Considerations in the design of stegtunnel - A method of passing hidden
data in TCP/IP headers.
Download stegtunnel here!
Stegtunnel is a tool written to hide data within TCP/IP header fields. It was designed to be undetectable, even by people familiar with the tool. It can hide the data underneath real TCP connections, using real, unmodified clients and servers to provide the TCP conversation. In this way, detection of odd-looking sessions is avoided.
Steganography is the art of concealing messages within seemingly innocuous data, sometimes called a cover. Steganography is often found in close proximity to cryptography, and there is one maxim that, while intended for cryptography, can affect steganography as well.
"The enemy knows the system being used" - Claude Shannon
Methods using "Security through Obscurity," that is, steganographic methods that rely on Shannon's maxim proving false, have had a long and proud history. Methods such as positioning window shades in order to convey a bit of information, or hiding information in "dead drop" locations are essentially dependant on knowledge of the system being unknown to the adversary. These systems have digital analogs, such as the use of normally unused fields or adding extra fields that will be ignored by most software that reads the data. A slightly less obvious approach could be something along the lines of hosting a weblog, and varying the capitalization of URLs embedded in the links.
Use of these methods would work as long as the system itself remains unknown. They must be custom built to keep their effectiveness, and thus are unsuitable for open source steganographic tools.
Strong Open-Source Steganography
A strong open-source steganographic tool is one in which it is infeasible to determine whether the cover material contains additional embedded information without knowledge of the key. Determining what the information is exactly is not necessary to "break" the stegosystem, simply the determination that additional information is there.
The reason why recovery of the information is not necessary to consider the stegosystem broken is that information may (and perhaps should) be encrypted before being injected into the steganographic cover. The purpose of the steganography is to conceal the existence of the data flow, and if it fails in that it is broken. Stegtunnel was designed with the goal of being undetectable unless the key is known.
How Stegtunnel Works
Stegtunnel hides data in the sequence number and IPID fields of packets used for TCP connections. While improvements to the system have been suggested, and will likely be implemented in future releases of stegtunnel, what follows is a description of version 1 of the protocol, which was demonstrated initially at Rubi-Con 5.
The software released at Rubi-Con 5 consisted of two parts, stegclient and stegserver. Both stegclient and stegserver make use of an artifice called a "silent IP address." This is an IP address local to the subnet of the IP address that is currently unoccupied. Stegtunnel will listen for packets destined to this IP address, and reply to arp requests for this IP address.
This enables our userspace program to completely handle the IP conversation, as no kernel state about packets sent to these IP addresses will be kept. Each stegtunnel connection is actually composed of two separate connections, one between the local host and the silent IP, and another between the silent IP and the remote host.
This prevents the kernel from injecting spurious RST packets into our modified connections, and prevents ACK storms as well.
There are other ways of injecting changes into TCP connections. One way would be to make changes to the kernel directly, through a patch or loadable kernel module. This would be the cleanest method, but it would not be portable. Another would be to use a combination of routing table updates and firewall rule changes. Libdnet provides this functionality, and it is a likely future direction for stegtunnel.
Since stegtunnel is as much a protocol as it is a software package, any future implementations, such as a kernel module implementation, should be able to interoperate with the current silent IP implementation.
The first phase of the connection determines if both sides share the same shared secret. This phase takes place during the SYN and SYN-ACK packets of the TCP connection. The receiving host will use the packet nonce and the passphrase hash to generate four pseudorandom bytes. If these bytes match the sequence number, it is assumed that the remote generating host knows the passphrase as well, and the session is considered "keyed."
A host taking part in a "keyed" session will hash each outbound packet for use as a nonce, along with the passphrase, and use that to generate 2 pseudorandom bytes. These bytes are then XORed with the cleartext to be sent, and the resultant bytes are used as the IPID of the outbound packet. Inbound packets from the session will have two pseudorandom bytes generated in the same way, and the XOR with the IPID will extract the plaintext.
Future Direction of Stegtunnel
Stegtunnel will likely add several modes in the future. One mode would allow it to pass data only in the initial sequence number, at the cost of severely limiting the bandwidth. This will expand the number of operating systems it could be used on undetectably.
In addition, several weaknesses exist in stegtunnel that prevent it from fully meeting the design objectives. These weaknesses should be addressed. Some weaknesses will require a second version of the stegtunnel protocol to be implemented.
Weaknesses in the Protocol
Currently, the protocol does not handle dropped packets gracefully. In fact, a dropped packet will likely result in the leakage of information from the session, as a nonce collision is probable in the retransmit.
Out of order packets currently result in out of order decrypted plaintext. Future implementations may be able to use the TCP sequence numbers to order the packets properly. Since the data is in the IPID, however, it may be difficult to deal with overlapping packets or retransmits.
Due to the use of strongly pseudorandom-appearing output in the IPID and initial sequence number fields, the use of stegtunnel on systems that do not have random ISNs and IPIDs will be quite noticeable. Currently, OpenBSD and grsecurity Linux provide random IPIDs and ISNs. Syn Ack Labs would welcome knowledge about other OSes providing this kind of cover.
Currently, if plaintext is known, or has predictable characteristics, dictionary attacks may be mounted against packets from a stegtunnel session. The nonce is effectively in the clear, so care should be taken to pick only strong passphrases
Many (all current?) pseudorandom number generators have internal cycles, and may be identified using jitter analysis. When analyzing operating system randomness pools, jitter analysis works best against a large number of samples in a short time frame, so that entropy mixing does not overwhelm the cycles present. Future work should study whether it is possible to confuse jitter analysis without introducing new detectable weaknesses within the code. Without this, the steganography may depend upon using short data dtreams, or data streams spread over enough time to allow internal entropy gathering to confuse any cycles present in the PRNG.
Weaknesses in the Implementation
Because all connections are currently routed through the silent IP, only 64,000 connections may be active at any given time, due to port exhaustion. This may not be much of a problem for the client, but it leaves the server vulnerable to resource exhaustion attacks. Future implementations should either aggressively time-out inactive connections, or (probably better) move to the firewall/route-table adjustment style of tunneling.