The focus of the majority of my spare time lately has been teaching myself about how to automatically traverse firewalls and consumer/professional routers utilizing programming techniques. If you think about the potential of being able to eliminate the need for consumers to have knowledge of router configuration, you open up network and P2P based applications and hardware to a whole new audience of general consumers. This is a pretty specialized area that has a major lack of good articles, so I wanted to provide a summary of my own research into the topic.
I am approaching this topic from a programmers perspective; not an implementor. There is a lot of information available about people setting up ALG's (Application Layer Gateways fix packets that NAT screws up for things like IP telephony), SIP phones and a whole bunch of other crap. I want to know the guts from a network programming perspective at the TCP and UDP packet layer and not high level implementation information. I want to implement NAT traversal techniques in a P2P type manner so that my custom fabricated devices can do things like transfer files to each-other or accept incoming connection requests on the fly.
Why Group NAT Traversal and P2P Technologies Together?
Intelligent NAT traversal should be able to be done by clients themselves without the need of a central server. That is my programming objective which is why I am looking at NAT traversal from a P2P perspective.
Background Reading on NAT Traversal, P2P, and Multimedia Protocols Before We Get Started
If you are just starting to learn about NAT traversal and P2P, some of this may not make sense until you get a better understanding. I have prior experience doing TCP and UDP socket programming in Flash and C# .NET that does not have to traverse NAT. The items I have read, am in the process of reading, or have seen prior to writing this are as follows:
Great Introduction Video:
Yahoo! Director of Engineering
explains STUN and TURN
NAT Traversal Summary Articles and Papers:
NAT Traversal Techniques and Peer-to-Peer Applications by Zhou Hu
Peer-to-Peer Communication Across Network Address Translators by Bryan Ford, Pyda Srisuresh, and Dan Kegel
Wikipedia Page on SOCKS
P2P Summary Articles and Papers, thank gosh for Wikipedia:
Wikipedia Page on P2P (This actually has a good summary)
Wikipedia Page on Gnutella
Wikipedia Page on Gnutella2
Wikipedia Page on eDonkey
Wikipedia Page on XMPP
RFCs:
SIP RFC 3261
RTP Updated RFC 3550
RTMP Adobes lackluster document on it
ICE RFC 4091
STUN Updated RFC 5389
TURN IET behave Draft 16
Open Source Projects Code I Have Been Looking at:
Vuze
Shareaza
Pidgin
Jnushare (I dont think its practical to use
jxta, but was interested in checking out projects that used it.)
Java NAT Traversal Made Simple
How do IM Clients Handle NAT?
If ICE,STUN, and TURN are the best options for NAT traversal, then how the heck have IM clients been traversing NAT since the early 1990s or earlier? My guess is that IM clients have been either using SOCKS or another proxy type technique similar to TURN to be able to allow such good connectivity records in the past. I may be wrong.
I couldnt find anything helpful at all in the Pidgin source code. I really couldn't even find anything about where they are handling network protocol. When looking at what Wikipedia lists for each proprietary protocol; it really looks like IM generally works via proxy with XMPP being the only protocol that sort of starts to implement distributed methodology. I think normal message transfer happens like this.
1.You log into your IM server and keep a persistent outbound connection to it all the time.
2.When you want to send or get a message it must go through that central server.
3.Not too sure about transferring files. It would make sense for the central server to setup hole punching between two clients, but I see no indication of this happening in the pidgin source code. If it relays the file through the central server, then that is a pretty big strain on the central server.
Analysis of NAT Traversal in Vuze and Shareaza
Based on looking at the source code for Vuze and Shareaza I think they attempt to traverse NAT in the following way:
Vuze:
1. They use UPNP
2. They also use NATPMP which must have just been someones pet project because there are so few routers that support it. (Its an alternate protocol to UPNP developed by Apple and mainly supported by their devices)
3. I THINK that they establish outbound connections to the nearest DHT leaf node. I think this outbound connection to the DHT node may coordinate UDP Hole punching for the final transfers.
Shareaza:
1. They are Win only, so they try and automatically add exceptions to the windows firewall
2. They also implement UPNP
3. I dont see any indication of hole punching or any other inbound connection connectivity attempts Did I miss something?
Although Insecure From a Security Standpoint Take Advantage of UPNP
After doing a large sum of reading and looking at the source code from Azureus(Java), Sharaza(C/C++), Pidgin(C/C++), and a few other applications that I think would have to traverse NAT; it seems like there are a few common denominators. First off, Azureus and Sharaza implement UPNP(Universal Plug and Play) libraries which I really think probably nails at lest 50% of generic user cases. Despite UPNP
being pretty dangerous from a security standpoint according to Steve Gibson, normal users have no clue about it and will have it enabled; so why not take advantage of it and alleviate the first or second firewall in the way of programming problems.
Sharaza seems to take a different approach in trying to poke exceptions in the Windows firewall without prompting users about it (I think Sharaza is Windows only, so that makes a bit of sense.) Check out in the trunk/shareaza Firewall.cpp file included in its source code.
This is an area that I like because it is a tricky programming problem with no single implementation of NAT traversal that will work for all given situations. From a programmers perspective the only way to guarantee connectivity is to implement as many possible NAT traversal techniques as you can.
Despite most of the documentation you will find online being geared towards IP telephony; my personal belief based on research is that ICE, STUN, and TURN are the current best techniques for NAT traversal. These protocols are relatively new with origination dates in the 2000's.
With ICE, STUN, and TURN you need to have a server with a public IP address running "server" implementations of these protocols. So far I have played around with the
Vovidia c++ based Windows STUN server with lackluster results. To support backwards compatibility I think you must have two network interface cards on your server with two public IP's. The downside of the Vovidia implementation is that I cant figure out how to successfully use the same interface card with two different ports to run the server!
For my client STUN requests I have been using a Java application that uses Jason Derle's Stun4J wrapper. It would be pretty cool if you could use p2p clients on the public internet to act as STUN and TURN servers along the paradigim that Skype uses with it's "Super nodes".
Thats all I have for now.