Limitations of the TCP/IP protocol make it very difficult for a single application to saturate a network connection.
I'm just curious, but exactly which "limitations" are those? I can believe that parallel connections help in practice (especially when fetching small objects), but for large objects, I find it surprising you can't get reasonably close to saturating a single network connection with a modern TCP stack (e.g., using TCP window scaling).
It's pretty much impossible to saturate even a LAN connection with a single TCP connection. The are a number of issues at play here — RTT (Round Trip Time, i.e. ping/latency), window sizes, packet loss and initcwnd (TCP's initial window).
The combination of the limitations imposed by the speed of light and TCP's windowing system means that you are buggered transferring large files over high-latency TCP connections. I haven't checked their figures, but here's a TCP rate calculator I just found which lets you tune the different parameters: http://osn.fx.net.nz/LFN/
The greater the delay, the bigger the impact. For example if we take a standard Windows XP machine and plug in the values for a standard Gigabit LAN (typically .2ms latency between hosts) we get a maximum speed of 700Mbit/sec, but if we try if between two hosts, one of them in the USA (typically around 120ms) the maximum transfer rate falls to 1.17 Mbit/sec.
The are a number of issues at play here — RTT (Round Trip Time, i.e. ping/latency), window sizes, packet loss and initcwnd (TCP's initial window).
Initial window size: not relevant AFAICS, I'm not talking about connection startup behavior.
RTT, Window size: if the bandwidth-delay product is large, obviously you need a large window size (>>65K). Thankfully, recent TCP stacks support TCP window scaling.
Packet loss: you need relatively large buffers (by the standards of traditional TCP) and a sane scheme for recovering from packet loss (e.g., SACK), but I don't see why this is a show stopper on modern TCP stacks.
I'm not super familiar with the SPDY work, but from what I recall, it primarily addresses connection startup behavior, rather than steady-state behavior.
I'm just curious, but exactly which "limitations" are those? I can believe that parallel connections help in practice (especially when fetching small objects), but for large objects, I find it surprising you can't get reasonably close to saturating a single network connection with a modern TCP stack (e.g., using TCP window scaling).