Re: Linux TCP's Robustness to Multipath Packet Reordering

Netdev List
 help / color / mirror / Atom feed

From: Eric Dumazet <eric.dumazet@gmail.com>
To: Dominik Kaspar <dokaspar.ietf@gmail.com>
Cc: netdev@vger.kernel.org
Subject: Re: Linux TCP's Robustness to Multipath Packet Reordering
Date: Mon, 25 Apr 2011 13:25:01 +0200	[thread overview]
Message-ID: <1303730701.2747.110.camel@edumazet-laptop> (raw)
In-Reply-To: <BANLkTimpgXCpweZKCihCQkLjSZw5zL4=Pg@mail.gmail.com>

Le lundi 25 avril 2011 à 12:37 +0200, Dominik Kaspar a écrit :
> Hello,
> 
> Knowing how critical packet reordering is for standard TCP, I am
> currently testing how robust Linux TCP is when packets are forwarded
> over multiple paths (with different bandwidth and RTT). Since Linux
> TCP adapts its "dupAck threshold" to an estimated level of packet
> reordering, I expect it to be much more robust than a standard TCP
> that strictly follows the RFCs. Indeed, as you can see in the
> following plot, my experiments show a step-wise adaptation of Linux
> TCP to heavy reordering. After many minutes, Linux TCP finally reaches
> a data throughput close to the perfect aggregated data rate of two
> paths (emulated with characteristics similar to IEEE 802.11b (WLAN)
> and a 3G link (HSPA)):
> 
> http://home.simula.no/~kaspar/static/mptcp-emu-wlan-hspa-00.png
> 
> Does anyone have clues what's going on here? Why does the aggregated
> throughput increase in steps? And what could be the reason it takes
> minutes to adapt to the full capacity, when in other cases, Linux TCP
> adapts much faster (for example if the bandwidth of both paths are
> equal). I would highly appreciate some advice from the netdev
> community.
> 
> Implementation details:
> This multipath TCP experiment ran between a sending machine with a
> single Ethernet interface (eth0) and a client with two Ethernet
> interfaces (eth1, eth2). The machines are connected through a switch
> and tc/netem is used to emulate the bandwidth and RTT of both paths.
> TCP connections are established using iperf between eth0 and eth1 (the
> primary path). At the sender, an iptables' NFQUEUE is used to "spoof"
> the destination IP address of outgoing packets and force some to
> travel to eth2 instead of eth1 (the secondary path). This multipath
> scheduling happens in proportion to the emulated bandwidths, so if the
> paths are set to 500 and 1000 KB/s, then packets are distributed in a
> 1:2 ratio. At the client, iptables' RAWDNAT is used to translate the
> spoofed IP addresses back to their original, so that all packets end
> up at eth1, although a portion actually travelled to eth2. ACKs are
> not scheduled over multiple paths, but always travel back on the
> primary path. TCP does not notice anything of the multipath
> forwarding, except the side-effect of packet reordering, which can be
> huge if the path RTTs are set very differently.
> 

Hi Dominik

Implementation details of the tc/netem stages are important to fully
understand how TCP stack can react.

Is TSO active at sender side for example ?

Your results show that only some exceptional events make bandwidth
really change.

A tcpdump/pcap of ~10.000 first packets would be nice to provide (not on
mailing list, but on your web site)

next prev parent reply	other threads:[~2011-04-25 11:25 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-25 10:37 Linux TCP's Robustness to Multipath Packet Reordering Dominik Kaspar
2011-04-25 11:25 ` Eric Dumazet [this message]
2011-04-25 14:35   ` Dominik Kaspar
2011-04-25 15:38     ` Eric Dumazet
2011-04-26 16:58       ` Dominik Kaspar
2011-04-26 17:10         ` Eric Dumazet
2011-04-26 18:00           ` Dominik Kaspar
2011-04-26 20:16             ` John Heffner
2011-04-26 21:27               ` Dominik Kaspar
2011-04-27  9:57               ` Carsten Wolff
2011-04-27 16:22                 ` Dominik Kaspar
2011-04-27 16:36                   ` Alexander Zimmermann
2011-06-21 11:25                     ` Ilpo Järvinen
2011-06-21 11:34                       ` Carsten Wolff
2011-06-21 11:46                         ` Ilpo Järvinen
2011-04-27 16:48                   ` Eric Dumazet
2011-04-27 17:39                   ` Yuchung Cheng
2011-04-27 17:53                     ` Alexander Zimmermann
2011-04-27 19:56                     ` Dominik Kaspar
2011-04-27 21:41                       ` Yuchung Cheng
2011-04-28  6:11                         ` Alexander Zimmermann
2011-06-19 15:22                           ` Dominik Kaspar
2011-06-19 15:38                             ` Alexander Zimmermann
2011-06-19 16:25                               ` Dominik Kaspar
2011-06-20 10:42                                 ` Ilpo Järvinen
2011-06-20 12:52                                   ` Dominik Kaspar
2011-06-21 11:35                                     ` Ilpo Järvinen
2011-04-26 20:43     ` Eric Dumazet
2011-04-26 21:04       ` Dominik Kaspar
2011-04-26 21:08         ` Eric Dumazet
2011-04-26 21:16           ` Dominik Kaspar
2011-04-26 21:17           ` Eric Dumazet
2011-04-25 12:59 ` Carsten Wolff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1303730701.2747.110.camel@edumazet-laptop \
    --to=eric.dumazet@gmail.com \
    --cc=dokaspar.ietf@gmail.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox