From: w@1wt.eu (Willy Tarreau)
To: linux-arm-kernel@lists.infradead.org
Subject: [BUG,REGRESSION?] 3.11.6+,3.12: GbE iface rate drops to few KB/s
Date: Wed, 20 Nov 2013 18:34:36 +0100 [thread overview]
Message-ID: <20131120173436.GK8581@1wt.eu> (raw)
In-Reply-To: <20131120171227.GG8581@1wt.eu>
On Wed, Nov 20, 2013 at 06:12:27PM +0100, Willy Tarreau wrote:
> Hi guys,
>
> On Sun, Nov 17, 2013 at 09:41:38AM -0800, Eric Dumazet wrote:
> > On Sun, 2013-11-17 at 15:19 +0100, Willy Tarreau wrote:
> >
> > >
> > > So it is fairly possible that in your case you can't fill the link if you
> > > consume too many descriptors. For example, if your server uses TCP_NODELAY
> > > and sends incomplete segments (which is quite common), it's very easy to
> > > run out of descriptors before the link is full.
> >
> > BTW I have a very simple patch for TCP stack that could help this exact
> > situation...
> >
> > Idea is to use TCP Small Queue so that we dont fill qdisc/TX ring with
> > very small frames, and let tcp_sendmsg() have more chance to fill
> > complete packets.
> >
> > Again, for this to work very well, you need that NIC performs TX
> > completion in reasonable amount of time...
>
> Eric, first I would like to confirm that I could reproduce Arnaud's issue
> using 3.10.19 (160 kB/s in the worst case).
>
> Second, I confirm that your patch partially fixes it and my performance
> can be brought back to what I had with 3.10-rc7, but with a lot of
> concurrent streams. In fact, in 3.10-rc7, I managed to constantly saturate
> the wire when transfering 7 concurrent streams (118.6 kB/s). With the patch
> applied, performance is still only 27 MB/s at 7 concurrent streams, and I
> need at least 35 concurrent streams to fill the pipe. Strangely, after
> 2 GB of cumulated data transferred, the bandwidth divided by 11-fold and
> fell to 10 MB/s again.
>
> If I revert both "0ae5f47eff tcp: TSQ can use a dynamic limit" and
> your latest patch, the performance is back to original.
>
> Now I understand there's a major issue with the driver. But since the
> patch emphasizes the situations where drivers take a lot of time to
> wake the queue up, don't you think there could be an issue with low
> bandwidth links (eg: PPPoE over xDSL, 10 Mbps ethernet, etc...) ?
> I'm a bit worried about what we might discover in this area I must
> confess (despite generally being mostly focused on 10+ Gbps).
One important point, I was looking for the other patch you pointed
in this long thread and finally found it :
> So
> http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=98e09386c0ef4dfd48af7ba60ff908f0d525cdee
>
> restored this minimal amount of buffering, and let the bigger amount for
> 40Gb NICs ;)
This one definitely restores original performance, so it's a much better
bet in my opinion :-)
Best regards,
Willy
WARNING: multiple messages have this Message-ID (diff)
From: Willy Tarreau <w@1wt.eu>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>,
netdev@vger.kernel.org, Arnaud Ebalard <arno@natisbad.org>,
edumazet@google.com, Cong Wang <xiyou.wangcong@gmail.com>,
linux-arm-kernel@lists.infradead.org
Subject: Re: [BUG,REGRESSION?] 3.11.6+,3.12: GbE iface rate drops to few KB/s
Date: Wed, 20 Nov 2013 18:34:36 +0100 [thread overview]
Message-ID: <20131120173436.GK8581@1wt.eu> (raw)
In-Reply-To: <20131120171227.GG8581@1wt.eu>
On Wed, Nov 20, 2013 at 06:12:27PM +0100, Willy Tarreau wrote:
> Hi guys,
>
> On Sun, Nov 17, 2013 at 09:41:38AM -0800, Eric Dumazet wrote:
> > On Sun, 2013-11-17 at 15:19 +0100, Willy Tarreau wrote:
> >
> > >
> > > So it is fairly possible that in your case you can't fill the link if you
> > > consume too many descriptors. For example, if your server uses TCP_NODELAY
> > > and sends incomplete segments (which is quite common), it's very easy to
> > > run out of descriptors before the link is full.
> >
> > BTW I have a very simple patch for TCP stack that could help this exact
> > situation...
> >
> > Idea is to use TCP Small Queue so that we dont fill qdisc/TX ring with
> > very small frames, and let tcp_sendmsg() have more chance to fill
> > complete packets.
> >
> > Again, for this to work very well, you need that NIC performs TX
> > completion in reasonable amount of time...
>
> Eric, first I would like to confirm that I could reproduce Arnaud's issue
> using 3.10.19 (160 kB/s in the worst case).
>
> Second, I confirm that your patch partially fixes it and my performance
> can be brought back to what I had with 3.10-rc7, but with a lot of
> concurrent streams. In fact, in 3.10-rc7, I managed to constantly saturate
> the wire when transfering 7 concurrent streams (118.6 kB/s). With the patch
> applied, performance is still only 27 MB/s at 7 concurrent streams, and I
> need at least 35 concurrent streams to fill the pipe. Strangely, after
> 2 GB of cumulated data transferred, the bandwidth divided by 11-fold and
> fell to 10 MB/s again.
>
> If I revert both "0ae5f47eff tcp: TSQ can use a dynamic limit" and
> your latest patch, the performance is back to original.
>
> Now I understand there's a major issue with the driver. But since the
> patch emphasizes the situations where drivers take a lot of time to
> wake the queue up, don't you think there could be an issue with low
> bandwidth links (eg: PPPoE over xDSL, 10 Mbps ethernet, etc...) ?
> I'm a bit worried about what we might discover in this area I must
> confess (despite generally being mostly focused on 10+ Gbps).
One important point, I was looking for the other patch you pointed
in this long thread and finally found it :
> So
> http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=98e09386c0ef4dfd48af7ba60ff908f0d525cdee
>
> restored this minimal amount of buffering, and let the bigger amount for
> 40Gb NICs ;)
This one definitely restores original performance, so it's a much better
bet in my opinion :-)
Best regards,
Willy
next prev parent reply other threads:[~2013-11-20 17:34 UTC|newest]
Thread overview: 121+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-10 13:53 [BUG,REGRESSION?] 3.11.6+,3.12: GbE iface rate drops to few KB/s Arnaud Ebalard
2013-11-10 13:53 ` Arnaud Ebalard
2013-11-12 6:48 ` Cong Wang
2013-11-12 6:48 ` Cong Wang
2013-11-12 7:56 ` Arnaud Ebalard
2013-11-12 7:56 ` Arnaud Ebalard
2013-11-12 8:36 ` Willy Tarreau
2013-11-12 8:36 ` Willy Tarreau
2013-11-12 9:14 ` Arnaud Ebalard
2013-11-12 9:14 ` Arnaud Ebalard
2013-11-12 10:01 ` Willy Tarreau
2013-11-12 10:01 ` Willy Tarreau
2013-11-12 15:34 ` Arnaud Ebalard
2013-11-12 15:34 ` Arnaud Ebalard
2013-11-13 7:22 ` Willy Tarreau
2013-11-13 7:22 ` Willy Tarreau
2013-11-17 14:19 ` Willy Tarreau
2013-11-17 14:19 ` Willy Tarreau
2013-11-17 17:41 ` Eric Dumazet
2013-11-17 17:41 ` Eric Dumazet
2013-11-19 6:44 ` Arnaud Ebalard
2013-11-19 6:44 ` Arnaud Ebalard
2013-11-19 13:53 ` Eric Dumazet
2013-11-19 13:53 ` Eric Dumazet
2013-11-19 17:43 ` Willy Tarreau
2013-11-19 17:43 ` Willy Tarreau
2013-11-19 18:31 ` Eric Dumazet
2013-11-19 18:31 ` Eric Dumazet
2013-11-19 18:41 ` Willy Tarreau
2013-11-19 18:41 ` Willy Tarreau
2013-11-19 23:53 ` Arnaud Ebalard
2013-11-19 23:53 ` Arnaud Ebalard
2013-11-20 0:08 ` Eric Dumazet
2013-11-20 0:08 ` Eric Dumazet
2013-11-20 0:35 ` Willy Tarreau
2013-11-20 0:35 ` Willy Tarreau
2013-11-20 0:43 ` Eric Dumazet
2013-11-20 0:43 ` Eric Dumazet
2013-11-20 0:52 ` Willy Tarreau
2013-11-20 0:52 ` Willy Tarreau
2013-11-20 8:50 ` Thomas Petazzoni
2013-11-20 8:50 ` Thomas Petazzoni
2013-11-20 19:21 ` Arnaud Ebalard
2013-11-20 19:11 ` Willy Tarreau
2013-11-20 19:11 ` Willy Tarreau
2013-11-20 19:26 ` Arnaud Ebalard
2013-11-20 19:26 ` Arnaud Ebalard
2013-11-20 21:28 ` Arnaud Ebalard
2013-11-20 21:28 ` Arnaud Ebalard
2013-11-20 21:54 ` Willy Tarreau
2013-11-20 21:54 ` Willy Tarreau
2013-11-21 0:44 ` Willy Tarreau
2013-11-21 0:44 ` Willy Tarreau
2013-11-21 18:38 ` ARM network performance and dma_mask (was: [BUG,REGRESSION?] 3.11.6+,3.12: GbE iface rate drops to few KB/s) Willy Tarreau
2013-11-21 19:04 ` Thomas Petazzoni
2013-11-21 19:04 ` Thomas Petazzoni
2013-11-21 21:51 ` ARM network performance and dma_mask (was: [BUG, REGRESSION?] 3.11.6+, 3.12: " Willy Tarreau
2013-11-21 21:51 ` ARM network performance and dma_mask (was: [BUG,REGRESSION?] 3.11.6+,3.12: " Willy Tarreau
2013-11-21 22:01 ` ARM network performance and dma_mask Rob Herring
2013-11-21 22:01 ` Rob Herring
2013-11-21 22:13 ` Willy Tarreau
2013-11-21 22:13 ` Willy Tarreau
2013-11-21 21:51 ` [BUG,REGRESSION?] 3.11.6+,3.12: GbE iface rate drops to few KB/s Arnaud Ebalard
2013-11-21 21:51 ` Arnaud Ebalard
2013-11-21 21:52 ` Willy Tarreau
2013-11-21 21:52 ` Willy Tarreau
2013-11-21 22:00 ` Eric Dumazet
2013-11-21 22:00 ` Eric Dumazet
2013-11-21 22:55 ` Arnaud Ebalard
2013-11-21 22:55 ` Arnaud Ebalard
2013-11-21 23:23 ` Rick Jones
2013-11-21 23:23 ` Rick Jones
2013-11-20 17:12 ` Willy Tarreau
2013-11-20 17:12 ` Willy Tarreau
2013-11-20 17:30 ` Eric Dumazet
2013-11-20 17:30 ` Eric Dumazet
2013-11-20 17:38 ` Willy Tarreau
2013-11-20 17:38 ` Willy Tarreau
2013-11-20 18:52 ` David Miller
2013-11-20 18:52 ` David Miller
2013-11-20 17:34 ` Willy Tarreau [this message]
2013-11-20 17:34 ` Willy Tarreau
2013-11-20 17:40 ` Eric Dumazet
2013-11-20 17:40 ` Eric Dumazet
2013-11-20 18:15 ` Willy Tarreau
2013-11-20 18:15 ` Willy Tarreau
2013-11-20 18:21 ` Eric Dumazet
2013-11-20 18:21 ` Eric Dumazet
2013-11-20 18:29 ` Willy Tarreau
2013-11-20 18:29 ` Willy Tarreau
2013-11-20 19:22 ` Arnaud Ebalard
2013-11-20 19:22 ` Arnaud Ebalard
2013-11-18 10:09 ` David Laight
2013-11-18 10:09 ` David Laight
2013-11-18 10:52 ` Willy Tarreau
2013-11-18 10:52 ` Willy Tarreau
2013-11-18 10:26 ` Thomas Petazzoni
2013-11-18 10:26 ` Thomas Petazzoni
2013-11-18 10:44 ` Simon Guinot
2013-11-18 10:44 ` Simon Guinot
2013-11-18 16:54 ` Stephen Hemminger
2013-11-18 16:54 ` Stephen Hemminger
2013-11-18 17:13 ` Eric Dumazet
2013-11-18 17:13 ` Eric Dumazet
2013-11-18 10:51 ` Willy Tarreau
2013-11-18 10:51 ` Willy Tarreau
2013-11-18 17:58 ` Florian Fainelli
2013-11-18 17:58 ` Florian Fainelli
2013-11-12 14:39 ` [PATCH] tcp: tsq: restore minimal amount of queueing Eric Dumazet
2013-11-12 15:24 ` Sujith Manoharan
2013-11-13 14:06 ` Eric Dumazet
2013-11-13 14:32 ` [PATCH v2] " Eric Dumazet
2013-11-13 21:18 ` Arnaud Ebalard
2013-11-13 21:59 ` Holger Hoffstaette
2013-11-13 23:40 ` Eric Dumazet
2013-11-13 23:52 ` Holger Hoffstaette
2013-11-17 23:15 ` Francois Romieu
2013-11-18 16:26 ` Holger Hoffstätte
2013-11-18 16:47 ` Eric Dumazet
2013-11-13 22:41 ` Eric Dumazet
2013-11-14 21:26 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131120173436.GK8581@1wt.eu \
--to=w@1wt.eu \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.