All of lore.kernel.org
 help / color / mirror / Atom feed
From: arno@natisbad.org (Arnaud Ebalard)
To: linux-arm-kernel@lists.infradead.org
Subject: [BUG,REGRESSION?] 3.11.6+,3.12: GbE iface rate drops to few KB/s
Date: Wed, 20 Nov 2013 22:28:50 +0100	[thread overview]
Message-ID: <87txf692zx.fsf@natisbad.org> (raw)
In-Reply-To: 20131120191145.GP8581@1wt.eu

Hi,

Willy Tarreau <w@1wt.eu> writes:

> From d1a00e593841223c7d871007b1e1fc528afe8e4d Mon Sep 17 00:00:00 2001
> From: Willy Tarreau <w@1wt.eu>
> Date: Wed, 20 Nov 2013 19:47:11 +0100
> Subject: EXP: net: mvneta: try to flush Tx descriptor queue upon Rx
>  interrupts
>
> Right now the mvneta driver doesn't handle Tx IRQ, and solely relies on a
> timer to flush Tx descriptors. This causes jerky output traffic with bursts
> and pauses, making it difficult to reach line rate with very few streams.
> This patch tries to improve the situation which is complicated by the lack
> of public datasheet from Marvell. The workaround consists in trying to flush
> pending buffers during the Rx polling. The idea is that for symmetric TCP
> traffic, ACKs received in response to the packets sent will trigger the Rx
> interrupt and will anticipate the flushing of the descriptors.
>
> The results are quite good, a single TCP stream is now capable of saturating
> a gigabit.
>
> This is only a workaround, it doesn't address asymmetric traffic nor datagram
> based traffic.
>
> Signed-off-by: Willy Tarreau <w@1wt.eu>
> ---
>  drivers/net/ethernet/marvell/mvneta.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
>
> diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
> index 5aed8ed..59e1c86 100644
> --- a/drivers/net/ethernet/marvell/mvneta.c
> +++ b/drivers/net/ethernet/marvell/mvneta.c
> @@ -2013,6 +2013,26 @@ static int mvneta_poll(struct napi_struct *napi, int budget)
>  	}
>  
>  	pp->cause_rx_tx = cause_rx_tx;
> +
> +	/* Try to flush pending Tx buffers if any */
> +	if (test_bit(MVNETA_F_TX_DONE_TIMER_BIT, &pp->flags)) {
> +		int tx_todo = 0;
> +
> +		mvneta_tx_done_gbe(pp,
> +	                           (((1 << txq_number) - 1) &
> +	                           MVNETA_CAUSE_TXQ_SENT_DESC_ALL_MASK),
> +	                           &tx_todo);
> +
> +		if (tx_todo > 0) {
> +			mod_timer(&pp->tx_done_timer,
> +			          jiffies + msecs_to_jiffies(MVNETA_TX_DONE_TIMER_PERIOD));
> +		}
> +		else {
> +			clear_bit(MVNETA_F_TX_DONE_TIMER_BIT, &pp->flags);
> +			del_timer(&pp->tx_done_timer);
> +		}
> +	}
> +
>  	return rx_done;
>  }

With current Linus tree (head being b4789b8e: aacraid: prevent invalid
pointer dereference), as a baseline here is what I get:

 w/ tcp_wmem left w/ default values (4096 16384 4071360)

  via netperf (TCP_MAERTS/TCP_STREAM): 151.13 / 935.50 Mbits/s
  via wget against apache: 15.4 MB/s
  via wget against nginx: 104 MB/s
 
 w/ tcp_wmem set to 4096 16384 262144:

  via netperf (TCP_MAERTS/TCP_STREAM): 919.89 / 935.50 Mbits/s
  via wget against apache: 63.3 MB/s
  via wget against nginx: 104 MB/s
 
With your patch on top of it (and tcp_wmem kept at its default value):

 via netperf: 939.16 / 935.44 Mbits/s
 via wget against apache: 65.9 MB/s (top reports 69.5 sy, 30.1 si
                                     and 72% CPU for apache2)
 via wget against nginx: 106 MB/s


With your patch and MVNETA_TX_DONE_TIMER_PERIOD set to 1 instead of 10
(still w/ and tcp_wmem kept at its default value):

 via netperf: 939.12 / 935.84 Mbits/s
 via wget against apache: 63.7 MB/s
 via wget against nginx: 108 MB/s

So:

 - First, Eric's patch sitting in Linus tree does fix the regression
   I had on 3.11.7 and early 3.12 (15.4 MB/s vs 256KB/s).

 - As can be seen in the results of first test, Eric's patch still
   requires some additional tweaking of tcp_wmem to get netperf and
   apache somewhat happy w/ perfectible drivers (63.3 MB/s instead of
   15.4MB/s by setting max tcp send buffer space to 256KB for apache).

 - For unknown reasons, nginx manages to provide a 104MB/s download rate
   even with a tcp_wmem set to default and no specific patch of mvneta.

 - Now, Willy's patch seems to makes netperf happy (link saturated from
   server to client), w/o tweaking tcp_wmem.

 - Again with Willy's patch I guess the "limitations" of the platform
   (1.2GHz CPU w/ 512MB of RAM) somehow prevent Apache to saturate the
   link. All I can say is that the same test some months ago on a 1.6GHz
   ARMv5TE (kirkwood 88f6282) w/ 256MB of RAM gave me 108MB/s. I do not
   know if it is some apache regression, some mvneta vs mv63xx_eth
   difference or some CPU frequency issue but having netperf and  nginx
   happy make me wonder about Apache.

 - Willy, setting MVNETA_TX_DONE_TIMER_PERIOD to 1 instead of 10 w/ your
   patch does not improve the already good value I get w/ your patch.


In the end if you iterate on your work to push a version of your patch
upstream, I'll be happy to test it. And thanks for the time you already
spent!

Cheers,

a+

WARNING: multiple messages have this Message-ID (diff)
From: arno@natisbad.org (Arnaud Ebalard)
To: Willy Tarreau <w@1wt.eu>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
	Thomas Petazzoni <thomas.petazzoni@free-electrons.com>,
	Florian Fainelli <f.fainelli@gmail.com>,
	simon.guinot@sequanux.org, netdev@vger.kernel.org,
	edumazet@google.com, Cong Wang <xiyou.wangcong@gmail.com>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [BUG,REGRESSION?] 3.11.6+,3.12: GbE iface rate drops to few KB/s
Date: Wed, 20 Nov 2013 22:28:50 +0100	[thread overview]
Message-ID: <87txf692zx.fsf@natisbad.org> (raw)
In-Reply-To: 20131120191145.GP8581@1wt.eu

Hi,

Willy Tarreau <w@1wt.eu> writes:

> From d1a00e593841223c7d871007b1e1fc528afe8e4d Mon Sep 17 00:00:00 2001
> From: Willy Tarreau <w@1wt.eu>
> Date: Wed, 20 Nov 2013 19:47:11 +0100
> Subject: EXP: net: mvneta: try to flush Tx descriptor queue upon Rx
>  interrupts
>
> Right now the mvneta driver doesn't handle Tx IRQ, and solely relies on a
> timer to flush Tx descriptors. This causes jerky output traffic with bursts
> and pauses, making it difficult to reach line rate with very few streams.
> This patch tries to improve the situation which is complicated by the lack
> of public datasheet from Marvell. The workaround consists in trying to flush
> pending buffers during the Rx polling. The idea is that for symmetric TCP
> traffic, ACKs received in response to the packets sent will trigger the Rx
> interrupt and will anticipate the flushing of the descriptors.
>
> The results are quite good, a single TCP stream is now capable of saturating
> a gigabit.
>
> This is only a workaround, it doesn't address asymmetric traffic nor datagram
> based traffic.
>
> Signed-off-by: Willy Tarreau <w@1wt.eu>
> ---
>  drivers/net/ethernet/marvell/mvneta.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
>
> diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
> index 5aed8ed..59e1c86 100644
> --- a/drivers/net/ethernet/marvell/mvneta.c
> +++ b/drivers/net/ethernet/marvell/mvneta.c
> @@ -2013,6 +2013,26 @@ static int mvneta_poll(struct napi_struct *napi, int budget)
>  	}
>  
>  	pp->cause_rx_tx = cause_rx_tx;
> +
> +	/* Try to flush pending Tx buffers if any */
> +	if (test_bit(MVNETA_F_TX_DONE_TIMER_BIT, &pp->flags)) {
> +		int tx_todo = 0;
> +
> +		mvneta_tx_done_gbe(pp,
> +	                           (((1 << txq_number) - 1) &
> +	                           MVNETA_CAUSE_TXQ_SENT_DESC_ALL_MASK),
> +	                           &tx_todo);
> +
> +		if (tx_todo > 0) {
> +			mod_timer(&pp->tx_done_timer,
> +			          jiffies + msecs_to_jiffies(MVNETA_TX_DONE_TIMER_PERIOD));
> +		}
> +		else {
> +			clear_bit(MVNETA_F_TX_DONE_TIMER_BIT, &pp->flags);
> +			del_timer(&pp->tx_done_timer);
> +		}
> +	}
> +
>  	return rx_done;
>  }

With current Linus tree (head being b4789b8e: aacraid: prevent invalid
pointer dereference), as a baseline here is what I get:

 w/ tcp_wmem left w/ default values (4096 16384 4071360)

  via netperf (TCP_MAERTS/TCP_STREAM): 151.13 / 935.50 Mbits/s
  via wget against apache: 15.4 MB/s
  via wget against nginx: 104 MB/s
 
 w/ tcp_wmem set to 4096 16384 262144:

  via netperf (TCP_MAERTS/TCP_STREAM): 919.89 / 935.50 Mbits/s
  via wget against apache: 63.3 MB/s
  via wget against nginx: 104 MB/s
 
With your patch on top of it (and tcp_wmem kept at its default value):

 via netperf: 939.16 / 935.44 Mbits/s
 via wget against apache: 65.9 MB/s (top reports 69.5 sy, 30.1 si
                                     and 72% CPU for apache2)
 via wget against nginx: 106 MB/s


With your patch and MVNETA_TX_DONE_TIMER_PERIOD set to 1 instead of 10
(still w/ and tcp_wmem kept at its default value):

 via netperf: 939.12 / 935.84 Mbits/s
 via wget against apache: 63.7 MB/s
 via wget against nginx: 108 MB/s

So:

 - First, Eric's patch sitting in Linus tree does fix the regression
   I had on 3.11.7 and early 3.12 (15.4 MB/s vs 256KB/s).

 - As can be seen in the results of first test, Eric's patch still
   requires some additional tweaking of tcp_wmem to get netperf and
   apache somewhat happy w/ perfectible drivers (63.3 MB/s instead of
   15.4MB/s by setting max tcp send buffer space to 256KB for apache).

 - For unknown reasons, nginx manages to provide a 104MB/s download rate
   even with a tcp_wmem set to default and no specific patch of mvneta.

 - Now, Willy's patch seems to makes netperf happy (link saturated from
   server to client), w/o tweaking tcp_wmem.

 - Again with Willy's patch I guess the "limitations" of the platform
   (1.2GHz CPU w/ 512MB of RAM) somehow prevent Apache to saturate the
   link. All I can say is that the same test some months ago on a 1.6GHz
   ARMv5TE (kirkwood 88f6282) w/ 256MB of RAM gave me 108MB/s. I do not
   know if it is some apache regression, some mvneta vs mv63xx_eth
   difference or some CPU frequency issue but having netperf and  nginx
   happy make me wonder about Apache.

 - Willy, setting MVNETA_TX_DONE_TIMER_PERIOD to 1 instead of 10 w/ your
   patch does not improve the already good value I get w/ your patch.


In the end if you iterate on your work to push a version of your patch
upstream, I'll be happy to test it. And thanks for the time you already
spent!

Cheers,

a+

  parent reply	other threads:[~2013-11-20 21:28 UTC|newest]

Thread overview: 121+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-10 13:53 [BUG,REGRESSION?] 3.11.6+,3.12: GbE iface rate drops to few KB/s Arnaud Ebalard
2013-11-10 13:53 ` Arnaud Ebalard
2013-11-12  6:48 ` Cong Wang
2013-11-12  6:48   ` Cong Wang
2013-11-12  7:56   ` Arnaud Ebalard
2013-11-12  7:56     ` Arnaud Ebalard
2013-11-12  8:36     ` Willy Tarreau
2013-11-12  8:36       ` Willy Tarreau
2013-11-12  9:14       ` Arnaud Ebalard
2013-11-12  9:14         ` Arnaud Ebalard
2013-11-12 10:01         ` Willy Tarreau
2013-11-12 10:01           ` Willy Tarreau
2013-11-12 15:34           ` Arnaud Ebalard
2013-11-12 15:34             ` Arnaud Ebalard
2013-11-13  7:22             ` Willy Tarreau
2013-11-13  7:22               ` Willy Tarreau
2013-11-17 14:19               ` Willy Tarreau
2013-11-17 14:19                 ` Willy Tarreau
2013-11-17 17:41                 ` Eric Dumazet
2013-11-17 17:41                   ` Eric Dumazet
2013-11-19  6:44                   ` Arnaud Ebalard
2013-11-19  6:44                     ` Arnaud Ebalard
2013-11-19 13:53                     ` Eric Dumazet
2013-11-19 13:53                       ` Eric Dumazet
2013-11-19 17:43                       ` Willy Tarreau
2013-11-19 17:43                         ` Willy Tarreau
2013-11-19 18:31                         ` Eric Dumazet
2013-11-19 18:31                           ` Eric Dumazet
2013-11-19 18:41                           ` Willy Tarreau
2013-11-19 18:41                             ` Willy Tarreau
2013-11-19 23:53                             ` Arnaud Ebalard
2013-11-19 23:53                               ` Arnaud Ebalard
2013-11-20  0:08                               ` Eric Dumazet
2013-11-20  0:08                                 ` Eric Dumazet
2013-11-20  0:35                                 ` Willy Tarreau
2013-11-20  0:35                                   ` Willy Tarreau
2013-11-20  0:43                                   ` Eric Dumazet
2013-11-20  0:43                                     ` Eric Dumazet
2013-11-20  0:52                                     ` Willy Tarreau
2013-11-20  0:52                                       ` Willy Tarreau
2013-11-20  8:50                               ` Thomas Petazzoni
2013-11-20  8:50                                 ` Thomas Petazzoni
2013-11-20 19:21                                 ` Arnaud Ebalard
2013-11-20 19:11                               ` Willy Tarreau
2013-11-20 19:11                                 ` Willy Tarreau
2013-11-20 19:26                                 ` Arnaud Ebalard
2013-11-20 19:26                                   ` Arnaud Ebalard
2013-11-20 21:28                                 ` Arnaud Ebalard [this message]
2013-11-20 21:28                                   ` Arnaud Ebalard
2013-11-20 21:54                                   ` Willy Tarreau
2013-11-20 21:54                                     ` Willy Tarreau
2013-11-21  0:44                                     ` Willy Tarreau
2013-11-21  0:44                                       ` Willy Tarreau
2013-11-21 18:38                                       ` ARM network performance and dma_mask (was: [BUG,REGRESSION?] 3.11.6+,3.12: GbE iface rate drops to few KB/s) Willy Tarreau
2013-11-21 19:04                                         ` Thomas Petazzoni
2013-11-21 19:04                                           ` Thomas Petazzoni
2013-11-21 21:51                                           ` ARM network performance and dma_mask (was: [BUG, REGRESSION?] 3.11.6+, 3.12: " Willy Tarreau
2013-11-21 21:51                                             ` ARM network performance and dma_mask (was: [BUG,REGRESSION?] 3.11.6+,3.12: " Willy Tarreau
2013-11-21 22:01                                         ` ARM network performance and dma_mask Rob Herring
2013-11-21 22:01                                           ` Rob Herring
2013-11-21 22:13                                           ` Willy Tarreau
2013-11-21 22:13                                             ` Willy Tarreau
2013-11-21 21:51                                       ` [BUG,REGRESSION?] 3.11.6+,3.12: GbE iface rate drops to few KB/s Arnaud Ebalard
2013-11-21 21:51                                         ` Arnaud Ebalard
2013-11-21 21:52                                         ` Willy Tarreau
2013-11-21 21:52                                           ` Willy Tarreau
2013-11-21 22:00                                           ` Eric Dumazet
2013-11-21 22:00                                             ` Eric Dumazet
2013-11-21 22:55                                             ` Arnaud Ebalard
2013-11-21 22:55                                               ` Arnaud Ebalard
2013-11-21 23:23                                               ` Rick Jones
2013-11-21 23:23                                                 ` Rick Jones
2013-11-20 17:12                   ` Willy Tarreau
2013-11-20 17:12                     ` Willy Tarreau
2013-11-20 17:30                     ` Eric Dumazet
2013-11-20 17:30                       ` Eric Dumazet
2013-11-20 17:38                       ` Willy Tarreau
2013-11-20 17:38                         ` Willy Tarreau
2013-11-20 18:52                       ` David Miller
2013-11-20 18:52                         ` David Miller
2013-11-20 17:34                     ` Willy Tarreau
2013-11-20 17:34                       ` Willy Tarreau
2013-11-20 17:40                       ` Eric Dumazet
2013-11-20 17:40                         ` Eric Dumazet
2013-11-20 18:15                         ` Willy Tarreau
2013-11-20 18:15                           ` Willy Tarreau
2013-11-20 18:21                           ` Eric Dumazet
2013-11-20 18:21                             ` Eric Dumazet
2013-11-20 18:29                             ` Willy Tarreau
2013-11-20 18:29                               ` Willy Tarreau
2013-11-20 19:22                           ` Arnaud Ebalard
2013-11-20 19:22                             ` Arnaud Ebalard
2013-11-18 10:09                 ` David Laight
2013-11-18 10:09                   ` David Laight
2013-11-18 10:52                   ` Willy Tarreau
2013-11-18 10:52                     ` Willy Tarreau
2013-11-18 10:26                 ` Thomas Petazzoni
2013-11-18 10:26                   ` Thomas Petazzoni
2013-11-18 10:44                   ` Simon Guinot
2013-11-18 10:44                     ` Simon Guinot
2013-11-18 16:54                     ` Stephen Hemminger
2013-11-18 16:54                       ` Stephen Hemminger
2013-11-18 17:13                       ` Eric Dumazet
2013-11-18 17:13                         ` Eric Dumazet
2013-11-18 10:51                   ` Willy Tarreau
2013-11-18 10:51                     ` Willy Tarreau
2013-11-18 17:58                     ` Florian Fainelli
2013-11-18 17:58                       ` Florian Fainelli
2013-11-12 14:39     ` [PATCH] tcp: tsq: restore minimal amount of queueing Eric Dumazet
2013-11-12 15:24       ` Sujith Manoharan
2013-11-13 14:06       ` Eric Dumazet
2013-11-13 14:32       ` [PATCH v2] " Eric Dumazet
2013-11-13 21:18         ` Arnaud Ebalard
2013-11-13 21:59           ` Holger Hoffstaette
2013-11-13 23:40             ` Eric Dumazet
2013-11-13 23:52               ` Holger Hoffstaette
2013-11-17 23:15                 ` Francois Romieu
2013-11-18 16:26                   ` Holger Hoffstätte
2013-11-18 16:47                     ` Eric Dumazet
2013-11-13 22:41           ` Eric Dumazet
2013-11-14 21:26         ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87txf692zx.fsf@natisbad.org \
    --to=arno@natisbad.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.