* [PATCH] net: bcmgenet: fix throughtput regression with TSO autosizing
@ 2015-02-26 11:05 Jaedon Shin
2015-02-26 16:36 ` David Miller
2015-02-26 17:27 ` Eric Dumazet
0 siblings, 2 replies; 6+ messages in thread
From: Jaedon Shin @ 2015-02-26 11:05 UTC (permalink / raw)
To: Florian Fainelli; +Cc: netdev, Jaedon Shin
This patch prevents the performance degradation of xmit after
605ad7f ("tcp: refine TSO autosizing").
Signed-off-by: Jaedon Shin <jaedon.shin@gmail.com>
---
drivers/net/ethernet/broadcom/genet/bcmgenet.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
index ff83c46b..87eeff0 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
@@ -1310,6 +1310,9 @@ static netdev_tx_t bcmgenet_xmit(struct sk_buff *skb, struct net_device *dev)
out:
spin_unlock_irqrestore(&ring->lock, flags);
+ if (index != DESC_INDEX)
+ skb_orphan(skb);
+
return ret;
}
--
2.3.0
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH] net: bcmgenet: fix throughtput regression with TSO autosizing 2015-02-26 11:05 [PATCH] net: bcmgenet: fix throughtput regression with TSO autosizing Jaedon Shin @ 2015-02-26 16:36 ` David Miller 2015-02-27 14:33 ` Jaedon Shin 2015-02-26 17:27 ` Eric Dumazet 1 sibling, 1 reply; 6+ messages in thread From: David Miller @ 2015-02-26 16:36 UTC (permalink / raw) To: jaedon.shin; +Cc: f.fainelli, netdev From: Jaedon Shin <jaedon.shin@gmail.com> Date: Thu, 26 Feb 2015 20:05:58 +0900 > This patch prevents the performance degradation of xmit after > 605ad7f ("tcp: refine TSO autosizing"). > > Signed-off-by: Jaedon Shin <jaedon.shin@gmail.com> I doubt this is the correct way to fix this. Also, you need to describe in detail what the actual problem is, how you evaluated the cause, and what made you think that your choosen solution is the proper one. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] net: bcmgenet: fix throughtput regression with TSO autosizing 2015-02-26 16:36 ` David Miller @ 2015-02-27 14:33 ` Jaedon Shin 2015-02-27 15:13 ` Eric Dumazet 0 siblings, 1 reply; 6+ messages in thread From: Jaedon Shin @ 2015-02-27 14:33 UTC (permalink / raw) To: David Miller; +Cc: Florian Fainelli, netdev, eric.dumazet > On Feb 27, 2015, at 1:36 AM, David Miller <davem@davemloft.net> wrote: > > From: Jaedon Shin <jaedon.shin@gmail.com> > Date: Thu, 26 Feb 2015 20:05:58 +0900 > >> This patch prevents the performance degradation of xmit after >> 605ad7f ("tcp: refine TSO autosizing"). >> >> Signed-off-by: Jaedon Shin <jaedon.shin@gmail.com> > > I doubt this is the correct way to fix this. > > Also, you need to describe in detail what the actual problem > is, how you evaluated the cause, and what made you think that > your choosen solution is the proper one. The bcmgenet_tx_reclaim() of tx_ring[{0,1,2,3}] process only under 18 descriptors is too late reclaiming transmitted skb. Therefore, performance degradation of xmit after 605ad7f ("tcp: refine TSO autosizing"). # iperf -c 172.16.1.2 -i 1 ------------------------------------------------------------ Client connecting to 172.16.1.20, TCP port 5001 TCP window size: 45.0 KByte (default) ------------------------------------------------------------ [ 3] local 172.16.1.101 port 44020 connected with 172.16.1.20 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0- 1.0 sec 512 KBytes 4.19 Mbits/sec [ 3] 1.0- 2.0 sec 384 KBytes 3.15 Mbits/sec [ 3] 2.0- 3.0 sec 0.00 Bytes 0.00 bits/sec [ 3] 3.0- 4.0 sec 0.00 Bytes 0.00 bits/sec [ 3] 4.0- 5.0 sec 128 KBytes 1.05 Mbits/sec [ 3] 5.0- 6.0 sec 0.00 Bytes 0.00 bits/sec [ 3] 6.0- 7.0 sec 5.75 MBytes 48.2 Mbits/sec [ 3] 7.0- 8.0 sec 11.1 MBytes 93.3 Mbits/sec [ 3] 8.0- 9.0 sec 11.2 MBytes 94.4 Mbits/sec [ 3] 9.0-10.0 sec 11.1 MBytes 93.3 Mbits/sec [ 3] 0.0-10.0 sec 40.5 MBytes 33.9 Mbits/sec Considered as a way to avoid the two. 1. Using skb_orphan found previous mailing. This is reporting finish of xmit immediately. 2. Using bcmgenet_tx_reclaim_all instead of bcmgenet_tx_reclaim in bcmgenet_poll. But, it is not suitable to use because of bcmgenet_poll is connected with bcmgenet_isr0. So, I had to use the shortest way. the result is as follows: # iperf -c 172.16.1.20 -i 1 ------------------------------------------------------------ Client connecting to 172.16.1.20, TCP port 5001 TCP window size: 45.0 KByte (default) ------------------------------------------------------------ [ 3] local 172.16.1.101 port 50763 connected with 172.16.1.20 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0- 1.0 sec 11.2 MBytes 94.4 Mbits/sec [ 3] 1.0- 2.0 sec 11.2 MBytes 94.4 Mbits/sec [ 3] 2.0- 3.0 sec 11.1 MBytes 93.3 Mbits/sec [ 3] 3.0- 4.0 sec 11.2 MBytes 94.4 Mbits/sec [ 3] 4.0- 5.0 sec 11.1 MBytes 93.3 Mbits/sec [ 3] 5.0- 6.0 sec 11.2 MBytes 94.4 Mbits/sec [ 3] 6.0- 7.0 sec 11.1 MBytes 93.3 Mbits/sec [ 3] 7.0- 8.0 sec 11.2 MBytes 94.4 Mbits/sec [ 3] 8.0- 9.0 sec 11.2 MBytes 94.4 Mbits/sec [ 3] 9.0-10.0 sec 11.1 MBytes 93.3 Mbits/sec [ 3] 0.0-10.0 sec 112 MBytes 94.0 Mbits/sec > Hmpff... > > Can you elaborate on the regression ? > > Is it because NIC delays TX completion irq or something ? > > bcmgenet_poll() only drains TX packets on ring16 (DESC_INDEC) > > Other tx completions seem to run from hard irq context (bcmgenet_isr1()) > > Your patch seems to imply a bug in hard irq signaling/handling. Good point. The use of skb_orphan is in hard irq. I had mistaken using hard irq in tx_ring[{0,1,2,3}]. The patch of v2 is to use tx_poll like bcmsysport. That too can get the good results. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] net: bcmgenet: fix throughtput regression with TSO autosizing 2015-02-27 14:33 ` Jaedon Shin @ 2015-02-27 15:13 ` Eric Dumazet 0 siblings, 0 replies; 6+ messages in thread From: Eric Dumazet @ 2015-02-27 15:13 UTC (permalink / raw) To: Jaedon Shin; +Cc: David Miller, Florian Fainelli, netdev On Fri, 2015-02-27 at 23:33 +0900, Jaedon Shin wrote: > > On Feb 27, 2015, at 1:36 AM, David Miller <davem@davemloft.net> wrote: > > > > From: Jaedon Shin <jaedon.shin@gmail.com> > > Date: Thu, 26 Feb 2015 20:05:58 +0900 > > > >> This patch prevents the performance degradation of xmit after > >> 605ad7f ("tcp: refine TSO autosizing"). > >> > >> Signed-off-by: Jaedon Shin <jaedon.shin@gmail.com> > > > > I doubt this is the correct way to fix this. > > > > Also, you need to describe in detail what the actual problem > > is, how you evaluated the cause, and what made you think that > > your choosen solution is the proper one. > > The bcmgenet_tx_reclaim() of tx_ring[{0,1,2,3}] process only under 18 > descriptors is too late reclaiming transmitted skb. Therefore, > performance degradation of xmit after 605ad7f ("tcp: refine TSO autosizing"). But the real cause is that this driver disables interrupts unless a certain amount of TX descriptors are used. This is wrong and should be fixed. Only if this fix is proven to not be possible, skb_orphan() will be needed. Random guess, could you simply try this quick and dirty patch : (dirty because I am lazy and I think Florian is already working on it) diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c index ff83c46bc389..0c2e1b0cd180 100644 --- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c +++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c @@ -1028,9 +1028,6 @@ static void __bcmgenet_tx_reclaim(struct net_device *dev, last_c_index &= (num_tx_bds - 1); } - if (ring->free_bds > (MAX_SKB_FRAGS + 1)) - ring->int_disable(priv, ring); - if (netif_tx_queue_stopped(txq)) netif_tx_wake_queue(txq); @@ -1302,11 +1299,13 @@ static netdev_tx_t bcmgenet_xmit(struct sk_buff *skb, struct net_device *dev) bcmgenet_tdma_ring_writel(priv, ring->index, ring->prod_index, TDMA_PROD_INDEX); - if (ring->free_bds <= (MAX_SKB_FRAGS + 1)) { + if (ring->free_bds <= (MAX_SKB_FRAGS + 1)) netif_tx_stop_queue(txq); + + if (unlikely(!ring->int_enabled)) { + ring->int_enabled = true; ring->int_enable(priv, ring); } - out: spin_unlock_irqrestore(&ring->lock, flags); diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.h b/drivers/net/ethernet/broadcom/genet/bcmgenet.h index b36ddec0cc0a..90d52893a1c8 100644 --- a/drivers/net/ethernet/broadcom/genet/bcmgenet.h +++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.h @@ -530,6 +530,7 @@ struct bcmgenet_tx_ring { unsigned int prod_index; /* Tx ring producer index SW copy */ unsigned int cb_ptr; /* Tx ring initial CB ptr */ unsigned int end_ptr; /* Tx ring end CB ptr */ + bool int_enabled; void (*int_enable)(struct bcmgenet_priv *priv, struct bcmgenet_tx_ring *); void (*int_disable)(struct bcmgenet_priv *priv, ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] net: bcmgenet: fix throughtput regression with TSO autosizing 2015-02-26 11:05 [PATCH] net: bcmgenet: fix throughtput regression with TSO autosizing Jaedon Shin 2015-02-26 16:36 ` David Miller @ 2015-02-26 17:27 ` Eric Dumazet 2015-02-26 17:29 ` Florian Fainelli 1 sibling, 1 reply; 6+ messages in thread From: Eric Dumazet @ 2015-02-26 17:27 UTC (permalink / raw) To: Jaedon Shin; +Cc: Florian Fainelli, netdev On Thu, 2015-02-26 at 20:05 +0900, Jaedon Shin wrote: > This patch prevents the performance degradation of xmit after > 605ad7f ("tcp: refine TSO autosizing"). > > Signed-off-by: Jaedon Shin <jaedon.shin@gmail.com> > --- > drivers/net/ethernet/broadcom/genet/bcmgenet.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c > index ff83c46b..87eeff0 100644 > --- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c > +++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c > @@ -1310,6 +1310,9 @@ static netdev_tx_t bcmgenet_xmit(struct sk_buff *skb, struct net_device *dev) > out: > spin_unlock_irqrestore(&ring->lock, flags); > > + if (index != DESC_INDEX) > + skb_orphan(skb); > + Hmpff... Can you elaborate on the regression ? Is it because NIC delays TX completion irq or something ? bcmgenet_poll() only drains TX packets on ring16 (DESC_INDEC) Other tx completions seem to run from hard irq context (bcmgenet_isr1()) Your patch seems to imply a bug in hard irq signaling/handling. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] net: bcmgenet: fix throughtput regression with TSO autosizing 2015-02-26 17:27 ` Eric Dumazet @ 2015-02-26 17:29 ` Florian Fainelli 0 siblings, 0 replies; 6+ messages in thread From: Florian Fainelli @ 2015-02-26 17:29 UTC (permalink / raw) To: Eric Dumazet, Jaedon Shin; +Cc: netdev On 26/02/15 09:27, Eric Dumazet wrote: > On Thu, 2015-02-26 at 20:05 +0900, Jaedon Shin wrote: >> This patch prevents the performance degradation of xmit after >> 605ad7f ("tcp: refine TSO autosizing"). >> >> Signed-off-by: Jaedon Shin <jaedon.shin@gmail.com> >> --- >> drivers/net/ethernet/broadcom/genet/bcmgenet.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c >> index ff83c46b..87eeff0 100644 >> --- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c >> +++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c >> @@ -1310,6 +1310,9 @@ static netdev_tx_t bcmgenet_xmit(struct sk_buff *skb, struct net_device *dev) >> out: >> spin_unlock_irqrestore(&ring->lock, flags); >> >> + if (index != DESC_INDEX) >> + skb_orphan(skb); >> + > > Hmpff... > > Can you elaborate on the regression ? > > Is it because NIC delays TX completion irq or something ? > > bcmgenet_poll() only drains TX packets on ring16 (DESC_INDEC) > > Other tx completions seem to run from hard irq context (bcmgenet_isr1()) > > Your patch seems to imply a bug in hard irq signaling/handling. Yes, there is a bug in how packets transmitted are reclaimed, I will submit a fix for this shortly. -- Florian ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2015-02-27 15:13 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-02-26 11:05 [PATCH] net: bcmgenet: fix throughtput regression with TSO autosizing Jaedon Shin 2015-02-26 16:36 ` David Miller 2015-02-27 14:33 ` Jaedon Shin 2015-02-27 15:13 ` Eric Dumazet 2015-02-26 17:27 ` Eric Dumazet 2015-02-26 17:29 ` Florian Fainelli
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).