* [PATCH] net: bcmgenet: fix throughtput regression with TSO autosizing
@ 2015-02-26 11:05 Jaedon Shin
2015-02-26 16:36 ` David Miller
2015-02-26 17:27 ` Eric Dumazet
0 siblings, 2 replies; 6+ messages in thread
From: Jaedon Shin @ 2015-02-26 11:05 UTC (permalink / raw)
To: Florian Fainelli; +Cc: netdev, Jaedon Shin
This patch prevents the performance degradation of xmit after
605ad7f ("tcp: refine TSO autosizing").
Signed-off-by: Jaedon Shin <jaedon.shin@gmail.com>
---
drivers/net/ethernet/broadcom/genet/bcmgenet.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
index ff83c46b..87eeff0 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
@@ -1310,6 +1310,9 @@ static netdev_tx_t bcmgenet_xmit(struct sk_buff *skb, struct net_device *dev)
out:
spin_unlock_irqrestore(&ring->lock, flags);
+ if (index != DESC_INDEX)
+ skb_orphan(skb);
+
return ret;
}
--
2.3.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] net: bcmgenet: fix throughtput regression with TSO autosizing
2015-02-26 11:05 [PATCH] net: bcmgenet: fix throughtput regression with TSO autosizing Jaedon Shin
@ 2015-02-26 16:36 ` David Miller
2015-02-27 14:33 ` Jaedon Shin
2015-02-26 17:27 ` Eric Dumazet
1 sibling, 1 reply; 6+ messages in thread
From: David Miller @ 2015-02-26 16:36 UTC (permalink / raw)
To: jaedon.shin; +Cc: f.fainelli, netdev
From: Jaedon Shin <jaedon.shin@gmail.com>
Date: Thu, 26 Feb 2015 20:05:58 +0900
> This patch prevents the performance degradation of xmit after
> 605ad7f ("tcp: refine TSO autosizing").
>
> Signed-off-by: Jaedon Shin <jaedon.shin@gmail.com>
I doubt this is the correct way to fix this.
Also, you need to describe in detail what the actual problem
is, how you evaluated the cause, and what made you think that
your choosen solution is the proper one.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] net: bcmgenet: fix throughtput regression with TSO autosizing
2015-02-26 11:05 [PATCH] net: bcmgenet: fix throughtput regression with TSO autosizing Jaedon Shin
2015-02-26 16:36 ` David Miller
@ 2015-02-26 17:27 ` Eric Dumazet
2015-02-26 17:29 ` Florian Fainelli
1 sibling, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2015-02-26 17:27 UTC (permalink / raw)
To: Jaedon Shin; +Cc: Florian Fainelli, netdev
On Thu, 2015-02-26 at 20:05 +0900, Jaedon Shin wrote:
> This patch prevents the performance degradation of xmit after
> 605ad7f ("tcp: refine TSO autosizing").
>
> Signed-off-by: Jaedon Shin <jaedon.shin@gmail.com>
> ---
> drivers/net/ethernet/broadcom/genet/bcmgenet.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
> index ff83c46b..87eeff0 100644
> --- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
> +++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
> @@ -1310,6 +1310,9 @@ static netdev_tx_t bcmgenet_xmit(struct sk_buff *skb, struct net_device *dev)
> out:
> spin_unlock_irqrestore(&ring->lock, flags);
>
> + if (index != DESC_INDEX)
> + skb_orphan(skb);
> +
Hmpff...
Can you elaborate on the regression ?
Is it because NIC delays TX completion irq or something ?
bcmgenet_poll() only drains TX packets on ring16 (DESC_INDEC)
Other tx completions seem to run from hard irq context (bcmgenet_isr1())
Your patch seems to imply a bug in hard irq signaling/handling.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] net: bcmgenet: fix throughtput regression with TSO autosizing
2015-02-26 17:27 ` Eric Dumazet
@ 2015-02-26 17:29 ` Florian Fainelli
0 siblings, 0 replies; 6+ messages in thread
From: Florian Fainelli @ 2015-02-26 17:29 UTC (permalink / raw)
To: Eric Dumazet, Jaedon Shin; +Cc: netdev
On 26/02/15 09:27, Eric Dumazet wrote:
> On Thu, 2015-02-26 at 20:05 +0900, Jaedon Shin wrote:
>> This patch prevents the performance degradation of xmit after
>> 605ad7f ("tcp: refine TSO autosizing").
>>
>> Signed-off-by: Jaedon Shin <jaedon.shin@gmail.com>
>> ---
>> drivers/net/ethernet/broadcom/genet/bcmgenet.c | 3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
>> index ff83c46b..87eeff0 100644
>> --- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
>> +++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
>> @@ -1310,6 +1310,9 @@ static netdev_tx_t bcmgenet_xmit(struct sk_buff *skb, struct net_device *dev)
>> out:
>> spin_unlock_irqrestore(&ring->lock, flags);
>>
>> + if (index != DESC_INDEX)
>> + skb_orphan(skb);
>> +
>
> Hmpff...
>
> Can you elaborate on the regression ?
>
> Is it because NIC delays TX completion irq or something ?
>
> bcmgenet_poll() only drains TX packets on ring16 (DESC_INDEC)
>
> Other tx completions seem to run from hard irq context (bcmgenet_isr1())
>
> Your patch seems to imply a bug in hard irq signaling/handling.
Yes, there is a bug in how packets transmitted are reclaimed, I will
submit a fix for this shortly.
--
Florian
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] net: bcmgenet: fix throughtput regression with TSO autosizing
2015-02-26 16:36 ` David Miller
@ 2015-02-27 14:33 ` Jaedon Shin
2015-02-27 15:13 ` Eric Dumazet
0 siblings, 1 reply; 6+ messages in thread
From: Jaedon Shin @ 2015-02-27 14:33 UTC (permalink / raw)
To: David Miller; +Cc: Florian Fainelli, netdev, eric.dumazet
> On Feb 27, 2015, at 1:36 AM, David Miller <davem@davemloft.net> wrote:
>
> From: Jaedon Shin <jaedon.shin@gmail.com>
> Date: Thu, 26 Feb 2015 20:05:58 +0900
>
>> This patch prevents the performance degradation of xmit after
>> 605ad7f ("tcp: refine TSO autosizing").
>>
>> Signed-off-by: Jaedon Shin <jaedon.shin@gmail.com>
>
> I doubt this is the correct way to fix this.
>
> Also, you need to describe in detail what the actual problem
> is, how you evaluated the cause, and what made you think that
> your choosen solution is the proper one.
The bcmgenet_tx_reclaim() of tx_ring[{0,1,2,3}] process only under 18
descriptors is too late reclaiming transmitted skb. Therefore,
performance degradation of xmit after 605ad7f ("tcp: refine TSO autosizing").
# iperf -c 172.16.1.2 -i 1
------------------------------------------------------------
Client connecting to 172.16.1.20, TCP port 5001
TCP window size: 45.0 KByte (default)
------------------------------------------------------------
[ 3] local 172.16.1.101 port 44020 connected with 172.16.1.20 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0- 1.0 sec 512 KBytes 4.19 Mbits/sec
[ 3] 1.0- 2.0 sec 384 KBytes 3.15 Mbits/sec
[ 3] 2.0- 3.0 sec 0.00 Bytes 0.00 bits/sec
[ 3] 3.0- 4.0 sec 0.00 Bytes 0.00 bits/sec
[ 3] 4.0- 5.0 sec 128 KBytes 1.05 Mbits/sec
[ 3] 5.0- 6.0 sec 0.00 Bytes 0.00 bits/sec
[ 3] 6.0- 7.0 sec 5.75 MBytes 48.2 Mbits/sec
[ 3] 7.0- 8.0 sec 11.1 MBytes 93.3 Mbits/sec
[ 3] 8.0- 9.0 sec 11.2 MBytes 94.4 Mbits/sec
[ 3] 9.0-10.0 sec 11.1 MBytes 93.3 Mbits/sec
[ 3] 0.0-10.0 sec 40.5 MBytes 33.9 Mbits/sec
Considered as a way to avoid the two.
1. Using skb_orphan found previous mailing. This is reporting finish
of xmit immediately.
2. Using bcmgenet_tx_reclaim_all instead of bcmgenet_tx_reclaim in
bcmgenet_poll. But, it is not suitable to use because of bcmgenet_poll
is connected with bcmgenet_isr0.
So, I had to use the shortest way. the result is as follows:
# iperf -c 172.16.1.20 -i 1
------------------------------------------------------------
Client connecting to 172.16.1.20, TCP port 5001
TCP window size: 45.0 KByte (default)
------------------------------------------------------------
[ 3] local 172.16.1.101 port 50763 connected with 172.16.1.20 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0- 1.0 sec 11.2 MBytes 94.4 Mbits/sec
[ 3] 1.0- 2.0 sec 11.2 MBytes 94.4 Mbits/sec
[ 3] 2.0- 3.0 sec 11.1 MBytes 93.3 Mbits/sec
[ 3] 3.0- 4.0 sec 11.2 MBytes 94.4 Mbits/sec
[ 3] 4.0- 5.0 sec 11.1 MBytes 93.3 Mbits/sec
[ 3] 5.0- 6.0 sec 11.2 MBytes 94.4 Mbits/sec
[ 3] 6.0- 7.0 sec 11.1 MBytes 93.3 Mbits/sec
[ 3] 7.0- 8.0 sec 11.2 MBytes 94.4 Mbits/sec
[ 3] 8.0- 9.0 sec 11.2 MBytes 94.4 Mbits/sec
[ 3] 9.0-10.0 sec 11.1 MBytes 93.3 Mbits/sec
[ 3] 0.0-10.0 sec 112 MBytes 94.0 Mbits/sec
> Hmpff...
>
> Can you elaborate on the regression ?
>
> Is it because NIC delays TX completion irq or something ?
>
> bcmgenet_poll() only drains TX packets on ring16 (DESC_INDEC)
>
> Other tx completions seem to run from hard irq context (bcmgenet_isr1())
>
> Your patch seems to imply a bug in hard irq signaling/handling.
Good point.
The use of skb_orphan is in hard irq. I had mistaken using hard irq
in tx_ring[{0,1,2,3}].
The patch of v2 is to use tx_poll like bcmsysport.
That too can get the good results.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] net: bcmgenet: fix throughtput regression with TSO autosizing
2015-02-27 14:33 ` Jaedon Shin
@ 2015-02-27 15:13 ` Eric Dumazet
0 siblings, 0 replies; 6+ messages in thread
From: Eric Dumazet @ 2015-02-27 15:13 UTC (permalink / raw)
To: Jaedon Shin; +Cc: David Miller, Florian Fainelli, netdev
On Fri, 2015-02-27 at 23:33 +0900, Jaedon Shin wrote:
> > On Feb 27, 2015, at 1:36 AM, David Miller <davem@davemloft.net> wrote:
> >
> > From: Jaedon Shin <jaedon.shin@gmail.com>
> > Date: Thu, 26 Feb 2015 20:05:58 +0900
> >
> >> This patch prevents the performance degradation of xmit after
> >> 605ad7f ("tcp: refine TSO autosizing").
> >>
> >> Signed-off-by: Jaedon Shin <jaedon.shin@gmail.com>
> >
> > I doubt this is the correct way to fix this.
> >
> > Also, you need to describe in detail what the actual problem
> > is, how you evaluated the cause, and what made you think that
> > your choosen solution is the proper one.
>
> The bcmgenet_tx_reclaim() of tx_ring[{0,1,2,3}] process only under 18
> descriptors is too late reclaiming transmitted skb. Therefore,
> performance degradation of xmit after 605ad7f ("tcp: refine TSO autosizing").
But the real cause is that this driver disables interrupts unless a
certain amount of TX descriptors are used.
This is wrong and should be fixed.
Only if this fix is proven to not be possible, skb_orphan() will be
needed.
Random guess, could you simply try this quick and dirty patch :
(dirty because I am lazy and I think Florian is already working on it)
diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
index ff83c46bc389..0c2e1b0cd180 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
@@ -1028,9 +1028,6 @@ static void __bcmgenet_tx_reclaim(struct net_device *dev,
last_c_index &= (num_tx_bds - 1);
}
- if (ring->free_bds > (MAX_SKB_FRAGS + 1))
- ring->int_disable(priv, ring);
-
if (netif_tx_queue_stopped(txq))
netif_tx_wake_queue(txq);
@@ -1302,11 +1299,13 @@ static netdev_tx_t bcmgenet_xmit(struct sk_buff *skb, struct net_device *dev)
bcmgenet_tdma_ring_writel(priv, ring->index,
ring->prod_index, TDMA_PROD_INDEX);
- if (ring->free_bds <= (MAX_SKB_FRAGS + 1)) {
+ if (ring->free_bds <= (MAX_SKB_FRAGS + 1))
netif_tx_stop_queue(txq);
+
+ if (unlikely(!ring->int_enabled)) {
+ ring->int_enabled = true;
ring->int_enable(priv, ring);
}
-
out:
spin_unlock_irqrestore(&ring->lock, flags);
diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.h b/drivers/net/ethernet/broadcom/genet/bcmgenet.h
index b36ddec0cc0a..90d52893a1c8 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet.h
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.h
@@ -530,6 +530,7 @@ struct bcmgenet_tx_ring {
unsigned int prod_index; /* Tx ring producer index SW copy */
unsigned int cb_ptr; /* Tx ring initial CB ptr */
unsigned int end_ptr; /* Tx ring end CB ptr */
+ bool int_enabled;
void (*int_enable)(struct bcmgenet_priv *priv,
struct bcmgenet_tx_ring *);
void (*int_disable)(struct bcmgenet_priv *priv,
^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2015-02-27 15:13 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-02-26 11:05 [PATCH] net: bcmgenet: fix throughtput regression with TSO autosizing Jaedon Shin
2015-02-26 16:36 ` David Miller
2015-02-27 14:33 ` Jaedon Shin
2015-02-27 15:13 ` Eric Dumazet
2015-02-26 17:27 ` Eric Dumazet
2015-02-26 17:29 ` Florian Fainelli
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).