* [PATCH] net: reduce net_rx_action() latency to 2 HZ @ 2013-03-05 17:15 Eric Dumazet 2013-03-21 15:03 ` Paul Gortmaker 0 siblings, 1 reply; 7+ messages in thread From: Eric Dumazet @ 2013-03-05 17:15 UTC (permalink / raw) To: David Miller; +Cc: netdev From: Eric Dumazet <edumazt@google.com> We should use time_after_eq() to get maximum latency of two ticks, instead of three. Bug added in commit 24f8b2385 (net: increase receive packet quantum) Signed-off-by: Eric Dumazet <edumazt@google.com> --- net/core/dev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/core/dev.c b/net/core/dev.c index 18d8b5a..461b315 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -4107,7 +4107,7 @@ static void net_rx_action(struct softirq_action *h) * Allow this to run for 2 jiffies since which will allow * an average latency of 1.5/HZ. */ - if (unlikely(budget <= 0 || time_after(jiffies, time_limit))) + if (unlikely(budget <= 0 || time_after_eq(jiffies, time_limit))) goto softnet_break; local_irq_enable(); ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] net: reduce net_rx_action() latency to 2 HZ 2013-03-05 17:15 [PATCH] net: reduce net_rx_action() latency to 2 HZ Eric Dumazet @ 2013-03-21 15:03 ` Paul Gortmaker 2013-03-21 15:27 ` Eric Dumazet 0 siblings, 1 reply; 7+ messages in thread From: Paul Gortmaker @ 2013-03-21 15:03 UTC (permalink / raw) To: Eric Dumazet; +Cc: David Miller, netdev, stable, Willy Tarreau [CC'ing stable & Willy - for the older releases not fed by http://patchwork.ozlabs.org/bundle/davem/stable/ ] On Tue, Mar 5, 2013 at 12:15 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > From: Eric Dumazet <edumazt@google.com> > > We should use time_after_eq() to get maximum latency of two ticks, > instead of three. > > Bug added in commit 24f8b2385 (net: increase receive packet quantum) I'm not sure what applications would notice the extra tick, but 24f8b takes us back to 2.6.29. It cherry picks cleanly onto 2.6.34, so it probably also does the same for Willy's 2.6.32 longterm too. Commit is now mainline d114a3338747255518 - v3.9-rc3~36^2~34. Paul. -- > > Signed-off-by: Eric Dumazet <edumazt@google.com> > --- > net/core/dev.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/core/dev.c b/net/core/dev.c > index 18d8b5a..461b315 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -4107,7 +4107,7 @@ static void net_rx_action(struct softirq_action *h) > * Allow this to run for 2 jiffies since which will allow > * an average latency of 1.5/HZ. > */ > - if (unlikely(budget <= 0 || time_after(jiffies, time_limit))) > + if (unlikely(budget <= 0 || time_after_eq(jiffies, time_limit))) > goto softnet_break; > > local_irq_enable(); > > > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] net: reduce net_rx_action() latency to 2 HZ 2013-03-21 15:03 ` Paul Gortmaker @ 2013-03-21 15:27 ` Eric Dumazet 2013-03-21 17:25 ` Paul Gortmaker 0 siblings, 1 reply; 7+ messages in thread From: Eric Dumazet @ 2013-03-21 15:27 UTC (permalink / raw) To: Paul Gortmaker; +Cc: David Miller, netdev, stable, Willy Tarreau, Tom Herbert On Thu, 2013-03-21 at 11:03 -0400, Paul Gortmaker wrote: > [CC'ing stable & Willy - for the older releases not fed by > http://patchwork.ozlabs.org/bundle/davem/stable/ ] > > On Tue, Mar 5, 2013 at 12:15 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > > From: Eric Dumazet <edumazt@google.com> > > > > We should use time_after_eq() to get maximum latency of two ticks, > > instead of three. > > > > Bug added in commit 24f8b2385 (net: increase receive packet quantum) > > I'm not sure what applications would notice the extra tick, but 24f8b takes > us back to 2.6.29. It cherry picks cleanly onto 2.6.34, so it probably also > does the same for Willy's 2.6.32 longterm too. > > Commit is now mainline d114a3338747255518 - v3.9-rc3~36^2~34. BQL (Bytes Queue Limit) relies on TX completion being run often, and Qdisc being serviced often as well. If net_rx_action() hogs the cpu, net_tx_action() is delayed and NIC can stall. I wrote this patch because I was investigating a regression when a Google application began using BQL enabled kernels. About the latency in itself, following commit is way more interesting. commit c10d73671ad30f5 (softirq: reduce latencies) As without it, I could trigger more than 50ms latencies for the poor user thread interrupted by softirq processing. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] net: reduce net_rx_action() latency to 2 HZ 2013-03-21 15:27 ` Eric Dumazet @ 2013-03-21 17:25 ` Paul Gortmaker 2013-03-21 17:43 ` Eric Dumazet 2013-03-24 1:02 ` Willy Tarreau 0 siblings, 2 replies; 7+ messages in thread From: Paul Gortmaker @ 2013-03-21 17:25 UTC (permalink / raw) To: Eric Dumazet Cc: David Miller, netdev, stable, Willy Tarreau, Tom Herbert, Steven Rostedt On 13-03-21 11:27 AM, Eric Dumazet wrote: > On Thu, 2013-03-21 at 11:03 -0400, Paul Gortmaker wrote: >> [CC'ing stable & Willy - for the older releases not fed by >> http://patchwork.ozlabs.org/bundle/davem/stable/ ] >> >> On Tue, Mar 5, 2013 at 12:15 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote: >>> From: Eric Dumazet <edumazt@google.com> >>> >>> We should use time_after_eq() to get maximum latency of two ticks, >>> instead of three. >>> >>> Bug added in commit 24f8b2385 (net: increase receive packet quantum) >> >> I'm not sure what applications would notice the extra tick, but 24f8b takes >> us back to 2.6.29. It cherry picks cleanly onto 2.6.34, so it probably also >> does the same for Willy's 2.6.32 longterm too. >> >> Commit is now mainline d114a3338747255518 - v3.9-rc3~36^2~34. > > BQL (Bytes Queue Limit) relies on TX completion being run often, and > Qdisc being serviced often as well. If net_rx_action() hogs the cpu, > net_tx_action() is delayed and NIC can stall. > > I wrote this patch because I was investigating a regression when a > Google application began using BQL enabled kernels. > > About the latency in itself, following commit is way more interesting. > > commit c10d73671ad30f5 (softirq: reduce latencies) > > As without it, I could trigger more than 50ms latencies for the poor > user thread interrupted by softirq processing. That is also reasonably portable back to 2.6.34. And it is more interesting too -- it will be interesting in a preempt_rt context too, once RT moves ahead off the current 3.6 baseline, which still has the old count-limit of 10 vs the new 2ms time limit. RT (3.4 and 3.6 based) currently has this patch from Steven: http://git.kernel.org/cgit/linux/kernel/git/paulg/3.6-rt-patches.git/tree/net-tx-action-avoid-livelock-on-rt.patch Anyway, thanks for the heads up on this commit. Paul. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] net: reduce net_rx_action() latency to 2 HZ 2013-03-21 17:25 ` Paul Gortmaker @ 2013-03-21 17:43 ` Eric Dumazet 2013-03-21 18:45 ` Steven Rostedt 2013-03-24 1:02 ` Willy Tarreau 1 sibling, 1 reply; 7+ messages in thread From: Eric Dumazet @ 2013-03-21 17:43 UTC (permalink / raw) To: Paul Gortmaker Cc: David Miller, netdev, stable, Willy Tarreau, Tom Herbert, Steven Rostedt On Thu, 2013-03-21 at 13:25 -0400, Paul Gortmaker wrote: > That is also reasonably portable back to 2.6.34. And it is more > interesting too -- it will be interesting in a preempt_rt context > too, once RT moves ahead off the current 3.6 baseline, which still > has the old count-limit of 10 vs the new 2ms time limit. > > RT (3.4 and 3.6 based) currently has this patch from Steven: > http://git.kernel.org/cgit/linux/kernel/git/paulg/3.6-rt-patches.git/tree/net-tx-action-avoid-livelock-on-rt.patch Interesting, as Google has an internal patch removing this trylock() as well. I think I should upstream it eventually ;) commit 2f0a3f573b531dc57c268fd809dc65169edae369 Author: Eric Dumazet <edumazet@google.com> Date: Thu Dec 13 09:18:01 2012 -0800 net-dev_xmit_hold_queues: fix a busy loop in net_tx_action Under load, net_tx_action() fails to acquire qdisc lock and reschedules qdisc in a never ending loop. The spin_trylock() has almost no chance to complete because of ticket spinlock and xmit_hold_queue holding the lock for long period of times. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] net: reduce net_rx_action() latency to 2 HZ 2013-03-21 17:43 ` Eric Dumazet @ 2013-03-21 18:45 ` Steven Rostedt 0 siblings, 0 replies; 7+ messages in thread From: Steven Rostedt @ 2013-03-21 18:45 UTC (permalink / raw) To: Eric Dumazet Cc: Paul Gortmaker, David Miller, netdev, stable, Willy Tarreau, Tom Herbert On Thu, 2013-03-21 at 10:43 -0700, Eric Dumazet wrote: > On Thu, 2013-03-21 at 13:25 -0400, Paul Gortmaker wrote: > > > That is also reasonably portable back to 2.6.34. And it is more > > interesting too -- it will be interesting in a preempt_rt context > > too, once RT moves ahead off the current 3.6 baseline, which still > > has the old count-limit of 10 vs the new 2ms time limit. > > > > RT (3.4 and 3.6 based) currently has this patch from Steven: > > http://git.kernel.org/cgit/linux/kernel/git/paulg/3.6-rt-patches.git/tree/net-tx-action-avoid-livelock-on-rt.patch > > Interesting, as Google has an internal patch removing this trylock() as > well. > > I think I should upstream it eventually ;) Yes please :-) -- Steve > > commit 2f0a3f573b531dc57c268fd809dc65169edae369 > Author: Eric Dumazet <edumazet@google.com> > Date: Thu Dec 13 09:18:01 2012 -0800 > > net-dev_xmit_hold_queues: fix a busy loop in net_tx_action > > Under load, net_tx_action() fails to acquire qdisc lock > and reschedules qdisc in a never ending loop. > > The spin_trylock() has almost no chance to complete because > of ticket spinlock and xmit_hold_queue holding the lock for long > period of times. > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] net: reduce net_rx_action() latency to 2 HZ 2013-03-21 17:25 ` Paul Gortmaker 2013-03-21 17:43 ` Eric Dumazet @ 2013-03-24 1:02 ` Willy Tarreau 1 sibling, 0 replies; 7+ messages in thread From: Willy Tarreau @ 2013-03-24 1:02 UTC (permalink / raw) To: Paul Gortmaker Cc: Eric Dumazet, David Miller, netdev, stable, Tom Herbert, Steven Rostedt On Thu, Mar 21, 2013 at 01:25:48PM -0400, Paul Gortmaker wrote: > On 13-03-21 11:27 AM, Eric Dumazet wrote: > > On Thu, 2013-03-21 at 11:03 -0400, Paul Gortmaker wrote: > >> [CC'ing stable & Willy - for the older releases not fed by > >> http://patchwork.ozlabs.org/bundle/davem/stable/ ] > >> > >> On Tue, Mar 5, 2013 at 12:15 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > >>> From: Eric Dumazet <edumazt@google.com> > >>> > >>> We should use time_after_eq() to get maximum latency of two ticks, > >>> instead of three. > >>> > >>> Bug added in commit 24f8b2385 (net: increase receive packet quantum) > >> > >> I'm not sure what applications would notice the extra tick, but 24f8b takes > >> us back to 2.6.29. It cherry picks cleanly onto 2.6.34, so it probably also > >> does the same for Willy's 2.6.32 longterm too. > >> > >> Commit is now mainline d114a3338747255518 - v3.9-rc3~36^2~34. > > > > BQL (Bytes Queue Limit) relies on TX completion being run often, and > > Qdisc being serviced often as well. If net_rx_action() hogs the cpu, > > net_tx_action() is delayed and NIC can stall. > > > > I wrote this patch because I was investigating a regression when a > > Google application began using BQL enabled kernels. > > > > About the latency in itself, following commit is way more interesting. > > > > commit c10d73671ad30f5 (softirq: reduce latencies) > > > > As without it, I could trigger more than 50ms latencies for the poor > > user thread interrupted by softirq processing. > > That is also reasonably portable back to 2.6.34. And it is more > interesting too -- it will be interesting in a preempt_rt context > too, once RT moves ahead off the current 3.6 baseline, which still > has the old count-limit of 10 vs the new 2ms time limit. > > RT (3.4 and 3.6 based) currently has this patch from Steven: > http://git.kernel.org/cgit/linux/kernel/git/paulg/3.6-rt-patches.git/tree/net-tx-action-avoid-livelock-on-rt.patch > > Anyway, thanks for the heads up on this commit. And thanks to you Paul for the heads up as well, I'll pick them from your branch :-) Cheers, Willy ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2013-03-24 1:02 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-03-05 17:15 [PATCH] net: reduce net_rx_action() latency to 2 HZ Eric Dumazet 2013-03-21 15:03 ` Paul Gortmaker 2013-03-21 15:27 ` Eric Dumazet 2013-03-21 17:25 ` Paul Gortmaker 2013-03-21 17:43 ` Eric Dumazet 2013-03-21 18:45 ` Steven Rostedt 2013-03-24 1:02 ` Willy Tarreau
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).