* Re: bad networking related lag in v2.6.22-rc2 [not found] ` <20070523063052.GB26814@elte.hu> @ 2007-05-23 10:56 ` Patrick McHardy 2007-05-23 11:05 ` Ingo Molnar 2007-05-23 11:25 ` Herbert Xu 0 siblings, 2 replies; 11+ messages in thread From: Patrick McHardy @ 2007-05-23 10:56 UTC (permalink / raw) To: Ingo Molnar Cc: Anant Nitya, linux-kernel, Linus Torvalds, Andrew Morton, Thomas Gleixner, David S. Miller, Linux Netdev List, Herbert Xu [-- Attachment #1: Type: text/plain, Size: 1275 bytes --] Ingo Molnar wrote: > if you feel inclined to try the git-bisection then by all means please > do it (it will certainly be helpful and educative), but it's optional: i > dont think you should 'need' to go through extra debugging chores, my > analysis based on the excellent trace you provided still holds and > whoever modified htb_dequeue()'s logic recently ought to be able to > figure that out (or send you a debug patch to further narrow the problem > down). > > The trace shows a _clearly_ anomalous loop: for example there's 56396 > (!) calls to rb_first() in htb_dequeue() [without the kernel ever > exiting that function]: > > earth4:~/s> grep rb_first trace-to-ingo.txt | wc -l > 56396 How is this trace to be understood? Is it simply a call trace in execution-order? If thats the case than we are exiting htb_dequeue, each call to qdisc_watchdog_schedule happens at the very end of that function, which would imply a bug in __qdisc_run. Looking at the recent changes to __qdisc_run, this indeed seems to be the case, when the qdisc is throttled and has packets queued we return a value != 0, causing __qdisc_run to loop until all packets have been sent, which may be a long time. Anant, can you please verify by testing the attached patch? Thanks. [-- Attachment #2: x --] [-- Type: text/plain, Size: 318 bytes --] diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index f28bb2d..f536060 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -174,7 +174,7 @@ requeue: out: BUG_ON((int) q->q.qlen < 0); - return q->q.qlen; + return skb ? q->q.qlen : 0; } void __qdisc_run(struct net_device *dev) ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: bad networking related lag in v2.6.22-rc2 2007-05-23 10:56 ` bad networking related lag in v2.6.22-rc2 Patrick McHardy @ 2007-05-23 11:05 ` Ingo Molnar 2007-05-23 11:25 ` Herbert Xu 1 sibling, 0 replies; 11+ messages in thread From: Ingo Molnar @ 2007-05-23 11:05 UTC (permalink / raw) To: Patrick McHardy Cc: Anant Nitya, linux-kernel, Linus Torvalds, Andrew Morton, Thomas Gleixner, David S. Miller, Linux Netdev List, Herbert Xu * Patrick McHardy <kaber@trash.net> wrote: > How is this trace to be understood? Is it simply a call trace in > execution-order? [...] yeah. There's a help section at the top of the trace which explains the other fields too: _------=> CPU# / _-----=> irqs-off | / _----=> need-resched || / _---=> hardirq/softirq ||| / _--=> preempt-depth |||| / ||||| delay cmd pid ||||| time | caller \ / ||||| \ | / privoxy-12926 1.Ns1 0us : ktime_get_ts (ktime_get) the function name in braces is the parent function. So in this case the trace entry means we called ktime_get_ts() from ktime_get(). Ingo ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: bad networking related lag in v2.6.22-rc2 2007-05-23 10:56 ` bad networking related lag in v2.6.22-rc2 Patrick McHardy 2007-05-23 11:05 ` Ingo Molnar @ 2007-05-23 11:25 ` Herbert Xu 2007-05-23 11:33 ` Patrick McHardy 2007-05-23 11:40 ` Ingo Molnar 1 sibling, 2 replies; 11+ messages in thread From: Herbert Xu @ 2007-05-23 11:25 UTC (permalink / raw) To: Patrick McHardy Cc: Ingo Molnar, Anant Nitya, linux-kernel, Linus Torvalds, Andrew Morton, Thomas Gleixner, David S. Miller, Linux Netdev List On Wed, May 23, 2007 at 12:56:04PM +0200, Patrick McHardy wrote: > > Looking at the recent changes to __qdisc_run, this indeed seems > to be the case, when the qdisc is throttled and has packets queued > we return a value != 0, causing __qdisc_run to loop until all > packets have been sent, which may be a long time. Good catch! I was obviously half awake at the time :) We could also fix it this way: [NET_SCHED]: Fix qdisc_restart return value when dequeue is empty My previous patch that changed the return value of qdisc_restart incorrectly made the case where dequeue returns empty continue processing packets. This patch is based on diagnosis and fix by Patrick McHardy. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Thanks, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index f28bb2d..cbefe22 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -169,8 +169,8 @@ requeue: else q->ops->requeue(skb, q); netif_schedule(dev); - return 0; } + return 0; out: BUG_ON((int) q->q.qlen < 0); ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: bad networking related lag in v2.6.22-rc2 2007-05-23 11:25 ` Herbert Xu @ 2007-05-23 11:33 ` Patrick McHardy 2007-05-23 15:00 ` Linus Torvalds 2007-05-23 11:40 ` Ingo Molnar 1 sibling, 1 reply; 11+ messages in thread From: Patrick McHardy @ 2007-05-23 11:33 UTC (permalink / raw) To: Herbert Xu Cc: Ingo Molnar, Anant Nitya, linux-kernel, Linus Torvalds, Andrew Morton, Thomas Gleixner, David S. Miller, Linux Netdev List Herbert Xu wrote: > On Wed, May 23, 2007 at 12:56:04PM +0200, Patrick McHardy wrote: > >>Looking at the recent changes to __qdisc_run, this indeed seems >>to be the case, when the qdisc is throttled and has packets queued >>we return a value != 0, causing __qdisc_run to loop until all >>packets have been sent, which may be a long time. > > > Good catch! I was obviously half awake at the time :) > > We could also fix it this way: Yes, that looks better, thanks. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: bad networking related lag in v2.6.22-rc2 2007-05-23 11:33 ` Patrick McHardy @ 2007-05-23 15:00 ` Linus Torvalds 2007-05-23 17:16 ` Patrick McHardy 0 siblings, 1 reply; 11+ messages in thread From: Linus Torvalds @ 2007-05-23 15:00 UTC (permalink / raw) To: Patrick McHardy Cc: Herbert Xu, Ingo Molnar, Anant Nitya, linux-kernel, Andrew Morton, Thomas Gleixner, David S. Miller, Linux Netdev List On Wed, 23 May 2007, Patrick McHardy wrote: > > Yes, that looks better, thanks. There appear to be other obvious problems in the recent "cleanups" in this area.. Look at psched_tdiff_bounded(psched_time_t tv1, psched_time_t tv2, psched_time_t bound) { return min(tv1 - tv2, bound); } and compare it to the previous code: #define PSCHED_TDIFF_SAFE(tv1, tv2, bound) \ min_t(long long, (tv1) - (tv2), bound) and ponder how that "trivial cleanup" totally broke the thing. Hint: "psched_time_t" is an "u64". What does that mean for min(tv1 - tv2, bound); again, when "tv2" is larger than tv1. It _used_ to return a negative value. Now it returns a positive "bound" upper bound, because "tv1-tv2" will be used as a huge unsigned (and thus _positive_) integer. And was that accidental, or done on purpose? Sounds accidental to me, since you then want to return a "psched_tdiff_t", which is typedeffed to be "long". Doesn't sound very safe to me, especially since the commit message for this is "[NET_SCHED]: turn PSCHED_TDIFF_SAFE into inline function", and there's no indication that anybody realized that it changed semantics in the process. Hmm? What _should_ that thing do? Linus ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: bad networking related lag in v2.6.22-rc2 2007-05-23 15:00 ` Linus Torvalds @ 2007-05-23 17:16 ` Patrick McHardy 0 siblings, 0 replies; 11+ messages in thread From: Patrick McHardy @ 2007-05-23 17:16 UTC (permalink / raw) To: Linus Torvalds Cc: Herbert Xu, Ingo Molnar, Anant Nitya, linux-kernel, Andrew Morton, Thomas Gleixner, David S. Miller, Linux Netdev List Linus Torvalds wrote: > There appear to be other obvious problems in the recent "cleanups" in this > area.. > > Look at > > psched_tdiff_bounded(psched_time_t tv1, psched_time_t tv2, psched_time_t bound) > { > return min(tv1 - tv2, bound); > } > > and compare it to the previous code: > > #define PSCHED_TDIFF_SAFE(tv1, tv2, bound) \ > min_t(long long, (tv1) - (tv2), bound) > > and ponder how that "trivial cleanup" totally broke the thing. > > Hint: "psched_time_t" is an "u64". What does that mean for > > min(tv1 - tv2, bound); > > again, when "tv2" is larger than tv1. It _used_ to return a negative > value. Now it returns a positive "bound" upper bound, because "tv1-tv2" > will be used as a huge unsigned (and thus _positive_) integer. And was > that accidental, or done on purpose? > > Sounds accidental to me, since you then want to return a "psched_tdiff_t", > which is typedeffed to be "long". > > Doesn't sound very safe to me, especially since the commit message for > this is "[NET_SCHED]: turn PSCHED_TDIFF_SAFE into inline function", and > there's no indication that anybody realized that it changed semantics in > the process. I did realize it, but tv2 > tv1 can't happen and makes no sense for the users of this function. I probably should have provided a more detailed changelog entry. > Hmm? What _should_ that thing do? It is used to calculate the amount of tokens a tocken bucket has accumulated since the last refill, thus we always have tv1 >= tv2 (modulo ktime wraps). In fact tv2 > tv1 was never properly supported. This macro would have returned the negative long long value, but all users assign it to a psched_tdiff_t (long), so depending on the exact values, it might still be interpreted as a large positive value. Additionally there was a second implementation for the gettimeofday clocksource that didn't return the negative difference but the bound value. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: bad networking related lag in v2.6.22-rc2 2007-05-23 11:25 ` Herbert Xu 2007-05-23 11:33 ` Patrick McHardy @ 2007-05-23 11:40 ` Ingo Molnar 2007-05-23 21:30 ` David Miller 1 sibling, 1 reply; 11+ messages in thread From: Ingo Molnar @ 2007-05-23 11:40 UTC (permalink / raw) To: Herbert Xu Cc: Patrick McHardy, Anant Nitya, linux-kernel, Linus Torvalds, Andrew Morton, Thomas Gleixner, David S. Miller, Linux Netdev List * Herbert Xu <herbert@gondor.apana.org.au> wrote: > [NET_SCHED]: Fix qdisc_restart return value when dequeue is empty > > My previous patch that changed the return value of qdisc_restart > incorrectly made the case where dequeue returns empty continue > processing packets. > > This patch is based on diagnosis and fix by Patrick McHardy. > > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> also: Reported-and-debugged-by: Anant Nitya <kernel@prachanda.info> ... i gave your patch a quick test-boot and networking still works fine. Ingo ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: bad networking related lag in v2.6.22-rc2 2007-05-23 11:40 ` Ingo Molnar @ 2007-05-23 21:30 ` David Miller 2007-05-24 5:41 ` Patrick McHardy 2007-05-24 7:12 ` Anant Nitya 0 siblings, 2 replies; 11+ messages in thread From: David Miller @ 2007-05-23 21:30 UTC (permalink / raw) To: mingo; +Cc: herbert, kaber, kernel, linux-kernel, torvalds, akpm, tglx, netdev From: Ingo Molnar <mingo@elte.hu> Date: Wed, 23 May 2007 13:40:21 +0200 > > * Herbert Xu <herbert@gondor.apana.org.au> wrote: > > > [NET_SCHED]: Fix qdisc_restart return value when dequeue is empty > > > > My previous patch that changed the return value of qdisc_restart > > incorrectly made the case where dequeue returns empty continue > > processing packets. > > > > This patch is based on diagnosis and fix by Patrick McHardy. > > > > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> > > also: > > Reported-and-debugged-by: Anant Nitya <kernel@prachanda.info> Applied, thanks everyone. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: bad networking related lag in v2.6.22-rc2 2007-05-23 21:30 ` David Miller @ 2007-05-24 5:41 ` Patrick McHardy 2007-05-24 6:40 ` David Miller 2007-05-24 7:12 ` Anant Nitya 1 sibling, 1 reply; 11+ messages in thread From: Patrick McHardy @ 2007-05-24 5:41 UTC (permalink / raw) To: David Miller Cc: mingo, herbert, kernel, linux-kernel, torvalds, akpm, tglx, netdev [-- Attachment #1: Type: text/plain, Size: 259 bytes --] David Miller wrote: >>* Herbert Xu <herbert@gondor.apana.org.au> wrote: >> >>>[NET_SCHED]: Fix qdisc_restart return value when dequeue is empty > > Applied, thanks everyone. Even though it didn't fix this problem, this patch I sent earlier is also needed. [-- Attachment #2: x --] [-- Type: text/plain, Size: 1132 bytes --] [NET_SCHED]: sch_htb: fix event cache time calculation The event cache time must be an absolute value, when no event exists it is incorrectly set to 1s instead of 1s in the future. Signed-off-by: Patrick McHardy <kaber@trash.net> --- commit 49d1023ea0ea8377e740123d5954e88a00f78b7c tree 031c210f1b5e37ade5a4fa519f5808cd49225b89 parent 637fc540b0ad22bf7971929e906e704236af06cd author Patrick McHardy <kaber@trash.net> Mon, 21 May 2007 23:24:16 +0200 committer Patrick McHardy <kaber@trash.net> Mon, 21 May 2007 23:25:51 +0200 net/sched/sch_htb.c | 5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c index 99bcec8..035788c 100644 --- a/net/sched/sch_htb.c +++ b/net/sched/sch_htb.c @@ -976,8 +976,9 @@ static struct sk_buff *htb_dequeue(struct Qdisc *sch) if (q->now >= q->near_ev_cache[level]) { event = htb_do_events(q, level); - q->near_ev_cache[level] = event ? event : - PSCHED_TICKS_PER_SEC; + if (!event) + event = q->now + PSCHED_TICKS_PER_SEC; + q->near_ev_cache[level] = event; } else event = q->near_ev_cache[level]; ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: bad networking related lag in v2.6.22-rc2 2007-05-24 5:41 ` Patrick McHardy @ 2007-05-24 6:40 ` David Miller 0 siblings, 0 replies; 11+ messages in thread From: David Miller @ 2007-05-24 6:40 UTC (permalink / raw) To: kaber; +Cc: mingo, herbert, kernel, linux-kernel, torvalds, akpm, tglx, netdev From: Patrick McHardy <kaber@trash.net> Date: Thu, 24 May 2007 07:41:00 +0200 > David Miller wrote: > >>* Herbert Xu <herbert@gondor.apana.org.au> wrote: > >> > >>>[NET_SCHED]: Fix qdisc_restart return value when dequeue is empty > > > > Applied, thanks everyone. > > > Even though it didn't fix this problem, this patch I sent earlier is > also needed. Thanks a lot for reminding me about this patch Patrick, applied. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: bad networking related lag in v2.6.22-rc2 2007-05-23 21:30 ` David Miller 2007-05-24 5:41 ` Patrick McHardy @ 2007-05-24 7:12 ` Anant Nitya 1 sibling, 0 replies; 11+ messages in thread From: Anant Nitya @ 2007-05-24 7:12 UTC (permalink / raw) To: linux-kernel Cc: David Miller, mingo, herbert, kaber, torvalds, akpm, tglx, netdev On Thursday 24 May 2007 03:00:56 David Miller wrote: > From: Ingo Molnar <mingo@elte.hu> > Date: Wed, 23 May 2007 13:40:21 +0200 > > > * Herbert Xu <herbert@gondor.apana.org.au> wrote: > > > [NET_SCHED]: Fix qdisc_restart return value when dequeue is empty > > > > > > My previous patch that changed the return value of qdisc_restart > > > incorrectly made the case where dequeue returns empty continue > > > processing packets. > > > > > > This patch is based on diagnosis and fix by Patrick McHardy. > > > > > > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> > > > > also: > > > > Reported-and-debugged-by: Anant Nitya <kernel@prachanda.info> > > Applied, thanks everyone. Networking lag I been seeing since 2.6.22-rc1, disappeared after applying this patch. Thanks to everyone who helped me run my system sane again. :) Reagards Ananitya -- Out of many thousands, one may endeavor for perfection, and of those who have achieved perfection, hardly one knows Me in truth. -- Gita Sutra Of Mysticism ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2007-05-24 7:14 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20070517174533.GA538@elte.hu>
[not found] ` <200705221147.56571.kernel@prachanda.hub>
[not found] ` <20070522062233.GA20002@elte.hu>
[not found] ` <200705231110.44526.kernel@prachanda.hub>
[not found] ` <20070523063052.GB26814@elte.hu>
2007-05-23 10:56 ` bad networking related lag in v2.6.22-rc2 Patrick McHardy
2007-05-23 11:05 ` Ingo Molnar
2007-05-23 11:25 ` Herbert Xu
2007-05-23 11:33 ` Patrick McHardy
2007-05-23 15:00 ` Linus Torvalds
2007-05-23 17:16 ` Patrick McHardy
2007-05-23 11:40 ` Ingo Molnar
2007-05-23 21:30 ` David Miller
2007-05-24 5:41 ` Patrick McHardy
2007-05-24 6:40 ` David Miller
2007-05-24 7:12 ` Anant Nitya
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).