netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jarek Poplawski <jarkao2@o2.pl>
To: slavon@bigtelecom.ru
Cc: netdev@vger.kernel.org
Subject: Re: Tc bug (kernel crash) more info
Date: Thu, 30 Aug 2007 08:31:10 +0200	[thread overview]
Message-ID: <20070830063110.GB1677@ff.dom.local> (raw)
In-Reply-To: <20070830001632.ki4u5bx9sow40o4s@mail.himki.net>

On Thu, Aug 30, 2007 at 12:16:32AM +0400, slavon@bigtelecom.ru wrote:
> Quoting Jarek Poplawski <jarkao2@o2.pl>:
> 
> >On Wed, Aug 29, 2007 at 04:53:52PM +0400, Badalian Vyacheslav wrote:
> >...
> >>we have this kernel panic (then delete HTB) at all 2.6.18-x versions.
> >>on older kernel (2.6.x) we have another panic (then delete tc filter)...
> >>summary we have TC panics 1 year ago ;) Sysctl option "reboot on panic"
> >
> >I'm not sure: do you mean it was less often? Did you try to report it
> >here? (Delete HTB: qdisc or classes?)
> >
> 
> i was can't catch bug. now i have configured netconsole to catch panics.
> for every clinet run command like:

If some error repeats you should report it even without logs. Sometimes
people here could help to catch this, but at least they know something
is wrong around and look at the code more carefully.

> 
> ### command to recreate HTB
> tc filter del dev eth1 protocol ip parent 1:0 prio 5 handle 4:9:a1 u32
...

I need more time to think about it.

> In my desktop system i have "Black dead" (2.6.22-r5) All freeze (on  
> monitor KDE desctop. mouse, keyboard, network and other not work. HDD  
> led is on. No panics.)
> 
> Say that info you need. I will try get it.

I still think, at least .config and dmesg could be interesting.

> 
> PS. And also have we have strange bug in another computer (2.6.22-r5).
> Have computer XEON_CPUx2 (4 CPU)
> 
> after boot have CPU0 and CPU3 SI = ~50%
> after some time CPU0 SI = 0% and ksoftirqd/2 process have 100% cpu usage!
> nat-new ~ # cat /proc/interrupts
>            CPU0       CPU1       CPU2       CPU3
>   0:        403          0          0          0   IO-APIC-edge      timer
...
> LOC:   89312505   89314019   89310139   89313972
> ERR:          0
> MIS:          0
> 
> changes only LOC interrupts!
> 
> Maybe its info intresting for you. =)

Yes. It seems something loops or breaks with disabled interrupts. If
it's possible on this box try this 2.6.23-rc4 (and as minimum devices
and as maximum debug options in config as possible). Without anything
in logs or from the screen it could be hard, so maybe you need to
experiment with different configs and kernel versions.

Thanks,
Jarek P.

PS: if it's possible you can try this patch maybe with some fake load
plus these tc scripts (for testing only, linux 2.6.22.5).

---

diff -Nurp linux-2.6.22.5-/net/sched/sch_htb.c linux-2.6.22.5/net/sched/sch_htb.c
--- linux-2.6.22.5-/net/sched/sch_htb.c	2007-07-09 01:32:17.000000000 +0200
+++ linux-2.6.22.5/net/sched/sch_htb.c	2007-08-29 20:32:26.000000000 +0200
@@ -394,6 +394,14 @@ static void htb_safe_rb_erase(struct rb_
 {
 	if (RB_EMPTY_NODE(rb)) {
 		WARN_ON(1);
+	} else if (RB_EMPTY_ROOT(root)) {
+		WARN_ON(1);
+	} else if (((unsigned long)rb & ~3) == 0) {
+		WARN_ON(1);
+	} else if (((unsigned long)root & ~3) == 0) {
+		WARN_ON(1);
+	} else if (rb_parent(rb) == NULL) {
+		WARN_ON(1);
 	} else {
 		rb_erase(rb, root);
 		RB_CLEAR_NODE(rb);
@@ -688,7 +696,11 @@ static void htb_rate_timer(unsigned long
 
 
 	/* lock queue so that we can muck with it */
-	spin_lock_bh(&sch->dev->queue_lock);
+	if (!spin_trylock_bh(&sch->dev->queue_lock)) {
+		q->rttim.expires = jiffies + 1;
+		add_timer(&q->rttim);
+		return;
+	}
 
 	q->rttim.expires = jiffies + HZ;
 	add_timer(&q->rttim);
@@ -1306,7 +1318,8 @@ static void htb_destroy(struct Qdisc *sc
 
 	qdisc_watchdog_cancel(&q->watchdog);
 #ifdef HTB_RATECM
-	del_timer_sync(&q->rttim);
+	if (!del_timer_sync(&q->rttim))
+		del_timer(&q->rttim);
 #endif
 	/* This line used to be after htb_destroy_class call below
 	   and surprisingly it worked in 2.4. But it must precede it

  reply	other threads:[~2007-08-30  6:29 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-08-29  9:34 Tc bug (kernel crash) more info Badalian Vyacheslav
2007-08-29 11:34 ` Jarek Poplawski
2007-08-29 12:14   ` Jarek Poplawski
2007-08-29 12:53     ` Badalian Vyacheslav
2007-08-29 13:30       ` Jarek Poplawski
2007-08-29 20:16         ` slavon
2007-08-30  6:31           ` Jarek Poplawski [this message]
2007-08-30  7:27             ` Jarek Poplawski
2007-08-30  9:09               ` Badalian Vyacheslav
2007-08-30 12:37                 ` Jarek Poplawski
2007-08-30 13:43                   ` Badalian Vyacheslav
2007-08-31  7:04                   ` Badalian Vyacheslav
2007-08-31  7:59                     ` Jarek Poplawski
2007-08-31  8:25                       ` Badalian Vyacheslav
2007-08-31  8:49                         ` Jarek Poplawski
2007-08-31  9:05                         ` Jarek Poplawski
2007-08-31  9:16                           ` Jarek Poplawski
2007-08-31  9:33                           ` Badalian Vyacheslav
2007-08-31 10:17                             ` Jarek Poplawski
2007-08-31 10:48                               ` Badalian Vyacheslav
2007-08-31 12:59                                 ` Jarek Poplawski
2007-08-31 14:31                                   ` Badalian Vyacheslav
2007-08-31 14:51                                     ` Badalian Vyacheslav
     [not found]                                       ` <20070831215850.zf2xi256o00owk4s@mail.himki.net>
2007-09-01 10:36                                         ` slavon
2007-09-03  7:31                                       ` Jarek Poplawski
2007-09-03  8:05                                         ` Badalian Vyacheslav
2007-09-03  8:31                                           ` Badalian Vyacheslav
2007-09-03  9:12                                             ` Jarek Poplawski
2007-08-31 10:50                               ` Badalian Vyacheslav
2007-08-31 10:59                                 ` Badalian Vyacheslav
2007-08-31 11:28                                   ` Jarek Poplawski
2007-08-31 12:14                                     ` Badalian Vyacheslav

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070830063110.GB1677@ff.dom.local \
    --to=jarkao2@o2.pl \
    --cc=netdev@vger.kernel.org \
    --cc=slavon@bigtelecom.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).