netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Fw: [Bugme-new] [Bug 3747] New: HTB causes machine lockups
@ 2004-11-15 20:20 Andrew Morton
  2004-11-15 23:38 ` Thomas Graf
  0 siblings, 1 reply; 3+ messages in thread
From: Andrew Morton @ 2004-11-15 20:20 UTC (permalink / raw)
  To: netdev



Begin forwarded message:

Date: Mon, 15 Nov 2004 04:46:38 -0800
From: bugme-daemon@osdl.org
To: bugme-new@lists.osdl.org
Subject: [Bugme-new] [Bug 3747] New: HTB causes machine lockups


http://bugme.osdl.org/show_bug.cgi?id=3747

           Summary: HTB causes machine lockups
    Kernel Version: 2.4.26
            Status: NEW
          Severity: high
             Owner: acme@conectiva.com.br
         Submitter: alchemyx@uznam.net.pl


Distribution: PLD Linux 1.1 with 2.4.26 vanilla kernel
Hardware Environment: 2x Xeon 2.4 GHz, 1 GB of ram, 2x e1000 NICs
Software Environment: iproute2-2.4.7.ss020116
Problem Description:

I have some network traffic shaping rules used on a Linux box. For example I
have hundreds of them on eth1. I am using HTB in all cases (sometimes attaching
SFQ to HTB rules). Everything works fine, until I issue:

tc qdisc del dev eth1 root

Which causes a machine lockup. Sometimes it happens day after day, but sometimes
once a month. There are no traces in logs, no oopses, and so on. Machine just
freezes (you see login on your screen, but can't type anything, can't ping that
machine, ctrl-alt-del or sysrq doesn't work). Problem was on kernels from 2.4.21
up to 2.4.26. Haven't tried 2.4.27 and 2.6.x on that box yet.

Steps to reproduce:
Hard to tell really. You need to set up a script creating hunderds of HTB rules
and regenerating them every (let say) 15 minutes, issuing 'tc qdisc del dev
ethX' everytime.

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Fw: [Bugme-new] [Bug 3747] New: HTB causes machine lockups
  2004-11-15 20:20 Fw: [Bugme-new] [Bug 3747] New: HTB causes machine lockups Andrew Morton
@ 2004-11-15 23:38 ` Thomas Graf
  2004-11-22 18:10   ` Michał Margula
  0 siblings, 1 reply; 3+ messages in thread
From: Thomas Graf @ 2004-11-15 23:38 UTC (permalink / raw)
  To: alchemyx; +Cc: netdev, Andrew Morton

> Date: Mon, 15 Nov 2004 04:46:38 -0800
> From: bugme-daemon@osdl.org
> To: bugme-new@lists.osdl.org
> Subject: [Bugme-new] [Bug 3747] New: HTB causes machine lockups
> 
> I have some network traffic shaping rules used on a Linux box. For example I
> have hundreds of them on eth1. I am using HTB in all cases (sometimes attaching
> SFQ to HTB rules). Everything works fine, until I issue:
> 
> tc qdisc del dev eth1 root
> 
> Which causes a machine lockup. Sometimes it happens day after day, but sometimes
> once a month. There are no traces in logs, no oopses, and so on. Machine just
> freezes (you see login on your screen, but can't type anything, can't ping that
> machine, ctrl-alt-del or sysrq doesn't work). Problem was on kernels from 2.4.21
> up to 2.4.26. Haven't tried 2.4.27 and 2.6.x on that box yet.
>
> Steps to reproduce:
> Hard to tell really. You need to set up a script creating hunderds of HTB rules
> and regenerating them every (let say) 15 minutes, issuing 'tc qdisc del dev
> ethX' everytime.

Did you ever experience the same problems with other classful qdiscs such as
CBQ? Would it be possible for you to run a similar CBQ setup on a dummy device
and see if it happens as well? Is anything going on at the time of the
deadlocks like an interface going down etc.?

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Fw: [Bugme-new] [Bug 3747] New: HTB causes machine lockups
  2004-11-15 23:38 ` Thomas Graf
@ 2004-11-22 18:10   ` Michał Margula
  0 siblings, 0 replies; 3+ messages in thread
From: Michał Margula @ 2004-11-22 18:10 UTC (permalink / raw)
  To: Thomas Graf; +Cc: netdev, Andrew Morton

Thomas Graf napisał(a):

>Did you ever experience the same problems with other classful qdiscs such as
>CBQ? Would it be possible for you to run a similar CBQ setup on a dummy device
>and see if it happens as well? Is anything going on at the time of the
>deadlocks like an interface going down etc.?
>
>  
>
At the moment I can't make such CBQ configuration, because I have no 
free time, too much work. But in about few weeks I am going to convert 
that box 2.6.x kernels. After that if hangs don't stop, we can think 
about converting that to CBQ, ok?

And by the way, today I got that in my logs:

Nov 22 17:50:01 sauron kernel: Unable to handle kernel NULL pointer 
dereference at virtual address 00000000
Nov 22 17:50:01 sauron kernel:  printing eip:
Nov 22 17:50:01 sauron kernel: f89cb555
Nov 22 17:50:01 sauron kernel: *pde = 00000000
Nov 22 17:50:01 sauron kernel: Oops: 0000
Nov 22 17:50:01 sauron kernel: CPU:    3
Nov 22 17:50:01 sauron kernel: EIP:    
0010:[cls_u32:__insmod_cls_u32_O/lib/modules/2.4.26/kernel/net/sched/cls_+-146091/96]    
Not tainted
Nov 22 17:50:01 sauron kernel: EFLAGS: 00010202
Nov 22 17:50:01 sauron kernel: eax: 000000b0   ebx: 00000000   ecx: 
c02f9888   edx: 000000b0
Nov 22 17:50:01 sauron kernel: esi: f04a7dd8   edi: f4eb5068   ebp: 
f4eb5068   esp: f04a7d90
Nov 22 17:50:01 sauron kernel: ds: 0018   es: 0018   ss: 0018
Nov 22 17:50:01 sauron kernel: Process tc (pid: 6600, stackpage=f04a7000)
Nov 22 17:50:01 sauron kernel: Stack: f4eb5000 00000930 c57fd824 
c57fd800 f4eb5068 00000000 f4eb5060 c02021c5
Nov 22 17:50:01 sauron kernel:        f4eb5000 f04a7dd8 e6eb8880 
c92aab00 c57fd800 c92aab00 c57fd818 dc9a8e90
Nov 22 17:50:01 sauron kernel:        f7ceb000 0000092f 00000000 
00000000 00000001 c02020b4 e6eb8880 c57fd800
Nov 22 17:50:01 sauron kernel: Call Trace:    [tc_dump_tclass+221/304] 
[qdisc_class_dump+0/52] [netlink_dump+130/464] [skb_free_datagram+29/36] 
[netlink_recvmsg+190/300]
Nov 22 17:50:01 sauron kernel:   [netlink_recvmsg+229/300] 
[sock_recvmsg+61/188] [sys_recvmsg+356/516] [handle_mm_fault+92/188] 
[schedule+1115/1312] [pipe_write+518/616]
Nov 22 17:50:01 sauron kernel:   [sys_socketcall+501/512] 
[system_call+51/56]
Nov 22 17:50:01 sauron kernel:
Nov 22 17:50:01 sauron kernel: Code: 8b 03 0f 18 00 39 fb 75 c2 83 44 24 
10 08 83 c5 08 ff 44 24
Nov 22 17:50:01 sauron kernel:  <6>HTB init, kernel part version 3.16

And my HTB rules stopped working.

-- 
Michał Margula, alchemyx@uznam.net.pl, http://alchemyx.uznam.net.pl/
"W życiu piękne są tylko chwile" [Ryszard Riedel]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2004-11-22 18:10 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-11-15 20:20 Fw: [Bugme-new] [Bug 3747] New: HTB causes machine lockups Andrew Morton
2004-11-15 23:38 ` Thomas Graf
2004-11-22 18:10   ` Michał Margula

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).