* Tc bug (kernel crash) more info
@ 2007-08-29 9:34 Badalian Vyacheslav
2007-08-29 11:34 ` Jarek Poplawski
0 siblings, 1 reply; 32+ messages in thread
From: Badalian Vyacheslav @ 2007-08-29 9:34 UTC (permalink / raw)
To: netdev
Again crash. Need more posts of panic or this message have full info
that needed to fix bug?
BUG: unable to handle kernel NULL pointer dereference at virtual address
00000008
printing eip:
c01bf041
*pde = 00000000
Oops: 0000 [#1]
SMP
Modules linked in: cls_u32 sch_sfq sch_htb netconsole xt_tcpudp
iptable_filter ip_tables x_tables e752x_edac edac_mc i2c_i801
CPU: 2
EIP: 0060:[<c01bf041>] Not tainted VLI
EFLAGS: 00010282 (2.6.22-gentoo-r5-fw #6)
EIP is at rb_erase+0x110/0x22f
eax: e40fa334 ebx: 00000000 ecx: 00000000 edx: e40fa334
esi: e6add334 edi: e5a86134 ebp: f6840428 esp: c21c5d20
ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0068
Process swapper (pid: 0, ti=c21c4000 task=c21b8a90 task.ti=c21c4000)
Stack: 00000001 e5a86134 00000000 e5a86000 00000055 f88391a7 f6840080
00057857
00000000 f0ac4600 f6840080 f883aa3d f6e53ec0 e6b18380 f6e53ec0
00000008
f6840428 f6840000 00000000 f6840080 00000000 61cf32bc 00000001
e6b18380
Call Trace:
[<f88391a7>] htb_safe_rb_erase+0x43/0x51 [sch_htb]
[<f883aa3d>] htb_dequeue+0x145/0x6d4 [sch_htb]
[<f88618fe>] sfq_enqueue+0x1c/0x18a [sch_sfq]
[<c02b7592>] __qdisc_run+0x1e/0x188
[<c02adcce>] dev_queue_xmit+0x152/0x25c
[<c02c8f77>] ip_output+0x280/0x2b9
[<c02c51cc>] ip_forward_finish+0x0/0x2e
[<c02c5465>] ip_forward+0x26b/0x2c6
[<c02c51cc>] ip_forward_finish+0x0/0x2e
[<c02c41fb>] ip_rcv+0x484/0x4bd
[<c02a8a0d>] __netdev_alloc_skb+0x1c/0x35
[<c02abd54>] netif_receive_skb+0x2b8/0x319
[<c0238034>] e1000_clean_rx_irq+0x375/0x441
[<c0237cbf>] e1000_clean_rx_irq+0x0/0x441
[<c02370ea>] e1000_clean+0x71/0x237
[<c02ada90>] net_rx_action+0x91/0x17d
[<c011c39a>] __do_softirq+0x5d/0xc1
[<c011c430>] do_softirq+0x32/0x36
[<c010439a>] do_IRQ+0x7e/0x90
[<c010d461>] smp_apic_timer_interrupt+0x74/0x80
[<c010439a>] do_IRQ+0x7e/0x90
[<c0102ed3>] common_interrupt+0x23/0x28
[<c0100ab2>] mwait_idle_with_hints+0x3c/0x40
[<c0100bbe>] cpu_idle+0x5a/0x6f
=======================
Code: 01 00 00 8b 4e 08 39 d9 0f 85 85 00 00 00 8b 4e 04 8b 01 a8 01 75
14 83 c8 01 89 ea 89 01 89 f0 83 26 fe e8 1e fd ff ff 8b 4e 04 <8b> 59
08 85 db 74 06 8b 03 a8 01 74 15 8b 41 04 85 c0 0f 84 c6
EIP: [<c01bf041>] rb_erase+0x110/0x22f SS:ESP 0068:c21c5d20
Kernel panic - not syncing: Fatal exception in interrupt
Rebooting in 3 seconds..
^ permalink raw reply [flat|nested] 32+ messages in thread* Re: Tc bug (kernel crash) more info 2007-08-29 9:34 Tc bug (kernel crash) more info Badalian Vyacheslav @ 2007-08-29 11:34 ` Jarek Poplawski 2007-08-29 12:14 ` Jarek Poplawski 0 siblings, 1 reply; 32+ messages in thread From: Jarek Poplawski @ 2007-08-29 11:34 UTC (permalink / raw) To: Badalian Vyacheslav; +Cc: netdev On 29-08-2007 11:34, Badalian Vyacheslav wrote: > Again crash. Need more posts of panic or this message have full info > that needed to fix bug? Hi, Please, try to not create new threads each time: reply to the previous one if you have something new. And this one doesn't seem to show more. You have written earlier it's '1-5 times on week', so you should have got used to it a little, so no need to panic... You would better try to write if there was some previous kernel version, which worked better for you? It seems, there could be some locking problem and your script could mess htb queue from the second cpu (or is interrupted). Probably you could have something more in logs about this, and maybe even this script could be helpful (you should mask secret things only). Or maybe you could try to add some echos to this script to figure out the part which is the most suspected. Of course .config and dmesg (zipped) could be helpful too. If it's possible you can try it shortly without e.g. netconsole or even without CONFIG_SMP. Regards, Jarek P. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Tc bug (kernel crash) more info 2007-08-29 11:34 ` Jarek Poplawski @ 2007-08-29 12:14 ` Jarek Poplawski 2007-08-29 12:53 ` Badalian Vyacheslav 0 siblings, 1 reply; 32+ messages in thread From: Jarek Poplawski @ 2007-08-29 12:14 UTC (permalink / raw) To: Badalian Vyacheslav; +Cc: netdev On Wed, Aug 29, 2007 at 01:34:47PM +0200, Jarek Poplawski wrote: > On 29-08-2007 11:34, Badalian Vyacheslav wrote: > > Again crash. Need more posts of panic or this message have full info > > that needed to fix bug? ... > If it's possible you can try it shortly without e.g. netconsole or > even without CONFIG_SMP. ...or maybe even dare to try something current like 2.6.23-rc4? Jarek P. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Tc bug (kernel crash) more info 2007-08-29 12:14 ` Jarek Poplawski @ 2007-08-29 12:53 ` Badalian Vyacheslav 2007-08-29 13:30 ` Jarek Poplawski 0 siblings, 1 reply; 32+ messages in thread From: Badalian Vyacheslav @ 2007-08-29 12:53 UTC (permalink / raw) To: Jarek Poplawski; +Cc: netdev Jarek Poplawski пишет: > On Wed, Aug 29, 2007 at 01:34:47PM +0200, Jarek Poplawski wrote: > >> On 29-08-2007 11:34, Badalian Vyacheslav wrote: >> >>> Again crash. Need more posts of panic or this message have full info >>> that needed to fix bug? >>> > ... > >> If it's possible you can try it shortly without e.g. netconsole or >> even without CONFIG_SMP. >> > > ...or maybe even dare to try something current like 2.6.23-rc4? > > Jarek P. > > we have this kernel panic (then delete HTB) at all 2.6.18-x versions. on older kernel (2.6.x) we have another panic (then delete tc filter)... summary we have TC panics 1 year ago ;) Sysctl option "reboot on panic" save us. Now we up 2 backup computers and may try any patches to fix this problem. Also on 2.6.22 have strange dead. Black screen, no response to keyboard, no info in netconsole, HardDisk led is stable red. "Black Dead" ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Tc bug (kernel crash) more info 2007-08-29 12:53 ` Badalian Vyacheslav @ 2007-08-29 13:30 ` Jarek Poplawski 2007-08-29 20:16 ` slavon 0 siblings, 1 reply; 32+ messages in thread From: Jarek Poplawski @ 2007-08-29 13:30 UTC (permalink / raw) To: Badalian Vyacheslav; +Cc: netdev On Wed, Aug 29, 2007 at 04:53:52PM +0400, Badalian Vyacheslav wrote: ... > we have this kernel panic (then delete HTB) at all 2.6.18-x versions. > on older kernel (2.6.x) we have another panic (then delete tc filter)... > summary we have TC panics 1 year ago ;) Sysctl option "reboot on panic" I'm not sure: do you mean it was less often? Did you try to report it here? (Delete HTB: qdisc or classes?) > save us. Now we up 2 backup computers and may try any patches to fix > this problem. > > Also on 2.6.22 have strange dead. Black screen, no response to keyboard, > no info in netconsole, HardDisk led is stable red. "Black Dead" > Yes, with all black it could be harder... Maybe 'set -x' at the beginning (after #!/bin/sh line) of a script could manage to save something before reboot or send with netconsole (but there could be a lot of this with a large script...). Netconsole could be troublesome too. One HTB deadlock problem during similar deleting was fixed in 2.6.23-rc (HTB timer problem) but the log was different. Anyway, we probably need some more information (and trying). Jarek P. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Tc bug (kernel crash) more info 2007-08-29 13:30 ` Jarek Poplawski @ 2007-08-29 20:16 ` slavon 2007-08-30 6:31 ` Jarek Poplawski 0 siblings, 1 reply; 32+ messages in thread From: slavon @ 2007-08-29 20:16 UTC (permalink / raw) To: Jarek Poplawski; +Cc: netdev Quoting Jarek Poplawski <jarkao2@o2.pl>: > On Wed, Aug 29, 2007 at 04:53:52PM +0400, Badalian Vyacheslav wrote: > ... >> we have this kernel panic (then delete HTB) at all 2.6.18-x versions. >> on older kernel (2.6.x) we have another panic (then delete tc filter)... >> summary we have TC panics 1 year ago ;) Sysctl option "reboot on panic" > > I'm not sure: do you mean it was less often? Did you try to report it > here? (Delete HTB: qdisc or classes?) > i was can't catch bug. now i have configured netconsole to catch panics. for every clinet run command like: ### command to recreate HTB tc filter del dev eth1 protocol ip parent 1:0 prio 5 handle 4:9:a1 u32 tc filter del dev eth0 protocol ip parent 1:0 prio 5 handle 4:9:a1 u32 tc class del dev eth1 parent 1:6 classid 1:1c tc class del dev eth0 parent 1:6 classid 1:1c tc class del dev eth1 parent 1:8 classid 1:1c tc class del dev eth0 parent 1:8 classid 1:1c tc class add dev eth1 parent 1:8 classid 1:1c htb rate 1kbit ceil 5000kbit burst 1b cburst 625b quantum 1500 tc qdisc add dev eth1 parent 1:1c handle 28 sfq perturb 10 tc class add dev eth0 parent 1:8 classid 1:1c htb rate 1kbit ceil 5000kbit burst 1b cburst 625b quantum 1500 tc qdisc add dev eth0 parent 1:1c handle 28 sfq perturb 10 tc filter add dev eth1 protocol ip parent 1:0 prio 5 handle 8:73:6 u32 ht 8:73: match ip dst 87.255.6.115 flowid 1:1c tc filter add dev eth0 protocol ip parent 1:0 prio 5 handle 8:73:6 u32 ht 8:73: match ip src 87.255.6.115 flowid 1:1c tc filter add dev eth1 protocol ip parent 1:0 prio 5 handle 4:9:a1 u32 ht 4:9: match ip dst 172.16.161.9 flowid 1:1c tc filter add dev eth0 protocol ip parent 1:0 prio 5 handle 4:9:a1 u32 ht 4:9: match ip src 172.16.161.9 flowid 1:1c ### i try delete class "parent 1:6" and "parent 1:8" because i not know what parent was. (limited speed class 1:6 or unlimited speed class 1:8) if i delete class - qdisc delete automatic. if computer not have traffic - all normal (test system work 3 week). if we have lot of traffic system have 1-5 kernel panics at week. >> save us. Now we up 2 backup computers and may try any patches to fix >> this problem. >> >> Also on 2.6.22 have strange dead. Black screen, no response to keyboard, >> no info in netconsole, HardDisk led is stable red. "Black Dead" >> > > Yes, with all black it could be harder... Maybe 'set -x' at the > beginning (after #!/bin/sh line) of a script could manage to save > something before reboot or send with netconsole (but there could be > a lot of this with a large script...). Netconsole could be troublesome > too. One HTB deadlock problem during similar deleting was fixed in > 2.6.23-rc (HTB timer problem) but the log was different. Anyway, > we probably need some more information (and trying). In my desktop system i have "Black dead" (2.6.22-r5) All freeze (on monitor KDE desctop. mouse, keyboard, network and other not work. HDD led is on. No panics.) Say that info you need. I will try get it. PS. And also have we have strange bug in another computer (2.6.22-r5). Have computer XEON_CPUx2 (4 CPU) after boot have CPU0 and CPU3 SI = ~50% after some time CPU0 SI = 0% and ksoftirqd/2 process have 100% cpu usage! nat-new ~ # cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 0: 403 0 0 0 IO-APIC-edge timer 1: 448 0 0 0 IO-APIC-edge i8042 6: 3 0 0 0 IO-APIC-edge floppy 8: 3 0 0 0 IO-APIC-edge rtc 9: 18 0 0 0 IO-APIC-fasteoi acpi 12: 4 0 0 0 IO-APIC-edge i8042 16: 100838998 0 656832858 0 IO-APIC-fasteoi eth0 17: 756133415 0 124233955 1 IO-APIC-fasteoi eth1 18: 0 0 0 0 IO-APIC-fasteoi uhci_hcd:usb1 19: 27167 0 0 0 IO-APIC-fasteoi gdth 20: 0 0 0 0 IO-APIC-fasteoi uhci_hcd:usb2 NMI: 0 0 0 0 LOC: 89312505 89314019 89310139 89313972 ERR: 0 MIS: 0 changes only LOC interrupts! Maybe its info intresting for you. =) Best regals. ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Tc bug (kernel crash) more info 2007-08-29 20:16 ` slavon @ 2007-08-30 6:31 ` Jarek Poplawski 2007-08-30 7:27 ` Jarek Poplawski 0 siblings, 1 reply; 32+ messages in thread From: Jarek Poplawski @ 2007-08-30 6:31 UTC (permalink / raw) To: slavon; +Cc: netdev On Thu, Aug 30, 2007 at 12:16:32AM +0400, slavon@bigtelecom.ru wrote: > Quoting Jarek Poplawski <jarkao2@o2.pl>: > > >On Wed, Aug 29, 2007 at 04:53:52PM +0400, Badalian Vyacheslav wrote: > >... > >>we have this kernel panic (then delete HTB) at all 2.6.18-x versions. > >>on older kernel (2.6.x) we have another panic (then delete tc filter)... > >>summary we have TC panics 1 year ago ;) Sysctl option "reboot on panic" > > > >I'm not sure: do you mean it was less often? Did you try to report it > >here? (Delete HTB: qdisc or classes?) > > > > i was can't catch bug. now i have configured netconsole to catch panics. > for every clinet run command like: If some error repeats you should report it even without logs. Sometimes people here could help to catch this, but at least they know something is wrong around and look at the code more carefully. > > ### command to recreate HTB > tc filter del dev eth1 protocol ip parent 1:0 prio 5 handle 4:9:a1 u32 ... I need more time to think about it. > In my desktop system i have "Black dead" (2.6.22-r5) All freeze (on > monitor KDE desctop. mouse, keyboard, network and other not work. HDD > led is on. No panics.) > > Say that info you need. I will try get it. I still think, at least .config and dmesg could be interesting. > > PS. And also have we have strange bug in another computer (2.6.22-r5). > Have computer XEON_CPUx2 (4 CPU) > > after boot have CPU0 and CPU3 SI = ~50% > after some time CPU0 SI = 0% and ksoftirqd/2 process have 100% cpu usage! > nat-new ~ # cat /proc/interrupts > CPU0 CPU1 CPU2 CPU3 > 0: 403 0 0 0 IO-APIC-edge timer ... > LOC: 89312505 89314019 89310139 89313972 > ERR: 0 > MIS: 0 > > changes only LOC interrupts! > > Maybe its info intresting for you. =) Yes. It seems something loops or breaks with disabled interrupts. If it's possible on this box try this 2.6.23-rc4 (and as minimum devices and as maximum debug options in config as possible). Without anything in logs or from the screen it could be hard, so maybe you need to experiment with different configs and kernel versions. Thanks, Jarek P. PS: if it's possible you can try this patch maybe with some fake load plus these tc scripts (for testing only, linux 2.6.22.5). --- diff -Nurp linux-2.6.22.5-/net/sched/sch_htb.c linux-2.6.22.5/net/sched/sch_htb.c --- linux-2.6.22.5-/net/sched/sch_htb.c 2007-07-09 01:32:17.000000000 +0200 +++ linux-2.6.22.5/net/sched/sch_htb.c 2007-08-29 20:32:26.000000000 +0200 @@ -394,6 +394,14 @@ static void htb_safe_rb_erase(struct rb_ { if (RB_EMPTY_NODE(rb)) { WARN_ON(1); + } else if (RB_EMPTY_ROOT(root)) { + WARN_ON(1); + } else if (((unsigned long)rb & ~3) == 0) { + WARN_ON(1); + } else if (((unsigned long)root & ~3) == 0) { + WARN_ON(1); + } else if (rb_parent(rb) == NULL) { + WARN_ON(1); } else { rb_erase(rb, root); RB_CLEAR_NODE(rb); @@ -688,7 +696,11 @@ static void htb_rate_timer(unsigned long /* lock queue so that we can muck with it */ - spin_lock_bh(&sch->dev->queue_lock); + if (!spin_trylock_bh(&sch->dev->queue_lock)) { + q->rttim.expires = jiffies + 1; + add_timer(&q->rttim); + return; + } q->rttim.expires = jiffies + HZ; add_timer(&q->rttim); @@ -1306,7 +1318,8 @@ static void htb_destroy(struct Qdisc *sc qdisc_watchdog_cancel(&q->watchdog); #ifdef HTB_RATECM - del_timer_sync(&q->rttim); + if (!del_timer_sync(&q->rttim)) + del_timer(&q->rttim); #endif /* This line used to be after htb_destroy_class call below and surprisingly it worked in 2.4. But it must precede it ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Tc bug (kernel crash) more info 2007-08-30 6:31 ` Jarek Poplawski @ 2007-08-30 7:27 ` Jarek Poplawski 2007-08-30 9:09 ` Badalian Vyacheslav 0 siblings, 1 reply; 32+ messages in thread From: Jarek Poplawski @ 2007-08-30 7:27 UTC (permalink / raw) To: slavon; +Cc: netdev On Thu, Aug 30, 2007 at 08:31:10AM +0200, Jarek Poplawski wrote: > On Thu, Aug 30, 2007 at 12:16:32AM +0400, slavon@bigtelecom.ru wrote: ... > > PS. And also have we have strange bug in another computer (2.6.22-r5). > > Have computer XEON_CPUx2 (4 CPU) > > > > after boot have CPU0 and CPU3 SI = ~50% > > after some time CPU0 SI = 0% and ksoftirqd/2 process have 100% cpu usage! > > nat-new ~ # cat /proc/interrupts > > CPU0 CPU1 CPU2 CPU3 > > 0: 403 0 0 0 IO-APIC-edge timer > ... > > LOC: 89312505 89314019 89310139 89313972 > > ERR: 0 > > MIS: 0 > > > > changes only LOC interrupts! > > > > Maybe its info intresting for you. =) > > Yes. It seems something loops or breaks with disabled interrupts. If On the other hand disabling local interrupts shouldn't be enough here, so it's really strange... Did you get this remotely? Are you sure LOC only? (Anyway this 2.6.23-rc4 should be interesting.) Jarek P. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Tc bug (kernel crash) more info 2007-08-30 7:27 ` Jarek Poplawski @ 2007-08-30 9:09 ` Badalian Vyacheslav 2007-08-30 12:37 ` Jarek Poplawski 0 siblings, 1 reply; 32+ messages in thread From: Badalian Vyacheslav @ 2007-08-30 9:09 UTC (permalink / raw) To: Jarek Poplawski; +Cc: netdev Jarek Poplawski пишет: > On Thu, Aug 30, 2007 at 08:31:10AM +0200, Jarek Poplawski wrote: > >> On Thu, Aug 30, 2007 at 12:16:32AM +0400, slavon@bigtelecom.ru wrote: >> > ... > >>> PS. And also have we have strange bug in another computer (2.6.22-r5). >>> Have computer XEON_CPUx2 (4 CPU) >>> >>> after boot have CPU0 and CPU3 SI = ~50% >>> after some time CPU0 SI = 0% and ksoftirqd/2 process have 100% cpu usage! >>> nat-new ~ # cat /proc/interrupts >>> CPU0 CPU1 CPU2 CPU3 >>> 0: 403 0 0 0 IO-APIC-edge timer >>> >> ... >> >>> LOC: 89312505 89314019 89310139 89313972 >>> ERR: 0 >>> MIS: 0 >>> >>> changes only LOC interrupts! >>> >>> Maybe its info intresting for you. =) >>> >> Yes. It seems something loops or breaks with disabled interrupts. If >> > > On the other hand disabling local interrupts shouldn't be enough here, > so it's really strange... Did you get this remotely? Are you sure LOC > only? (Anyway this 2.6.23-rc4 should be interesting.) > > Jarek P. > > Only LOC changes... icmp answer = 50-70ms... after 1-2 hours traffic level is down and SI on CPU0 and CPU2 change to above 50%. ksoftirqd free CPU usage. I have this bug 3-4 times in week. If you need info what i can see only in bug still processing - i may try get this info for you. maybe help: 1U server INTEL, mb se7501w2 nat-new ~ # lspci 00:00.0 Host bridge: Intel Corporation E7501 Memory Controller Hub (rev 01) 00:00.1 Class ff00: Intel Corporation E7500/E7501 Host RASUM Controller (rev 01) 00:03.0 PCI bridge: Intel Corporation E7500/E7501 Hub Interface C PCI-to-PCI Bridge (rev 01) 00:03.1 Class ff00: Intel Corporation E7500/E7501 Hub Interface C RASUM Controller (rev 01) 00:1d.0 USB Controller: Intel Corporation 82801CA/CAM USB Controller #1 (rev 02) 00:1d.1 USB Controller: Intel Corporation 82801CA/CAM USB Controller #2 (rev 02) 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 42) 00:1f.0 ISA bridge: Intel Corporation 82801CA LPC Interface Controller (rev 02) 00:1f.1 IDE interface: Intel Corporation 82801CA Ultra ATA Storage Controller (rev 02) 00:1f.3 SMBus: Intel Corporation 82801CA/CAM SMBus Controller (rev 02) 01:0c.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27) 02:1c.0 PIC: Intel Corporation 82870P2 P64H2 I/OxAPIC (rev 04) 02:1d.0 PCI bridge: Intel Corporation 82870P2 P64H2 Hub PCI Bridge (rev 04) 02:1e.0 PIC: Intel Corporation 82870P2 P64H2 I/OxAPIC (rev 04) 02:1f.0 PCI bridge: Intel Corporation 82870P2 P64H2 Hub PCI Bridge (rev 04) 03:07.0 Ethernet controller: Intel Corporation 82546EB Gigabit Ethernet Controller (Copper) (rev 01) 03:07.1 Ethernet controller: Intel Corporation 82546EB Gigabit Ethernet Controller (Copper) (rev 01) 04:08.0 RAID bus controller: Intel Corporation RAID Controller ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Tc bug (kernel crash) more info 2007-08-30 9:09 ` Badalian Vyacheslav @ 2007-08-30 12:37 ` Jarek Poplawski 2007-08-30 13:43 ` Badalian Vyacheslav 2007-08-31 7:04 ` Badalian Vyacheslav 0 siblings, 2 replies; 32+ messages in thread From: Jarek Poplawski @ 2007-08-30 12:37 UTC (permalink / raw) To: Badalian Vyacheslav; +Cc: netdev On Thu, Aug 30, 2007 at 01:09:11PM +0400, Badalian Vyacheslav wrote: > Jarek Poplawski ??????????: ... > >On the other hand disabling local interrupts shouldn't be enough here, > >so it's really strange... Did you get this remotely? Are you sure LOC > >only? (Anyway this 2.6.23-rc4 should be interesting.) ... > Only LOC changes... icmp answer = 50-70ms... after 1-2 hours traffic > level is down and SI on CPU0 and CPU2 change to above 50%. ksoftirqd > free CPU usage. I have this bug 3-4 times in week. If you need info what > i can see only in bug still processing - i may try get this info for you. Any additional info could be helpful. I'm not sure if all these computers do similar htb processing, or it's another problem? As I've written before htb before 2.6.23-rc1 has a problem with timer lockup during qdisc_destroy, so softirqs would be hit. If it's htb's fault 2.6.23-rc4 or my testing patch should help. I try to find in htb code another weak points. BTW, if during such lockups any processes are killed 'by hand' etc., without restarting the whole system, please let us know. > maybe help: > > 1U server INTEL, mb se7501w2 > > nat-new ~ # lspci lspci -v (or -vv should be more usable - but with dmesg at least) Jarek P. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Tc bug (kernel crash) more info 2007-08-30 12:37 ` Jarek Poplawski @ 2007-08-30 13:43 ` Badalian Vyacheslav 2007-08-31 7:04 ` Badalian Vyacheslav 1 sibling, 0 replies; 32+ messages in thread From: Badalian Vyacheslav @ 2007-08-30 13:43 UTC (permalink / raw) To: Jarek Poplawski; +Cc: netdev [-- Attachment #1: Type: text/plain, Size: 1907 bytes --] Jarek Poplawski пишет: > On Thu, Aug 30, 2007 at 01:09:11PM +0400, Badalian Vyacheslav wrote: > >> Jarek Poplawski ??????????: >> > ... > >>> On the other hand disabling local interrupts shouldn't be enough here, >>> so it's really strange... Did you get this remotely? Are you sure LOC >>> only? (Anyway this 2.6.23-rc4 should be interesting.) >>> > ... > >> Only LOC changes... icmp answer = 50-70ms... after 1-2 hours traffic >> level is down and SI on CPU0 and CPU2 change to above 50%. ksoftirqd >> free CPU usage. I have this bug 3-4 times in week. If you need info what >> i can see only in bug still processing - i may try get this info for you. >> > > Any additional info could be helpful. I'm not sure if all these > computers do similar htb processing, or it's another problem? > As I've written before htb before 2.6.23-rc1 has a problem with > timer lockup during qdisc_destroy, so softirqs would be hit. > If it's htb's fault 2.6.23-rc4 or my testing patch should help. > > I try to find in htb code another weak points. BTW, if during > such lockups any processes are killed 'by hand' etc., without > restarting the whole system, please let us know. > > I will try patch ;) "CPU Si" is another bug at another computer. - NAT NAT: simple have iptables rules (NAT) and ipcad to generate netflow. FW: have iptables rules to close FORWARD access and TC rules for shape. Scheme of network: CORE <-> (FW 2.6.22 and FW-BackupLink) <-> NAT <-> OUT_ROUTER = 500mbs/500mbs traffic FW have kernel panics if i try Delete HTB class (i will try patch) NAT sometime have "si usage bug" if traffic more what 250mbs. txt.txt have dmesg and lspci -vv info for NAT Sorry for my English spell >> maybe help: >> >> 1U server INTEL, mb se7501w2 >> >> nat-new ~ # lspci >> > > lspci -v (or -vv should be more usable - but with dmesg at least) > > Jarek P. > > > [-- Attachment #2: txt.txt --] [-- Type: text/plain, Size: 50668 bytes --] 00:00.0 Host bridge: Intel Corporation E7501 Memory Controller Hub (rev 01) Subsystem: Intel Corporation Unknown device 341a Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 0 Capabilities: [40] Vendor Specific Information 00:00.1 Class ff00: Intel Corporation E7500/E7501 Host RASUM Controller (rev 01) Subsystem: Intel Corporation Unknown device 341a Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- 00:03.0 PCI bridge: Intel Corporation E7500/E7501 Hub Interface C PCI-to-PCI Bridge (rev 01) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 64 Bus: primary=00, secondary=02, subordinate=04, sec-latency=0 I/O behind bridge: 00002000-00002fff Memory behind bridge: fe800000-feafffff Prefetchable memory behind bridge: f8300000-fc5fffff Secondary status: 66MHz+ FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- <SERR- <PERR- BridgeCtl: Parity+ SERR+ NoISA- VGA- MAbort- >Reset- FastB2B- 00:03.1 Class ff00: Intel Corporation E7500/E7501 Hub Interface C RASUM Controller (rev 01) Subsystem: Intel Corporation Unknown device 341a Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- 00:1d.0 USB Controller: Intel Corporation 82801CA/CAM USB Controller #1 (rev 02) (prog-if 00 [UHCI]) Subsystem: Intel Corporation Unknown device 341a Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 0 Interrupt: pin A routed to IRQ 18 Region 4: I/O ports at 3020 [size=32] 00:1d.1 USB Controller: Intel Corporation 82801CA/CAM USB Controller #2 (rev 02) (prog-if 00 [UHCI]) Subsystem: Intel Corporation Unknown device 341a Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 0 Interrupt: pin B routed to IRQ 20 Region 4: I/O ports at 3000 [size=32] 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 42) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 0 Bus: primary=00, secondary=01, subordinate=01, sec-latency=32 I/O behind bridge: 00001000-00001fff Memory behind bridge: fc700000-fe7fffff Prefetchable memory behind bridge: f8200000-f82fffff Secondary status: 66MHz- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- <SERR- <PERR- BridgeCtl: Parity+ SERR+ NoISA- VGA+ MAbort- >Reset- FastB2B- 00:1f.0 ISA bridge: Intel Corporation 82801CA LPC Interface Controller (rev 02) Control: I/O+ Mem+ BusMaster+ SpecCycle+ MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 0 00:1f.1 IDE interface: Intel Corporation 82801CA Ultra ATA Storage Controller (rev 02) (prog-if 8a [Master SecP PriP]) Subsystem: Intel Corporation Unknown device 341a Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 0 Interrupt: pin A routed to IRQ 18 Region 0: I/O ports at 01f0 [size=8] Region 1: I/O ports at 03f4 [size=1] Region 2: I/O ports at 0170 [size=8] Region 3: I/O ports at 0374 [size=1] Region 4: I/O ports at 03a0 [size=16] Region 5: Memory at 88000000 (32-bit, non-prefetchable) [size=1K] 00:1f.3 SMBus: Intel Corporation 82801CA/CAM SMBus Controller (rev 02) Subsystem: Intel Corporation Unknown device 341a Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Interrupt: pin B routed to IRQ 21 Region 4: I/O ports at 0580 [size=32] 01:0c.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27) (prog-if 00 [VGA]) Subsystem: Intel Corporation Unknown device 341a Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping+ SERR- FastB2B- Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 64 (2000ns min), Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 11 Region 0: Memory at fd000000 (32-bit, non-prefetchable) [size=16M] Region 1: I/O ports at 1000 [size=256] Region 2: Memory at fe3f0000 (32-bit, non-prefetchable) [size=4K] Expansion ROM at f8200000 [disabled] [size=128K] Capabilities: [5c] Power Management version 2 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 PME-Enable- DSel=0 DScale=0 PME- 02:1c.0 PIC: Intel Corporation 82870P2 P64H2 I/OxAPIC (rev 04) (prog-if 20 [IO(X)-APIC]) Subsystem: Intel Corporation Unknown device 341a Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 0 Region 0: Memory at feae0000 (32-bit, non-prefetchable) [size=4K] Capabilities: [50] PCI-X non-bridge device Command: DPERE- ERO- RBC=512 OST=1 Status: Dev=02:1c.0 64bit+ 133MHz+ SCD- USC- DC=simple DMMRBC=512 DMOST=1 DMCRS=8 RSCEM- 266MHz- 533MHz- 02:1d.0 PCI bridge: Intel Corporation 82870P2 P64H2 Hub PCI Bridge (rev 04) (prog-if 00 [Normal decode]) Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 64, Cache Line Size: 64 bytes Bus: primary=02, secondary=04, subordinate=04, sec-latency=48 Memory behind bridge: fe900000-fe9fffff Prefetchable memory behind bridge: 00000000f8400000-00000000fc4fffff Secondary status: 66MHz+ FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ <SERR- <PERR- BridgeCtl: Parity+ SERR+ NoISA- VGA- MAbort- >Reset- FastB2B- Capabilities: [50] PCI-X bridge device Secondary Status: 64bit+ 133MHz+ SCD- USC- SCO- SRD- Freq=conv Status: Dev=02:1d.0 64bit+ 133MHz+ SCD- USC- SCO- SRD- Upstream: Capacity=65535 CommitmentLimit=65535 Downstream: Capacity=65535 CommitmentLimit=65535 02:1e.0 PIC: Intel Corporation 82870P2 P64H2 I/OxAPIC (rev 04) (prog-if 20 [IO(X)-APIC]) Subsystem: Intel Corporation Unknown device 341a Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 0 Region 0: Memory at feaf0000 (32-bit, non-prefetchable) [size=4K] Capabilities: [50] PCI-X non-bridge device Command: DPERE- ERO- RBC=512 OST=1 Status: Dev=02:1e.0 64bit+ 133MHz+ SCD- USC- DC=simple DMMRBC=512 DMOST=1 DMCRS=8 RSCEM- 266MHz- 533MHz- 02:1f.0 PCI bridge: Intel Corporation 82870P2 P64H2 Hub PCI Bridge (rev 04) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 64, Cache Line Size: 64 bytes Bus: primary=02, secondary=03, subordinate=03, sec-latency=64 I/O behind bridge: 00002000-00002fff Memory behind bridge: fe800000-fe8fffff Prefetchable memory behind bridge: 00000000f8300000-00000000f83fffff Secondary status: 66MHz+ FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- <SERR- <PERR- BridgeCtl: Parity+ SERR+ NoISA- VGA- MAbort- >Reset- FastB2B- Capabilities: [50] PCI-X bridge device Secondary Status: 64bit+ 133MHz+ SCD- USC- SCO- SRD- Freq=100MHz Status: Dev=02:1f.0 64bit+ 133MHz+ SCD- USC- SCO- SRD- Upstream: Capacity=65535 CommitmentLimit=65535 Downstream: Capacity=65535 CommitmentLimit=65535 03:07.0 Ethernet controller: Intel Corporation 82546EB Gigabit Ethernet Controller (Copper) (rev 01) Subsystem: Intel Corporation Unknown device 341a Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 64 (63750ns min), Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 16 Region 0: Memory at fe8c0000 (64-bit, non-prefetchable) [size=128K] Region 4: I/O ports at 2040 [size=64] Capabilities: [dc] Power Management version 2 Flags: PMEClk+ DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 PME-Enable- DSel=0 DScale=1 PME- Capabilities: [e4] PCI-X non-bridge device Command: DPERE- ERO+ RBC=512 OST=1 Status: Dev=03:07.0 64bit+ 133MHz+ SCD- USC- DC=simple DMMRBC=2048 DMOST=1 DMCRS=16 RSCEM- 266MHz- 533MHz- Capabilities: [f0] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable- Address: 0000000000000000 Data: 0000 03:07.1 Ethernet controller: Intel Corporation 82546EB Gigabit Ethernet Controller (Copper) (rev 01) Subsystem: Intel Corporation Unknown device 341a Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 64 (63750ns min), Cache Line Size: 64 bytes Interrupt: pin B routed to IRQ 17 Region 0: Memory at fe8e0000 (64-bit, non-prefetchable) [size=128K] Region 4: I/O ports at 2000 [size=64] Capabilities: [dc] Power Management version 2 Flags: PMEClk+ DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 PME-Enable- DSel=0 DScale=1 PME- Capabilities: [e4] PCI-X non-bridge device Command: DPERE- ERO+ RBC=512 OST=1 Status: Dev=03:07.1 64bit+ 133MHz+ SCD- USC- DC=simple DMMRBC=2048 DMOST=1 DMCRS=16 RSCEM- 266MHz- 533MHz- Capabilities: [f0] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable- Address: 0000000000000000 Data: 0000 04:08.0 RAID bus controller: Intel Corporation RAID Controller Subsystem: Intel Corporation SRCZCR Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=slow >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 64, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 19 Region 0: Memory at fa000000 (32-bit, prefetchable) [size=32M] [virtual] Expansion ROM at fe9f8000 [disabled] [size=32K] Capabilities: [80] Power Management version 2 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 PME-Enable- DSel=0 DScale=0 PME- Linux version 2.6.22-gentoo-r5-nat (root@nat-new) (gcc version 4.1.1 (Gentoo 4.1.1-r3)) #3 SMP Thu Aug 30 06:56:52 MSD 2007 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009a800 (usable) BIOS-e820: 000000000009a800 - 00000000000a0000 (reserved) BIOS-e820: 00000000000e8000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000007fff0000 (usable) BIOS-e820: 000000007fff0000 - 000000007ffff000 (ACPI data) BIOS-e820: 000000007ffff000 - 0000000080000000 (ACPI NVS) BIOS-e820: 00000000fec00000 - 00000000fed00400 (reserved) BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved) BIOS-e820: 00000000fff00000 - 0000000100000000 (reserved) 1151MB HIGHMEM available. 896MB LOWMEM available. found SMP MP-table at 000ff780 Entering add_active_range(0, 0, 524272) 0 entries of 256 used Zone PFN ranges: DMA 0 -> 4096 Normal 4096 -> 229376 HighMem 229376 -> 524272 early_node_map[1] active PFN ranges 0: 0 -> 524272 On node 0 totalpages: 524272 DMA zone: 32 pages used for memmap DMA zone: 0 pages reserved DMA zone: 4064 pages, LIFO batch:0 Normal zone: 1760 pages used for memmap Normal zone: 223520 pages, LIFO batch:31 HighMem zone: 2303 pages used for memmap HighMem zone: 292593 pages, LIFO batch:31 DMI 2.3 present. ACPI: RSDP 000FF9B0, 0014 (r0 INTEL ) ACPI: RSDT 7FFF0000, 0030 (r1 INTEL SWV25 1 MSFT 1000000) ACPI: FACP 7FFF0030, 0074 (r1 INTEL SWV25 1 MSFT 1000000) ACPI: DSDT 7FFF0190, 28C4 (r1 INTEL SWV25 100 INTL 20020918) ACPI: FACS 7FFFF000, 0040 ACPI: APIC 7FFF00B0, 0090 (r1 INTEL SWV25 1 MSFT 1000000) ACPI: OEMR 7FFF0140, 0050 (r1 INTEL SWV25 1 MSFT 1000000) ACPI: PM-Timer IO Port: 0x408 ACPI: Local APIC address 0xfee00000 ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) Processor #0 15:2 APIC version 20 ACPI: LAPIC (acpi_id[0x01] lapic_id[0x06] enabled) Processor #6 15:2 APIC version 20 ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled) Processor #1 15:2 APIC version 20 ACPI: LAPIC (acpi_id[0x03] lapic_id[0x07] enabled) Processor #7 15:2 APIC version 20 ACPI: LAPIC_NMI (acpi_id[0x00] high level lint[0x1]) ACPI: LAPIC_NMI (acpi_id[0x01] high level lint[0x1]) ACPI: IOAPIC (id[0x08] address[0xfec00000] gsi_base[0]) IOAPIC[0]: apic_id 8, version 32, address 0xfec00000, GSI 0-23 ACPI: IOAPIC (id[0x09] address[0xfec81000] gsi_base[24]) IOAPIC[1]: apic_id 9, version 32, address 0xfec81000, GSI 24-47 ACPI: IOAPIC (id[0x0a] address[0xfec81400] gsi_base[48]) IOAPIC[2]: apic_id 10, version 32, address 0xfec81400, GSI 48-71 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) ACPI: IRQ0 used by override. ACPI: IRQ2 used by override. ACPI: IRQ9 used by override. Enabling APIC mode: Flat. Using 3 I/O APICs Using ACPI (MADT) for SMP configuration information Allocating PCI resources starting at 88000000 (gap: 80000000:7ec00000) Built 1 zonelists. Total pages: 520177 Kernel command line: BOOT_IMAGE=linux-2.6.22-r5 ro root=802 mapped APIC to ffffd000 (fee00000) mapped IOAPIC to ffffc000 (fec00000) mapped IOAPIC to ffffb000 (fec81000) mapped IOAPIC to ffffa000 (fec81400) Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Initializing CPU#0 PID hash table entries: 4096 (order: 12, 16384 bytes) Detected 2791.139 MHz processor. Console: colour VGA+ 80x25 Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) Memory: 2074076k/2097088k available (2347k kernel code, 21856k reserved, 1205k data, 248k init, 1179584k highmem) virtual kernel memory layout: fixmap : 0xffe17000 - 0xfffff000 (1952 kB) pkmap : 0xff800000 - 0xffc00000 (4096 kB) vmalloc : 0xf8800000 - 0xff7fe000 ( 111 MB) lowmem : 0xc0000000 - 0xf8000000 ( 896 MB) .init : 0xc0480000 - 0xc04be000 ( 248 kB) .data : 0xc034aca8 - 0xc04782bc (1205 kB) .text : 0xc0100000 - 0xc034aca8 (2347 kB) Checking if this processor honours the WP bit even in supervisor mode... Ok. Calibrating delay using timer specific routine.. 5587.52 BogoMIPS (lpj=9308033) Mount-cache hash table entries: 512 CPU: After generic identify, caps: bfebfbff 00000000 00000000 00000000 00004400 00000000 00000000 CPU: Trace cache: 12K uops, L1 D cache: 8K CPU: L2 cache: 512K CPU: Physical Processor ID: 0 CPU: After all inits, caps: bfebfbff 00000000 00000000 0000b080 00004400 00000000 00000000 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. CPU0: Intel P4/Xeon Extended MCE MSRs (12) available CPU0: Thermal monitoring enabled Compat vDSO mapped to ffffe000. Checking 'hlt' instruction... OK. Freeing SMP alternatives: 14k freed ACPI: Core revision 20070126 ACPI Warning (dswload-0698): Type override - [DEB_] had invalid type (Integer) for Scope operator, changed to (Scope) [20070126] ACPI Warning (dswload-0698): Type override - [MLIB] had invalid type (Integer) for Scope operator, changed to (Scope) [20070126] ACPI Warning (dswload-0698): Type override - [DATA] had invalid type (String) for Scope operator, changed to (Scope) [20070126] ACPI Warning (dswload-0698): Type override - [SIO_] had invalid type (String) for Scope operator, changed to (Scope) [20070126] ACPI Warning (dswload-0698): Type override - [LEDP] had invalid type (String) for Scope operator, changed to (Scope) [20070126] ACPI Warning (dswload-0698): Type override - [GPEN] had invalid type (String) for Scope operator, changed to (Scope) [20070126] ACPI Warning (dswload-0698): Type override - [GPST] had invalid type (String) for Scope operator, changed to (Scope) [20070126] ACPI Warning (dswload-0698): Type override - [WUES] had invalid type (String) for Scope operator, changed to (Scope) [20070126] ACPI Warning (dswload-0698): Type override - [WUSE] had invalid type (String) for Scope operator, changed to (Scope) [20070126] ACPI Warning (dswload-0698): Type override - [SBID] had invalid type (String) for Scope operator, changed to (Scope) [20070126] ACPI Warning (dswload-0698): Type override - [SWCE] had invalid type (String) for Scope operator, changed to (Scope) [20070126] Parsing all Control Methods: Table [DSDT](id 0001) - 358 Objects with 28 Devices 95 Methods 25 Regions tbxface-0587 [02] tb_load_namespace : ACPI Tables successfully acquired evxfevnt-0091 [02] enable : Transition to ACPI mode successful CPU0: Intel(R) Xeon(TM) CPU 2.80GHz stepping 05 Booting processor 1/1 eip 2000 Initializing CPU#1 Calibrating delay using timer specific routine.. 5583.27 BogoMIPS (lpj=9302835) CPU: After generic identify, caps: bfebfbff 00000000 00000000 00000000 00004400 00000000 00000000 CPU: Trace cache: 12K uops, L1 D cache: 8K CPU: L2 cache: 512K CPU: Physical Processor ID: 0 CPU: After all inits, caps: bfebfbff 00000000 00000000 0000b080 00004400 00000000 00000000 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#1. CPU1: Intel P4/Xeon Extended MCE MSRs (12) available CPU1: Thermal monitoring enabled CPU1: Intel(R) Xeon(TM) CPU 2.80GHz stepping 05 Booting processor 2/6 eip 2000 Initializing CPU#2 Calibrating delay using timer specific routine.. 5583.29 BogoMIPS (lpj=9302868) CPU: After generic identify, caps: bfebfbff 00000000 00000000 00000000 00004400 00000000 00000000 CPU: Trace cache: 12K uops, L1 D cache: 8K CPU: L2 cache: 512K CPU: Physical Processor ID: 3 CPU: After all inits, caps: bfebfbff 00000000 00000000 0000b080 00004400 00000000 00000000 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#2. CPU2: Intel P4/Xeon Extended MCE MSRs (12) available CPU2: Thermal monitoring enabled CPU2: Intel(R) Xeon(TM) CPU 2.80GHz stepping 05 Booting processor 3/7 eip 2000 Initializing CPU#3 Calibrating delay using timer specific routine.. 5583.30 BogoMIPS (lpj=9302888) CPU: After generic identify, caps: bfebfbff 00000000 00000000 00000000 00004400 00000000 00000000 CPU: Trace cache: 12K uops, L1 D cache: 8K CPU: L2 cache: 512K CPU: Physical Processor ID: 3 CPU: After all inits, caps: bfebfbff 00000000 00000000 0000b080 00004400 00000000 00000000 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#3. CPU3: Intel P4/Xeon Extended MCE MSRs (12) available CPU3: Thermal monitoring enabled CPU3: Intel(R) Xeon(TM) CPU 2.80GHz stepping 05 Total of 4 processors activated (22338.39 BogoMIPS). ENABLING IO-APIC IRQs ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1 checking TSC synchronization [CPU#0 -> CPU#1]: passed. checking TSC synchronization [CPU#0 -> CPU#2]: passed. checking TSC synchronization [CPU#0 -> CPU#3]: passed. Brought up 4 CPUs migration_cost=110,693 NET: Registered protocol family 16 ACPI: bus type pci registered PCI: PCI BIOS revision 2.10 entry at 0xfdb75, last bus=4 PCI: Using configuration type 1 Setting up standard PCI resources evgpeblk-0952 [04] ev_create_gpe_block : GPE 00 to 0F [_GPE] 2 regs on int 0x9 evgpeblk-0952 [04] ev_create_gpe_block : GPE 10 to 2F [_GPE] 4 regs on int 0x9 evgpeblk-1048 [03] ev_initialize_gpe_bloc: Found 4 Wake, Enabled 0 Runtime GPEs in this block evgpeblk-1048 [03] ev_initialize_gpe_bloc: Found 6 Wake, Enabled 0 Runtime GPEs in this block Completing Region/Field/Buffer/Package initialization:............................................................................ Initialized 25/25 Regions 8/8 Fields 29/29 Buffers 14/23 Packages (367 nodes) Initializing Device/Processor/Thermal objects by executing _INI methods:. Executed 1 _INI methods requiring 0 _STA executions (examined 34 objects) ACPI: Interpreter enabled ACPI: Using IOAPIC for interrupt routing ACPI: PCI Root Bridge [PCI0] (0000:00) PCI: Probing PCI hardware (bus 00) PCI quirk: region 0400-047f claimed by ICH4 ACPI/GPIO/TCO PCI quirk: region 0480-04bf claimed by ICH4 GPIO PCI: Transparent bridge - 0000:00:1e.0 ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P1._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P5.P5P6._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P5.P5P7._PRT] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 *9 10 11 12 14 15) ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 9 10 *11 12 14 15) ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 9 *10 11 12 14 15) Linux Plug and Play Support v0.97 (c) Adam Belay pnp: PnP ACPI init ACPI: bus type pnp registered pnp: PnP ACPI: found 11 devices ACPI: ACPI bus type pnp unregistered SCSI subsystem initialized usbcore: registered new interface driver usbfs usbcore: registered new interface driver hub usbcore: registered new device driver usb PCI: Using ACPI for IRQ routing PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report pnp: 00:01: ioport range 0x3f0-0x3f1 has been reserved pnp: 00:01: ioport range 0x400-0x4bf could not be reserved pnp: 00:01: ioport range 0x4d0-0x4d1 has been reserved pnp: 00:01: ioport range 0x40b-0x40b has been reserved pnp: 00:01: ioport range 0x4d6-0x4d6 has been reserved Time: tsc clocksource has been installed. PCI: Bridge: 0000:02:1d.0 IO window: disabled. MEM window: fe900000-fe9fffff PREFETCH window: f8400000-fc4fffff PCI: Bridge: 0000:02:1f.0 IO window: 2000-2fff MEM window: fe800000-fe8fffff PREFETCH window: f8300000-f83fffff PCI: Bridge: 0000:00:03.0 IO window: 2000-2fff MEM window: fe800000-feafffff PREFETCH window: f8300000-fc5fffff PCI: Bridge: 0000:00:1e.0 IO window: 1000-1fff MEM window: fc700000-fe7fffff PREFETCH window: f8200000-f82fffff PCI: Setting latency timer of device 0000:00:1e.0 to 64 NET: Registered protocol family 2 IP route cache hash table entries: 32768 (order: 5, 131072 bytes) TCP established hash table entries: 131072 (order: 8, 1572864 bytes) TCP bind hash table entries: 65536 (order: 7, 524288 bytes) TCP: Hash tables configured (established 131072 bind 65536) TCP reno registered Machine check exception polling timer started. IA-32 Microcode Update Driver: v1.14a <tigran@aivazian.fsnet.co.uk> highmem bounce pool size: 64 pages Total HugeTLB memory allocated, 0 Installing knfsd (copyright (C) 1996 okir@monad.swb.de). io scheduler noop registered io scheduler anticipatory registered io scheduler deadline registered io scheduler cfq registered (default) Boot video device is 0000:01:0c.0 Real Time Clock Driver v1.12ac intel_rng: FWH not detected Linux agpgart interface v0.102 (c) Dave Jones Hangcheck: starting hangcheck timer 0.9.0 (tick is 180 seconds, margin is 60 seconds). Hangcheck: Using get_cycles(). input: Power Button (FF) as /class/input/input0 ACPI: Power Button (FF) [PWRF] Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing disabled 00:08: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A 00:09: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A FDC 0 is a National Semiconductor PC87306 RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize loop: module loaded Intel(R) PRO/1000 Network Driver - version 7.3.20-k2-NAPI Copyright (c) 1999-2006 Intel Corporation. ACPI: PCI Interrupt 0000:03:07.0[A] -> GSI 30 (level, low) -> IRQ 16 e1000: 0000:03:07.0: e1000_probe: (PCI-X:100MHz:64-bit) 00:0e:0c:66:ab:0e e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection ACPI: PCI Interrupt 0000:03:07.1[B] -> GSI 31 (level, low) -> IRQ 17 e1000: 0000:03:07.1: e1000_probe: (PCI-X:100MHz:64-bit) 00:0e:0c:66:ab:0f e1000: eth1: e1000_probe: Intel(R) PRO/1000 Network Connection Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx ICH3: IDE controller at PCI slot 0000:00:1f.1 PCI: Enabling device 0000:00:1f.1 (0005 -> 0007) ACPI: PCI Interrupt 0000:00:1f.1[A] -> GSI 16 (level, low) -> IRQ 18 ICH3: chipset revision 2 ICH3: not 100% native mode: will probe irqs later ide0: BM-DMA at 0x03a0-0x03a7, BIOS settings: hda:pio, hdb:pio ide1: BM-DMA at 0x03a8-0x03af, BIOS settings: hdc:pio, hdd:pio Probing IDE interface ide0... Probing IDE interface ide1... Probing IDE interface ide0... Probing IDE interface ide1... megaraid cmm: 2.20.2.7 (Release Date: Sun Jul 16 00:01:03 EST 2006) megaraid: 2.20.5.1 (Release Date: Thu Nov 16 15:32:35 EST 2006) GDT-HA: Storage RAID Controller Driver. Version: 3.05 ACPI: PCI Interrupt 0000:04:08.0[A] -> GSI 48 (level, low) -> IRQ 19 GDT-HA: Found 1 PCI Storage RAID Controllers Configuring GDT-PCI HA at 4/8 IRQ 19 GDT-HA 0: Name: SRCZCR scsi0 : SRCZCR scsi 0:0:0:0: Direct-Access Intel Host Drive #00 PQ: 0 ANSI: 2 scsi 0:2:6:0: Processor ESG-SHV SCA HSBP M18 0.07 PQ: 0 ANSI: 2 sd 0:0:0:0: [sda] 143299800 512-byte hardware sectors (73369 MB) sd 0:0:0:0: [sda] Assuming Write Enabled sd 0:0:0:0: [sda] Assuming drive cache: write through sd 0:0:0:0: [sda] 143299800 512-byte hardware sectors (73369 MB) sd 0:0:0:0: [sda] Assuming Write Enabled sd 0:0:0:0: [sda] Assuming drive cache: write through sda: sda1 sda2 sda3 sd 0:0:0:0: [sda] Attached SCSI disk I2O subsystem v1.325 i2o: max drivers = 8 Fusion MPT base driver 3.04.04 Copyright (c) 1999-2007 LSI Logic Corporation Fusion MPT SPI Host driver 3.04.04 usbmon: debugfs is not available ohci_hcd: 2006 August 04 USB 1.1 'Open' Host Controller (OHCI) Driver USB Universal Host Controller Interface driver v3.0 ACPI: PCI Interrupt 0000:00:1d.0[A] -> GSI 16 (level, low) -> IRQ 18 PCI: Setting latency timer of device 0000:00:1d.0 to 64 uhci_hcd 0000:00:1d.0: UHCI Host Controller uhci_hcd 0000:00:1d.0: new USB bus registered, assigned bus number 1 uhci_hcd 0000:00:1d.0: irq 18, io base 0x00003020 usb usb1: configuration #1 chosen from 1 choice hub 1-0:1.0: USB hub found hub 1-0:1.0: 2 ports detected ACPI: PCI Interrupt 0000:00:1d.1[B] -> GSI 19 (level, low) -> IRQ 20 PCI: Setting latency timer of device 0000:00:1d.1 to 64 uhci_hcd 0000:00:1d.1: UHCI Host Controller uhci_hcd 0000:00:1d.1: new USB bus registered, assigned bus number 2 uhci_hcd 0000:00:1d.1: irq 20, io base 0x00003000 usb usb2: configuration #1 chosen from 1 choice hub 2-0:1.0: USB hub found hub 2-0:1.0: 2 ports detected usbcore: registered new interface driver usblp drivers/usb/class/usblp.c: v0.13: USB Printer Device Class driver Initializing USB Mass Storage driver... usbcore: registered new interface driver usb-storage USB Mass Storage support registered. PNP: PS/2 Controller [PNP0303:PS2K] at 0x60,0x64 irq 1 PNP: PS/2 controller doesn't have AUX irq; using default 12 serio: i8042 KBD port at 0x60,0x64 irq 1 serio: i8042 AUX port at 0x60,0x64 irq 12 mice: PS/2 mouse device common for all mice input: AT Translated Set 2 keyboard as /class/input/input1 ACPI: PCI Interrupt 0000:00:1f.3[B] -> GSI 17 (level, low) -> IRQ 21 usbcore: registered new interface driver usbhid drivers/hid/usbhid/hid-core.c: v2.6:USB HID core driver oprofile: using NMI interrupt. nf_conntrack version 0.5.0 (8192 buckets, 65536 max) TCP cubic registered NET: Registered protocol family 1 NET: Registered protocol family 17 Starting balanced_irq Using IPI Shortcut mode kjournald starting. Commit interval 5 seconds EXT3-fs: mounted filesystem with ordered data mode. VFS: Mounted root (ext3 filesystem) readonly. Freeing unused kernel memory: 248k freed EDAC MC: Ver: 2.0.1 Aug 30 2007 EDAC e7xxx: tolm = 80000, remapbase = 4000, remaplimit = 0 EDAC MC0: Giving out device to e7xxx_edac E7501: DEV 0000:00:00.0 EXT3 FS on sda2, internal journal Adding 3911788k swap on /dev/sda1. Priority:-1 extents:1 across:3911788k ip_tables: (C) 2000-2006 Netfilter Core Team e1000: eth0: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX e1000: eth1: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX UDP: bad checksum. From 82.27.165.203:6881 to 87.255.1.89:1024 ulen 282 UDP: bad checksum. From 202.69.162.152:60033 to 87.255.1.89:1024 ulen 282 UDP: bad checksum. From 82.25.94.36:52777 to 87.255.1.43:52907 ulen 285 UDP: short packet: From 219.146.165.98:63467 3849/106 to 87.255.1.39:7903 UDP: bad checksum. From 201.53.114.76:18845 to 87.255.1.59:27212 ulen 276 UDP: short packet: From 209.222.2.2:12043 25649/75 to 87.255.1.43:52907 UDP: bad checksum. From 125.92.10.232:10982 to 87.255.1.6:1352 ulen 145 UDP: bad checksum. From 190.140.147.153:12765 to 87.255.1.43:1288 ulen 68 UDP: short packet: From 222.128.33.233:1924 2065/1480 to 87.255.1.73:18459 UDP: bad checksum. From 61.49.226.238:25333 to 87.255.1.54:7105 ulen 280 UDP: short packet: From 76.172.242.93:10256 58125/10 to 87.255.1.35:59675 UDP: bad checksum. From 68.103.175.217:14262 to 87.255.1.35:59675 ulen 380 UDP: bad checksum. From 68.12.171.227:45609 to 87.255.1.110:22983 ulen 25 UDP: bad checksum. From 68.12.171.227:45609 to 87.255.1.110:22983 ulen 25 UDP: bad checksum. From 68.12.171.227:45609 to 87.255.1.110:22983 ulen 25 UDP: bad checksum. From 88.148.211.148:20152 to 87.255.1.14:47576 ulen 282 UDP: bad checksum. From 222.188.215.87:25546 to 87.255.1.43:1036 ulen 310 UDP: bad checksum. From 75.118.118.25:60111 to 87.255.1.54:1385 ulen 289 UDP: bad checksum. From 72.155.16.125:64718 to 87.255.1.69:6349 ulen 33 UDP: bad checksum. From 89.40.138.242:12225 to 87.255.1.69:6349 ulen 33 UDP: bad checksum. From 68.13.132.220:33056 to 87.255.1.45:30568 ulen 33 UDP: bad checksum. From 59.172.152.214:6653 to 87.255.1.75:4672 ulen 210 UDP: bad checksum. From 68.72.231.22:65516 to 87.255.1.69:6349 ulen 33 UDP: bad checksum. From 221.218.52.175:30997 to 87.255.1.19:1388 ulen 280 UDP: bad checksum. From 59.173.192.113:20318 to 87.255.1.19:1388 ulen 280 UDP: short packet: From 219.146.165.99:32720 1865/106 to 87.255.1.6:242 UDP: bad checksum. From 89.208.32.229:27023 to 87.255.1.106:1382 ulen 282 UDP: bad checksum. From 85.140.101.140:10207 to 87.255.1.6:39724 ulen 315 UDP: bad checksum. From 75.118.114.187:9034 to 87.255.1.69:6349 ulen 33 UDP: bad checksum. From 125.33.83.127:7808 to 87.255.1.67:1389 ulen 276 UDP: bad checksum. From 62.165.211.135:31276 to 87.255.1.69:6349 ulen 33 UDP: short packet: From 24.192.142.124:8609 58125/10 to 87.255.1.100:1383 UDP: bad checksum. From 80.4.96.108:49152 to 87.255.1.89:1383 ulen 282 UDP: bad checksum. From 222.93.49.206:8301 to 87.255.1.54:7105 ulen 280 UDP: bad checksum. From 87.18.31.216:13100 to 87.255.1.37:25117 ulen 46 UDP: bad checksum. From 68.220.96.97:50001 to 87.255.1.69:6349 ulen 33 UDP: bad checksum. From 65.6.116.40:62628 to 87.255.1.65:4127 ulen 33 UDP: bad checksum. From 202.44.178.101:28604 to 87.255.1.69:6349 ulen 33 UDP: bad checksum. From 202.44.178.101:28604 to 87.255.1.69:6349 ulen 33 UDP: bad checksum. From 59.127.220.134:12590 to 87.255.1.19:1391 ulen 310 UDP: bad checksum. From 122.169.4.72:4273 to 87.255.1.69:6349 ulen 33 UDP: short packet: From 219.146.165.99:28513 14818/106 to 87.255.1.6:52824 UDP: bad checksum. From 65.8.0.232:59928 to 87.255.1.69:6349 ulen 33 UDP: bad checksum. From 58.64.42.170:11772 to 87.255.1.25:1391 ulen 280 UDP: bad checksum. From 125.127.173.190:1071 to 87.255.1.7:61159 ulen 70 UDP: bad checksum. From 74.103.66.241:60001 to 87.255.1.43:1036 ulen 280 UDP: bad checksum. From 84.236.72.241:6881 to 87.255.1.59:1391 ulen 289 UDP: bad checksum. From 201.45.89.132:11997 to 87.255.1.95:1391 ulen 88 UDP: bad checksum. From 213.118.235.231:58480 to 87.255.1.92:64035 ulen 76 UDP: short packet: From 89.142.143.74:4764 16827/310 to 87.255.1.95:56662 UDP: short packet: From 222.128.33.233:1924 2304/1480 to 87.255.1.73:1391 UDP: bad checksum. From 82.25.27.254:10930 to 87.255.1.6:10261 ulen 315 UDP: bad checksum. From 195.182.8.208:8767 to 87.255.1.43:1774 ulen 190 UDP: bad checksum. From 60.221.172.243:16001 to 87.255.1.90:24094 ulen 70 UDP: bad checksum. From 195.182.8.208:8767 to 87.255.1.43:1774 ulen 190 UDP: bad checksum. From 216.117.225.80:17153 to 87.255.1.100:32104 ulen 33 UDP: bad checksum. From 195.182.8.208:8767 to 87.255.1.43:1774 ulen 190 UDP: bad checksum. From 82.20.112.186:53140 to 87.255.1.5:46615 ulen 285 UDP: bad checksum. From 60.221.172.243:16001 to 87.255.1.90:24094 ulen 70 UDP: bad checksum. From 201.235.3.10:28881 to 87.255.1.60:20475 ulen 36 UDP: bad checksum. From 86.6.19.199:34660 to 87.255.1.89:1391 ulen 315 UDP: bad checksum. From 86.136.199.161:34803 to 87.255.1.69:6349 ulen 33 UDP: bad checksum. From 196.29.43.198:61239 to 87.255.1.70:10965 ulen 106 UDP: bad checksum. From 58.64.42.170:11772 to 87.255.1.49:8332 ulen 280 UDP: bad checksum. From 89.40.137.76:28986 to 87.255.1.14:7507 ulen 33 UDP: bad checksum. From 80.97.183.220:20764 to 87.255.1.69:56013 ulen 33 UDP: bad checksum. From 122.4.221.21:62420 to 87.255.1.95:63641 ulen 70 UDP: short packet: From 122.5.130.142:16001 0/70 to 87.255.1.5:46615 UDP: bad checksum. From 80.97.183.220:22921 to 87.255.1.69:6349 ulen 33 UDP: bad checksum. From 125.40.32.216:19948 to 87.255.1.82:8080 ulen 106 UDP: bad checksum. From 151.33.64.18:63598 to 87.255.1.65:1394 ulen 12 UDP: bad checksum. From 60.26.131.200:18938 to 87.255.1.25:8516 ulen 109 UDP: bad checksum. From 219.131.38.89:1542 to 87.255.1.43:1036 ulen 273 UDP: bad checksum. From 75.118.114.187:9034 to 87.255.1.69:6349 ulen 33 UDP: bad checksum. From 81.106.145.78:16923 to 87.255.1.89:1395 ulen 306 UDP: bad checksum. From 82.39.35.157:49155 to 87.255.1.61:26583 ulen 285 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 100 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 385 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 1207 UDP: bad checksum. From 60.10.201.110:58630 to 87.255.1.43:45981 ulen 112 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 170 UDP: bad checksum. From 59.172.92.39:9796 to 87.255.1.58:36124 ulen 276 UDP: bad checksum. From 82.0.17.204:1828 to 87.255.1.70:1396 ulen 289 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 1316 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 131 UDP: bad checksum. From 60.10.201.110:36000 to 87.255.1.43:45981 ulen 112 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 79 UDP: bad checksum. From 81.100.196.245:49109 to 87.255.1.41:1396 ulen 282 UDP: bad checksum. From 125.40.32.216:19948 to 87.255.1.82:8080 ulen 106 UDP: bad checksum. From 65.82.222.45:58117 to 87.255.1.69:6349 ulen 33 UDP: bad checksum. From 60.10.201.110:36000 to 87.255.1.43:45981 ulen 112 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 218 UDP: bad checksum. From 125.92.10.232:10982 to 87.255.1.6:39724 ulen 145 conntrack_ftp: partial EPRT 3941011105+1 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 181 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 182 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 193 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 177 UDP: bad checksum. From 70.152.16.84:51130 to 87.255.1.61:11800 ulen 33 UDP: bad checksum. From 124.120.182.165:20000 to 87.255.1.49:1397 ulen 280 UDP: bad checksum. From 218.206.139.231:18137 to 87.255.1.89:1397 ulen 280 UDP: bad checksum. From 219.157.13.224:21814 to 87.255.1.89:1386 ulen 109 UDP: bad checksum. From 82.28.117.13:4667 to 87.255.1.70:10965 ulen 289 UDP: bad checksum. From 122.4.221.21:62420 to 87.255.1.95:63641 ulen 70 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 171 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 435 UDP: short packet: From 60.19.221.226:16001 0/70 to 87.255.1.89:1395 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 388 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 237 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 437 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 313 UDP: short packet: From 24.104.98.2:52014 25649/106 to 87.255.1.50:1397 UDP: bad checksum. From 58.40.148.46:16001 to 87.255.1.73:1397 ulen 291 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 56 UDP: bad checksum. From 62.143.96.142:25646 to 87.255.1.23:1397 ulen 285 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 57 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 119 UDP: bad checksum. From 62.99.150.176:62759 to 87.255.1.89:1397 ulen 282 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 154 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 58 UDP: bad checksum. From 202.69.182.218:1382 to 87.255.1.51:44883 ulen 111 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 173 UDP: bad checksum. From 60.10.201.110:32791 to 87.255.1.43:45981 ulen 109 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 504 UDP: short packet: From 76.208.213.81:61808 25649/111 to 87.255.1.82:1399 UDP: short packet: From 201.43.229.150:20414 1743/1480 to 87.255.1.5:4672 UDP: bad checksum. From 60.10.201.110:32791 to 87.255.1.43:45981 ulen 109 UDP: bad checksum. From 219.156.82.150:21978 to 87.255.1.49:1400 ulen 280 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 59 UDP: bad checksum. From 91.76.96.202:1513 to 87.255.1.42:1513 ulen 40 UDP: bad checksum. From 91.76.96.202:1513 to 87.255.1.42:1513 ulen 40 UDP: bad checksum. From 86.22.10.27:13134 to 87.255.1.70:1400 ulen 280 UDP: bad checksum. From 58.51.76.78:16001 to 87.255.1.43:1036 ulen 70 UDP: bad checksum. From 60.10.201.110:37137 to 87.255.1.43:45981 ulen 109 UDP: bad checksum. From 60.10.201.110:37137 to 87.255.1.43:45981 ulen 112 UDP: bad checksum. From 81.71.46.120:29898 to 87.255.1.50:28320 ulen 76 UDP: bad checksum. From 82.200.62.66:9387 to 87.255.1.50:4672 ulen 10 UDP: bad checksum. From 82.200.62.66:9387 to 87.255.1.50:4672 ulen 10 UDP: bad checksum. From 82.200.62.66:9387 to 87.255.1.50:4672 ulen 10 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 249 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 690 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 398 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 562 UDP: bad checksum. From 81.71.46.120:29898 to 87.255.1.50:28320 ulen 76 UDP: bad checksum. From 59.56.237.120:5153 to 87.255.1.65:20666 ulen 77 UDP: short packet: From 209.222.2.2:60526 25649/106 to 87.255.1.50:1400 UDP: bad checksum. From 81.71.46.120:29898 to 87.255.1.50:28320 ulen 76 UDP: bad checksum. From 72.155.16.125:57262 to 87.255.1.69:6349 ulen 33 UDP: bad checksum. From 87.18.31.216:13100 to 87.255.1.69:6349 ulen 32 UDP: bad checksum. From 85.53.203.33:27218 to 87.255.1.95:1401 ulen 280 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 252 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 238 UDP: bad checksum. From 189.7.87.142:22118 to 87.255.1.95:1401 ulen 61 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 182 UDP: bad checksum. From 81.71.46.120:29898 to 87.255.1.50:28320 ulen 76 UDP: bad checksum. From 122.168.26.181:18092 to 87.255.1.121:31885 ulen 33 UDP: bad checksum. From 85.232.129.95:50935 to 87.255.1.6:39724 ulen 75 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 345 UDP: bad checksum. From 220.39.18.95:1422 to 87.255.1.43:1036 ulen 280 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 170 UDP: bad checksum. From 219.74.75.48:48586 to 87.255.1.41:10983 ulen 272 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 121 UDP: bad checksum. From 59.56.227.149:61068 to 87.255.1.110:11659 ulen 73 UDP: bad checksum. From 217.150.52.121:38932 to 87.255.1.73:24898 ulen 75 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 1280 UDP: bad checksum. From 76.104.23.107:6885 to 87.255.1.41:10983 ulen 43 UDP: bad checksum. From 81.71.46.120:29898 to 87.255.1.50:28320 ulen 76 UDP: bad checksum. From 222.174.114.242:6881 to 87.255.1.70:1396 ulen 70 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 346 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 60 UDP: bad checksum. From 81.71.46.120:29898 to 87.255.1.50:28320 ulen 76 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 193 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 173 UDP: bad checksum. From 124.125.147.168:13666 to 87.255.1.121:31885 ulen 33 UDP: bad checksum. From 58.247.245.35:10741 to 87.255.1.61:26583 ulen 276 UDP: bad checksum. From 86.6.19.199:34660 to 87.255.1.89:59708 ulen 315 UDP: bad checksum. From 81.100.196.245:49109 to 87.255.1.41:1402 ulen 282 UDP: bad checksum. From 81.71.46.120:29898 to 87.255.1.50:28320 ulen 76 UDP: bad checksum. From 81.71.46.120:29898 to 87.255.1.50:28320 ulen 76 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 297 UDP: bad checksum. From 89.215.181.50:25463 to 87.255.1.45:52103 ulen 121 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 110 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 116 UDP: bad checksum. From 82.12.187.151:54689 to 87.255.1.73:24898 ulen 285 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 346 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 300 UDP: bad checksum. From 75.118.114.187:9034 to 87.255.1.69:6349 ulen 33 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 107 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 314 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 351 UDP: bad checksum. From 58.208.75.165:4711 to 87.255.1.31:4672 ulen 43 UDP: bad checksum. From 86.0.167.3:50213 to 87.255.1.95:35590 ulen 1363 UDP: short packet: From 76.208.213.81:62578 25649/111 to 87.255.1.115:43661 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 368 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 286 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 89 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 91 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 171 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 242 UDP: bad checksum. From 67.101.85.111:61171 to 87.255.1.82:1402 ulen 276 UDP: bad checksum. From 74.235.10.52:60671 to 87.255.1.65:4127 ulen 33 UDP: bad checksum. From 125.127.31.238:25105 to 87.255.1.89:64485 ulen 285 UDP: bad checksum. From 59.172.152.214:6653 to 87.255.1.36:58185 ulen 212 UDP: bad checksum. From 89.179.119.121:27015 to 87.255.1.106:27005 ulen 422 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 191 UDP: bad checksum. From 24.165.7.12:14957 to 87.255.1.41:10983 ulen 195 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 294 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 60 UDP: bad checksum. From 89.41.188.204:26647 to 87.255.1.70:10965 ulen 289 UDP: bad checksum. From 85.232.140.239:52129 to 87.255.1.47:40571 ulen 75 UDP: bad checksum. From 218.166.76.153:19194 to 87.255.1.89:1404 ulen 209 UDP: bad checksum. From 72.153.182.254:59919 to 87.255.1.69:6349 ulen 33 UDP: short packet: From 222.128.33.233:1924 2188/1480 to 87.255.1.73:18459 UDP: short packet: From 125.115.14.182:63201 1761/1480 to 87.255.1.122:29105 UDP: bad checksum. From 89.179.119.121:27015 to 87.255.1.106:27005 ulen 706 UDP: short packet: From 74.212.42.2:15934 38003/71 to 87.255.1.43:51508 UDP: bad checksum. From 82.7.91.66:22831 to 87.255.1.65:1406 ulen 311 UDP: bad checksum. From 81.108.96.240:19343 to 87.255.1.23:1406 ulen 285 UDP: bad checksum. From 203.128.253.26:5191 to 87.255.1.65:4127 ulen 33 UDP: bad checksum. From 72.153.182.254:50288 to 87.255.1.69:6349 ulen 33 UDP: short packet: From 74.212.42.2:37835 25649/111 to 87.255.1.63:44205 UDP: bad checksum. From 217.150.52.121:61373 to 87.255.1.14:7507 ulen 33 UDP: bad checksum. From 222.137.156.201:1501 to 87.255.1.31:22845 ulen 106 UDP: bad checksum. From 219.156.82.150:21978 to 87.255.1.49:1400 ulen 106 UDP: bad checksum. From 121.56.158.2:20461 to 87.255.1.89:63197 ulen 276 UDP: short packet: From 124.120.192.111:17028 266/10 to 87.255.1.65:1401 UDP: bad checksum. From 76.104.23.107:6885 to 87.255.1.41:10983 ulen 39 UDP: bad checksum. From 221.220.217.101:13355 to 87.255.1.65:1406 ulen 306 UDP: bad checksum. From 62.143.93.246:63371 to 87.255.1.49:8332 ulen 117 UDP: bad checksum. From 77.101.2.7:10942 to 87.255.1.14:1406 ulen 311 UDP: bad checksum. From 72.204.240.75:15927 to 87.255.1.60:20475 ulen 71 UDP: bad checksum. From 68.220.96.97:61607 to 87.255.1.61:11800 ulen 33 UDP: short packet: From 220.227.26.141:51338 25649/111 to 87.255.1.89:1406 UDP: bad checksum. From 82.200.62.66:9387 to 87.255.1.50:4672 ulen 10 UDP: bad checksum. From 82.200.62.66:9387 to 87.255.1.50:4672 ulen 10 UDP: bad checksum. From 82.200.62.66:9387 to 87.255.1.50:4672 ulen 10 UDP: bad checksum. From 124.120.192.111:17028 to 87.255.1.65:4127 ulen 33 UDP: bad checksum. From 124.120.192.111:17028 to 87.255.1.61:11800 ulen 33 UDP: bad checksum. From 88.27.71.2:29780 to 87.255.1.106:1418 ulen 1328 UDP: bad checksum. From 121.16.230.115:14017 to 87.255.1.43:1420 ulen 73 UDP: bad checksum. From 87.13.134.184:35000 to 87.255.1.43:1420 ulen 288 UDP: short packet: From 222.128.33.233:1924 2432/1480 to 87.255.1.73:1406 UDP: bad checksum. From 60.241.68.45:50403 to 87.255.1.73:24898 ulen 111 UDP: short packet: From 203.101.42.197:51251 25649/111 to 87.255.1.51:44883 UDP: short packet: From 74.212.42.2:15934 38650/71 to 87.255.1.43:1420 UDP: bad checksum. From 74.34.14.208:36128 to 87.255.1.114:10159 ulen 35 UDP: bad checksum. From 74.34.14.208:36128 to 87.255.1.114:10159 ulen 35 UDP: bad checksum. From 74.34.14.208:36128 to 87.255.1.114:10159 ulen 35 UDP: bad checksum. From 74.34.14.208:36128 to 87.255.1.114:10159 ulen 35 UDP: bad checksum. From 203.67.201.93:26261 to 87.255.1.36:44953 ulen 145 UDP: bad checksum. From 68.103.175.217:14262 to 87.255.1.65:1401 ulen 380 UDP: bad checksum. From 24.165.7.12:14957 to 87.255.1.41:1420 ulen 36 UDP: short packet: From 121.16.230.115:13824 669/70 to 87.255.1.43:12424 UDP: bad checksum. From 24.77.131.34:6881 to 87.255.1.122:29105 ulen 99 UDP: bad checksum. From 58.20.75.114:10546 to 87.255.1.60:1422 ulen 306 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 639 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 1316 UDP: bad checksum. From 82.200.62.66:9387 to 87.255.1.50:4672 ulen 10 UDP: bad checksum. From 82.200.62.66:9387 to 87.255.1.50:4672 ulen 10 UDP: bad checksum. From 82.200.62.66:9387 to 87.255.1.50:4672 ulen 10 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 1316 UDP: bad checksum. From 213.67.210.17:5444 to 87.255.1.89:41441 ulen 50 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 1316 UDP: bad checksum. From 24.165.7.12:14957 to 87.255.1.41:1420 ulen 132 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 957 UDP: bad checksum. From 87.18.31.216:13100 to 87.255.1.35:10256 ulen 32 UDP: bad checksum. From 24.165.7.12:14957 to 87.255.1.41:1420 ulen 44 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 167 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 193 UDP: bad checksum. From 125.40.114.106:14426 to 87.255.1.36:44953 ulen 109 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 156 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 191 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 86 UDP: bad checksum. From 213.145.133.124:29149 to 87.255.1.65:1424 ulen 127 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 487 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 443 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 301 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 356 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 782 UDP: bad checksum. From 220.141.130.122:52810 to 87.255.1.59:41231 ulen 146 UDP: bad checksum. From 61.146.67.199:65353 to 87.255.1.89:1420 ulen 145 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 60 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 390 UDP: bad checksum. From 69.246.191.217:52967 to 87.255.1.51:44883 ulen 285 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 251 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 188 UDP: bad checksum. From 218.208.196.221:53967 to 87.255.1.41:10983 ulen 1508 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 186 UDP: short packet: From 74.212.42.2:15934 57720/71 to 87.255.1.43:1425 UDP: bad checksum. From 89.208.32.226:28960 to 87.255.1.80:28960 ulen 514 UDP: short packet: From 74.212.42.2:15934 63183/50 to 87.255.1.43:1425 UDP: short packet: From 74.212.42.2:15934 58294/71 to 87.255.1.43:1425 UDP: bad checksum. From 24.165.7.12:14957 to 87.255.1.41:10983 ulen 40 UDP: bad checksum. From 24.65.16.161:61232 to 87.255.1.49:1421 ulen 319 UDP: bad checksum. From 86.9.68.129:65534 to 87.255.1.122:1425 ulen 285 UDP: bad checksum. From 87.13.134.110:45292 to 87.255.1.115:15051 ulen 111 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Tc bug (kernel crash) more info 2007-08-30 12:37 ` Jarek Poplawski 2007-08-30 13:43 ` Badalian Vyacheslav @ 2007-08-31 7:04 ` Badalian Vyacheslav 2007-08-31 7:59 ` Jarek Poplawski 1 sibling, 1 reply; 32+ messages in thread From: Badalian Vyacheslav @ 2007-08-31 7:04 UTC (permalink / raw) To: Jarek Poplawski, netdev [-- Attachment #1: Type: text/plain, Size: 1443 bytes --] I try you patch. Also i try add more debug options to kernel. I catch (BUG: spinlock lockup on CPU#3, tc/6403, f742e200) All info in file. Ready for next patch ;) >> Jarek Poplawski ??????????: >> > ... > >>> On the other hand disabling local interrupts shouldn't be enough here, >>> so it's really strange... Did you get this remotely? Are you sure LOC >>> only? (Anyway this 2.6.23-rc4 should be interesting.) >>> > ... > >> Only LOC changes... icmp answer = 50-70ms... after 1-2 hours traffic >> level is down and SI on CPU0 and CPU2 change to above 50%. ksoftirqd >> free CPU usage. I have this bug 3-4 times in week. If you need info what >> i can see only in bug still processing - i may try get this info for you. >> > > Any additional info could be helpful. I'm not sure if all these > computers do similar htb processing, or it's another problem? > As I've written before htb before 2.6.23-rc1 has a problem with > timer lockup during qdisc_destroy, so softirqs would be hit. > If it's htb's fault 2.6.23-rc4 or my testing patch should help. > > I try to find in htb code another weak points. BTW, if during > such lockups any processes are killed 'by hand' etc., without > restarting the whole system, please let us know. > > >> maybe help: >> >> 1U server INTEL, mb se7501w2 >> >> nat-new ~ # lspci >> > > lspci -v (or -vv should be more usable - but with dmesg at least) > > Jarek P. > > > [-- Attachment #2: bug.txt --] [-- Type: text/plain, Size: 15944 bytes --] [ 133.592929] HTB: quantum of class 10002 is big. Consider r2q change. [ 133.606638] HTB: quantum of class 10004 is big. Consider r2q change. [ 133.609442] HTB: quantum of class 10005 is big. Consider r2q change. [ 133.612331] HTB: quantum of class 10007 is big. Consider r2q change. [ 133.615099] HTB: quantum of class 10008 is big. Consider r2q change. [ 133.624105] HTB: quantum of class 10002 is big. Consider r2q change. [ 133.628133] HTB: quantum of class 10004 is big. Consider r2q change. [ 133.630870] HTB: quantum of class 10005 is big. Consider r2q change. [ 133.633649] HTB: quantum of class 10007 is big. Consider r2q change. [ 133.636379] HTB: quantum of class 10008 is big. Consider r2q change. [ 133.648717] u32 classifier [ 133.648839] Performance counters on [ 133.648957] input device check on [ 133.649064] Actions configured [ 135.430122] WARNING: at net/sched/sch_htb.c:404 htb_safe_rb_erase() [ 135.430322] [<f88394c5>] htb_deactivate_prios+0x135/0x18c [sch_htb] [ 135.430491] [<f883addd>] htb_dequeue+0x468/0x6d6 [sch_htb] [ 135.430643] [<c02bcf9b>] __qdisc_run+0x1e/0x190 [ 135.430801] [<c02b352c>] dev_queue_xmit+0x152/0x266 [ 135.430920] [<c02b7a40>] neigh_resolve_output+0x1f2/0x224 [ 135.431085] [<c02ceaee>] ip_output+0x28f/0x2bd [ 135.431261] [<c02cbd94>] dst_output+0x0/0x7 [ 135.431419] [<c02ce48b>] ip_build_and_send_pkt+0x1da/0x1ef [ 135.431560] [<c02cbd94>] dst_output+0x0/0x7 [ 135.431698] [<c02dfbed>] tcp_v4_send_synack+0x9f/0xf3 [ 135.431842] [<c02e140a>] tcp_v4_conn_request+0x379/0x3ae [ 135.431979] [<c02c6913>] rt_intern_hash+0x31f/0x331 [ 135.432121] [<c02da4f5>] tcp_rcv_state_process+0x62/0xad1 [ 135.432265] [<c02e090a>] tcp_v4_do_rcv+0x2be/0x311 [ 135.432405] [<c02e2938>] tcp_v4_rcv+0x86a/0x8de [ 135.432547] [<c02c9ec9>] ip_local_deliver+0x18b/0x232 [ 135.432683] [<c02c9608>] ip_local_deliver_finish+0x0/0x1b2 [ 135.432824] [<c02c9d05>] ip_rcv+0x484/0x4bd [ 135.432962] [<c02b155f>] netif_receive_skb+0x2bc/0x32b [ 135.433105] [<c023ca05>] e1000_clean_rx_irq+0x375/0x444 [ 135.433252] [<c023c690>] e1000_clean_rx_irq+0x0/0x444 [ 135.433388] [<c023baa9>] e1000_clean+0x7a/0x249 [ 135.433528] [<c02b32e6>] net_rx_action+0x91/0x185 [ 135.433668] [<c011c8c6>] __do_softirq+0x5d/0xc1 [ 135.433806] [<c011c95c>] do_softirq+0x32/0x36 [ 135.433939] [<c01043e6>] do_IRQ+0x7e/0x90 [ 135.434075] [<c010d65d>] smp_apic_timer_interrupt+0x74/0x80 [ 135.434219] [<c010c7c5>] smp_call_function_interrupt+0x3c/0x52 [ 135.434352] [<c0102f1f>] common_interrupt+0x23/0x28 [ 135.434488] [<c0100ab1>] mwait_idle_with_hints+0x3b/0x3f [ 135.434627] [<c0100bbc>] cpu_idle+0x59/0x6e [ 135.434765] [<c0421bcd>] start_kernel+0x2ea/0x2f2 [ 135.434908] [<c0421440>] unknown_bootoption+0x0/0x202 [ 135.435044] ======================= [ 140.454198] WARNING: at net/sched/sch_htb.c:404 htb_safe_rb_erase() [ 140.454311] [<f88394c5>] htb_deactivate_prios+0x135/0x18c [sch_htb] [ 140.454469] [<f883addd>] htb_dequeue+0x468/0x6d6 [sch_htb] [ 140.454616] [<c02bcf9b>] __qdisc_run+0x1e/0x190 [ 140.454764] [<c02b352c>] dev_queue_xmit+0x152/0x266 [ 140.454910] [<c02b7a40>] neigh_resolve_output+0x1f2/0x224 [ 140.455053] [<c02ceaee>] ip_output+0x28f/0x2bd [ 140.455199] [<c02cbd94>] dst_output+0x0/0x7 [ 140.455341] [<c02cc13d>] ip_push_pending_frames+0x2f2/0x3b6 [ 140.455484] [<c02cbd94>] dst_output+0x0/0x7 [ 140.455625] [<c02cde53>] ip_send_reply+0x1a2/0x1f8 [ 140.455772] [<c0303e1b>] _read_unlock_bh+0x5/0xd [ 140.455914] [<c02dfb1f>] tcp_v4_send_reset+0x10c/0x13b [ 140.456059] [<c02e2948>] tcp_v4_rcv+0x87a/0x8de [ 140.456202] [<c02c9ec9>] ip_local_deliver+0x18b/0x232 [ 140.456343] [<c02c9608>] ip_local_deliver_finish+0x0/0x1b2 [ 140.456490] [<c02c9d05>] ip_rcv+0x484/0x4bd [ 140.456633] [<c023c5cb>] e1000_alloc_rx_buffers+0x1bb/0x280 [ 140.456784] [<c02b155f>] netif_receive_skb+0x2bc/0x32b [ 140.456925] [<c023ca05>] e1000_clean_rx_irq+0x375/0x444 [ 140.457069] [<c023c690>] e1000_clean_rx_irq+0x0/0x444 [ 140.457210] [<c023baa9>] e1000_clean+0x7a/0x249 [ 140.457351] [<c02b32e6>] net_rx_action+0x91/0x185 [ 140.457493] [<c011c8c6>] __do_softirq+0x5d/0xc1 [ 140.457637] [<c011c95c>] do_softirq+0x32/0x36 [ 140.457781] [<c01043e6>] do_IRQ+0x7e/0x90 [ 140.457924] [<c010d65d>] smp_apic_timer_interrupt+0x74/0x80 [ 140.458068] [<c010c7c5>] smp_call_function_interrupt+0x3c/0x52 [ 140.458212] [<c0102f1f>] common_interrupt+0x23/0x28 [ 140.458354] [<c0100ab1>] mwait_idle_with_hints+0x3b/0x3f [ 140.458495] [<c0100bbc>] cpu_idle+0x59/0x6e [ 140.458635] [<c0421bcd>] start_kernel+0x2ea/0x2f2 [ 140.458783] [<c0421440>] unknown_bootoption+0x0/0x202 [ 140.458925] ======================= [ 383.290939] BUG: spinlock lockup on CPU#3, tc/6403, f742e200 [ 383.291058] [<c01c5fe9>] _raw_spin_lock+0xbb/0xdc [ 383.291203] [<c02bcb1d>] qdisc_lock_tree+0x14/0x1c [ 383.291346] [<f883a22e>] htb_change_class+0x23a/0x505 [sch_htb] [ 383.291495] [<c02bda57>] tc_ctl_tclass+0x1ae/0x1fd [ 383.291633] [<c02bd8a9>] tc_ctl_tclass+0x0/0x1fd [ 383.291774] [<c02b8898>] rtnetlink_rcv_msg+0x18d/0x1a7 [ 383.291911] [<c02c2a16>] netlink_run_queue+0x65/0xdb [ 383.292055] [<c02b870b>] rtnetlink_rcv_msg+0x0/0x1a7 [ 383.292195] [<c02b86c7>] rtnetlink_rcv+0x25/0x3d [ 383.292332] [<c02c2e58>] netlink_data_ready+0x12/0x52 [ 383.292468] [<c02c1e92>] netlink_sendskb+0x1c/0x33 [ 383.292605] [<c02c2e3a>] netlink_sendmsg+0x23b/0x247 [ 383.292746] [<c02a837b>] sock_sendmsg+0xbc/0xd4 [ 383.292890] [<c0127bad>] autoremove_wake_function+0x0/0x35 [ 383.293035] [<c0127bad>] autoremove_wake_function+0x0/0x35 [ 383.293179] [<c01c44a3>] copy_from_user+0x2d/0x59 [ 383.293314] [<c02ae8d9>] verify_iovec+0x3e/0x6d [ 383.293452] [<c02a8527>] sys_sendmsg+0x194/0x1f9 [ 383.293590] [<c02a8d9b>] sys_recvmsg+0x14d/0x1cf [ 383.293727] [<c01c44a3>] copy_from_user+0x2d/0x59 [ 383.293864] [<c013bfc4>] __alloc_pages+0x63/0x297 [ 383.294004] [<c0143be0>] __handle_mm_fault+0x7bd/0x7ef [ 383.294148] [<c02ab565>] sock_def_write_space+0x15/0x8e [ 383.294284] [<c02ab054>] sock_setsockopt+0x4bb/0x4c5 [ 383.295769] [<c02a949a>] sys_socketcall+0x223/0x242 [ 383.295911] [<c030520e>] do_page_fault+0x0/0x534 [ 383.296051] [<c010250e>] sysenter_past_esp+0x5f/0x85 [ 383.296192] ======================= ############# SYSTEM IS FREEZE. reboot on panic not work. Reboot manual [ 30.748000] HTB: quantum of class 10002 is big. Consider r2q change. [ 30.774000] HTB: quantum of class 10004 is big. Consider r2q change. [ 30.777000] HTB: quantum of class 10005 is big. Consider r2q change. [ 30.779000] HTB: quantum of class 10007 is big. Consider r2q change. [ 30.782000] HTB: quantum of class 10008 is big. Consider r2q change. [ 30.790000] HTB: quantum of class 10002 is big. Consider r2q change. [ 30.794000] HTB: quantum of class 10004 is big. Consider r2q change. [ 30.797000] HTB: quantum of class 10005 is big. Consider r2q change. [ 30.800000] HTB: quantum of class 10007 is big. Consider r2q change. [ 30.803000] HTB: quantum of class 10008 is big. Consider r2q change. [ 30.815000] u32 classifier [ 30.815000] Performance counters on [ 30.815000] input device check on [ 30.816000] Actions configured [ 32.684000] WARNING: at net/sched/sch_htb.c:404 htb_safe_rb_erase() [ 32.684000] [<f88394c5>] htb_deactivate_prios+0x135/0x18c [sch_htb] [ 32.684000] [<f883addd>] htb_dequeue+0x468/0x6d6 [sch_htb] [ 32.684000] [<c02bcf9b>] __qdisc_run+0x1e/0x190 [ 32.684000] [<c02b352c>] dev_queue_xmit+0x152/0x266 [ 32.684000] [<c02b7a40>] neigh_resolve_output+0x1f2/0x224 [ 32.684000] [<c02ceaee>] ip_output+0x28f/0x2bd [ 32.684000] [<c02cbd94>] dst_output+0x0/0x7 [ 32.684000] [<c02ce1c2>] ip_queue_xmit+0x319/0x35e [ 32.684000] [<c02cbd94>] dst_output+0x0/0x7 [ 32.684000] [<c02c700a>] __ip_route_output_key+0x6e5/0x6ff [ 32.684000] [<c02e0fd3>] tcp_v4_send_check+0x80/0xb6 [ 32.684000] [<c02dc494>] tcp_transmit_skb+0x65c/0x68f [ 32.684000] [<c02ad694>] __alloc_skb+0x49/0xf5 [ 32.684000] [<c02de876>] tcp_connect+0x2a8/0x327 [ 32.684000] [<c02e1a15>] tcp_v4_connect+0x468/0x586 [ 32.684000] [<c02ec4d5>] inet_stream_connect+0x7f/0x1ff [ 32.684000] [<c013e195>] mark_page_accessed+0x1c/0x30 [ 32.684000] [<c01c44a3>] copy_from_user+0x2d/0x59 [ 32.684000] [<c02a7eca>] sys_connect+0x72/0x9c [ 32.684000] [<c02a9d5f>] release_sock+0x13/0x94 [ 32.684000] [<c02e0b3f>] tcp_v4_init_sock+0x6f/0x141 [ 32.684000] [<c0303e2d>] _spin_unlock_bh+0x5/0xd [ 32.684000] [<c02ab054>] sock_setsockopt+0x4bb/0x4c5 [ 32.684000] [<c0161002>] d_instantiate+0x3f/0x4c [ 32.684000] [<c02a7e22>] sys_setsockopt+0x53/0x89 [ 32.684000] [<c02a9306>] sys_socketcall+0x8f/0x242 [ 32.684000] [<c010250e>] sysenter_past_esp+0x5f/0x85 [ 32.684000] ======================= ### System is freeze. Keyboard not work. reboot on panic not work. Reboot manual [ 127.698542] HTB: quantum of class 10002 is big. Consider r2q change. [ 127.722805] HTB: quantum of class 10004 is big. Consider r2q change. [ 127.725820] HTB: quantum of class 10005 is big. Consider r2q change. [ 127.728585] HTB: quantum of class 10007 is big. Consider r2q change. [ 127.731314] HTB: quantum of class 10008 is big. Consider r2q change. [ 127.743386] HTB: quantum of class 10002 is big. Consider r2q change. [ 127.747382] HTB: quantum of class 10004 is big. Consider r2q change. [ 127.750136] HTB: quantum of class 10005 is big. Consider r2q change. [ 127.752952] HTB: quantum of class 10007 is big. Consider r2q change. [ 127.755631] WARNING: at net/sched/sch_htb.c:404 htb_safe_rb_erase() [ 127.755692] HTB: quantum of class 10008 is big. Consider r2q change. [ 127.755853] [<f88394c5>] htb_deactivate_prios+0x135/0x18c [sch_htb] [ 127.756000] [<f883addd>] htb_dequeue+0x468/0x6d6 [sch_htb] [ 127.756144] [<c01d7148>] extract_entropy+0x45/0x89 [ 127.756289] [<c02bcf9b>] __qdisc_run+0x1e/0x190 [ 127.756434] [<c02b352c>] dev_queue_xmit+0x152/0x266 [ 127.756574] [<c02e80d0>] arp_send+0x4c/0x64 [ 127.756720] [<c02e76d2>] arp_xmit+0x4d/0x51 [ 127.756857] [<c02e839b>] arp_process+0x2b3/0x50b [ 127.757001] [<c013a5dc>] mempool_free+0x66/0x6b [ 127.757151] [<c0303ed7>] _spin_lock_irqsave+0x9/0xd [ 127.757291] [<c01d6c48>] __add_entropy_words+0x58/0x184 [ 127.757430] [<c02e86e2>] arp_rcv+0xef/0x103 [ 127.757567] [<c02b155f>] netif_receive_skb+0x2bc/0x32b [ 127.757705] [<c023ca05>] e1000_clean_rx_irq+0x375/0x444 [ 127.757853] [<c023c690>] e1000_clean_rx_irq+0x0/0x444 [ 127.757990] [<c023baa9>] e1000_clean+0x7a/0x249 [ 127.758126] [<c02b32e6>] net_rx_action+0x91/0x185 [ 127.758264] [<c011c8c6>] __do_softirq+0x5d/0xc1 [ 127.758404] [<c011c95c>] do_softirq+0x32/0x36 [ 127.758547] [<c01043e6>] do_IRQ+0x7e/0x90 [ 127.758696] [<c010d65d>] smp_apic_timer_interrupt+0x74/0x80 [ 127.758840] [<c010c7c5>] smp_call_function_interrupt+0x3c/0x52 [ 127.758978] [<c0102f1f>] common_interrupt+0x23/0x28 [ 127.759115] [<c0100ab1>] mwait_idle_with_hints+0x3b/0x3f [ 127.759254] [<c0100bbc>] cpu_idle+0x59/0x6e [ 127.759390] [<c0421bcd>] start_kernel+0x2ea/0x2f2 [ 127.759534] [<c0421440>] unknown_bootoption+0x0/0x202 [ 127.759672] ======================= [ 127.765948] u32 classifier [ 127.766062] Performance counters on [ 127.766177] input device check on [ 127.766293] Actions configured [ 128.857330] WARNING: at net/sched/sch_htb.c:404 htb_safe_rb_erase() [ 128.857449] [<f88394c5>] htb_deactivate_prios+0x135/0x18c [sch_htb] [ 128.857606] [<f883addd>] htb_dequeue+0x468/0x6d6 [sch_htb] [ 128.857754] [<c02bcf9b>] __qdisc_run+0x1e/0x190 [ 128.857901] [<c02b352c>] dev_queue_xmit+0x152/0x266 [ 128.858048] [<c02e80d0>] arp_send+0x4c/0x64 [ 128.858191] [<c02e76d2>] arp_xmit+0x4d/0x51 [ 128.858331] [<c02e8936>] arp_solicit+0x132/0x190 [ 128.858469] [<c02b814f>] neigh_timer_handler+0x238/0x281 [ 128.858606] [<c02b7f17>] neigh_timer_handler+0x0/0x281 [ 128.858744] [<c011f798>] run_timer_softirq+0xfa/0x15d [ 128.858887] [<c011c8c6>] __do_softirq+0x5d/0xc1 [ 128.859027] [<c011c95c>] do_softirq+0x32/0x36 [ 128.859166] [<c010d65d>] smp_apic_timer_interrupt+0x74/0x80 [ 128.859307] [<c010c7c5>] smp_call_function_interrupt+0x3c/0x52 [ 128.859449] [<c0102fdc>] apic_timer_interrupt+0x28/0x30 [ 128.859589] [<c0100ab1>] mwait_idle_with_hints+0x3b/0x3f [ 128.859727] [<c0100bbc>] cpu_idle+0x59/0x6e [ 128.859862] [<c0421bcd>] start_kernel+0x2ea/0x2f2 [ 128.860007] [<c0421440>] unknown_bootoption+0x0/0x202 [ 128.860144] ======================= [ 189.139227] megaraid: aborting-1187 cmd=28 <c=2 t=0 l=0> [ 189.139337] megaraid: 1187:11[255:0], abort from completed list ### System is freeze. Keyboard not work. reboot on panic not work. Reboot manual [ 31.131000] HTB: quantum of class 10002 is big. Consider r2q change. [ 31.156000] HTB: quantum of class 10004 is big. Consider r2q change. [ 31.159000] HTB: quantum of class 10005 is big. Consider r2q change. [ 31.162000] HTB: quantum of class 10007 is big. Consider r2q change. [ 31.165000] HTB: quantum of class 10008 is big. Consider r2q change. [ 31.177000] HTB: quantum of class 10002 is big. Consider r2q change. [ 31.181000] HTB: quantum of class 10004 is big. Consider r2q change. [ 31.183000] HTB: quantum of class 10005 is big. Consider r2q change. [ 31.186000] HTB: quantum of class 10007 is big. Consider r2q change. [ 31.189000] HTB: quantum of class 10008 is big. Consider r2q change. [ 31.200000] u32 classifier [ 31.200000] Performance counters on [ 31.200000] input device check on [ 31.200000] Actions configured [ 32.480000] WARNING: at net/sched/sch_htb.c:404 htb_safe_rb_erase() [ 32.480000] [<f88394c5>] htb_deactivate_prios+0x135/0x18c [sch_htb] [ 32.480000] [<f883addd>] htb_dequeue+0x468/0x6d6 [sch_htb] [ 32.480000] [<c02bcf9b>] __qdisc_run+0x1e/0x190 [ 32.480000] [<c02b352c>] dev_queue_xmit+0x152/0x266 [ 32.480000] [<c02b7a40>] neigh_resolve_output+0x1f2/0x224 [ 32.480000] [<c02ceaee>] ip_output+0x28f/0x2bd [ 32.480000] [<c02cbd94>] dst_output+0x0/0x7 [ 32.480000] [<c02ce48b>] ip_build_and_send_pkt+0x1da/0x1ef [ 32.480000] [<c02cbd94>] dst_output+0x0/0x7 [ 32.480000] [<c02dfbed>] tcp_v4_send_synack+0x9f/0xf3 [ 32.480000] [<c02e140a>] tcp_v4_conn_request+0x379/0x3ae [ 32.480000] [<c02c6913>] rt_intern_hash+0x31f/0x331 [ 32.480000] [<c02da4f5>] tcp_rcv_state_process+0x62/0xad1 [ 32.480000] [<c02e090a>] tcp_v4_do_rcv+0x2be/0x311 [ 32.480000] [<c02e2938>] tcp_v4_rcv+0x86a/0x8de [ 32.480000] [<c02c9ec9>] ip_local_deliver+0x18b/0x232 [ 32.480000] [<c02c9608>] ip_local_deliver_finish+0x0/0x1b2 [ 32.480000] [<c02c9d05>] ip_rcv+0x484/0x4bd [ 32.480000] [<c02b155f>] netif_receive_skb+0x2bc/0x32b [ 32.480000] [<c023ca05>] e1000_clean_rx_irq+0x375/0x444 [ 32.480000] [<c023c690>] e1000_clean_rx_irq+0x0/0x444 [ 32.480000] [<c023baa9>] e1000_clean+0x7a/0x249 [ 32.480000] [<c02b32e6>] net_rx_action+0x91/0x185 [ 32.480000] [<c011c8c6>] __do_softirq+0x5d/0xc1 [ 32.480000] [<c011c95c>] do_softirq+0x32/0x36 [ 32.480000] [<c01043e6>] do_IRQ+0x7e/0x90 [ 32.480000] [<c010d65d>] smp_apic_timer_interrupt+0x74/0x80 [ 32.480000] [<c010c7c5>] smp_call_function_interrupt+0x3c/0x52 [ 32.480000] [<c0102f1f>] common_interrupt+0x23/0x28 [ 32.480000] [<c0100ab1>] mwait_idle_with_hints+0x3b/0x3f [ 32.480000] [<c0100bbc>] cpu_idle+0x59/0x6e [ 32.480000] [<c0421bcd>] start_kernel+0x2ea/0x2f2 [ 32.480000] [<c0421440>] unknown_bootoption+0x0/0x202 [ 32.480000] ======================= [ 76.104000] input: AT Translated Set 2 keyboard as /class/input/input2 ### System is freeze. Keyboard not work. reboot on panic not work. Reboot manual ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Tc bug (kernel crash) more info 2007-08-31 7:04 ` Badalian Vyacheslav @ 2007-08-31 7:59 ` Jarek Poplawski 2007-08-31 8:25 ` Badalian Vyacheslav 0 siblings, 1 reply; 32+ messages in thread From: Jarek Poplawski @ 2007-08-31 7:59 UTC (permalink / raw) To: Badalian Vyacheslav; +Cc: netdev On Fri, Aug 31, 2007 at 11:04:21AM +0400, Badalian Vyacheslav wrote: > > I try you patch. Also i try add more debug options to kernel. I catch > (BUG: spinlock lockup on CPU#3, tc/6403, f742e200) > All info in file. Ready for next patch ;) I've to look at this, but actually my patch wrongly added this one warning. I hope you did this on a testing machine... Anyway there could be something new but I need some time. If possible, 2.6.23-rc4 should be the best thing for testing yet. Thanks, Jarek P. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Tc bug (kernel crash) more info 2007-08-31 7:59 ` Jarek Poplawski @ 2007-08-31 8:25 ` Badalian Vyacheslav 2007-08-31 8:49 ` Jarek Poplawski 2007-08-31 9:05 ` Jarek Poplawski 0 siblings, 2 replies; 32+ messages in thread From: Badalian Vyacheslav @ 2007-08-31 8:25 UTC (permalink / raw) To: Jarek Poplawski, netdev i not have testing mashine. we have 2 mashine and dynamic routing. if 1 mashine down - all traffic go to second mashine. I can test is on this mashines but i need that testing mashine will reboot on kernel panic (sysctl message). No freezes =) Ok. i try 2.6.23-rc4. > On Fri, Aug 31, 2007 at 11:04:21AM +0400, Badalian Vyacheslav wrote: > >> I try you patch. Also i try add more debug options to kernel. I catch >> (BUG: spinlock lockup on CPU#3, tc/6403, f742e200) >> All info in file. Ready for next patch ;) >> > > I've to look at this, but actually my patch wrongly added this one > warning. I hope you did this on a testing machine... > > Anyway there could be something new but I need some time. If possible, > 2.6.23-rc4 should be the best thing for testing yet. > > Thanks, > Jarek P. > > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Tc bug (kernel crash) more info 2007-08-31 8:25 ` Badalian Vyacheslav @ 2007-08-31 8:49 ` Jarek Poplawski 2007-08-31 9:05 ` Jarek Poplawski 1 sibling, 0 replies; 32+ messages in thread From: Jarek Poplawski @ 2007-08-31 8:49 UTC (permalink / raw) To: Badalian Vyacheslav; +Cc: netdev On Fri, Aug 31, 2007 at 12:25:22PM +0400, Badalian Vyacheslav wrote: > i not have testing mashine. > we have 2 mashine and dynamic routing. if 1 mashine down - all traffic > go to second mashine. > I can test is on this mashines but i need that testing mashine will > reboot on kernel panic (sysctl message). No freezes =) > > Ok. i try 2.6.23-rc4. > > >On Fri, Aug 31, 2007 at 11:04:21AM +0400, Badalian Vyacheslav wrote: > > > >>I try you patch. Also i try add more debug options to kernel. I catch > >>(BUG: spinlock lockup on CPU#3, tc/6403, f742e200) BTW, I think after this BUG could be something more about other CPUs. Could you check this? Jarek P. PS: If it's possible try to not cut too much: if there is something confidential mask this with some XXX or mark the cut with... There could matter what else is run at the same time (including other driver's warnings). If you think it's too much for a list you can send it to me only. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Tc bug (kernel crash) more info 2007-08-31 8:25 ` Badalian Vyacheslav 2007-08-31 8:49 ` Jarek Poplawski @ 2007-08-31 9:05 ` Jarek Poplawski 2007-08-31 9:16 ` Jarek Poplawski 2007-08-31 9:33 ` Badalian Vyacheslav 1 sibling, 2 replies; 32+ messages in thread From: Jarek Poplawski @ 2007-08-31 9:05 UTC (permalink / raw) To: Badalian Vyacheslav; +Cc: netdev On Fri, Aug 31, 2007 at 12:25:22PM +0400, Badalian Vyacheslav wrote: > i not have testing mashine. > we have 2 mashine and dynamic routing. if 1 mashine down - all traffic > go to second mashine. > I can test is on this mashines but i need that testing mashine will > reboot on kernel panic (sysctl message). No freezes =) > > Ok. i try 2.6.23-rc4. ...but without testing machine it can be too much risk! New versions of kernel can break your applications (sometimes they should be at least rebuilded). So, maybe you would better try this, 'less testing', version of my patch: Jarek P. --- diff -Nurp linux-2.6.22.5-/net/sched/sch_htb.c linux-2.6.22.5/net/sched/sch_htb.c --- linux-2.6.22.5-/net/sched/sch_htb.c 2007-07-09 01:32:17.000000000 +0200 +++ linux-2.6.22.5/net/sched/sch_htb.c 2007-08-31 08:43:45.000000000 +0200 @@ -688,7 +688,11 @@ static void htb_rate_timer(unsigned long /* lock queue so that we can muck with it */ - spin_lock_bh(&sch->dev->queue_lock); + if (!spin_trylock_bh(&sch->dev->queue_lock)) { + q->rttim.expires = jiffies + 1; + add_timer(&q->rttim); + return; + } q->rttim.expires = jiffies + HZ; add_timer(&q->rttim); @@ -1306,7 +1310,8 @@ static void htb_destroy(struct Qdisc *sc qdisc_watchdog_cancel(&q->watchdog); #ifdef HTB_RATECM - del_timer_sync(&q->rttim); + if (!del_timer_sync(&q->rttim)) + del_timer(&q->rttim); #endif /* This line used to be after htb_destroy_class call below and surprisingly it worked in 2.4. But it must precede it ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Tc bug (kernel crash) more info 2007-08-31 9:05 ` Jarek Poplawski @ 2007-08-31 9:16 ` Jarek Poplawski 2007-08-31 9:33 ` Badalian Vyacheslav 1 sibling, 0 replies; 32+ messages in thread From: Jarek Poplawski @ 2007-08-31 9:16 UTC (permalink / raw) To: Badalian Vyacheslav; +Cc: netdev On Fri, Aug 31, 2007 at 11:05:09AM +0200, Jarek Poplawski wrote: ... > So, maybe you would better try this, 'less testing', version of my patch: Of course, the previous patch should be reverted (patch -p1 -R) or clean 2.6.22.5 used for this. Jarek P. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Tc bug (kernel crash) more info 2007-08-31 9:05 ` Jarek Poplawski 2007-08-31 9:16 ` Jarek Poplawski @ 2007-08-31 9:33 ` Badalian Vyacheslav 2007-08-31 10:17 ` Jarek Poplawski 1 sibling, 1 reply; 32+ messages in thread From: Badalian Vyacheslav @ 2007-08-31 9:33 UTC (permalink / raw) To: Jarek Poplawski; +Cc: netdev >> i not have testing mashine. >> we have 2 mashine and dynamic routing. if 1 mashine down - all traffic >> go to second mashine. >> I can test is on this mashines but i need that testing mashine will >> reboot on kernel panic (sysctl message). No freezes =) >> >> Ok. i try 2.6.23-rc4. >> > > ...but without testing machine it can be too much risk! New versions > of kernel can break your applications (sometimes they should be at > least rebuilded). > > So, maybe you would better try this, 'less testing', version of my patch: > > I risk only if RC kernel broke hardware. Mashines use only iptables and TC. All Scrips backuped. But if i need to reboot PC by hand (if it freeze) i need drive to servers-room =) > BTW, I think after this BUG could be something more about other CPUs. > Could you check this? I send to you all info that catch NETCONSOLE. i don't cut any info... Now 1 mashine work at *"2.6.23-rc4-git2 <http://kernel.org/pub/linux/kernel/v2.6/snapshots/patch-2.6.23-rc4-git2.bz2>" Backup mashine use 2.6.18* ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Tc bug (kernel crash) more info 2007-08-31 9:33 ` Badalian Vyacheslav @ 2007-08-31 10:17 ` Jarek Poplawski 2007-08-31 10:48 ` Badalian Vyacheslav 2007-08-31 10:50 ` Badalian Vyacheslav 0 siblings, 2 replies; 32+ messages in thread From: Jarek Poplawski @ 2007-08-31 10:17 UTC (permalink / raw) To: Badalian Vyacheslav; +Cc: netdev On Fri, Aug 31, 2007 at 01:33:04PM +0400, Badalian Vyacheslav wrote: > > >>i not have testing mashine. > >>we have 2 mashine and dynamic routing. if 1 mashine down - all traffic > >>go to second mashine. > >>I can test is on this mashines but i need that testing mashine will > >>reboot on kernel panic (sysctl message). No freezes =) > >> > >>Ok. i try 2.6.23-rc4. > >> > > > >...but without testing machine it can be too much risk! New versions > >of kernel can break your applications (sometimes they should be at > >least rebuilded). > > > >So, maybe you would better try this, 'less testing', version of my patch: > > > > > I risk only if RC kernel broke hardware. Mashines use only iptables and > TC. All Scrips backuped. But sometime a new kernel can break binary compatibility with previous one (e.g. after data structures change) and e.g. iptables or iproute tools stop working or work in an unpredictable way. There were a few such changes before 2.6.20 - I don't track current changes too much. And I'm sure your system uses much more than iptables or TC, even without your knowledge. > But if i need to reboot PC by hand (if it freeze) i need drive to > servers-room =) Of course, if previous kernel always boots by default, or even there is a possibility to use for testing different partition with a copy of main system this risk should be much smaller. > > > BTW, I think after this BUG could be something more about other CPUs. > > Could you check this? > > I send to you all info that catch NETCONSOLE. i don't cut any info... So, it seems something was broken. But, I meant, there can be sometimes interesting things a few lines before or after the infos too. > > Now 1 mashine work at *"2.6.23-rc4-git2 > <http://kernel.org/pub/linux/kernel/v2.6/snapshots/patch-2.6.23-rc4-git2.bz2>" > Backup mashine use 2.6.18* > BTW, -git versions are usually more risky than -rc. And, maybe, let this 2.6.18 better stay away from this testing... Thanks, Jarek P. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Tc bug (kernel crash) more info 2007-08-31 10:17 ` Jarek Poplawski @ 2007-08-31 10:48 ` Badalian Vyacheslav 2007-08-31 12:59 ` Jarek Poplawski 2007-08-31 10:50 ` Badalian Vyacheslav 1 sibling, 1 reply; 32+ messages in thread From: Badalian Vyacheslav @ 2007-08-31 10:48 UTC (permalink / raw) To: Jarek Poplawski; +Cc: netdev > But sometime a new kernel can break binary compatibility with previous > one (e.g. after data structures change) and e.g. iptables or iproute > tools stop working or work in an unpredictable way. There were a few > such changes before 2.6.20 - I don't track current changes too much. > And I'm sure your system uses much more than iptables or TC, even > without your knowledge. > I can rollback =) Backup system get all functions. I can risk for one system of two for get and fix all bugs. Dynamic routing work fine and automatic switch between systems = 1-4 seconds. >> I send to you all info that catch NETCONSOLE. i don't cut any info... >> > > So, it seems something was broken. But, I meant, there can be sometimes > interesting things a few lines before or after the infos too. > I can only see that say netconsole. If i look to monitor i look last lines. last line is "====...". Scrolling not work netconsole run as module and start after system do full load. Then netconsole is up - i run generator of tc scripts. > BTW, -git versions are usually more risky than -rc. And, maybe, let > this 2.6.18 better stay away from this testing... > I look changes between 2.6.23-rc4 and 2.6.23-rc4-git2 and think that paches look good and no do any critical things =) Now i do many script runs to simulate bug. if i get it on 2.6.23-rc4 - i post it here. 2.6.23-rc4 not have htb_timer function. Badalian Vyacheslav ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Tc bug (kernel crash) more info 2007-08-31 10:48 ` Badalian Vyacheslav @ 2007-08-31 12:59 ` Jarek Poplawski 2007-08-31 14:31 ` Badalian Vyacheslav 0 siblings, 1 reply; 32+ messages in thread From: Jarek Poplawski @ 2007-08-31 12:59 UTC (permalink / raw) To: Badalian Vyacheslav; +Cc: netdev On Fri, Aug 31, 2007 at 02:48:31PM +0400, Badalian Vyacheslav wrote: ... > I can only see that say netconsole. If i look to monitor i look last > lines. last line is "====...". Scrolling not work > netconsole run as module and start after system do full load. Then > netconsole is up - i run generator of tc scripts. It would be interesting to know if this bug did ever happen without netconsole, or, if it wasn't tested this way, if it's possible to do such a test (not necessarily today)? > >BTW, -git versions are usually more risky than -rc. And, maybe, let > >this 2.6.18 better stay away from this testing... > > > I look changes between 2.6.23-rc4 and 2.6.23-rc4-git2 and think that > paches look good and no do any critical things =) > > Now i do many script runs to simulate bug. if i get it on 2.6.23-rc4 - i > post it here. 2.6.23-rc4 not have htb_timer function. I'll not be able to assist you until monday (but I'll try to look into the code and maybe to prepare some new patch - but it needs a lot of checking to not add too much of this locking as well). I think you can stay with a kernel whichever you like - I'm not sure any config changes or even more debugging can change much, but maybe I'm wrong. It looks to me like some locking is missing or interrupted. If you are working weekends and find something new, don't wait: maybe somebody else here could be interested too. Cheers, Jarek P. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Tc bug (kernel crash) more info 2007-08-31 12:59 ` Jarek Poplawski @ 2007-08-31 14:31 ` Badalian Vyacheslav 2007-08-31 14:51 ` Badalian Vyacheslav 0 siblings, 1 reply; 32+ messages in thread From: Badalian Vyacheslav @ 2007-08-31 14:31 UTC (permalink / raw) To: Jarek Poplawski, netdev Ok =) I hope in next week you found bug place and fix it! PS. if you ask where i can read "kernel panic dump logic" literature and try find bugline in code. I read dump and see that bug in function "rb_insert_color" + some shift (in asm?) that called from htb_dequeue? But in htb_dequeue not have calling rb_insert_color =( Or some nodes in trace was skipped? Its for change up my education ;) > I'll not be able to assist you until monday (but I'll try to look > into the code and maybe to prepare some new patch - but it needs > a lot of checking to not add too much of this locking as well). > > I think you can stay with a kernel whichever you like - I'm not sure > any config changes or even more debugging can change much, but maybe > I'm wrong. It looks to me like some locking is missing or interrupted. > If you are working weekends and find something new, don't wait: maybe > somebody else here could be interested too. > > Cheers, > Jarek P. > > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Tc bug (kernel crash) more info 2007-08-31 14:31 ` Badalian Vyacheslav @ 2007-08-31 14:51 ` Badalian Vyacheslav [not found] ` <20070831215850.zf2xi256o00owk4s@mail.himki.net> 2007-09-03 7:31 ` Jarek Poplawski 0 siblings, 2 replies; 32+ messages in thread From: Badalian Vyacheslav @ 2007-08-31 14:51 UTC (permalink / raw) To: Jarek Poplawski; +Cc: netdev I found that bug in this place (gdb) l *0xc01c8973 0xc01c8973 is in rb_insert_color (lib/rbtree.c:80). 75 76 while ((parent = rb_parent(node)) && rb_is_red(parent)) 77 { 78 gparent = rb_parent(parent); 79 80 if (parent == gparent->rb_left) 81 { 82 { 83 register struct rb_node *uncle = gparent->rb_right; 84 if (uncle && rb_is_red(uncle)) if i not wrong understand message "unable to handle kernel NULL pointer dereference at virtual address 00000008" its was known that "gparent == Null"? Or i hope or i try find a mare's-nest? > Ok =) I hope in next week you found bug place and fix it! > > PS. if you ask where i can read "kernel panic dump logic" literature > and try find bugline in code. > I read dump and see that bug in function "rb_insert_color" + some > shift (in asm?) that called from htb_dequeue? But in htb_dequeue not > have calling rb_insert_color =( Or some nodes in trace was skipped? > > Its for change up my education ;) >> I'll not be able to assist you until monday (but I'll try to look >> into the code and maybe to prepare some new patch - but it needs >> a lot of checking to not add too much of this locking as well). >> >> I think you can stay with a kernel whichever you like - I'm not sure >> any config changes or even more debugging can change much, but maybe >> I'm wrong. It looks to me like some locking is missing or interrupted. >> If you are working weekends and find something new, don't wait: maybe >> somebody else here could be interested too. >> >> Cheers, >> Jarek P. >> > > - > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 32+ messages in thread
[parent not found: <20070831215850.zf2xi256o00owk4s@mail.himki.net>]
* Re: Tc bug (kernel crash) more info [not found] ` <20070831215850.zf2xi256o00owk4s@mail.himki.net> @ 2007-09-01 10:36 ` slavon 0 siblings, 0 replies; 32+ messages in thread From: slavon @ 2007-09-01 10:36 UTC (permalink / raw) To: netdev > Hi All! > I found another bugs in HTB > > 1. HTB Wrong calculate LEVELS. > - try run "./create_nodes.sh" in archive and do "tc -d class show dev eth0" Hm.. i read http://luxik.cdi.cz/~devik/qos/htb/manual/theory.htm I understand that if Level calculation broken - HTB wrong work! I try to see sch_htb.c but i not see simple fix to this. Maybe anyone with expirince try to look sch_htb.c? > > 2. HTB miss qdisc what i add. (it's not added) > - try run ./create_nodes.sh; sh tc_rules_last2 2>/dev/null > - I add my output of "tc -d class show dev eth0" for you look > (tc_class.txt in archive) Not correct info. "tc -d class show dev eth0 | grep -v leaf" ask many rules without leaf, but QDISC for this casses is created. How filter add packer to filter where not have LEAF? I think HTB not property work. Thanks. Badalian Vyacheslav > > Anyone! I want test any patches to fix all HTB problems! > > Thanks > > >> I found that bug in this place >> >> (gdb) l *0xc01c8973 >> 0xc01c8973 is in rb_insert_color (lib/rbtree.c:80). >> 75 >> 76 while ((parent = rb_parent(node)) && rb_is_red(parent)) >> 77 { >> 78 gparent = rb_parent(parent); >> 79 >> 80 if (parent == gparent->rb_left) >> 81 { >> 82 { >> 83 register struct rb_node *uncle >> = gparent->rb_right; >> 84 if (uncle && rb_is_red(uncle)) >> >> >> if i not wrong understand message "unable to handle kernel NULL pointer >> dereference at virtual address 00000008" its was known that "gparent == >> Null"? >> Or i hope or i try find a mare's-nest? >> >>> Ok =) I hope in next week you found bug place and fix it! >>> >>> PS. if you ask where i can read "kernel panic dump logic" >>> literature and try find bugline in code. >>> I read dump and see that bug in function "rb_insert_color" + some >>> shift (in asm?) that called from htb_dequeue? But in htb_dequeue >>> not have calling rb_insert_color =( Or some nodes in trace was >>> skipped? >>> >>> Its for change up my education ;) >>>> I'll not be able to assist you until monday (but I'll try to look >>>> into the code and maybe to prepare some new patch - but it needs >>>> a lot of checking to not add too much of this locking as well). >>>> >>>> I think you can stay with a kernel whichever you like - I'm not sure >>>> any config changes or even more debugging can change much, but maybe >>>> I'm wrong. It looks to me like some locking is missing or interrupted. >>>> If you are working weekends and find something new, don't wait: maybe >>>> somebody else here could be interested too. >>>> Cheers, >>>> Jarek P. >>>> >>> >>> - >>> To unsubscribe from this list: send the line "unsubscribe netdev" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >> >> - >> To unsubscribe from this list: send the line "unsubscribe netdev" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > ---------------------------------------------------------------- > This message was sent using IMP, the Internet Messaging Program. ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Tc bug (kernel crash) more info 2007-08-31 14:51 ` Badalian Vyacheslav [not found] ` <20070831215850.zf2xi256o00owk4s@mail.himki.net> @ 2007-09-03 7:31 ` Jarek Poplawski 2007-09-03 8:05 ` Badalian Vyacheslav 1 sibling, 1 reply; 32+ messages in thread From: Jarek Poplawski @ 2007-09-03 7:31 UTC (permalink / raw) To: Badalian Vyacheslav; +Cc: netdev On Fri, Aug 31, 2007 at 06:51:24PM +0400, Badalian Vyacheslav wrote: > I found that bug in this place > > (gdb) l *0xc01c8973 > 0xc01c8973 is in rb_insert_color (lib/rbtree.c:80). ... > if i not wrong understand message "unable to handle kernel NULL pointer > dereference at virtual address 00000008" its was known that "gparent == > Null"? > Or i hope or i try find a mare's-nest? Your errors trigger in rbtree, which does indexing for HTB, but since it's something quite rare I think there is a very small probability that it's caused by HTB class/level handling (but it's possible, too), but more probable (to me) these indexes are corrupted by something e.g. like accessing them without proper locking. Below I attach a patch for testing: it adds some lock debugging (plus one place: htb_put is locked). There is mainly checking of locks needed for writing to rbtree, but it doesn't check all readings yet, so there will be still something to check if this patch doesn't help to find anything. It should be applied to 2.6.23-rc4, but if you prefer 2.6.22.5 version let me know (BTW, I hope you let us know if you have to apply any other patches/changes to these kernels...). > > >Ok =) I hope in next week you found bug place and fix it! > > > >PS. if you ask where i can read "kernel panic dump logic" literature > >and try find bugline in code. > >I read dump and see that bug in function "rb_insert_color" + some > >shift (in asm?) that called from htb_dequeue? But in htb_dequeue not > >have calling rb_insert_color =( Or some nodes in trace was skipped? > > > >Its for change up my education ;) I didn't learn much about this, but usually objdump is enough for me. Here are some links for kernel education: http://kernelnewbies.org/ http://www.tux.org/lkml/ http://en.tldp.org/LDP/khg/HyperNews/get/khg.html http://www.stardust.webpages.pl/files/handbook/tmp-en/ Regards, Jarek P. --- diff -Nurp linux-2.6.23-rc4-/net/sched/sch_htb.c linux-2.6.23-rc4/net/sched/sch_htb.c --- linux-2.6.23-rc4-/net/sched/sch_htb.c 2007-08-28 19:52:25.000000000 +0200 +++ linux-2.6.23-rc4/net/sched/sch_htb.c 2007-09-02 10:34:39.000000000 +0200 @@ -52,6 +52,7 @@ one less than their parent. */ +#define DEBUG_HTB #define HTB_HSIZE 16 /* classid hash size */ #define HTB_HYSTERESIS 1 /* whether to use mode hysteresis for speedup */ #define HTB_VER 0x30011 /* major must be matched with number suplied by TC as version */ @@ -127,6 +128,9 @@ struct htb_class { int prio; /* For parent to leaf return possible here */ int quantum; /* we do backup. Finally full replacement */ /* of un.leaf originals should be done. */ +#ifdef DEBUG_HTB + struct Qdisc *sch; +#endif }; static inline long L2T(struct htb_class *cl, struct qdisc_rate_table *rate, @@ -175,6 +179,23 @@ struct htb_sched { long direct_pkts; }; +#ifdef DEBUG_HTB +static inline int htb_queue_locked(struct htb_class *cl) +{ + if (cl->sch) { + if (!spin_is_locked(&cl->sch->dev->queue_lock) || + !in_softirq()) { + cl->sch = NULL; + return 0; + } + } + return 1; +} +#define DEBUG_QUEUE_LOCKED(cl) WARN_ON(!htb_queue_locked(cl)) +#else +#define DEBUG_QUEUE_LOCKED(dev) do { } while (0) +#endif + /* compute hash of size HTB_HSIZE for given handle */ static inline int htb_hash(u32 h) { @@ -280,6 +301,7 @@ static void htb_add_to_id_tree(struct rb { struct rb_node **p = &root->rb_node, *parent = NULL; + DEBUG_QUEUE_LOCKED(cl); while (*p) { struct htb_class *c; parent = *p; @@ -306,6 +328,7 @@ static void htb_add_to_wait_tree(struct { struct rb_node **p = &q->wait_pq[cl->level].rb_node, *parent = NULL; + DEBUG_QUEUE_LOCKED(cl); cl->pq_key = q->now + delay; if (cl->pq_key == q->now) cl->pq_key++; @@ -378,6 +401,7 @@ static inline void htb_remove_class_from { int m = 0; + DEBUG_QUEUE_LOCKED(cl); while (mask) { int prio = ffz(~mask); @@ -438,6 +462,7 @@ static void htb_deactivate_prios(struct struct htb_class *p = cl->parent; long m, mask = cl->prio_activity; + DEBUG_QUEUE_LOCKED(cl); while (cl->cmode == HTB_MAY_BORROW && p && mask) { m = mask; mask = 0; @@ -668,6 +693,7 @@ static void htb_charge_class(struct htb_ if (toks <= -cl->mbuffer) toks = 1-cl->mbuffer; \ cl->T = toks + DEBUG_QUEUE_LOCKED(cl); while (cl) { diff = psched_tdiff_bounded(q->now, cl->t_c, cl->mbuffer); if (cl->level >= level) { @@ -724,6 +750,7 @@ static psched_time_t htb_do_events(struc if (cl->pq_key > q->now) return cl->pq_key; + DEBUG_QUEUE_LOCKED(cl); htb_safe_rb_erase(p, q->wait_pq + level); diff = psched_tdiff_bounded(q->now, cl->t_c, cl->mbuffer); htb_change_class_mode(q, cl, &diff); @@ -822,6 +849,7 @@ static struct sk_buff *htb_dequeue_tree( start = cl = htb_lookup_leaf(q->row[level] + prio, prio, q->ptr[level] + prio, q->last_ptr_id[level] + prio); + DEBUG_QUEUE_LOCKED(cl); do { next: @@ -1197,6 +1225,7 @@ static void htb_destroy_class(struct Qdi { struct htb_sched *q = qdisc_priv(sch); + DEBUG_QUEUE_LOCKED(cl); if (!cl->level) { BUG_TRAP(cl->un.leaf.q); qdisc_destroy(cl->un.leaf.q); @@ -1291,8 +1320,11 @@ static void htb_put(struct Qdisc *sch, u { struct htb_class *cl = (struct htb_class *)arg; - if (--cl->refcnt == 0) + if (--cl->refcnt == 0) { + sch_tree_lock(sch); htb_destroy_class(sch, cl); + sch_tree_unlock(sch); + } } static int htb_change_class(struct Qdisc *sch, u32 classid, @@ -1367,6 +1399,9 @@ static int htb_change_class(struct Qdisc for (prio = 0; prio < TC_HTB_NUMPRIO; prio++) RB_CLEAR_NODE(&cl->node[prio]); +#ifdef DEBUG_HTB + cl->sch = sch; +#endif /* create leaf qdisc early because it uses kmalloc(GFP_KERNEL) so that can't be used inside of sch_tree_lock -- thanks to Karlis Peisenieks */ ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Tc bug (kernel crash) more info 2007-09-03 7:31 ` Jarek Poplawski @ 2007-09-03 8:05 ` Badalian Vyacheslav 2007-09-03 8:31 ` Badalian Vyacheslav 0 siblings, 1 reply; 32+ messages in thread From: Badalian Vyacheslav @ 2007-09-03 8:05 UTC (permalink / raw) To: Jarek Poplawski, netdev > Your errors trigger in rbtree, which does indexing for HTB, but since > it's something quite rare I think there is a very small probability > that it's caused by HTB class/level handling (but it's possible, too), > but more probable (to me) these indexes are corrupted by something e.g. > like accessing them without proper locking. > > Below I attach a patch for testing: it adds some lock debugging (plus > one place: htb_put is locked). There is mainly checking of locks > needed for writing to rbtree, but it doesn't check all readings yet, > so there will be still something to check if this patch doesn't help > to find anything. > > It should be applied to 2.6.23-rc4, but if you prefer 2.6.22.5 version > let me know (BTW, I hope you let us know if you have to apply any > other patches/changes to these kernels...). > > Ok... i was apply patch and see that its say... thanks... Maybe you see bug 2 (wrong level calculation) and 3 (class not leaf but have qdisc) at http://bugzilla.kernel.org/show_bug.cgi?id=8971 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Tc bug (kernel crash) more info 2007-09-03 8:05 ` Badalian Vyacheslav @ 2007-09-03 8:31 ` Badalian Vyacheslav 2007-09-03 9:12 ` Jarek Poplawski 0 siblings, 1 reply; 32+ messages in thread From: Badalian Vyacheslav @ 2007-09-03 8:31 UTC (permalink / raw) To: Jarek Poplawski, netdev May you also see that i need change to fix this: qdisc handle can >= 10 000 i have more then 10 000 qdiscs =( ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Tc bug (kernel crash) more info 2007-09-03 8:31 ` Badalian Vyacheslav @ 2007-09-03 9:12 ` Jarek Poplawski 0 siblings, 0 replies; 32+ messages in thread From: Jarek Poplawski @ 2007-09-03 9:12 UTC (permalink / raw) To: Badalian Vyacheslav; +Cc: netdev On Mon, Sep 03, 2007 at 12:31:39PM +0400, Badalian Vyacheslav wrote: > May you also see that i need change to fix this: > > qdisc handle can >= 10 000 > > i have more then 10 000 qdiscs =( > As far as I know qdisc handle is hex, so you can have e.g.: handle 999a (or a999 too). But, does it mean your kernel has any changes around this? If so, please, describe them all (or send your patch). I've read your bugzilla log with: "2. HTB levels wrong calculate!", but can't check this now: could you add there what are these levels you can see after: tc -d class show dev eth0 ? Jarek P. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Tc bug (kernel crash) more info 2007-08-31 10:17 ` Jarek Poplawski 2007-08-31 10:48 ` Badalian Vyacheslav @ 2007-08-31 10:50 ` Badalian Vyacheslav 2007-08-31 10:59 ` Badalian Vyacheslav 1 sibling, 1 reply; 32+ messages in thread From: Badalian Vyacheslav @ 2007-08-31 10:50 UTC (permalink / raw) To: Jarek Poplawski, netdev I get kernel panic on 2.6.23-rc4-git2 This is netconsole log! [ 3931.002707] BUG: unable to handle kernel NULL pointer dereference at virtual address 00000008 [ 3931.002846] printing eip: [ 3931.002906] c01c8973 [ 3931.002967] *pde = 00000000 [ 3931.003031] Oops: 0000 [#1] [ 3931.003093] SMP [ 3931.003160] Modules linked in: cls_u32 sch_sfq sch_htb netconsole xt_tcpudp iptable_filter ip_tables x_tables i2c_i801 i2c_core [ 3931.003327] CPU: 2 [ 3931.003327] EIP: 0060:[<c01c8973>] Not tainted VLI [ 3931.003328] EFLAGS: 00010246 (2.6.23-rc4-testing #1) [ 3931.003526] EIP is at rb_insert_color+0x13/0xad [ 3931.003594] eax: 00000000 ebx: e9570324 ecx: e9570324 edx: f6deac48 [ 3931.003663] esi: 00000000 edi: ef5c4124 ebp: f6dea8a0 esp: e25f5d6c [ 3931.003731] ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 [ 3931.003796] Process sh (pid: 6146, ti=e25f4000 task=c268b290 task.ti=e25f4000) [ 3931.003866] Stack: f6deac48 00000569 00000000 ef5c4000 f6dea8a0 f8862a9d f881c5db e1fda780 [ 3931.004016] 00000003 f6db6dc0 f6deac48 f6dea800 00000000 00000000 dfc3e9b2 00000000 [ 3931.004161] e25f5dd8 00000000 c02a774b 00000002 e25f5e70 f6dea930 f6dea930 00000000 [ 3931.004307] Call Trace: [ 3931.004434] [<f8862a9d>] htb_dequeue+0x195/0x6d2 [sch_htb] [ 3931.004510] [<f881c5db>] ipt_do_table+0x41f/0x47c [ip_tables] [ 3931.004584] [<c02a774b>] tc_classify+0x17/0x7c [ 3931.004658] [<f8861925>] htb_activate_prios+0x9b/0xa5 [sch_htb] [ 3931.004730] [<c02a71af>] __qdisc_run+0x2a/0x16b [ 3931.004798] [<c029cfc1>] dev_queue_xmit+0x18b/0x2a6 [ 3931.004874] [<c02b94e3>] ip_output+0x281/0x2ba [ 3931.004947] [<c02b571c>] ip_forward_finish+0x0/0x2e [ 3931.005012] [<c02b59b5>] ip_forward+0x26b/0x2c6 [ 3931.005081] [<c02b571c>] ip_forward_finish+0x0/0x2e [ 3931.005150] [<c02b4729>] ip_rcv+0x484/0x4bd [ 3931.005216] [<c013dcc5>] file_read_actor+0x0/0xdb [ 3931.005293] [<c029ab9c>] netif_receive_skb+0x2cd/0x340 [ 3931.005362] [<c0234ef1>] e1000_clean_rx_irq+0x379/0x448 [ 3931.005437] [<c0234b78>] e1000_clean_rx_irq+0x0/0x448 [ 3931.005506] [<c0233f8f>] e1000_clean+0x7a/0x249 [ 3931.005574] [<c029ccad>] net_rx_action+0x91/0x17f [ 3931.005642] [<c01225e2>] __do_softirq+0x5d/0xc1 [ 3931.005714] [<c0122678>] do_softirq+0x32/0x36 [ 3931.005779] [<c010488a>] do_IRQ+0x7e/0x90 [ 3931.005849] [<c01032eb>] common_interrupt+0x23/0x28 [ 3931.005923] ======================= [ 3931.005986] Code: 56 04 eb 07 89 56 08 eb 02 89 17 8b 03 83 e0 03 09 d0 89 03 5b 5e 5f c3 55 57 89 c7 56 53 83 ec 04 89 14 24 eb 7e 89 c6 83 e6 fc <8b> 56 08 39 d3 75 34 8b 56 04 85 d2 74 06 8b 02 a8 01 74 31 8b [ 3931.006386] EIP: [<c01c8973>] rb_insert_color+0x13/0xad SS:ESP 0068:e25f5d6c [ 3931.006757] Kernel panic - not syncing: Fatal exception in interrupt [ 3931.006863] Rebooting in 3 seconds.. > BTW, -git versions are usually more risky than -rc. And, maybe, let > this 2.6.18 better stay away from this testing... > > Thanks, > Jarek P. > > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Tc bug (kernel crash) more info 2007-08-31 10:50 ` Badalian Vyacheslav @ 2007-08-31 10:59 ` Badalian Vyacheslav 2007-08-31 11:28 ` Jarek Poplawski 0 siblings, 1 reply; 32+ messages in thread From: Badalian Vyacheslav @ 2007-08-31 10:59 UTC (permalink / raw) To: Jarek Poplawski, netdev May be this bug eq "[PATCH] [NET_SCHED] sch_prio.c: remove duplicate call of tc_classify()"? > I get kernel panic on 2.6.23-rc4-git2 > This is netconsole log! ... ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Tc bug (kernel crash) more info 2007-08-31 10:59 ` Badalian Vyacheslav @ 2007-08-31 11:28 ` Jarek Poplawski 2007-08-31 12:14 ` Badalian Vyacheslav 0 siblings, 1 reply; 32+ messages in thread From: Jarek Poplawski @ 2007-08-31 11:28 UTC (permalink / raw) To: Badalian Vyacheslav; +Cc: netdev On Fri, Aug 31, 2007 at 02:59:55PM +0400, Badalian Vyacheslav wrote: > May be this bug eq "[PATCH] [NET_SCHED] sch_prio.c: remove duplicate > call of tc_classify()"? > > >I get kernel panic on 2.6.23-rc4-git2 > >This is netconsole log! > ... > So, it looks like you have found a really new (unknown) HTB bug, congratulations! We've only to find where is it hidden now... I don't think tc_classify could do such a harm to HTB, and can't see similar double calling as in sch_prio. Jarek P. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Tc bug (kernel crash) more info 2007-08-31 11:28 ` Jarek Poplawski @ 2007-08-31 12:14 ` Badalian Vyacheslav 0 siblings, 0 replies; 32+ messages in thread From: Badalian Vyacheslav @ 2007-08-31 12:14 UTC (permalink / raw) To: Jarek Poplawski; +Cc: netdev Jarek Poplawski пишет: > On Fri, Aug 31, 2007 at 02:59:55PM +0400, Badalian Vyacheslav wrote: > >> May be this bug eq "[PATCH] [NET_SCHED] sch_prio.c: remove duplicate >> call of tc_classify()"? >> >> >>> I get kernel panic on 2.6.23-rc4-git2 >>> This is netconsole log! >>> >> ... >> >> > > So, it looks like you have found a really new (unknown) HTB bug, > congratulations! We've only to find where is it hidden now... > > I don't think tc_classify could do such a harm to HTB, and can't > see similar double calling as in sch_prio. > > Jarek P. > > Great! +) I ready for patching and testing anything patches to find and fix problem =) Also i remember that bug up only if i delete HTB class and mashine have lot of traffic. I get statistic for last panic: Script try to delete 25278 classes, Creating 8360 classes. Adding 6732 filters Panic after 27 restarts of script. ^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2007-09-03 9:10 UTC | newest]
Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-08-29 9:34 Tc bug (kernel crash) more info Badalian Vyacheslav
2007-08-29 11:34 ` Jarek Poplawski
2007-08-29 12:14 ` Jarek Poplawski
2007-08-29 12:53 ` Badalian Vyacheslav
2007-08-29 13:30 ` Jarek Poplawski
2007-08-29 20:16 ` slavon
2007-08-30 6:31 ` Jarek Poplawski
2007-08-30 7:27 ` Jarek Poplawski
2007-08-30 9:09 ` Badalian Vyacheslav
2007-08-30 12:37 ` Jarek Poplawski
2007-08-30 13:43 ` Badalian Vyacheslav
2007-08-31 7:04 ` Badalian Vyacheslav
2007-08-31 7:59 ` Jarek Poplawski
2007-08-31 8:25 ` Badalian Vyacheslav
2007-08-31 8:49 ` Jarek Poplawski
2007-08-31 9:05 ` Jarek Poplawski
2007-08-31 9:16 ` Jarek Poplawski
2007-08-31 9:33 ` Badalian Vyacheslav
2007-08-31 10:17 ` Jarek Poplawski
2007-08-31 10:48 ` Badalian Vyacheslav
2007-08-31 12:59 ` Jarek Poplawski
2007-08-31 14:31 ` Badalian Vyacheslav
2007-08-31 14:51 ` Badalian Vyacheslav
[not found] ` <20070831215850.zf2xi256o00owk4s@mail.himki.net>
2007-09-01 10:36 ` slavon
2007-09-03 7:31 ` Jarek Poplawski
2007-09-03 8:05 ` Badalian Vyacheslav
2007-09-03 8:31 ` Badalian Vyacheslav
2007-09-03 9:12 ` Jarek Poplawski
2007-08-31 10:50 ` Badalian Vyacheslav
2007-08-31 10:59 ` Badalian Vyacheslav
2007-08-31 11:28 ` Jarek Poplawski
2007-08-31 12:14 ` Badalian Vyacheslav
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).