* do_IRQ: stack overflow: 872..
@ 2004-12-18 6:27 Crazy AMD K7
0 siblings, 0 replies; 7+ messages in thread
From: Crazy AMD K7 @ 2004-12-18 6:27 UTC (permalink / raw)
To: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 15957 bytes --]
Hi!
I have found a few days ago strange messages in /var/log/messages
More than 10 times there was do_IRQ: stack overflow: (nimber).... followed
with code. If need I can send all this data. I have run
ksymoops with only first 3 cases. Here is the first, the second and
the third are in attachment.
After that oopses my system continued to work.
uname uname -a
Linux linux 2.4.28 #2 Втр Ноя 30 15:43:35 MSK 2004 i686 unknown
gcc -v
Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs
gcc version 2.96 20000731 (Red Hat Linux 7.3 2.96-113)
I have applies ebtables_brnf patch (http://bridge.sf.net) and a
small add to it. The problem occur a few days later I have installed a
new hdd.
My system works as a bridge.
Processor AMDK7-900
RAM 512Mb
HDD IDE 200+200Gb
NIC Compex RE100TX(rtl8139 chipset)
If some fields are empty or look unusual you may have an old version.
Compare to the current minimal requirements in Documentation/Changes.
Linux linux 2.4.28 #2 Втр Ноя 30 15:43:35 MSK 2004 i686 unknown
Gnu C 2.96
Gnu make 3.79.1
util-linux 2.11n
mount 2.11n
modutils 2.4.18
e2fsprogs 1.27
reiserfsprogs 3.x.0j
quota-tools 3.01.
PPP 2.4.1
Linux C Library 2.2.5
Dynamic linker (ldd) 2.2.5
Procps 2.0.7
Net-tools 1.60
Console-tools 0.3.3
Sh-utils 2.0.11
Modules Loaded
ksymoops 2.4.4 on i686 2.4.28. Options used
-V (default)
-k /proc/ksyms (default)
-l /proc/modules (default)
-o /lib/modules/2.4.28/ (default)
-m /boot/System.map-2.4.28-2 (specified)
No modules in ksyms, skipping objects
Warning (read_lsmod): no symbols in lsmod, is /proc/modules a valid lsmod file?
Dec 14 21:23:49 localhost kernel: c0262414 00000368 00000002 c0325640 00000028 c03256f0 c010a508 00000002
Dec 14 21:23:49 localhost kernel: 00000028 000001f3 c0325640 00000028 c03256f0 00000079 00000018 00000018
Dec 14 21:23:49 localhost kernel: ffffff0a c01c17ba 00000010 00000286 c01d0025 00000079 000001f3 00000028
Dec 14 21:23:49 localhost kernel: Call Trace: [<c010a508>] [<c01c17ba>] [<c01d0025>] [<c01c9c0d>] [<c01c9ee2>]
Dec 14 21:23:49 localhost kernel: [<c01ca35f>] [<c01ce520>] [<c0108079>] [<c0108208>] [<c010a508>] [<c02576bb>]
Dec 14 21:23:49 localhost kernel: [<c0208dc8>] [<c020777b>] [<c020790d>] [<c01be170>] [<c0108208>] [<c010822c>]
Dec 14 21:23:49 localhost kernel: [<c020afd5>] [<c02530e0>] [<c02146fc>] [<c020b1de>] [<c02530e0>] [<c024642c>]
Dec 14 21:23:49 localhost kernel: [<c0253199>] [<c02130b3>] [<c02530e0>] [<c02530e0>] [<c021342e>] [<c0213460>]
Dec 14 21:23:49 localhost kernel: [<c0253199>] [<c02130b3>] [<c0256e10>] [<c02530e0>] [<c0244098>] [<c02530e0>]
Dec 14 21:23:49 localhost kernel: [<c02130b3>] [<c02530e0>] [<c02530e0>] [<c021342e>] [<c02530e0>] [<c02130b3>]
Dec 14 21:23:49 localhost kernel: [<c02531b0>] [<c02531e9>] [<c02530e0>] [<c0213460>] [<c01be188>] [<c02531b0>]
Dec 14 21:23:49 localhost kernel: [<c0256cbf>] [<c02531b0>] [<c02146c4>] [<c02531b0>] [<c02130b3>] [<c02531b0>]
Dec 14 21:23:49 localhost kernel: [<c02531b0>] [<c021342e>] [<c02531b0>] [<c020777b>] [<c020790d>] [<c025322a>]
Dec 14 21:23:49 localhost kernel: [<c02531b0>] [<c02532ab>] [<c025276a>] [<c020b02e>] [<c025279d>] [<c020b255>]
Dec 14 21:23:49 localhost kernel: [<c0244098>] [<c021e9e7>] [<c02530e0>] [<c02530e0>] [<c021342e>] [<c021e940>]
Dec 14 21:23:49 localhost kernel: [<c0256f61>] [<c021e940>] [<c02130b3>] [<c021e940>] [<c021e940>] [<c021342e>]
Dec 14 21:23:49 localhost kernel: [<c021e940>] [<c0213460>] [<c021d649>] [<c021e940>] [<c02531b0>] [<c02130b3>]
Dec 14 21:23:49 localhost kernel: [<c02531b0>] [<c021eb60>] [<c0246498>] [<c021ea40>] [<c0256f61>] [<c021ea40>]
Dec 14 21:23:49 localhost kernel: [<c02130b3>] [<c021ea40>] [<c021ea40>] [<c021342e>] [<c021ea40>] [<c02146c4>]
Dec 14 21:23:49 localhost kernel: [<c021dac1>] [<c021ea40>] [<c01be188>] [<c020afd5>] [<c02146c4>] [<c024642c>]
Dec 14 21:23:49 localhost kernel: [<c0253199>] [<c020b1de>] [<c023296d>] [<c022db29>] [<c02292c1>] [<c0228b18>]
Dec 14 21:23:49 localhost kernel: [<c022e5e1>] [<c0256e10>] [<c022accb>] [<c022b929>] [<c022bd67>] [<c02530e0>]
Dec 14 21:23:49 localhost kernel: [<c02130b3>] [<c02530e0>] [<c02530e0>] [<c021342e>] [<c02530e0>] [<c02130b3>]
Dec 14 21:23:49 localhost kernel: [<c02531b0>] [<c02531e9>] [<c02530e0>] [<c0213460>] [<c02088f5>] [<c0233719>]
Dec 14 21:23:49 localhost kernel: [<c023366e>] [<c0233bc4>] [<c0244098>] [<c021acb0>] [<c02463ac>] [<c021ad67>]
Dec 14 21:23:49 localhost kernel: [<c021acb0>] [<c021acb0>] [<c021342e>] [<c021acb0>] [<c0213460>] [<c02560c0>]
Dec 14 21:23:49 localhost kernel: [<c024642c>] [<c021ade0>] [<c021a93c>] [<c021acb0>] [<c02560c0>] [<c02188fd>]
Dec 14 21:23:49 localhost kernel: [<c021ade0>] [<c021ade0>] [<c021af4b>] [<c0253b70>] [<c02568c0>] [<c021ade0>]
Dec 14 21:23:49 localhost kernel: [<c0256e3b>] [<c02130b3>] [<c021ade0>] [<c021ade0>] [<c021342e>] [<c021ade0>]
Dec 14 21:23:49 localhost kernel: [<c021d649>] [<c021e940>] [<c021ac7d>] [<c021ade0>] [<c0207212>] [<c0250f2c>]
Dec 14 21:23:49 localhost kernel: [<c0253b70>] [<c02075a1>] [<c020b7df>] [<c020b87d>] [<c020b99c>] [<c0108079>]
Dec 14 21:23:49 localhost kernel: [<c01176cb>] [<c010823c>] [<c010a508>] [<c015f13b>] [<c0158728>] [<c021ade0>]
Dec 14 21:23:49 localhost kernel: [<c021ade0>] [<c021af4b>] [<c016320b>] [<c015ec35>] [<c0132e86>] [<c015ec97>]
Dec 14 21:23:49 localhost kernel: [<c0158ab0>] [<c021ac7d>] [<c016320b>] [<c0158a53>] [<c0158b57>] [<c0158c2c>]
Dec 14 21:23:49 localhost kernel: [<c015ef55>] [<c014312e>] [<c015384a>] [<c016320b>] [<c015ec35>] [<c015f29d>]
Dec 14 21:23:49 localhost kernel: [<c0158728>] [<c0158791>] [<c01ce520>] [<c01d01d7>] [<c016320b>] [<c015ec35>]
Dec 14 21:23:49 localhost kernel: [<c0132e86>] [<c012b094>] [<c015ec97>] [<c016320b>] [<c015ec35>] [<c0155b79>]
Dec 14 21:23:49 localhost kernel: [<c0155e95>] [<c01297f5>] [<c0132bf8>] [<c0132c09>] [<c0132e86>] [<c0155d03>]
Dec 14 21:23:49 localhost kernel: [<c015656d>] [<c0132e86>] [<c01330b1>] [<c01566c9>] [<c0133616>] [<c0164f07>]
Dec 14 21:23:49 localhost kernel: [<c0133ecd>] [<c0156670>] [<c0156b1c>] [<c0156670>] [<c01297f5>] [<c012339d>]
Dec 14 21:23:49 localhost kernel: [<c0125ec5>] [<c0126320>] [<c015467f>] [<c0131175>] [<c013eea8>] [<c0116feb>]
Dec 14 21:23:49 localhost kernel: [<c0106cb3>]
Warning (Oops_read): Code line not seen, dumping what data is available
Trace; c010a508 <call_do_IRQ+5/d>
Trace; c01c17ba <ide_outb+a/10>
Trace; c01d0025 <__ide_do_rw_disk+155/520>
Trace; c01c9c0d <ide_start_request+18d/1c0>
Trace; c01c9ee2 <ide_do_request+272/2b0>
Trace; c01ca35f <ide_intr+df/100>
Trace; c01ce520 <ide_dma_intr+0/a0>
Trace; c0108079 <handle_IRQ_event+39/60>
Trace; c0108208 <do_IRQ+88/d0>
Trace; c010a508 <call_do_IRQ+5/d>
Trace; c02576bb <_mmx_memcpy+2b/150>
Trace; c0208dc8 <skb_copy_and_csum_dev+78/d0>
Trace; c020777b <kfree_skbmem+b/60>
Trace; c020790d <__kfree_skb+13d/150>
Trace; c01be170 <rtl8139_start_xmit+50/100>
Trace; c0108208 <do_IRQ+88/d0>
Trace; c010822c <do_IRQ+ac/d0>
Trace; c020afd5 <dev_queue_xmit_nit+15/a0>
Trace; c02530e0 <br_dev_queue_push_xmit+0/d0>
Trace; c02146fc <qdisc_restart+4c/d0>
Trace; c020b1de <dev_queue_xmit+fe/260>
Trace; c02530e0 <br_dev_queue_push_xmit+0/d0>
Trace; c024642c <ipt_route_hook+1c/20>
Trace; c0253199 <br_dev_queue_push_xmit+b9/d0>
Trace; c02130b3 <nf_iterate+33/90>
Trace; c02530e0 <br_dev_queue_push_xmit+0/d0>
Trace; c02530e0 <br_dev_queue_push_xmit+0/d0>
Trace; c021342e <nf_hook_slow+ae/140>
Trace; c0213460 <nf_hook_slow+e0/140>
Trace; c0253199 <br_dev_queue_push_xmit+b9/d0>
Trace; c02130b3 <nf_iterate+33/90>
Trace; c0256e10 <br_nf_post_routing+140/150>
Trace; c02530e0 <br_dev_queue_push_xmit+0/d0>
Trace; c0244098 <ipt_do_table+308/450>
Trace; c02530e0 <br_dev_queue_push_xmit+0/d0>
Trace; c02130b3 <nf_iterate+33/90>
Trace; c02530e0 <br_dev_queue_push_xmit+0/d0>
Trace; c02530e0 <br_dev_queue_push_xmit+0/d0>
Trace; c021342e <nf_hook_slow+ae/140>
Trace; c02530e0 <br_dev_queue_push_xmit+0/d0>
Trace; c02130b3 <nf_iterate+33/90>
Trace; c02531b0 <br_forward_finish+0/40>
Trace; c02531e9 <br_forward_finish+39/40>
Trace; c02530e0 <br_dev_queue_push_xmit+0/d0>
Trace; c0213460 <nf_hook_slow+e0/140>
Trace; c01be188 <rtl8139_start_xmit+68/100>
Trace; c02531b0 <br_forward_finish+0/40>
Trace; c0256cbf <br_nf_local_out+19f/1b0>
Trace; c02531b0 <br_forward_finish+0/40>
Trace; c02146c4 <qdisc_restart+14/d0>
Trace; c02531b0 <br_forward_finish+0/40>
Trace; c02130b3 <nf_iterate+33/90>
Trace; c02531b0 <br_forward_finish+0/40>
Trace; c02531b0 <br_forward_finish+0/40>
Trace; c021342e <nf_hook_slow+ae/140>
Trace; c02531b0 <br_forward_finish+0/40>
Trace; c020777b <kfree_skbmem+b/60>
Trace; c020790d <__kfree_skb+13d/150>
Trace; c025322a <__br_deliver+3a/40>
Trace; c02531b0 <br_forward_finish+0/40>
Trace; c02532ab <br_deliver+2b/50>
Trace; c025276a <__br_dev_xmit+6a/90>
Trace; c020b02e <dev_queue_xmit_nit+6e/a0>
Trace; c025279d <br_dev_xmit+d/10>
Trace; c020b255 <dev_queue_xmit+175/260>
Trace; c0244098 <ipt_do_table+308/450>
Trace; c021e9e7 <ip_finish_output2+a7/100>
Trace; c02530e0 <br_dev_queue_push_xmit+0/d0>
Trace; c02530e0 <br_dev_queue_push_xmit+0/d0>
Trace; c021342e <nf_hook_slow+ae/140>
Trace; c021e940 <ip_finish_output2+0/100>
Trace; c0256f61 <ip_sabotage_out+111/130>
Trace; c021e940 <ip_finish_output2+0/100>
Trace; c02130b3 <nf_iterate+33/90>
Trace; c021e940 <ip_finish_output2+0/100>
Trace; c021e940 <ip_finish_output2+0/100>
Trace; c021342e <nf_hook_slow+ae/140>
Trace; c021e940 <ip_finish_output2+0/100>
Trace; c0213460 <nf_hook_slow+e0/140>
Trace; c021d649 <ip_output+139/150>
Trace; c021e940 <ip_finish_output2+0/100>
Trace; c02531b0 <br_forward_finish+0/40>
Trace; c02130b3 <nf_iterate+33/90>
Trace; c02531b0 <br_forward_finish+0/40>
Trace; c021eb60 <ip_queue_xmit2+120/1f7>
Trace; c0246498 <ipt_local_hook+68/b0>
Trace; c021ea40 <ip_queue_xmit2+0/1f7>
Trace; c0256f61 <ip_sabotage_out+111/130>
Trace; c021ea40 <ip_queue_xmit2+0/1f7>
Trace; c02130b3 <nf_iterate+33/90>
Trace; c021ea40 <ip_queue_xmit2+0/1f7>
Trace; c021ea40 <ip_queue_xmit2+0/1f7>
Trace; c021342e <nf_hook_slow+ae/140>
Trace; c021ea40 <ip_queue_xmit2+0/1f7>
Trace; c02146c4 <qdisc_restart+14/d0>
Trace; c021dac1 <ip_queue_xmit+461/4b0>
Trace; c021ea40 <ip_queue_xmit2+0/1f7>
Trace; c01be188 <rtl8139_start_xmit+68/100>
Trace; c020afd5 <dev_queue_xmit_nit+15/a0>
Trace; c02146c4 <qdisc_restart+14/d0>
Trace; c024642c <ipt_route_hook+1c/20>
Trace; c0253199 <br_dev_queue_push_xmit+b9/d0>
Trace; c020b1de <dev_queue_xmit+fe/260>
Trace; c023296d <tcp_v4_send_check+6d/b0>
Trace; c022db29 <tcp_transmit_skb+549/670>
Trace; c02292c1 <tcp_clean_rtx_queue+221/320>
Trace; c0228b18 <tcp_fastretrans_alert+468/620>
Trace; c022e5e1 <tcp_write_xmit+151/290>
Trace; c0256e10 <br_nf_post_routing+140/150>
Trace; c022accb <tcp_data_queue+3bb/970>
Trace; c022b929 <__tcp_data_snd_check+49/d0>
Trace; c022bd67 <tcp_rcv_established+137/8d0>
Trace; c02530e0 <br_dev_queue_push_xmit+0/d0>
Trace; c02130b3 <nf_iterate+33/90>
Trace; c02530e0 <br_dev_queue_push_xmit+0/d0>
Trace; c02530e0 <br_dev_queue_push_xmit+0/d0>
Trace; c021342e <nf_hook_slow+ae/140>
Trace; c02530e0 <br_dev_queue_push_xmit+0/d0>
Trace; c02130b3 <nf_iterate+33/90>
Trace; c02531b0 <br_forward_finish+0/40>
Trace; c02531e9 <br_forward_finish+39/40>
Trace; c02530e0 <br_dev_queue_push_xmit+0/d0>
Trace; c0213460 <nf_hook_slow+e0/140>
Trace; c02088f5 <skb_checksum+45/240>
Trace; c0233719 <tcp_v4_do_rcv+29/100>
Trace; c023366e <tcp_v4_checksum_init+7e/100>
Trace; c0233bc4 <tcp_v4_rcv+3d4/620>
Trace; c0244098 <ipt_do_table+308/450>
Trace; c021acb0 <ip_local_deliver_finish+0/130>
Trace; c02463ac <ipt_hook+1c/20>
Trace; c021ad67 <ip_local_deliver_finish+b7/130>
Trace; c021acb0 <ip_local_deliver_finish+0/130>
Trace; c021acb0 <ip_local_deliver_finish+0/130>
Trace; c021342e <nf_hook_slow+ae/140>
Trace; c021acb0 <ip_local_deliver_finish+0/130>
Trace; c0213460 <nf_hook_slow+e0/140>
Trace; c02560c0 <br_nf_pre_routing_finish+0/1f0>
Trace; c024642c <ipt_route_hook+1c/20>
Trace; c021ade0 <ip_rcv_finish+0/1a0>
Trace; c021a93c <ip_local_deliver+17c/190>
Trace; c021acb0 <ip_local_deliver_finish+0/130>
Trace; c02560c0 <br_nf_pre_routing_finish+0/1f0>
Trace; c02188fd <ip_route_input+3d/130>
Trace; c021ade0 <ip_rcv_finish+0/1a0>
Trace; c021ade0 <ip_rcv_finish+0/1a0>
Trace; c021af4b <ip_rcv_finish+16b/1a0>
Trace; c0253b70 <br_handle_frame_finish+0/110>
Trace; c02568c0 <br_nf_pre_routing+330/350>
Trace; c021ade0 <ip_rcv_finish+0/1a0>
Trace; c0256e3b <ip_sabotage_in+1b/30>
Trace; c02130b3 <nf_iterate+33/90>
Trace; c021ade0 <ip_rcv_finish+0/1a0>
Trace; c021ade0 <ip_rcv_finish+0/1a0>
Trace; c021342e <nf_hook_slow+ae/140>
Trace; c021ade0 <ip_rcv_finish+0/1a0>
Trace; c021d649 <ip_output+139/150>
Trace; c021e940 <ip_finish_output2+0/100>
Trace; c021ac7d <ip_rcv+32d/360>
Trace; c021ade0 <ip_rcv_finish+0/1a0>
Trace; c0207212 <sock_def_readable+22/50>
Trace; c0250f2c <packet_rcv+ec/260>
Trace; c0253b70 <br_handle_frame_finish+0/110>
Trace; c02075a1 <alloc_skb+d1/190>
Trace; c020b7df <netif_receive_skb+17f/1b0>
Trace; c020b87d <process_backlog+6d/120>
Trace; c020b99c <net_rx_action+6c/100>
Trace; c0108079 <handle_IRQ_event+39/60>
Trace; c01176cb <do_softirq+4b/90>
Trace; c010823c <do_IRQ+bc/d0>
Trace; c010a508 <call_do_IRQ+5/d>
Trace; c015f13b <journal_dirty_metadata+1b/1a0>
Trace; c0158728 <ext3_do_update_inode+328/3c0>
Trace; c021ade0 <ip_rcv_finish+0/1a0>
Trace; c021ade0 <ip_rcv_finish+0/1a0>
Trace; c021af4b <ip_rcv_finish+16b/1a0>
Trace; c016320b <journal_cancel_revoke+fb/170>
Trace; c015ec35 <do_get_write_access+4b5/4e0>
Trace; c0132e86 <bread+16/80>
Trace; c015ec97 <journal_get_write_access+37/50>
Trace; c0158ab0 <ext3_reserve_inode_write+30/b0>
Trace; c021ac7d <ip_rcv+32d/360>
Trace; c016320b <journal_cancel_revoke+fb/170>
Trace; c0158a53 <ext3_mark_iloc_dirty+23/50>
Trace; c0158b57 <ext3_mark_inode_dirty+27/40>
Trace; c0158c2c <ext3_dirty_inode+bc/100>
Trace; c015ef55 <journal_get_undo_access+f5/120>
Trace; c014312e <__mark_inode_dirty+2e/90>
Trace; c015384a <ext3_new_block+aa/830>
Trace; c016320b <journal_cancel_revoke+fb/170>
Trace; c015ec35 <do_get_write_access+4b5/4e0>
Trace; c015f29d <journal_dirty_metadata+17d/1a0>
Trace; c0158728 <ext3_do_update_inode+328/3c0>
Trace; c0158791 <ext3_do_update_inode+391/3c0>
Trace; c01ce520 <ide_dma_intr+0/a0>
Trace; c01d01d7 <__ide_do_rw_disk+307/520>
Trace; c016320b <journal_cancel_revoke+fb/170>
Trace; c015ec35 <do_get_write_access+4b5/4e0>
Trace; c0132e86 <bread+16/80>
Trace; c012b094 <__alloc_pages+74/2c0>
Trace; c015ec97 <journal_get_write_access+37/50>
Trace; c016320b <journal_cancel_revoke+fb/170>
Trace; c015ec35 <do_get_write_access+4b5/4e0>
Trace; c0155b79 <ext3_alloc_block+19/20>
Trace; c0155e95 <ext3_alloc_branch+55/2d0>
Trace; c01297f5 <lru_cache_add+65/70>
Trace; c0132bf8 <getblk+28/60>
Trace; c0132c09 <getblk+39/60>
Trace; c0132e86 <bread+16/80>
Trace; c0155d03 <ext3_get_branch+53/d0>
Trace; c015656d <ext3_get_block_handle+1bd/2c0>
Trace; c0132e86 <bread+16/80>
Trace; c01330b1 <create_buffers+61/f0>
Trace; c01566c9 <ext3_get_block+59/60>
Trace; c0133616 <__block_prepare_write+e6/300>
Trace; c0164f07 <__jbd_kmalloc+27/a0>
Trace; c0133ecd <block_prepare_write+1d/40>
Trace; c0156670 <ext3_get_block+0/60>
Trace; c0156b1c <ext3_prepare_write+7c/120>
Trace; c0156670 <ext3_get_block+0/60>
Trace; c01297f5 <lru_cache_add+65/70>
Trace; c012339d <add_to_page_cache_unique+9d/b0>
Trace; c0125ec5 <do_generic_file_write+255/3e0>
Trace; c0126320 <generic_file_write+f0/110>
Trace; c015467f <ext3_file_write+1f/b0>
Trace; c0131175 <sys_write+95/f0>
Trace; c013eea8 <sys_select+468/480>
Trace; c0116feb <sys_gettimeofday+1b/a0>
Trace; c0106cb3 <system_call+33/38>
2 warnings issued. Results may not be reliable.
Thank you,
Pasha
[-- Attachment #2: sto2.out.txt.gz --]
[-- Type: application/x-gzip, Size: 2983 bytes --]
[-- Attachment #3: sto3.out.txt.gz --]
[-- Type: application/x-gzip, Size: 3017 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: do_IRQ: stack overflow: 872..
[not found] <1131604877.20041218092730@mail.ru.suse.lists.linux.kernel>
@ 2004-12-18 7:50 ` Andi Kleen
2004-12-18 11:12 ` Bart De Schuymer
2005-01-07 17:05 ` David Woodhouse
0 siblings, 2 replies; 7+ messages in thread
From: Andi Kleen @ 2004-12-18 7:50 UTC (permalink / raw)
To: Crazy AMD K7; +Cc: linux-kernel, netdev
Crazy AMD K7 <snort2004@mail.ru> writes:
> Hi!
> I have found a few days ago strange messages in /var/log/messages
> More than 10 times there was do_IRQ: stack overflow: (nimber).... followed
> with code. If need I can send all this data. I have run
> ksymoops with only first 3 cases. Here is the first, the second and
> the third are in attachment.
> After that oopses my system continued to work.
It's not really an oops, just a warning that stack space got quiet tight.
The problem seems to be that the br netfilter code is nesting far too
deeply and recursing several times. Looks like a design bug to me,
it shouldn't do that.
> uname uname -a
> Linux linux 2.4.28 #2 ÷ÔÒ îÏÑ 30 15:43:35 MSK 2004 i686 unknown
> gcc -v
> Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs
> gcc version 2.96 20000731 (Red Hat Linux 7.3 2.96-113)
> I have applies ebtables_brnf patch (http://bridge.sf.net) and a
Don't do that then or contact the author to fix it. Unfortunately
the code is also in 2.6 mainline.
-Andi
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: do_IRQ: stack overflow: 872..
2004-12-18 7:50 ` Andi Kleen
@ 2004-12-18 11:12 ` Bart De Schuymer
2004-12-18 11:14 ` Andi Kleen
2005-01-07 17:05 ` David Woodhouse
1 sibling, 1 reply; 7+ messages in thread
From: Bart De Schuymer @ 2004-12-18 11:12 UTC (permalink / raw)
To: Andi Kleen; +Cc: Crazy AMD K7, linux-kernel, netdev
Op za, 18-12-2004 te 08:50 +0100, schreef Andi Kleen:
> Crazy AMD K7 <snort2004@mail.ru> writes:
>
> > Hi!
> > I have found a few days ago strange messages in /var/log/messages
> > More than 10 times there was do_IRQ: stack overflow: (nimber).... followed
> > with code. If need I can send all this data. I have run
> > ksymoops with only first 3 cases. Here is the first, the second and
> > the third are in attachment.
> > After that oopses my system continued to work.
>
> It's not really an oops, just a warning that stack space got quiet tight.
>
> The problem seems to be that the br netfilter code is nesting far too
> deeply and recursing several times. Looks like a design bug to me,
> it shouldn't do that.
>
> > uname uname -a
> > Linux linux 2.4.28 #2 ÷ÔÒ îÏÑ 30 15:43:35 MSK 2004 i686 unknown
> > gcc -v
> > Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs
> > gcc version 2.96 20000731 (Red Hat Linux 7.3 2.96-113)
> > I have applies ebtables_brnf patch (http://bridge.sf.net) and a
>
> Don't do that then or contact the author to fix it. Unfortunately
> the code is also in 2.6 mainline.
>
Hi.
The bridge-nf code does not use recursive function calls and there is no
long consecutive function calling. Furthermore, there is no function in
the bridge-nf code that uses a large part of the stack.
Andi, if you make such statements then please point out the code part
you have (of course) read after which you decided to make the statement.
The bridge-nf code is used by quite a few people and by commercial
companies and I have never had a report like this. AMD has been having
all sorts of strange problems for weeks now, they're all somehow related
to bridge-nf, but I doubt he is using the bridge-nf patch on a clean 2.4
kernel.
AMD, is there any chance you can use the latest 2.6 kernel, without
extra patches ?
Bart
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: do_IRQ: stack overflow: 872..
2004-12-18 11:12 ` Bart De Schuymer
@ 2004-12-18 11:14 ` Andi Kleen
2004-12-18 11:51 ` Bart De Schuymer
0 siblings, 1 reply; 7+ messages in thread
From: Andi Kleen @ 2004-12-18 11:14 UTC (permalink / raw)
To: Bart De Schuymer; +Cc: Andi Kleen, Crazy AMD K7, linux-kernel, netdev
> The bridge-nf code does not use recursive function calls and there is no
> long consecutive function calling. Furthermore, there is no function in
> the bridge-nf code that uses a large part of the stack.
Just take a look at the backtrace in the original post. It clearly
shows a problem. And it points strongly towards br-netfilter.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: do_IRQ: stack overflow: 872..
2004-12-18 11:14 ` Andi Kleen
@ 2004-12-18 11:51 ` Bart De Schuymer
2004-12-18 13:53 ` Andi Kleen
0 siblings, 1 reply; 7+ messages in thread
From: Bart De Schuymer @ 2004-12-18 11:51 UTC (permalink / raw)
To: Andi Kleen; +Cc: Crazy AMD K7, linux-kernel, netdev
Op za, 18-12-2004 te 12:14 +0100, schreef Andi Kleen:
> > The bridge-nf code does not use recursive function calls and there is no
> > long consecutive function calling. Furthermore, there is no function in
> > the bridge-nf code that uses a large part of the stack.
>
> Just take a look at the backtrace in the original post. It clearly
> shows a problem. And it points strongly towards br-netfilter.
I don't doubt you are a much better reader of such backtraces than me.
However, let's count the number of times a function from
net/bridge/br_netfilter.c is in the backtrace:
1. br_nf*: 6 times
2. *sabotage*: 3 times
Seriously, out of 222 lines, only 9 from bridge-nf.
The function ip_queue_xmit, OTOH, is 8 times in the trace.
Anyway, as I already suspected weeks ago, AMD must be seeing some
incompatibility between ip_queue (he's using snort) and the bridge-nf
patch.
He is using the patch (I gave it to him) below on top of the bridge-nf
patch. Before using that patch he got a kernel panic occasionally.
However he seems not to get a message in his syslog.
Bart
--- linux-2.4.28-ebt-brnf/net/bridge/br_netfilter.c.old 2004-11-27 23:43:18.000000000 +0100
+++ linux-2.4.28-ebt-brnf/net/bridge/br_netfilter.c 2004-11-27
23:52:05.000000000 +0100
@@ -870,6 +870,10 @@ static unsigned int ip_sabotage_out(unsi
{
struct sk_buff *skb = *pskb;
+if (!skb) {
+ printk("TROUBLE IN IP_SABOTAGE_OUT: skb==NULL\n");
+ goto in_trouble;
+}
#ifdef CONFIG_SYSCTL
if (!skb->nf_bridge) {
struct vlan_ethhdr *hdr =
@@ -884,6 +888,10 @@ static unsigned int ip_sabotage_out(unsi
}
#endif
+if (!out) {
+ printk("TROUBLE IN IP_SABOTAGE_OUT: out == NULL\n");
+ goto in_trouble;
+}
if ((out->hard_start_xmit == br_dev_xmit &&
okfn != br_nf_forward_finish &&
okfn != br_nf_local_out_finish &&
@@ -920,6 +928,9 @@ static unsigned int ip_sabotage_out(unsi
}
return NF_ACCEPT;
+in_trouble:
+ dump_stack();
+ return NF_DROP;
}
/* For br_nf_local_out we need (prio = NF_BR_PRI_FIRST), to insure that
innocent
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: do_IRQ: stack overflow: 872..
2004-12-18 11:51 ` Bart De Schuymer
@ 2004-12-18 13:53 ` Andi Kleen
0 siblings, 0 replies; 7+ messages in thread
From: Andi Kleen @ 2004-12-18 13:53 UTC (permalink / raw)
To: Bart De Schuymer; +Cc: Andi Kleen, Crazy AMD K7, linux-kernel, netdev
On Sat, Dec 18, 2004 at 12:51:30PM +0100, Bart De Schuymer wrote:
> >
> > Just take a look at the backtrace in the original post. It clearly
> > shows a problem. And it points strongly towards br-netfilter.
>
> I don't doubt you are a much better reader of such backtraces than me.
> However, let's count the number of times a function from
> net/bridge/br_netfilter.c is in the backtrace:
> 1. br_nf*: 6 times
> 2. *sabotage*: 3 times
> Seriously, out of 222 lines, only 9 from bridge-nf.
> The function ip_queue_xmit, OTOH, is 8 times in the trace.
Yep, but ip_queue_xmit doesn't call itself recursively. Someone
must be doing it. And that's likely the bridge code.
BTW not all of these entries are probably true, there can
be a lot of false positives.
> Anyway, as I already suspected weeks ago, AMD must be seeing some
> incompatibility between ip_queue (he's using snort) and the bridge-nf
> patch.
>
> He is using the patch (I gave it to him) below on top of the bridge-nf
> patch. Before using that patch he got a kernel panic occasionally.
> However he seems not to get a message in his syslog.
Ok, since this report seems to be for a totally non standard
severly hacked up kernel I suppose nothing from it can be concluded for
the mainline kernel. Thanks for clearing this up.
Note to the original poster: when you report a bug with a patched
kernel always mention it.
-Andi
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: do_IRQ: stack overflow: 872..
2004-12-18 7:50 ` Andi Kleen
2004-12-18 11:12 ` Bart De Schuymer
@ 2005-01-07 17:05 ` David Woodhouse
1 sibling, 0 replies; 7+ messages in thread
From: David Woodhouse @ 2005-01-07 17:05 UTC (permalink / raw)
To: Andi Kleen; +Cc: Crazy AMD K7, linux-kernel, netdev, Stephen Hemminger
On Sat, 2004-12-18 at 08:50 +0100, Andi Kleen wrote:
> It's not really an oops, just a warning that stack space got quiet
> tight.
>
> The problem seems to be that the br netfilter code is nesting far too
> deeply and recursing several times. Looks like a design bug to me,
> it shouldn't do that.
I don't think it's recursing -- I think the stack trace is just a bit
noisy. The problem is that the bridge code, especially with br_netfilter
in the equation, is implicated in code paths which are just _too_ deep.
This happens when you're bridging packets received in an interrupt while
you were deep in journalling code, and it's also been seen with a call
trace something like nfs->sunrpc->ip->bridge->br_netfilter.
One option might be to make br_dev_xmit() just queue the packet rather
than trying to deliver it to all the slave devices immediately. Then the
actual retransmission can be handled from a context where we're _not_
short of stack; perhaps from a dedicated kernel thread.
Unfortunately that approach would introduce a lot of latency on all
packets we pass. Another option would be to have all architectures
provide a stack_available() function and for br_dev_xmit() to queue the
packet only if we're short of stack, while still sending most packets
immediately.
Proof of concept below; obviously the stack_available() is an evil hack
and would need to be done more sanely. Comments?
===== net/bridge/br_device.c 1.17 vs edited =====
--- 1.17/net/bridge/br_device.c 2004-07-29 22:40:51 +01:00
+++ edited/net/bridge/br_device.c 2005-01-07 16:54:26 +00:00
@@ -19,6 +19,13 @@
#include <asm/uaccess.h>
#include "br_private.h"
+static inline unsigned long stack_available(void)
+{
+ unsigned long esp;
+ asm volatile("movl %%esp,%0" : "=r"(esp));
+ return esp - (unsigned long)current - sizeof(struct thread_info);
+}
+
static struct net_device_stats *br_dev_get_stats(struct net_device *dev)
{
struct net_bridge *br;
@@ -34,6 +41,14 @@
const unsigned char *dest = skb->data;
struct net_bridge_fdb_entry *dst;
+ if (stack_available() < THREAD_SIZE/2) {
+ if (net_ratelimit()) {
+ printk(KERN_DEBUG "Bridge device %s queues packet due to stack shortage\n",
+ dev->name);
+ }
+ return NETDEV_TX_BUSY;
+ }
+
br->statistics.tx_packets++;
br->statistics.tx_bytes += skb->len;
@@ -104,7 +119,7 @@
SET_MODULE_OWNER(dev);
dev->stop = br_dev_stop;
dev->accept_fastpath = br_dev_accept_fastpath;
- dev->tx_queue_len = 0;
+ dev->tx_queue_len = 5;
dev->set_mac_address = NULL;
dev->priv_flags = IFF_EBRIDGE;
}
--
dwmw2
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2005-01-07 17:12 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-12-18 6:27 do_IRQ: stack overflow: 872 Crazy AMD K7
[not found] <1131604877.20041218092730@mail.ru.suse.lists.linux.kernel>
2004-12-18 7:50 ` Andi Kleen
2004-12-18 11:12 ` Bart De Schuymer
2004-12-18 11:14 ` Andi Kleen
2004-12-18 11:51 ` Bart De Schuymer
2004-12-18 13:53 ` Andi Kleen
2005-01-07 17:05 ` David Woodhouse
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox