* Kernel panic when using bridge @ 2011-04-08 1:20 Scot Doyle 2011-04-08 13:49 ` Sebastian Nickel 2011-04-08 19:17 ` Stephen Hemminger 0 siblings, 2 replies; 56+ messages in thread From: Scot Doyle @ 2011-04-08 1:20 UTC (permalink / raw) To: netdev This kernel panic occurs when using a bridge. I would be grateful for any ideas on how to correct it. The panic was captured on two servers after three or four days of minimal use, both configured as follows: - unpatched kernel 2.6.39-rc1 (commit ecb78ab6f30106ab72a575a25b1cdfd1633b7ca2) with default .config options - br0 on single intel igb NIC (3 other NIC's unused) - br0 with ip address on distinct /27 subnet - br0:1 with ip address on distinct /24 subnet - br0:2 with ip address on distinct /24 subnet - no iptables rules - ebtables not installed "net/bridge/br_netfilter.c" and "net/ipv4/ip_options.c" (in the current 2.6.39-rc2 and in net-next-2.6) are identical to the versions used to build this kernel. --------- Server #1 [333271.168869] BUG: unable to handle kernel NULL pointer dereference at 00000000000000cc [333271.176790] IP: [<ffffffff8129fb09>] ip_options_compile+0x1c1/0x435 [333271.183142] PGD 0 [333271.185242] Oops: 0000 [#1] SMP [333271.188564] last sysfs file: /sys/devices/virtual/net/lo/operstate [333271.194817] CPU 0 [333271.196734] Modules linked in: tun kvm_intel kvm bridge stp loop snd_pcm snd_timer snd soundcore snd_page_alloc i7core_edac tpm_tis psmouse evdev tpm dcdbas edac_core pcspkr serio_raw processor tpm_bios thermal_sys ghes power_meter hed button ext2 mbcache dm_mod raid1 md_mod sd_mod crc_t10dif usb_storage uas uhci_hcd mpt2sas igb scsi_transport_sas ehci_hcd raid_class scsi_mod dca usbcore bnx2 [last unloaded: scsi_wait_scan] [333271.234890] [333271.236460] Pid: 0, comm: swapper Not tainted 2.6.39-rc1+ #2 Dell Inc. PowerEdge R510/0DPRKF [333271.244991] RIP: 0010:[<ffffffff8129fb09>] [<ffffffff8129fb09>] ip_options_compile+0x1c1/0x435 [333271.253766] RSP: 0018:ffff88042f203af0 EFLAGS: 00010286 [333271.259150] RAX: 000000000000001c RBX: ffff8804055d9500 RCX: ffff880413a5f865 [333271.266354] RDX: 000000000000001f RSI: 0000000000000000 RDI: ffffffff817e6180 [333271.273558] RBP: ffff880413a5f863 R08: ffffffffa01f5e89 R09: ffff88042f203c58 [333271.280762] R10: 00000000006f53aa R11: 0000000000000293 R12: ffff8804055d9528 [333271.287967] R13: 0000000000000027 R14: ffff880413a5f84e R15: 0000000000000027 [333271.295171] FS: 0000000000000000(0000) GS:ffff88042f200000(0000) knlGS:0000000000000000 [333271.303330] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [333271.309148] CR2: 00000000000000cc CR3: 0000000001603000 CR4: 00000000000026e0 [333271.316353] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [333271.323557] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [333271.330763] Process swapper (pid: 0, threadinfo ffffffff81600000, task ffffffff8160b020) [333271.338920] Stack: [333271.341011] 0000000000000286 ffff88041c870e00 0000000000000000 ffffffff817e6180 [333271.348514] 0000000000000282 ffffffff810ec7a8 0000000000000282 ffff8804055d9528 [333271.356014] ffff8804055d9500 ffff880404188000 ffff880413a5f84e ffff880404188000 [333271.363515] Call Trace: [333271.366038] <IRQ> [333271.368226] [<ffffffff810ec7a8>] ? __slab_free+0x28/0x14a [333271.373789] [<ffffffffa01f9e3a>] ? br_parse_ip_options+0x133/0x1a0 [bridge] [333271.380910] [<ffffffffa01fabd8>] ? br_nf_pre_routing+0x348/0x3cb [bridge] [333271.387856] [<ffffffff81037106>] ? check_preempt_curr+0x38/0x61 [333271.393934] [<ffffffff81033dc0>] ? arch_local_irq_save+0x12/0x1b [333271.400101] [<ffffffff81298493>] ? nf_iterate+0x41/0x7e [333271.405490] [<ffffffff8103eb9c>] ? try_to_wake_up+0x16a/0x17c [333271.411396] [<ffffffffa01f5e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [333271.417997] [<ffffffffa01f5e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [333271.424595] [<ffffffff81298543>] ? nf_hook_slow+0x73/0x114 [333271.430242] [<ffffffffa01f5e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [333271.436843] [<ffffffff811042c0>] ? pollwake+0x49/0x4e [333271.442056] [<ffffffffa01f5e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [333271.448657] [<ffffffffa01f5e6f>] ? NF_HOOK.clone.4+0x3c/0x56 [bridge] [333271.455257] [<ffffffffa01f61e5>] ? br_handle_frame+0x195/0x1ac [bridge] [333271.462029] [<ffffffffa01f6050>] ? br_handle_frame_finish+0x1c7/0x1c7 [bridge] [333271.469410] [<ffffffff8127646f>] ? __netif_receive_skb+0x2a7/0x450 [333271.475749] [<ffffffff81033dc0>] ? arch_local_irq_save+0x12/0x1b [333271.481914] [<ffffffff81276893>] ? netif_receive_skb+0x52/0x58 [333271.487906] [<ffffffff81276d95>] ? napi_gro_receive+0x1f/0x2f [333271.493812] [<ffffffff8127696a>] ? napi_skb_finish+0x1c/0x31 [333271.499634] [<ffffffffa022afcd>] ? igb_poll+0x6d9/0x9ee [igb] [333271.505539] [<ffffffff8103ebae>] ? try_to_wake_up+0x17c/0x17c [333271.511444] [<ffffffff81034724>] ? __wake_up_common+0x41/0x78 [333271.517350] [<ffffffff81276ec0>] ? net_rx_action+0xa4/0x1b1 [333271.523084] [<ffffffff8104ad0a>] ? __do_softirq+0xb8/0x176 [333271.528731] [<ffffffff8106a80f>] ? tick_dev_program_event+0x2f/0xec [333271.535158] [<ffffffff81333a1c>] ? call_softirq+0x1c/0x30 [333271.540717] [<ffffffff8100aa57>] ? do_softirq+0x3f/0x84 [333271.546103] [<ffffffff8104af75>] ? irq_exit+0x3f/0x8f [333271.551317] [<ffffffff8100a793>] ? do_IRQ+0x85/0x9e [333271.556356] [<ffffffff8132c993>] ? common_interrupt+0x13/0x13 [333271.562259] <EOI> [333271.564447] [<ffffffff8106132c>] ? enqueue_hrtimer+0x3f/0x53 [333271.570268] [<ffffffffa0307417>] ? arch_local_irq_enable+0x7/0x8 [processor] [333271.577476] [<ffffffffa0307fdf>] ? acpi_idle_enter_bm+0x218/0x250 [processor] [333271.584771] [<ffffffff8125ded9>] ? menu_select+0x169/0x296 [333271.590416] [<ffffffff8125cfe9>] ? cpuidle_idle_call+0xf4/0x17e [333271.596495] [<ffffffff81008298>] ? cpu_idle+0xa2/0xc4 [333271.601709] [<ffffffff8169db60>] ? start_kernel+0x3b9/0x3c4 [333271.607441] [<ffffffff8169d3c6>] ? x86_64_start_kernel+0x102/0x10f [333271.613779] Code: 4d 02 3c 03 0f 86 59 02 00 00 0f b6 d0 44 39 ea 7f 32 83 c2 03 44 39 ea 0f 8f 45 02 00 00 48 85 db 74 18 48 8b 74 24 10 0f b6 c0 <8b> 96 cc 00 00 00 89 54 05 ff 41 80 4c 24 08 04 80 01 04 41 80 [333271.633283] RIP [<ffffffff8129fb09>] ip_options_compile+0x1c1/0x435 [333271.639717] RSP <ffff88042f203af0> [333271.643280] CR2: 00000000000000cc [333271.646992] ---[ end trace 1e328ecef856727a ]--- [333271.651683] BUG: scheduling while atomic: swapper/0/0x10000100 [333271.657586] Modules linked in: tun kvm_intel kvm bridge stp loop snd_pcm snd_timer snd soundcore snd_page_alloc i7core_edac tpm_tis psmouse evdev tpm dcdbas edac_core pcspkr serio_raw processor tpm_bios thermal_sys ghes power_meter hed button ext2 mbcache dm_mod raid1 md_mod sd_mod crc_t10dif usb_storage uas uhci_hcd mpt2sas igb scsi_transport_sas ehci_hcd raid_class scsi_mod dca usbcore bnx2 [last unloaded: scsi_wait_scan] [333271.695561] CPU 0 [333271.697479] Modules linked in: tun kvm_intel kvm bridge stp loop snd_pcm snd_timer snd soundcore snd_page_alloc i7core_edac tpm_tis psmouse evdev tpm dcdbas edac_core pcspkr serio_raw processor tpm_bios thermal_sys ghes power_meter hed button ext2 mbcache dm_mod raid1 md_mod sd_mod crc_t10dif usb_storage uas uhci_hcd mpt2sas igb scsi_transport_sas ehci_hcd raid_class scsi_mod dca usbcore bnx2 [last unloaded: scsi_wait_scan] [333271.735643] [333271.737215] Pid: 0, comm: swapper Tainted: G D 2.6.39-rc1+ #2 Dell Inc. PowerEdge R510/0DPRKF [333271.746614] RIP: 0010:[<ffffffffa0307417>] [<ffffffffa0307417>] arch_local_irq_enable+0x7/0x8 [processor] [333271.756345] RSP: 0018:ffffffff81601eb0 EFLAGS: 00000292 [333271.761729] RAX: 00000000000001a8 RBX: ffffffff8106132c RCX: 00000000000003e8 [333271.768932] RDX: 000000000000020a RSI: 0000000225c17d03 RDI: 0000000000067a4a [333271.776136] RBP: ffff88040450e000 R08: 00000000fffffffd R09: 0000000002625a00 [333271.783340] R10: 00000000006f53aa R11: 0000000000000293 R12: ffffffff8132c98e [333271.790546] R13: ffff88042f20feb0 R14: ffffffff811a2ede R15: ffff88042f20fdc8 [333271.797750] FS: 0000000000000000(0000) GS:ffff88042f200000(0000) knlGS:0000000000000000 [333271.805907] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [333271.811726] CR2: 00000000000000cc CR3: 0000000001603000 CR4: 00000000000026e0 [333271.818930] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [333271.826134] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [333271.833340] Process swapper (pid: 0, threadinfo ffffffff81600000, task ffffffff8160b020) [333271.841498] Stack: [333271.843589] ffffffffa0307fdf 0000000000011140 ffffffff8125ded9 0000000104f6235d [333271.851087] ffff88040450e020 ffff88040450e0f0 0000000000000002 ffffffffffffffff [333271.858585] ffffffff8125cfe9 0000000000000000 ffffffff81600000 ffffffff816812d0 [333271.866082] Call Trace: [333271.868607] [<ffffffffa0307fdf>] ? acpi_idle_enter_bm+0x218/0x250 [processor] [333271.875901] [<ffffffff8125ded9>] ? menu_select+0x169/0x296 [333271.881548] [<ffffffff8125cfe9>] ? cpuidle_idle_call+0xf4/0x17e [333271.887627] [<ffffffff81008298>] ? cpu_idle+0xa2/0xc4 [333271.892839] [<ffffffff8169db60>] ? start_kernel+0x3b9/0x3c4 [333271.898572] [<ffffffff8169d3c6>] ? x86_64_start_kernel+0x102/0x10f [333271.904910] Code: 63 1c fb 48 83 c4 38 89 e8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 0f 09 0f 1f 44 00 00 c3 fa 66 0f 1f 44 00 00 c3 fb 66 0f 1f 44 00 00 <c3> 48 8b 15 81 70 41 e1 48 8d 42 fd 48 83 f8 01 0f 96 c0 48 ff [333271.924428] Call Trace: [333271.926954] [<ffffffffa0307fdf>] ? acpi_idle_enter_bm+0x218/0x250 [processor] [333271.934247] [<ffffffff8125ded9>] ? menu_select+0x169/0x296 [333271.939893] [<ffffffff8125cfe9>] ? cpuidle_idle_call+0xf4/0x17e [333271.945973] [<ffffffff81008298>] ? cpu_idle+0xa2/0xc4 [333271.951184] [<ffffffff8169db60>] ? start_kernel+0x3b9/0x3c4 [333271.956915] [<ffffffff8169d3c6>] ? x86_64_start_kernel+0x102/0x10f [333271.963513] BUG: unable to handle kernel NULL pointer dereference at 00000000000000e0 [333271.971428] IP: [<ffffffffa022aca8>] igb_poll+0x3b4/0x9ee [igb] [333271.977431] PGD 40402d067 PUD 40357f067 PMD 0 [333271.981977] Oops: 0000 [#2] SMP [333271.985298] last sysfs file: /sys/devices/virtual/net/lo/operstate [333271.991550] CPU 0 [333271.993467] Modules linked in: tun kvm_intel kvm bridge stp loop snd_pcm snd_timer snd soundcore snd_page_alloc i7core_edac tpm_tis psmouse evdev tpm dcdbas edac_core pcspkr serio_raw processor tpm_bios thermal_sys ghes power_meter hed button ext2 mbcache dm_mod raid1 md_mod sd_mod crc_t10dif usb_storage uas uhci_hcd mpt2sas igb scsi_transport_sas ehci_hcd raid_class scsi_mod dca usbcore bnx2 [last unloaded: scsi_wait_scan] [333272.031622] [333272.033193] Pid: 2453, comm: kvm Tainted: G D 2.6.39-rc1+ #2 Dell Inc. PowerEdge R510/0DPRKF [333272.042504] RIP: 0010:[<ffffffffa022aca8>] [<ffffffffa022aca8>] igb_poll+0x3b4/0x9ee [igb] [333272.050933] RSP: 0000:ffff88041d87dd70 EFLAGS: 00010203 [333272.056317] RAX: ffff880401d6b458 RBX: ffff880405017080 RCX: 0000000000000000 [333272.063523] RDX: 0000000000000040 RSI: 0000000000001043 RDI: ffff880401d6b458 [333272.070726] RBP: 0000000000000000 R08: 00000000ffffffff R09: 00000000000515cb [333272.077930] R10: 0000000000000001 R11: 0000000000000202 R12: 0000000000000078 [333272.085135] R13: ffffc900126a2208 R14: 0000000000000000 R15: ffff8804052f50d0 [333272.092342] FS: 00007ff18dd67760(0000) GS:ffff88042f200000(0000) knlGS:0000000000000000 [333272.100499] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [333272.106316] CR2: 00000000000000e0 CR3: 0000000403352000 CR4: 00000000000026e0 [333272.113522] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [333272.120726] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [333272.127930] Process kvm (pid: 2453, threadinfo ffff88041d87c000, task ffff8804101d1410) [333272.135999] Stack: [333272.138090] ffffffff81103f65 00007fff22fa90c0 00007fff22fa9140 000000000000000e [333272.145589] ffffffff81053896 ffff8804052fb740 0000000000000000 00001043101d1410 [333272.153090] 000000010000000e ffff880401d6b440 ffff880400000000 ffff880404188740 [333272.160592] Call Trace: [333272.163118] [<ffffffff81103f65>] ? set_fd_set+0x31/0x38 [333272.168504] [<ffffffff81053896>] ? __dequeue_signal+0x15/0x10a [333272.174498] [<ffffffff81054b96>] ? send_sigqueue+0xd8/0xec [333272.180145] [<ffffffff81276ec0>] ? net_rx_action+0xa4/0x1b1 [333272.185878] [<ffffffff8105d673>] ? posix_timer_event+0x33/0x3d [333272.191871] [<ffffffff810657f4>] ? timekeeping_get_ns+0xe/0x2e [333272.197863] [<ffffffff8104ad0a>] ? __do_softirq+0xb8/0x176 [333272.203508] [<ffffffff81333a1c>] ? call_softirq+0x1c/0x30 [333272.209066] [<ffffffff8100aa57>] ? do_softirq+0x3f/0x84 [333272.214451] [<ffffffff8104af75>] ? irq_exit+0x3f/0x8f [333272.219664] [<ffffffff8101f280>] ? smp_apic_timer_interrupt+0x78/0x88 [333272.226263] [<ffffffff813331d3>] ? apic_timer_interrupt+0x13/0x20 [333272.232516] [<ffffffff81332812>] ? system_call_fastpath+0x16/0x1b [333272.238766] Code: 89 74 24 3c e9 78 03 00 00 8b 54 24 70 39 54 24 44 0f 8d 75 03 00 00 ff 44 24 44 0f ae e8 49 8b 6d 00 ff 44 24 40 b9 00 00 00 00 [333272.252256] 8b 85 e0 00 00 00 49 c7 45 00 00 00 00 00 41 8b 77 0c 0f 18 [333272.259484] RIP [<ffffffffa022aca8>] igb_poll+0x3b4/0x9ee [igb] [333272.265575] RSP <ffff88041d87dd70> [333272.269138] CR2: 00000000000000e0 [333272.272529] ---[ end trace 1e328ecef856727b ]--- [333272.277223] Kernel panic - not syncing: Fatal exception in interrupt [333272.283649] Pid: 2453, comm: kvm Tainted: G D 2.6.39-rc1+ #2 [333272.290158] Call Trace: [333272.292685] [<ffffffff8132ab34>] ? panic+0x92/0x1a1 [333272.297725] [<ffffffff8132d6f6>] ? oops_end+0xa9/0xb6 [333272.302940] [<ffffffff8102ca16>] ? no_context+0x1ed/0x1fa [333272.308499] [<ffffffff8132f4e3>] ? do_page_fault+0x16b/0x308 [333272.314319] [<ffffffff81104277>] ? __pollwait+0xd6/0xd6 [333272.319705] [<ffffffff81104277>] ? __pollwait+0xd6/0xd6 [333272.325090] [<ffffffff81104277>] ? __pollwait+0xd6/0xd6 [333272.330475] [<ffffffff8132cc55>] ? page_fault+0x25/0x30 [333272.335863] [<ffffffffa022aca8>] ? igb_poll+0x3b4/0x9ee [igb] [333272.341768] [<ffffffff81103f65>] ? set_fd_set+0x31/0x38 [333272.347154] [<ffffffff81053896>] ? __dequeue_signal+0x15/0x10a [333272.353147] [<ffffffff81054b96>] ? send_sigqueue+0xd8/0xec [333272.358795] [<ffffffff81276ec0>] ? net_rx_action+0xa4/0x1b1 [333272.364527] [<ffffffff8105d673>] ? posix_timer_event+0x33/0x3d [333272.370520] [<ffffffff810657f4>] ? timekeeping_get_ns+0xe/0x2e [333272.376512] [<ffffffff8104ad0a>] ? __do_softirq+0xb8/0x176 [333272.382158] [<ffffffff81333a1c>] ? call_softirq+0x1c/0x30 [333272.387716] [<ffffffff8100aa57>] ? do_softirq+0x3f/0x84 [333272.393102] [<ffffffff8104af75>] ? irq_exit+0x3f/0x8f [333272.398315] [<ffffffff8101f280>] ? smp_apic_timer_interrupt+0x78/0x88 [333272.404915] [<ffffffff813331d3>] ? apic_timer_interrupt+0x13/0x20 [333272.411167] [<ffffffff81332812>] ? system_call_fastpath+0x16/0x1b --------- Server #2 [401314.185779] BUG: unable to handle kernel NULL pointer dereference at 00000000000000cc [401314.193674] IP: [<ffffffff8129fb09>] ip_options_compile+0x1c1/0x435 [401314.200009] PGD 0 [401314.202101] Oops: 0000 [#1] SMP [401314.205412] last sysfs file: /sys/devices/virtual/block/md127/md/mismatch_cnt [401314.212597] CPU 0 [401314.214508] Modules linked in: kvm_intel kvm bridge stp loop tpm_tis snd_pcm tpm snd_timer snd soundcore psmouse snd_page_alloc ghes pcspkr tpm_bios i7core_edac se rio_raw evdev processor dcdbas edac_core thermal_sys power_meter hed button ext2 mbcache dm_mod raid1 md_mod sd_mod crc_t10dif usb_storage uas uhci_hcd mpt2sas igb scs i_transport_sas ehci_hcd raid_class scsi_mod usbcore bnx2 dca [last unloaded: scsi_wait_scan] [401314.252204] [401314.253769] Pid: 0, comm: swapper Not tainted 2.6.39-rc1+ #2 Dell Inc. PowerEdge R510/0DPRKF [401314.262276] RIP: 0010:[<ffffffff8129fb09>] [<ffffffff8129fb09>] ip_options_compile+0x1c1/0x435 [401314.271027] RSP: 0018:ffff88042f203af0 EFLAGS: 00010286 [401314.276398] RAX: 0000000000000024 RBX: ffff8804101d8d00 RCX: ffff88041db89065 [401314.283583] RDX: 0000000000000027 RSI: 0000000000000000 RDI: ffffffff817e6180 [401314.290768] RBP: ffff88041db89063 R08: ffffffffa0197e89 R09: ffff88042f203c58 [401314.297955] R10: ffff880404078340 R11: 0000000000000040 R12: ffff8804101d8d28 [401314.305140] R13: 0000000000000027 R14: ffff88041db8904e R15: 0000000000000027 [401314.312324] FS: 0000000000000000(0000) GS:ffff88042f200000(0000) knlGS:0000000000000000 [401314.320459] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [401314.326261] CR2: 00000000000000cc CR3: 0000000001603000 CR4: 00000000000006f0 [401314.333448] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [401314.340632] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [401314.347817] Process swapper (pid: 0, threadinfo ffffffff81600000, task ffffffff8160b020) [401314.355951] Stack: [401314.358034] ffff88042ec02900 ffff8804051d8740 0000000000000000 ffffffff817e6180 [401314.365514] 0000000000000282 ffffffff810ec7a8 0000000000000282 ffff8804101d8d28 [401314.372992] ffff8804101d8d00 ffff880404078000 ffff88041db8904e ffff880404078000 [401314.380471] Call Trace: [401314.382988] <IRQ> [401314.385170] [<ffffffff810ec7a8>] ? __slab_free+0x28/0x14a [401314.390717] [<ffffffffa019be3a>] ? br_parse_ip_options+0x133/0x1a0 [bridge] [401314.397817] [<ffffffffa019cbd8>] ? br_nf_pre_routing+0x348/0x3cb [bridge] [401314.404745] [<ffffffff8119d89b>] ? cpumask_next_and+0x2b/0x3a [401314.410635] [<ffffffff81298493>] ? nf_iterate+0x41/0x7e [401314.416007] [<ffffffffa0197e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [401314.422590] [<ffffffffa0197e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [401314.429171] [<ffffffff81298543>] ? nf_hook_slow+0x73/0x114 [401314.434800] [<ffffffffa0197e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [401314.441383] [<ffffffffa0197e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [401314.447965] [<ffffffffa0197e6f>] ? NF_HOOK.clone.4+0x3c/0x56 [bridge] [401314.454548] [<ffffffffa01981e5>] ? br_handle_frame+0x195/0x1ac [bridge] [401314.461302] [<ffffffffa0198050>] ? br_handle_frame_finish+0x1c7/0x1c7 [bridge] [401314.468662] [<ffffffff8127646f>] ? __netif_receive_skb+0x2a7/0x450 [401314.474985] [<ffffffff81276893>] ? netif_receive_skb+0x52/0x58 [401314.480960] [<ffffffff81276d95>] ? napi_gro_receive+0x1f/0x2f [401314.486850] [<ffffffff8127696a>] ? napi_skb_finish+0x1c/0x31 [401314.492655] [<ffffffffa01c0fcd>] ? igb_poll+0x6d9/0x9ee [igb] [401314.498546] [<ffffffff8109031f>] ? handle_irq_event+0x40/0x55 [401314.504436] [<ffffffff8132c993>] ? common_interrupt+0x13/0x13 [401314.510326] [<ffffffff81276ec0>] ? net_rx_action+0xa4/0x1b1 [401314.516043] [<ffffffff8104ad0a>] ? __do_softirq+0xb8/0x176 [401314.521674] [<ffffffff8100e06d>] ? paravirt_read_tsc+0x5/0x8 [401314.527478] [<ffffffff81333a1c>] ? call_softirq+0x1c/0x30 [401314.533022] [<ffffffff8100aa57>] ? do_softirq+0x3f/0x84 [401314.538392] [<ffffffff8104af75>] ? irq_exit+0x3f/0x8f [401314.543588] [<ffffffff8100a793>] ? do_IRQ+0x85/0x9e [401314.548613] [<ffffffff8132c993>] ? common_interrupt+0x13/0x13 [401314.554501] <EOI> [401314.556683] [<ffffffff8106132c>] ? enqueue_hrtimer+0x3f/0x53 [401314.562487] [<ffffffffa02c7417>] ? arch_local_irq_enable+0x7/0x8 [processor] [401314.569673] [<ffffffffa02c7fdf>] ? acpi_idle_enter_bm+0x218/0x250 [processor] [401314.576948] [<ffffffff8125ded9>] ? menu_select+0x169/0x296 [401314.582577] [<ffffffff8125cfe9>] ? cpuidle_idle_call+0xf4/0x17e [401314.588640] [<ffffffff81008298>] ? cpu_idle+0xa2/0xc4 [401314.593839] [<ffffffff8169db60>] ? start_kernel+0x3b9/0x3c4 [401314.599554] [<ffffffff8169d3c6>] ? x86_64_start_kernel+0x102/0x10f [401314.605875] Code: 4d 02 3c 03 0f 86 59 02 00 00 0f b6 d0 44 39 ea 7f 32 83 c2 03 44 39 ea 0f 8f 45 02 00 00 48 85 db 74 18 48 8b 74 24 10 0f b6 c0 <8b> 96 cc 00 00 00 89 54 05 ff 41 80 4c 24 08 04 80 01 04 41 80 [401314.625301] RIP [<ffffffff8129fb09>] ip_options_compile+0x1c1/0x435 [401314.631718] RSP <ffff88042f203af0> [401314.635271] CR2: 00000000000000cc [401314.638981] ---[ end trace ee9f25b92857731e ]--- [401314.643661] BUG: scheduling while atomic: swapper/0/0x10000100 [401314.649551] Modules linked in: kvm_intel kvm bridge stp loop tpm_tis snd_pcm tpm snd_timer snd soundcore psmouse snd_page_alloc ghes pcspkr tpm_bios i7core_edac serio_raw evdev processor dcdbas edac_core thermal_sys power_meter hed button ext2 mbcache dm_mod raid1 md_mod sd_mod crc_t10dif usb_storage uas uhci_hcd mpt2sas igb scsi_transport_sas ehci_hcd raid_class scsi_mod usbcore bnx2 dca [last unloaded: scsi_wait_scan] [401314.687071] CPU 0 [401314.688981] Modules linked in: kvm_intel kvm bridge stp loop tpm_tis snd_pcm tpm snd_timer snd soundcore psmouse snd_page_alloc ghes pcspkr tpm_bios i7core_edac serio_raw evdev processor dcdbas edac_core thermal_sys power_meter hed button ext2 mbcache dm_mod raid1 md_mod sd_mod crc_t10dif usb_storage uas uhci_hcd mpt2sas igb scsi_transport_sas ehci_hcd raid_class scsi_mod usbcore bnx2 dca [last unloaded: scsi_wait_scan] [401314.726690] [401314.728255] Pid: 0, comm: swapper Tainted: G D 2.6.39-rc1+ #2 Dell Inc. PowerEdge R510/0DPRKF [401314.737628] RIP: 0010:[<ffffffffa02c7417>] [<ffffffffa02c7417>] arch_local_irq_enable+0x7/0x8 [processor] [401314.747333] RSP: 0018:ffffffff81601eb0 EFLAGS: 00000292 [401314.752703] RAX: 000000000001d1bc RBX: ffffffff8106132c RCX: 00000000000003e8 [401314.759889] RDX: 000000000000006e RSI: 0000000225c17d03 RDI: 00000000071b46ce [401314.767075] RBP: ffff880403a82800 R08: 00000000fffffffd R09: 0000000000000000 [401314.774261] R10: ffff880404078340 R11: 0000000000000040 R12: ffffffff8132c98e [401314.781447] R13: ffff88042f20feb0 R14: ffffffff811a2ede R15: ffff88042f20fdc8 [401314.788633] FS: 0000000000000000(0000) GS:ffff88042f200000(0000) knlGS:0000000000000000 [401314.796770] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [401314.802573] CR2: 00000000000000cc CR3: 0000000001603000 CR4: 00000000000006f0 [401314.809761] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [401314.816947] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [401314.824133] Process swapper (pid: 0, threadinfo ffffffff81600000, task ffffffff8160b020) [401314.832269] Stack: [401314.834353] ffffffffa02c7fdf 0000000000011140 ffffffff8125ded9 0000000105fdb064 [401314.841835] ffff880403a82820 ffff880403a828f0 0000000000000002 ffffffffffffffff [401314.849317] ffffffff8125cfe9 0000000000000000 ffffffff81600000 ffffffff816812d0 [401314.856798] Call Trace: [401314.859317] [<ffffffffa02c7fdf>] ? acpi_idle_enter_bm+0x218/0x250 [processor] [401314.866590] [<ffffffff8125ded9>] ? menu_select+0x169/0x296 [401314.872222] [<ffffffff8125cfe9>] ? cpuidle_idle_call+0xf4/0x17e [401314.878286] [<ffffffff81008298>] ? cpu_idle+0xa2/0xc4 [401314.883483] [<ffffffff8169db60>] ? start_kernel+0x3b9/0x3c4 [401314.889200] [<ffffffff8169d3c6>] ? x86_64_start_kernel+0x102/0x10f [401314.895522] Code: 63 1c fb 48 83 c4 38 89 e8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 0f 09 0f 1f 44 00 00 c3 fa 66 0f 1f 44 00 00 c3 fb 66 0f 1f 44 00 00 <c3> 48 8b 15 81 70 45 e1 48 8d 42 fd 48 83 f8 01 0f 96 c0 48 ff [401314.914953] Call Trace: [401314.917472] [<ffffffffa02c7fdf>] ? acpi_idle_enter_bm+0x218/0x250 [processor] [401314.924746] [<ffffffff8125ded9>] ? menu_select+0x169/0x296 [401314.930378] [<ffffffff8125cfe9>] ? cpuidle_idle_call+0xf4/0x17e [401314.936444] [<ffffffff81008298>] ? cpu_idle+0xa2/0xc4 [401314.941642] [<ffffffff8169db60>] ? start_kernel+0x3b9/0x3c4 [401314.947361] [<ffffffff8169d3c6>] ? x86_64_start_kernel+0x102/0x10f [401314.954266] Kernel panic - not syncing: Fatal exception in interrupt [401314.960680] Pid: 0, comm: swapper Tainted: G D 2.6.39-rc1+ #2 [401314.967265] Call Trace: [401314.969785] <IRQ> [<ffffffff8132ab34>] ? panic+0x92/0x1a1 [401314.975429] [<ffffffff8132d6f6>] ? oops_end+0xa9/0xb6 [401314.980630] [<ffffffff8102ca16>] ? no_context+0x1ed/0x1fa [401314.986177] [<ffffffff8132f4e3>] ? do_page_fault+0x16b/0x308 [401314.991982] [<ffffffff81268a03>] ? sk_wake_async+0x19/0x3c [401314.997615] [<ffffffff8126dcea>] ? skb_checksum+0x46/0x1ce [401315.003246] [<ffffffff8126befd>] ? skb_clone+0x44/0x12d [401315.008619] [<ffffffff810ec6bc>] ? kmem_cache_alloc+0x22/0xe6 [401315.014509] [<ffffffff810ea70e>] ? virt_to_head_page+0x9/0x2f [401315.020400] [<ffffffff810ec8dc>] ? kmem_cache_free+0x12/0xa4 [401315.026208] [<ffffffffa019f014>] ? br_multicast_rcv+0xbc2/0xbee [bridge] [401315.033050] [<ffffffff81271ed8>] ? arch_local_irq_save+0x12/0x1b [401315.039201] [<ffffffff8132cc55>] ? page_fault+0x25/0x30 [401315.044574] [<ffffffffa0197e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [401315.051157] [<ffffffff8129fb09>] ? ip_options_compile+0x1c1/0x435 [401315.057395] [<ffffffff810ec7a8>] ? __slab_free+0x28/0x14a [401315.062943] [<ffffffffa019be3a>] ? br_parse_ip_options+0x133/0x1a0 [bridge] [401315.070047] [<ffffffffa019cbd8>] ? br_nf_pre_routing+0x348/0x3cb [bridge] [401315.076976] [<ffffffff8119d89b>] ? cpumask_next_and+0x2b/0x3a [401315.082867] [<ffffffff81298493>] ? nf_iterate+0x41/0x7e [401315.088242] [<ffffffffa0197e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [401315.094825] [<ffffffffa0197e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [401315.101408] [<ffffffff81298543>] ? nf_hook_slow+0x73/0x114 [401315.107040] [<ffffffffa0197e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [401315.113624] [<ffffffffa0197e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [401315.120208] [<ffffffffa0197e6f>] ? NF_HOOK.clone.4+0x3c/0x56 [bridge] [401315.126791] [<ffffffffa01981e5>] ? br_handle_frame+0x195/0x1ac [bridge] [401315.133549] [<ffffffffa0198050>] ? br_handle_frame_finish+0x1c7/0x1c7 [bridge] [401315.140911] [<ffffffff8127646f>] ? __netif_receive_skb+0x2a7/0x450 [401315.147233] [<ffffffff81276893>] ? netif_receive_skb+0x52/0x58 [401315.153212] [<ffffffff81276d95>] ? napi_gro_receive+0x1f/0x2f [401315.159103] [<ffffffff8127696a>] ? napi_skb_finish+0x1c/0x31 [401315.164911] [<ffffffffa01c0fcd>] ? igb_poll+0x6d9/0x9ee [igb] [401315.170804] [<ffffffff8109031f>] ? handle_irq_event+0x40/0x55 [401315.176695] [<ffffffff8132c993>] ? common_interrupt+0x13/0x13 [401315.182586] [<ffffffff81276ec0>] ? net_rx_action+0xa4/0x1b1 [401315.188305] [<ffffffff8104ad0a>] ? __do_softirq+0xb8/0x176 [401315.193937] [<ffffffff8100e06d>] ? paravirt_read_tsc+0x5/0x8 [401315.199743] [<ffffffff81333a1c>] ? call_softirq+0x1c/0x30 [401315.205289] [<ffffffff8100aa57>] ? do_softirq+0x3f/0x84 [401315.210661] [<ffffffff8104af75>] ? irq_exit+0x3f/0x8f [401315.215860] [<ffffffff8100a793>] ? do_IRQ+0x85/0x9e [401315.220887] [<ffffffff8132c993>] ? common_interrupt+0x13/0x13 [401315.226777] <EOI> [<ffffffff8106132c>] ? enqueue_hrtimer+0x3f/0x53 [401315.233201] [<ffffffffa02c7417>] ? arch_local_irq_enable+0x7/0x8 [processor] [401315.240390] [<ffffffffa02c7fdf>] ? acpi_idle_enter_bm+0x218/0x250 [processor] [401315.247664] [<ffffffff8125ded9>] ? menu_select+0x169/0x296 [401315.253296] [<ffffffff8125cfe9>] ? cpuidle_idle_call+0xf4/0x17e [401315.259361] [<ffffffff81008298>] ? cpu_idle+0xa2/0xc4 [401315.264560] [<ffffffff8169db60>] ? start_kernel+0x3b9/0x3c4 [401315.270279] [<ffffffff8169d3c6>] ? x86_64_start_kernel+0x102/0x10f [401315.276602] BUG: scheduling while atomic: swapper/0/0x10000100 [401315.282491] Modules linked in: kvm_intel kvm bridge stp loop tpm_tis snd_pcm tpm snd_timer snd soundcore psmouse snd_page_alloc ghes pcspkr tpm_bios i7core_edac serio_raw evdev processor dcdbas edac_core thermal_sys power_meter hed button ext2 mbcache dm_mod raid1 md_mod sd_mod crc_t10dif usb_storage uas uhci_hcd mpt2sas igb scsi_transport_sas ehci_hcd raid_class scsi_mod usbcore bnx2 dca [last unloaded: scsi_wait_scan] [401315.320027] CPU 0 [401315.321938] Modules linked in: kvm_intel kvm bridge stp loop tpm_tis snd_pcm tpm snd_timer snd soundcore psmouse snd_page_alloc ghes pcspkr tpm_bios i7core_edac serio_raw evdev processor dcdbas edac_core thermal_sys power_meter hed button ext2 mbcache dm_mod raid1 md_mod sd_mod crc_t10dif usb_storage uas uhci_hcd mpt2sas igb scsi_transport_sas ehci_hcd raid_class scsi_mod usbcore bnx2 dca [last unloaded: scsi_wait_scan] [401315.359651] [401315.361216] Pid: 0, comm: swapper Tainted: G D 2.6.39-rc1+ #2 Dell Inc. PowerEdge R510/0DPRKF [401315.370590] RIP: 0010:[<ffffffffa02c7417>] [<ffffffffa02c7417>] arch_local_irq_enable+0x7/0x8 [processor] [401315.380296] RSP: 0018:ffffffff81601eb0 EFLAGS: 00000292 [401315.385668] RAX: 000000000001d1bc RBX: ffffffff8106132c RCX: 00000000000003e8 [401315.392856] RDX: 000000000000006e RSI: 0000000225c17d03 RDI: 00000000071b46ce [401315.400043] RBP: ffff880403a82800 R08: 00000000fffffffd R09: 0000000000000000 [401315.407230] R10: ffff880404078340 R11: 0000000000000040 R12: ffffffff8132c98e [401315.414417] R13: ffff88042f20feb0 R14: ffffffff811a2ede R15: ffff88042f20fdc8 [401315.421605] FS: 0000000000000000(0000) GS:ffff88042f200000(0000) knlGS:0000000000000000 [401315.429744] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [401315.435549] CR2: 00000000000000cc CR3: 00000004042e8000 CR4: 00000000000006f0 [401315.442736] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [401315.449923] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [401315.457110] Process swapper (pid: 0, threadinfo ffffffff81600000, task ffffffff8160b020) [401315.465248] Stack: [401315.467334] ffffffffa02c7fdf 0000000000011140 ffffffff8125ded9 0000000105fdb064 [401315.474816] ffff880403a82820 ffff880403a828f0 0000000000000002 ffffffffffffffff [401315.482300] ffffffff8125cfe9 0000000000000000 ffffffff81600000 ffffffff816812d0 [401315.489784] Call Trace: [401315.492306] [<ffffffffa02c7fdf>] ? acpi_idle_enter_bm+0x218/0x250 [processor] [401315.499582] [<ffffffff8125ded9>] ? menu_select+0x169/0x296 [401315.505215] [<ffffffff8125cfe9>] ? cpuidle_idle_call+0xf4/0x17e [401315.511280] [<ffffffff81008298>] ? cpu_idle+0xa2/0xc4 [401315.516481] [<ffffffff8169db60>] ? start_kernel+0x3b9/0x3c4 [401315.522199] [<ffffffff8169d3c6>] ? x86_64_start_kernel+0x102/0x10f [401315.528522] Code: 63 1c fb 48 83 c4 38 89 e8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 0f 09 0f 1f 44 00 00 c3 fa 66 0f 1f 44 00 00 c3 fb 66 0f 1f 44 00 00 <c3> 48 8b 15 81 70 45 e1 48 8d 42 fd 48 83 f8 01 0f 96 c0 48 ff [401315.547963] Call Trace: [401315.550482] [<ffffffffa02c7fdf>] ? acpi_idle_enter_bm+0x218/0x250 [processor] [401315.557758] [<ffffffff8125ded9>] ? menu_select+0x169/0x296 [401315.563392] [<ffffffff8125cfe9>] ? cpuidle_idle_call+0xf4/0x17e [401315.569456] [<ffffffff81008298>] ? cpu_idle+0xa2/0xc4 [401315.574655] [<ffffffff8169db60>] ? start_kernel+0x3b9/0x3c4 [401315.580374] [<ffffffff8169d3c6>] ? x86_64_start_kernel+0x102/0x10f [401315.586862] BUG: unable to handle kernel NULL pointer dereference at 00000000000000e0 [401315.594758] IP: [<ffffffffa01c0ca8>] igb_poll+0x3b4/0x9ee [igb] [401315.600745] PGD 4041c6067 PUD 40514b067 PMD 0 [401315.605276] Oops: 0000 [#2] SMP [401315.608587] last sysfs file: /sys/devices/virtual/block/md127/md/mismatch_cnt [401315.615772] CPU 0 [401315.617682] Modules linked in: kvm_intel kvm bridge stp loop tpm_tis snd_pcm tpm snd_timer snd soundcore psmouse snd_page_alloc ghes pcspkr tpm_bios i7core_edac serio_raw evdev processor dcdbas edac_core thermal_sys power_meter hed button ext2 mbcache dm_mod raid1 md_mod sd_mod crc_t10dif usb_storage uas uhci_hcd mpt2sas igb scsi_transport_sas ehci_hcd raid_class scsi_mod usbcore bnx2 dca [last unloaded: scsi_wait_scan] [401315.655363] [401315.656931] Pid: 1563, comm: rsyslogd Tainted: G D 2.6.39-rc1+ #2 Dell Inc. PowerEdge R510/0DPRKF [401315.666650] RIP: 0010:[<ffffffffa01c0ca8>] [<ffffffffa01c0ca8>] igb_poll+0x3b4/0x9ee [igb] [401315.675057] RSP: 0000:ffff8804024d7d70 EFLAGS: 00010207 [401315.680427] RAX: ffff880401f5ccd8 RBX: ffff880401f5d8c0 RCX: 0000000000000000 [401315.687612] RDX: 0000000000000040 RSI: 0000000000001043 RDI: ffff880401f5ccd8 [401315.694798] RBP: 0000000000000000 R08: 000000a47c6e7c01 R09: 00000000000000fa [401315.701983] R10: 0000000000000249 R11: ffffffff81404535 R12: 0000000000000038 [401315.709168] R13: ffffc900126a28c0 R14: 0000000000000000 R15: ffff880404d09380 [401315.716353] FS: 00007f8cf4fca700(0000) GS:ffff88042f200000(0000) knlGS:0000000000000000 [401315.724489] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [401315.730292] CR2: 00000000000000e0 CR3: 00000004042e8000 CR4: 00000000000006f0 [401315.737477] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [401315.744662] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [401315.751847] Process rsyslogd (pid: 1563, threadinfo ffff8804024d6000, task ffff88041d9dc2e0) [401315.760330] Stack: [401315.762413] ffffffff81772040 ffffffff81051890 ffff88042f213a80 ffffffff817c5680 [401315.769892] ffffffff817c5680 0000000105fdb7ac 0000000000000286 0000104381051b4a [401315.777371] 0000000100000039 ffff880401f5ccc0 ffff880400000000 ffff880404078740 [401315.784850] Call Trace: [401315.787370] [<ffffffff81051890>] ? lock_timer_base.clone.20+0x25/0x4c [401315.793951] [<ffffffff8132c62d>] ? _raw_spin_lock_irq+0xd/0x1a [401315.799928] [<ffffffff81276ec0>] ? net_rx_action+0xa4/0x1b1 [401315.805644] [<ffffffff810657f4>] ? timekeeping_get_ns+0xe/0x2e [401315.811620] [<ffffffff8104ad0a>] ? __do_softirq+0xb8/0x176 [401315.817252] [<ffffffff81333a1c>] ? call_softirq+0x1c/0x30 [401315.822796] [<ffffffff8100aa57>] ? do_softirq+0x3f/0x84 [401315.828167] [<ffffffff8104af75>] ? irq_exit+0x3f/0x8f [401315.833365] [<ffffffff8101f280>] ? smp_apic_timer_interrupt+0x78/0x88 [401315.839946] [<ffffffff813331d3>] ? apic_timer_interrupt+0x13/0x20 [401315.846182] [<ffffffff81332812>] ? system_call_fastpath+0x16/0x1b [401315.852416] Code: 89 74 24 3c e9 78 03 00 00 8b 54 24 70 39 54 24 44 0f 8d 75 03 00 00 ff 44 24 44 0f ae e8 49 8b 6d 00 ff 44 24 40 b9 00 00 00 00 [401315.865859] 8b 85 e0 00 00 00 49 c7 45 00 00 00 00 00 41 8b 77 0c 0f 18 [401315.873063] RIP [<ffffffffa01c0ca8>] igb_poll+0x3b4/0x9ee [igb] [401315.879136] RSP <ffff8804024d7d70> [401315.882689] CR2: 00000000000000e0 [401315.886080] ---[ end trace ee9f25b92857731f ]--- [401315.890761] BUG: scheduling while atomic: rsyslogd/1563/0x10000100 [401315.896997] Modules linked in: kvm_intel kvm bridge stp loop tpm_tis snd_pcm tpm snd_timer snd soundcore psmouse snd_page_alloc ghes pcspkr tpm_bios i7core_edac serio_raw evdev processor dcdbas edac_core thermal_sys power_meter hed button ext2 mbcache dm_mod raid1 md_mod sd_mod crc_t10dif usb_storage uas uhci_hcd mpt2sas igb scsi_transport_sas ehci_hcd raid_class scsi_mod usbcore bnx2 dca [last unloaded: scsi_wait_scan] [401315.934529] CPU 0 [401315.936441] Modules linked in: kvm_intel kvm bridge stp loop tpm_tis snd_pcm tpm snd_timer snd soundcore psmouse snd_page_alloc ghes pcspkr tpm_bios i7core_edac serio_raw evdev processor dcdbas edac_core thermal_sys power_meter hed button ext2 mbcache dm_mod raid1 md_mod sd_mod crc_t10dif usb_storage uas uhci_hcd mpt2sas igb scsi_transport_sas ehci_hcd raid_class scsi_mod usbcore bnx2 dca [last unloaded: scsi_wait_scan] [401315.974155] [401315.975722] Pid: 1563, comm: rsyslogd Tainted: G D 2.6.39-rc1+ #2 Dell Inc. PowerEdge R510/0DPRKF [401315.985442] RIP: 0033:[<00007f8cf6f6b7c9>] [<00007f8cf6f6b7c9>] 0x7f8cf6f6b7c8 [401315.992814] RSP: 002b:00007f8cf4fc9d78 EFLAGS: 00000206 [401315.998185] RAX: 0000000000000000 RBX: ffffffff81332812 RCX: 0000000000000e83 [401316.005372] RDX: 00007f8cf67fd9dc RSI: 000000000000003c RDI: 00007f8cf65fc2a9 [401316.012560] RBP: 00007f8cf67fd9dc R08: 00007f8cf65fc2a9 R09: 000000000000061b [401316.019746] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff813331d3 [401316.026934] R13: ffff8804024d7f78 R14: 00007f8cf67fe880 R15: 0000000000c63b98 [401316.034122] FS: 00007f8cf4fca700(0000) GS:ffff88042f200000(0000) knlGS:0000000000000000 [401316.042261] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [401316.048065] CR2: 00000000000000e0 CR3: 00000004042e8000 CR4: 00000000000006f0 [401316.055253] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [401316.062440] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [401316.069628] Process rsyslogd (pid: 1563, threadinfo ffff8804024d6000, task ffff88041d9dc2e0) [401316.078112] [401316.079679] Call Trace: [401316.082330] Kernel panic - not syncing: Fatal exception in interrupt [401316.088739] Pid: 1563, comm: rsyslogd Tainted: G D 2.6.39-rc1+ #2 [401316.095666] Call Trace: [401316.098185] [<ffffffff8132ab34>] ? panic+0x92/0x1a1 [401316.103213] [<ffffffff8132d6f6>] ? oops_end+0xa9/0xb6 [401316.108414] [<ffffffff8102ca16>] ? no_context+0x1ed/0x1fa [401316.113960] [<ffffffff8132f4e3>] ? do_page_fault+0x16b/0x308 [401316.119767] [<ffffffff8132cc55>] ? page_fault+0x25/0x30 [401316.125143] [<ffffffffa01c0ca8>] ? igb_poll+0x3b4/0x9ee [igb] [401316.131035] [<ffffffff81051890>] ? lock_timer_base.clone.20+0x25/0x4c [401316.137617] [<ffffffff8132c62d>] ? _raw_spin_lock_irq+0xd/0x1a [401316.143595] [<ffffffff81276ec0>] ? net_rx_action+0xa4/0x1b1 [401316.149313] [<ffffffff810657f4>] ? timekeeping_get_ns+0xe/0x2e [401316.155291] [<ffffffff8104ad0a>] ? __do_softirq+0xb8/0x176 [401316.160923] [<ffffffff81333a1c>] ? call_softirq+0x1c/0x30 [401316.166469] [<ffffffff8100aa57>] ? do_softirq+0x3f/0x84 [401316.171843] [<ffffffff8104af75>] ? irq_exit+0x3f/0x8f [401316.177044] [<ffffffff8101f280>] ? smp_apic_timer_interrupt+0x78/0x88 [401316.183626] [<ffffffff813331d3>] ? apic_timer_interrupt+0x13/0x20 [401316.189863] [<ffffffff81332812>] ? system_call_fastpath+0x16/0x1b ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: Kernel panic when using bridge 2011-04-08 1:20 Kernel panic when using bridge Scot Doyle @ 2011-04-08 13:49 ` Sebastian Nickel 2011-04-08 14:57 ` Scot Doyle 2011-04-08 19:17 ` Stephen Hemminger 1 sibling, 1 reply; 56+ messages in thread From: Sebastian Nickel @ 2011-04-08 13:49 UTC (permalink / raw) To: netdev Scot Doyle <lkml <at> scotdoyle.com> writes: > > This kernel panic occurs when using a bridge. I would be grateful for > any ideas on how to correct it. > > The panic was captured on two servers after three or four days of > minimal use, both configured as follows: > - unpatched kernel 2.6.39-rc1 (commit > ecb78ab6f30106ab72a575a25b1cdfd1633b7ca2) with default .config options > - br0 on single intel igb NIC (3 other NIC's unused) > - br0 with ip address on distinct /27 subnet > - br0:1 with ip address on distinct /24 subnet > - br0:2 with ip address on distinct /24 subnet > - no iptables rules > - ebtables not installed > > "net/bridge/br_netfilter.c" and "net/ipv4/ip_options.c" (in the current > 2.6.39-rc2 and in net-next-2.6) are identical to the versions used to > build this kernel. > We have the same problems with kernel version 2.6.37.4 (unpatched). Almost same stacktrace. After some time there is a kernel panic. -br0 contains a realtek NIC (8169) and some vnet devices used with KVM. Any ideas would be great... ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: Kernel panic when using bridge 2011-04-08 13:49 ` Sebastian Nickel @ 2011-04-08 14:57 ` Scot Doyle 2011-04-08 19:12 ` Pallai Roland 0 siblings, 1 reply; 56+ messages in thread From: Scot Doyle @ 2011-04-08 14:57 UTC (permalink / raw) To: Sebastian Nickel; +Cc: netdev, Pallai Roland On 04/08/2011 08:49 AM, Sebastian Nickel wrote: > We have the same problems with kernel version 2.6.37.4 (unpatched). Almost same > stacktrace. > > After some time there is a kernel panic. > > -br0 contains a realtek NIC (8169) and some vnet devices used with KVM. > > Any ideas would be great... Perhaps the problem is isolated to the bridging code? Neither KVM guest nor associated tap device were running during my second reported panic. Here's a similar stacktrace from a third person: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=620201 ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: Kernel panic when using bridge 2011-04-08 14:57 ` Scot Doyle @ 2011-04-08 19:12 ` Pallai Roland 0 siblings, 0 replies; 56+ messages in thread From: Pallai Roland @ 2011-04-08 19:12 UTC (permalink / raw) To: Scot Doyle; +Cc: Sebastian Nickel, netdev 2011/4/8 Scot Doyle <lkml@scotdoyle.com>: > Perhaps the problem is isolated to the bridging code? Neither KVM guest nor > associated tap device were running during my second reported panic. > Here's a similar stacktrace from a third person: > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=620201 Yep, I'm the third person. :) What I can tell you my servers are stable after bridging has been eliminated. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: Kernel panic when using bridge 2011-04-08 1:20 Kernel panic when using bridge Scot Doyle 2011-04-08 13:49 ` Sebastian Nickel @ 2011-04-08 19:17 ` Stephen Hemminger 2011-04-09 4:51 ` Scot Doyle 1 sibling, 1 reply; 56+ messages in thread From: Stephen Hemminger @ 2011-04-08 19:17 UTC (permalink / raw) To: Scot Doyle; +Cc: netdev On Thu, 07 Apr 2011 20:20:25 -0500 Scot Doyle <lkml@scotdoyle.com> wrote: > This kernel panic occurs when using a bridge. I would be grateful for > any ideas on how to correct it. > > The panic was captured on two servers after three or four days of > minimal use, both configured as follows: > - unpatched kernel 2.6.39-rc1 (commit > ecb78ab6f30106ab72a575a25b1cdfd1633b7ca2) with default .config options > - br0 on single intel igb NIC (3 other NIC's unused) > - br0 with ip address on distinct /27 subnet > - br0:1 with ip address on distinct /24 subnet > - br0:2 with ip address on distinct /24 subnet > - no iptables rules > - ebtables not installed > > "net/bridge/br_netfilter.c" and "net/ipv4/ip_options.c" (in the current > 2.6.39-rc2 and in net-next-2.6) are identical to the versions used to > build this kernel. > Please reproduce with exactly 2.6.39-rc2 there were some bug fixes to make sure that header was initialized. -- ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: Kernel panic when using bridge 2011-04-08 19:17 ` Stephen Hemminger @ 2011-04-09 4:51 ` Scot Doyle 2011-04-09 7:19 ` Hiroaki SHIMODA 0 siblings, 1 reply; 56+ messages in thread From: Scot Doyle @ 2011-04-09 4:51 UTC (permalink / raw) To: Stephen Hemminger; +Cc: netdev On 04/08/2011 02:17 PM, Stephen Hemminger wrote: > Please reproduce with exactly 2.6.39-rc2 there were some bug fixes > to make sure that header was initialized. Hi Stephen, here's another panic with 2.6.39-rc2 (git commit bb3c90f0de7b34995b5e35cf5dc97a3d428b3761) using default kernel config options. # sysctl -a | grep bridge net.bridge.bridge-nf-call-arptables = 1 net.bridge.bridge-nf-call-iptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-filter-vlan-tagged = 0 net.bridge.bridge-nf-filter-pppoe-tagged = 0 # /etc/network/interfaces auto lo iface lo inet loopback auto br0 iface br0 inet static address x.y.z.237 netmask 255.255.255.224 gateway x.y.z.225 bridge_ports eth3 bridge_stp off bridge_maxwait 0 bridge_fd 0 auto br0:1 iface br0:1 inet static address 10.0.0.1 netmask 255.255.255.0 auto br0:2 iface br0:2 inet static address 10.0.1.1 netmask 255.255.255.0 ------ [ 1691.681069] BUG: unable to handle kernel NULL pointer dereference at 00000000000000cc [ 1691.688879] IP: [<ffffffff8129fb8d>] ip_options_compile+0x1c1/0x435 [ 1691.695126] PGD 0 [ 1691.697131] Oops: 0000 [#1] SMP [ 1691.700357] last sysfs file: /sys/devices/virtual/misc/kvm/uevent [ 1691.706418] CPU 0 [ 1691.708241] Modules linked in: kvm_intel kvm bridge stp loop snd_pcm snd_timer snd soundcore snd_page_alloc tpm_tis i7core_edac psmouse ghes tpm evdev edac_core pcspkr serio_raw processor tpm_bios button dcdbas thermal_sys hed power_meter ext2 mbcache dm_mod raid1 md_mod sd_mod crc_t10dif usb_storage uas uhci_hcd mpt2sas scsi_transport_sas raid_class ehci_hcd igb scsi_mod usbcore dca bnx2 [last unloaded: scsi_wait_scan] [ 1691.745849] [ 1691.747330] Pid: 0, comm: swapper Not tainted 2.6.39-rc2+ #3 Dell Inc. PowerEdge R510/0DPRKF [ 1691.755752] RIP: 0010:[<ffffffff8129fb8d>] [<ffffffff8129fb8d>] ip_options_compile+0x1c1/0x435 [ 1691.764418] RSP: 0018:ffff88042f203af0 EFLAGS: 00010286 [ 1691.769702] RAX: 0000000000000024 RBX: ffff88041c9fa900 RCX: ffff880403466865 [ 1691.776800] RDX: 0000000000000027 RSI: 0000000000000000 RDI: ffffffff817e6100 [ 1691.783899] RBP: ffff880403466863 R08: ffffffffa01ade89 R09: ffff88042f203c58 [ 1691.790997] R10: ffffe1c4ff103b40 R11: 0000000000000004 R12: ffff88041c9fa928 [ 1691.798095] R13: 0000000000000027 R14: ffff88040346684e R15: 0000000000000027 [ 1691.805194] FS: 0000000000000000(0000) GS:ffff88042f200000(0000) knlGS:0000000000000000 [ 1691.813245] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1691.818960] CR2: 00000000000000cc CR3: 0000000001603000 CR4: 00000000000006f0 [ 1691.826058] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1691.833156] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 1691.840254] Process swapper (pid: 0, threadinfo ffffffff81600000, task ffffffff8160b020) [ 1691.848303] Stack: [ 1691.850300] ffff88042ec02900 ffff8804051ac740 0000000000000000 ffffffff817e6100 [ 1691.857693] 0000000000000282 ffffffff810ec848 0000000000000282 ffff88041c9fa928 [ 1691.865085] ffff88041c9fa900 ffff8804038e8000 ffff88040346684e ffff8804038e8000 [ 1691.872480] Call Trace: [ 1691.874910] <IRQ> [ 1691.877005] [<ffffffff810ec848>] ? __slab_free+0x80/0x14a [ 1691.882465] [<ffffffffa01b1e3a>] ? br_parse_ip_options+0x133/0x1a0 [bridge] [ 1691.889480] [<ffffffffa01b2bd8>] ? br_nf_pre_routing+0x348/0x3cb [bridge] [ 1691.896324] [<ffffffff8119d88f>] ? cpumask_next_and+0x2b/0x3a [ 1691.902127] [<ffffffff81298517>] ? nf_iterate+0x41/0x7e [ 1691.907413] [<ffffffffa01ade89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [ 1691.913908] [<ffffffffa01ade89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [ 1691.920402] [<ffffffff812985c7>] ? nf_hook_slow+0x73/0x114 [ 1691.925947] [<ffffffffa01ade89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [ 1691.932442] [<ffffffffa01ade89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [ 1691.938937] [<ffffffffa01ade6f>] ? NF_HOOK.clone.4+0x3c/0x56 [bridge] [ 1691.945432] [<ffffffff810ee373>] ? __kmalloc_node_track_caller+0xd4/0x10d [ 1691.952274] [<ffffffffa01ae1e5>] ? br_handle_frame+0x195/0x1ac [bridge] [ 1691.958942] [<ffffffffa01ae050>] ? br_handle_frame_finish+0x1c7/0x1c7 [bridge] [ 1691.966217] [<ffffffff812764df>] ? __netif_receive_skb+0x2a7/0x450 [ 1691.972452] [<ffffffff81276918>] ? netif_receive_skb+0x52/0x58 [ 1691.978340] [<ffffffff81276e1a>] ? napi_gro_receive+0x1f/0x2f [ 1691.984143] [<ffffffff812769ef>] ? napi_skb_finish+0x1c/0x31 [ 1691.989862] [<ffffffffa0226fcd>] ? igb_poll+0x6d9/0x9ee [igb] [ 1691.995666] [<ffffffff8103eb92>] ? try_to_wake_up+0x16a/0x17c [ 1692.001470] [<ffffffff8109034f>] ? handle_irq_event+0x40/0x55 [ 1692.007275] [<ffffffff8106fc3c>] ? arch_local_irq_save+0x14/0x1d [ 1692.013338] [<ffffffff81276f45>] ? net_rx_action+0xa4/0x1b1 [ 1692.018971] [<ffffffff8104ad26>] ? __do_softirq+0xb8/0x176 [ 1692.024516] [<ffffffff81333b5c>] ? call_softirq+0x1c/0x30 [ 1692.029973] [<ffffffff8100aa57>] ? do_softirq+0x3f/0x84 [ 1692.035257] [<ffffffff8104af91>] ? irq_exit+0x3f/0x8f [ 1692.040368] [<ffffffff8100a793>] ? do_IRQ+0x85/0x9e [ 1692.045308] [<ffffffff8132cad3>] ? common_interrupt+0x13/0x13 [ 1692.051110] <EOI> [ 1692.053204] [<ffffffff81061348>] ? enqueue_hrtimer+0x3f/0x53 [ 1692.058922] [<ffffffffa032c417>] ? arch_local_irq_enable+0x7/0x8 [processor] [ 1692.066021] [<ffffffffa032cfdf>] ? acpi_idle_enter_bm+0x218/0x250 [processor] [ 1692.073208] [<ffffffff8125df49>] ? menu_select+0x169/0x296 [ 1692.078752] [<ffffffff8125d059>] ? cpuidle_idle_call+0xf4/0x17e [ 1692.084727] [<ffffffff81008298>] ? cpu_idle+0xa2/0xc4 [ 1692.089838] [<ffffffff8169db60>] ? start_kernel+0x3b9/0x3c4 [ 1692.095469] [<ffffffff8169d3c6>] ? x86_64_start_kernel+0x102/0x10f [ 1692.101703] Code: 4d 02 3c 03 0f 86 59 02 00 00 0f b6 d0 44 39 ea 7f 32 83 c2 03 44 39 ea 0f 8f 45 02 00 00 48 85 db 74 18 48 8b 74 24 10 0f b6 c0 <8b> 96 cc 00 00 00 89 54 05 ff 41 80 4c 24 08 04 80 01 04 41 80 [ 1692.121051] RIP [<ffffffff8129fb8d>] ip_options_compile+0x1c1/0x435 [ 1692.127382] RSP <ffff88042f203af0> [ 1692.130850] CR2: 00000000000000cc [ 1692.134470] ---[ end trace 0afda543b32ed72b ]--- [ 1692.139064] BUG: scheduling while atomic: swapper/0/0x10000100 [ 1692.144866] Modules linked in: kvm_intel kvm bridge stp loop snd_pcm snd_timer snd soundcore snd_page_alloc tpm_tis i7core_edac psmouse ghes tpm evdev edac_core pcspkr serio_raw processor tpm_bios button dcdbas thermal_sys hed power_meter ext2 mbcache dm_mod raid1 md_mod sd_mod crc_t10dif usb_storage uas uhci_hcd mpt2sas scsi_transport_sas raid_class ehci_hcd igb scsi_mod usbcore dca bnx2 [last unloaded: scsi_wait_scan] [ 1692.182294] CPU 0 [ 1692.184119] Modules linked in: kvm_intel kvm bridge stp loop snd_pcm snd_timer snd soundcore snd_page_alloc tpm_tis i7core_edac psmouse ghes tpm evdev edac_core pcspkr serio_raw processor tpm_bios button dcdbas thermal_sys hed power_meter ext2 mbcache dm_mod raid1 md_mod sd_mod crc_t10dif usb_storage uas uhci_hcd mpt2sas scsi_transport_sas raid_class ehci_hcd igb scsi_mod usbcore dca bnx2 [last unloaded: scsi_wait_scan] [ 1692.221718] [ 1692.223199] Pid: 0, comm: swapper Tainted: G D 2.6.39-rc2+ #3 Dell Inc. PowerEdge R510/0DPRKF [ 1692.232487] RIP: 0010:[<ffffffffa032c417>] [<ffffffffa032c417>] arch_local_irq_enable+0x7/0x8 [processor] [ 1692.242105] RSP: 0018:ffffffff81601eb0 EFLAGS: 00000292 [ 1692.247389] RAX: 000000000003fce5 RBX: ffffffff81061348 RCX: 00000000000003e8 [ 1692.254489] RDX: 00000000000000c5 RSI: 0000000225c17d03 RDI: 000000000f93df4d [ 1692.261588] RBP: ffff880405349800 R08: 00000000fffffffd R09: 0000000000000000 [ 1692.268689] R10: ffff88042f210ac0 R11: 0000000000000040 R12: ffffffff8132cace [ 1692.275790] R13: ffff88042f20feb0 R14: ffffffff811a2ed2 R15: ffff88042f20fdc8 [ 1692.282892] FS: 0000000000000000(0000) GS:ffff88042f200000(0000) knlGS:0000000000000000 [ 1692.290943] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1692.296660] CR2: 00000000000000cc CR3: 0000000001603000 CR4: 00000000000006f0 [ 1692.303759] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1692.310860] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 1692.317959] Process swapper (pid: 0, threadinfo ffffffff81600000, task ffffffff8160b020) [ 1692.326010] Stack: [ 1692.328008] ffffffffa032cfdf 0000000000011140 ffffffff8125df49 000000010005525f [ 1692.335406] ffff880405349820 ffff8804053498f0 0000000000000002 ffffffffffffffff [ 1692.342803] ffffffff8125d059 0000000000000000 ffffffff81600000 ffffffff816812d0 [ 1692.350197] Call Trace: [ 1692.352630] [<ffffffffa032cfdf>] ? acpi_idle_enter_bm+0x218/0x250 [processor] [ 1692.359818] [<ffffffff8125df49>] ? menu_select+0x169/0x296 [ 1692.365362] [<ffffffff8125d059>] ? cpuidle_idle_call+0xf4/0x17e [ 1692.371339] [<ffffffff81008298>] ? cpu_idle+0xa2/0xc4 [ 1692.376452] [<ffffffff8169db60>] ? start_kernel+0x3b9/0x3c4 [ 1692.382084] [<ffffffff8169d3c6>] ? x86_64_start_kernel+0x102/0x10f [ 1692.388319] Code: 63 1c fb 48 83 c4 38 89 e8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 0f 09 0f 1f 44 00 00 c3 fa 66 0f 1f 44 00 00 c3 fb 66 0f 1f 44 00 00 <c3> 48 8b 15 81 20 3f e1 48 8d 42 fd 48 83 f8 01 0f 96 c0 48 ff [ 1692.407673] Call Trace: [ 1692.410105] [<ffffffffa032cfdf>] ? acpi_idle_enter_bm+0x218/0x250 [processor] [ 1692.417294] [<ffffffff8125df49>] ? menu_select+0x169/0x296 [ 1692.422838] [<ffffffff8125d059>] ? cpuidle_idle_call+0xf4/0x17e [ 1692.428815] [<ffffffff81008298>] ? cpu_idle+0xa2/0xc4 [ 1692.433928] [<ffffffff8169db60>] ? start_kernel+0x3b9/0x3c4 [ 1692.439560] [<ffffffff8169d3c6>] ? x86_64_start_kernel+0x102/0x10f ^M ^GMessage from[ 1692.446160] BUG: unable to handle kernel NULL pointer dereference at 00000000000000e0 [ 1692.455313] IP: [<ffffffffa0226ca8>] igb_poll+0x3b4/0x9ee [igb] [ 1692.461214] PGD 404c2e067 PUD 402966067 PMD 0 [ 1692.465660] Oops: 0000 [#2] SMP [ 1692.468887] last sysfs file: /sys/devices/virtual/misc/kvm/uevent [ 1692.474947] CPU 0 [ 1692.476772] Modules linked in: kvm_intel kvm bridge stp loop snd_pcm snd_timer snd soundcore snd_page_alloc tpm_tis i7core_edac psmouse ghes tpm evdev edac_core pcspkr serio_raw processor tpm_bios button dcdbas thermal_sys hed power_meter ext2 mbcache dm_mod raid1 md_mod sd_mod crc_t10dif usb_storage uas uhci_hcd mpt2sas scsi_transport_sas raid_class ehci_hcd igb scsi_mod usbcore dca bnx2 [last unloaded: scsi_wait_scan] [ 1692.514362] [ 1692.515843] Pid: 1740, comm: rsyslogd Tainted: G D 2.6.39-rc2+ #3 Dell Inc. PowerEdge R510/0DPRKF [ 1692.525475] RIP: 0010:[<ffffffffa0226ca8>] [<ffffffffa0226ca8>] igb_poll+0x3b4/0x9ee [igb] [ 1692.533796] RSP: 0018:ffff880403081b90 EFLAGS: 00010203 [ 1692.539078] RAX: ffff880404f24e58 RBX: ffff880404f25a40 RCX: 0000000000000000 [ 1692.546176] RDX: 0000000000000040 RSI: 0000000000001043 RDI: ffff880404f24e58 [ 1692.553273] RBP: 0000000000000000 R08: 00000000f352249a R09: ffff88042f20f470 [ 1692.560373] R10: 0000000000000000 R11: 0000000000000293 R12: 000000000000000c [ 1692.567470] R13: ffffc9001267b320 R14: 0000000000000000 R15: ffff880403ff3140 [ 1692.574571] FS: 00007f4ce614f700(0000) GS:ffff88042f200000(0000) knlGS:0000000000000000 [ 1692.582619] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1692.588334] CR2: 00000000000000e0 CR3: 0000000403c96000 CR4: 00000000000006f0 [ 1692.595434] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1692.602532] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 1692.609630] Process rsyslogd (pid: 1740, threadinfo ffff880403080000, task ffff880413b8a170) [ 1692.618025] Stack: [ 1692.620024] 0000000000000000 ffffffff8105b633 0000000000000082 ffffffff8103eb92 [ 1692.627419] ffffffff8132b45a 0000000000000082 0000000000000000 000010431d887e00 [ 1692.634813] 0000000100000015 ffff880404f24e40 ffff880400000000 ffff8804038e8740 [ 1692.642208] Call Trace: [ 1692.644640] [<ffffffff8105b633>] ? wq_worker_waking_up+0x8/0x20 [ 1692.650615] [<ffffffff8103eb92>] ? try_to_wake_up+0x16a/0x17c [ 1692.656418] [<ffffffff8132b45a>] ? schedule+0x56e/0x585 [ 1692.661703] [<ffffffff8132c775>] ? _raw_spin_lock_irq+0xd/0x1a [ 1692.667592] [<ffffffff81276f45>] ? net_rx_action+0xa4/0x1b1 [ 1692.673221] [<ffffffff8104ad26>] ? __do_softirq+0xb8/0x176 [ 1692.678765] [<ffffffff81333b5c>] ? call_softirq+0x1c/0x30 [ 1692.684224] [<ffffffff8100aa57>] ? do_softirq+0x3f/0x84 [ 1692.689507] [<ffffffff8104af91>] ? irq_exit+0x3f/0x8f [ 1692.694618] [<ffffffff8100a793>] ? do_IRQ+0x85/0x9e [ 1692.699556] [<ffffffff8132cad3>] ? common_interrupt+0x13/0x13 [ 1692.705360] [<ffffffff811439a0>] ? kmsg_poll+0x3a/0x3a [ 1692.710559] [<ffffffff81045ba5>] ? spin_unlock_irq.clone.1+0xe/0x10 [ 1692.716882] [<ffffffff81045e01>] ? do_syslog+0x1e2/0x430 [ 1692.722253] [<ffffffff811439e6>] ? kmsg_read+0x46/0x50 [ 1692.727451] [<ffffffff8113aefa>] ? proc_reg_read+0x6f/0x88 [ 1692.732995] [<ffffffff810f6868>] ? vfs_read+0x9f/0xf2 [ 1692.738106] [<ffffffff810f6900>] ? sys_read+0x45/0x6b [ 1692.743218] [<ffffffff81332952>] ? system_call_fastpath+0x16/0x1b [ 1692.749364] Code: 89 74 24 3c e9 78 03 00 00 8b 54 24 70 39 54 24 44 0f 8d 75 03 00 00 ff 44 24 44 0f ae e8 49 8b 6d 00 ff 44 24 40 b9 00 00 00 00 [ 1692.762725] 8b 85 e0 00 00 00 49 c7 45 00 00 00 00 00 41 8b 77 0c 0f 18 [ 1692.769844] RIP [<ffffffffa0226ca8>] igb_poll+0x3b4/0x9ee [igb] [ 1692.775829] RSP <ffff880403081b90> [ 1692.779295] CR2: 00000000000000e0 syslogd@r510-1 [ 1692.782628] ---[ end trace 0afda543b32ed72c ]--- [ 1692.788575] BUG: scheduling while atomic: rsyslogd/1740/0x10000100 at Apr 8 20:34:[ 1692.794754] Modules linked in: kvm_intel04 ...^M kernel kvm bridge:[ 1691.697131] stp loopOops: 0000 [#1] snd_pcm snd_timerSMP ^M^M ^GMessa snd soundcorege from syslogd@ snd_page_alloc tpm_tisr510-1 at Apr 8 i7core_edac psmouse 20:34:04 ...^M ghes tpm kernel:[ 1691.7 evdev edac_core00357] last sysf pcspkr serio_raws file: /sys/dev processor tpm_biosices/virtual/mis button dcdbasc/kvm/uevent ^M^M thermal_sys hed ^GMessage from power_meter ext2syslogd@r510-1 a mbcache dm_modt Apr 8 20:34:0 raid1 md_mod4 ...^M kernel: sd_mod crc_t10dif[ 1691.848303] S usb_storage uastack: ^M^M ^GMess uhci_hcd mpt2sasage from syslogd scsi_transport_sas raid_class@r510-1 at Apr ehci_hcd igb8 20:34:04 ...^M^M scsi_mod usbcore kernel:[ 1691. dca bnx2872480] Call Tra [last unloaded: scsi_wait_scan] ce: ^M^M ^GMessag[ 1692.865388] CPU 0 [ 1692.868564] Modules linked in:e from syslogd@r kvm_intel kvm510-1 at Apr 8 bridge stp20:34:04 ...^M loop snd_pcmkernel:[ 1691.87 snd_timer snd4910] <IRQ> ^M soundcore snd_page_alloc tpm_tis i7core_edac psmouse ghes tpm evdev edac_core pcspkr serio_raw processor tpm_bios button dcdbas thermal_sys hed power_meter ext2 mbcache dm_mod raid1 md_mod sd_mod crc_t10dif usb_storage uas uhci_hcd mpt2sas scsi_transport_sas raid_class ehci_hcd igb scsi_mod usbcore dca bnx2 [last unloaded: scsi_wait_scan] [ 1692.913092] [ 1692.914574] Pid: 1740, comm: rsyslogd Tainted: G D 2.6.39-rc2+ #3 Dell Inc. PowerEdge R510/0DPRKF [ 1692.924208] RIP: 0010:[<ffffffff81045ba5>] [<ffffffff81045ba5>] spin_unlock_irq.clone.1+0xe/0x10 [ 1692.933051] RSP: 0018:ffff880403081e30 EFLAGS: 00000286 [ 1692.938336] RAX: 000000000001455c RBX: ffff880403afecc0 RCX: ffffffff817518a0 [ 1692.945435] RDX: 0000000000014520 RSI: 0000000000000003 RDI: ffffffff81751888 [ 1692.952536] RBP: 0000000000000fff R08: ffffffff811439a0 R09: 0000000000000000 [ 1692.959636] R10: 0000000000000000 R11: 0000000000000293 R12: ffffffff8132cad3 [ 1692.966737] R13: 0000000000000e33 R14: 00007f4ce7983693 R15: ffff880403081da8 [ 1692.973837] FS: 00007f4ce614f700(0000) GS:ffff88042f200000(0000) knlGS:0000000000000000 [ 1692.981890] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1692.987607] CR2: 00000000000000e0 CR3: 0000000403c96000 CR4: 00000000000006f0 [ 1692.994706] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1693.001805] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 1693.008907] Process rsyslogd (pid: 1740, threadinfo ffff880403080000, task ffff880413b8a170) [ 1693.017303] Stack: [ 1693.019301] ffffffff81045e01 0000000003afed20 ffffffff00000020 0000000000000000 [ 1693.026696] 0000000000000001 ffff880403081fd8 0000000000000000 0002000003081e60 [ 1693.034093] ffff880404f9b090 0000000100000000 00007f4ce7982860 0000000000000fff [ 1693.041490] Call Trace: [ 1693.043922] [<ffffffff81045e01>] ? do_syslog+0x1e2/0x430 [ 1693.049295] [<ffffffff811439e6>] ? kmsg_read+0x46/0x50 [ 1693.054493] [<ffffffff8113aefa>] ? proc_reg_read+0x6f/0x88 [ 1693.060039] [<ffffffff810f6868>] ? vfs_read+0x9f/0xf2 [ 1693.065153] [<ffffffff810f6900>] ? sys_read+0x45/0x6b [ 1693.070266] [<ffffffff81332952>] ? system_call_fastpath+0x16/0x1b [ 1693.076415] Code: 00 00 75 14 c7 05 48 be 72 00 01 00 00 00 c7 05 42 be 72 00 01 00 00 00 48 83 c4 08 c3 66 ff 05 ea bc 70 00 fb 66 0f 1f 44 00 00 <c3> c3 48 83 ec 08 48 c7 c2 a8 38 61 81 48 c7 c6 3e 2b 4c 81 48 [ 1693.095772] Call Trace: [ 1693.098205] [<ffffffff81045e01>] ? do_syslog+0x1e2/0x430 [ 1693.103579] [<ffffffff811439e6>] ? kmsg_read+0x46/0x50 [ 1693.108778] [<ffffffff8113aefa>] ? proc_reg_read+0x6f/0x88 [ 1693.114323] [<ffffffff810f6868>] ? vfs_read+0x9f/0xf2 [ 1693.119435] [<ffffffff810f6900>] ? sys_read+0x45/0x6b [ 1693.124547] [<ffffffff81332952>] ? system_call_fastpath+0x16/0x1b [ 1693.130704] Kernel panic - not syncing: Fatal exception in interrupt [ 1693.137028] Pid: 1740, comm: rsyslogd Tainted: G D 2.6.39-rc2+ #3 [ 1693.143870] Call Trace: [ 1693.146304] [<ffffffff8132ac78>] ? panic+0x92/0x1a1 [ 1693.151244] [<ffffffff8132d836>] ? oops_end+0xa9/0xb6 [ 1693.156359] [<ffffffff8102ca16>] ? no_context+0x1ed/0x1fa [ 1693.161819] [<ffffffff8132f623>] ? do_page_fault+0x16b/0x308 [ 1693.167537] [<ffffffff8132cd95>] ? page_fault+0x25/0x30 [ 1693.172824] [<ffffffffa0226ca8>] ? igb_poll+0x3b4/0x9ee [igb] [ 1693.178629] [<ffffffff8105b633>] ? wq_worker_waking_up+0x8/0x20 [ 1693.184606] [<ffffffff8103eb92>] ? try_to_wake_up+0x16a/0x17c [ 1693.190410] [<ffffffff8132b45a>] ? schedule+0x56e/0x585 [ 1693.195697] [<ffffffff8132c775>] ? _raw_spin_lock_irq+0xd/0x1a [ 1693.201588] [<ffffffff81276f45>] ? net_rx_action+0xa4/0x1b1 [ 1693.207219] [<ffffffff8104ad26>] ? __do_softirq+0xb8/0x176 [ 1693.212764] [<ffffffff81333b5c>] ? call_softirq+0x1c/0x30 [ 1693.218222] [<ffffffff8100aa57>] ? do_softirq+0x3f/0x84 [ 1693.223507] [<ffffffff8104af91>] ? irq_exit+0x3f/0x8f [ 1693.228619] [<ffffffff8100a793>] ? do_IRQ+0x85/0x9e [ 1693.233560] [<ffffffff8132cad3>] ? common_interrupt+0x13/0x13 [ 1693.239365] [<ffffffff811439a0>] ? kmsg_poll+0x3a/0x3a [ 1693.244564] [<ffffffff81045ba5>] ? spin_unlock_irq.clone.1+0xe/0x10 [ 1693.250888] [<ffffffff81045e01>] ? do_syslog+0x1e2/0x430 [ 1693.256259] [<ffffffff811439e6>] ? kmsg_read+0x46/0x50 [ 1693.261459] [<ffffffff8113aefa>] ? proc_reg_read+0x6f/0x88 [ 1693.267005] [<ffffffff810f6868>] ? vfs_read+0x9f/0xf2 [ 1693.272117] [<ffffffff810f6900>] ? sys_read+0x45/0x6b [ 1693.277231] [<ffffffff81332952>] ? system_call_fastpath+0x16/0x1b ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: Kernel panic when using bridge 2011-04-09 4:51 ` Scot Doyle @ 2011-04-09 7:19 ` Hiroaki SHIMODA 2011-04-11 23:48 ` Scot Doyle 0 siblings, 1 reply; 56+ messages in thread From: Hiroaki SHIMODA @ 2011-04-09 7:19 UTC (permalink / raw) To: Stephen Hemminger; +Cc: Scot Doyle, netdev On Fri, 08 Apr 2011 23:51:10 -0500 Scot Doyle <lkml@scotdoyle.com> wrote: > On 04/08/2011 02:17 PM, Stephen Hemminger wrote: > > Please reproduce with exactly 2.6.39-rc2 there were some bug fixes > > to make sure that header was initialized. > > Hi Stephen, here's another panic with 2.6.39-rc2 (git commit > bb3c90f0de7b34995b5e35cf5dc97a3d428b3761) using default kernel config > options. > > # sysctl -a | grep bridge > net.bridge.bridge-nf-call-arptables = 1 > net.bridge.bridge-nf-call-iptables = 1 > net.bridge.bridge-nf-call-ip6tables = 1 > net.bridge.bridge-nf-filter-vlan-tagged = 0 > net.bridge.bridge-nf-filter-pppoe-tagged = 0 > > # /etc/network/interfaces > auto lo > iface lo inet loopback > auto br0 > iface br0 inet static > address x.y.z.237 > netmask 255.255.255.224 > gateway x.y.z.225 > bridge_ports eth3 > bridge_stp off > bridge_maxwait 0 > bridge_fd 0 > auto br0:1 > iface br0:1 inet static > address 10.0.0.1 > netmask 255.255.255.0 > auto br0:2 > iface br0:2 inet static > address 10.0.1.1 > netmask 255.255.255.0 > > ------ > > [ 1691.681069] BUG: unable to handle kernel NULL pointer dereference at > 00000000000000cc > [ 1691.688879] IP: [<ffffffff8129fb8d>] ip_options_compile+0x1c1/0x435 > [ 1691.695126] PGD 0 > [ 1691.697131] Oops: 0000 [#1] SMP > [ 1691.700357] last sysfs file: /sys/devices/virtual/misc/kvm/uevent > [ 1691.706418] CPU 0 > [ 1691.708241] Modules linked in: kvm_intel kvm bridge stp loop snd_pcm > snd_timer snd soundcore snd_page_alloc tpm_tis i7core_edac psmouse ghes > tpm evdev edac_core pcspkr serio_raw processor tpm_bios button dcdbas > thermal_sys hed power_meter ext2 mbcache dm_mod raid1 md_mod sd_mod > crc_t10dif usb_storage uas uhci_hcd mpt2sas scsi_transport_sas > raid_class ehci_hcd igb scsi_mod usbcore dca bnx2 [last unloaded: > scsi_wait_scan] > [ 1691.745849] > [ 1691.747330] Pid: 0, comm: swapper Not tainted 2.6.39-rc2+ #3 Dell > Inc. PowerEdge R510/0DPRKF > [ 1691.755752] RIP: 0010:[<ffffffff8129fb8d>] [<ffffffff8129fb8d>] > ip_options_compile+0x1c1/0x435 > [ 1691.764418] RSP: 0018:ffff88042f203af0 EFLAGS: 00010286 > [ 1691.769702] RAX: 0000000000000024 RBX: ffff88041c9fa900 RCX: > ffff880403466865 > [ 1691.776800] RDX: 0000000000000027 RSI: 0000000000000000 RDI: > ffffffff817e6100 > [ 1691.783899] RBP: ffff880403466863 R08: ffffffffa01ade89 R09: > ffff88042f203c58 > [ 1691.790997] R10: ffffe1c4ff103b40 R11: 0000000000000004 R12: > ffff88041c9fa928 > [ 1691.798095] R13: 0000000000000027 R14: ffff88040346684e R15: > 0000000000000027 > [ 1691.805194] FS: 0000000000000000(0000) GS:ffff88042f200000(0000) > knlGS:0000000000000000 > [ 1691.813245] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 1691.818960] CR2: 00000000000000cc CR3: 0000000001603000 CR4: > 00000000000006f0 > [ 1691.826058] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [ 1691.833156] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: > 0000000000000400 > [ 1691.840254] Process swapper (pid: 0, threadinfo ffffffff81600000, > task ffffffff8160b020) > [ 1691.848303] Stack: > [ 1691.850300] ffff88042ec02900 ffff8804051ac740 0000000000000000 > ffffffff817e6100 > [ 1691.857693] 0000000000000282 ffffffff810ec848 0000000000000282 > ffff88041c9fa928 > [ 1691.865085] ffff88041c9fa900 ffff8804038e8000 ffff88040346684e > ffff8804038e8000 > [ 1691.872480] Call Trace: > [ 1691.874910] <IRQ> > [ 1691.877005] [<ffffffff810ec848>] ? __slab_free+0x80/0x14a > [ 1691.882465] [<ffffffffa01b1e3a>] ? br_parse_ip_options+0x133/0x1a0 > [bridge] > [ 1691.889480] [<ffffffffa01b2bd8>] ? br_nf_pre_routing+0x348/0x3cb > [bridge] > [ 1691.896324] [<ffffffff8119d88f>] ? cpumask_next_and+0x2b/0x3a > [ 1691.902127] [<ffffffff81298517>] ? nf_iterate+0x41/0x7e > [ 1691.907413] [<ffffffffa01ade89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] > [ 1691.913908] [<ffffffffa01ade89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] > [ 1691.920402] [<ffffffff812985c7>] ? nf_hook_slow+0x73/0x114 > [ 1691.925947] [<ffffffffa01ade89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] > [ 1691.932442] [<ffffffffa01ade89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] > [ 1691.938937] [<ffffffffa01ade6f>] ? NF_HOOK.clone.4+0x3c/0x56 [bridge] > [ 1691.945432] [<ffffffff810ee373>] ? > __kmalloc_node_track_caller+0xd4/0x10d > [ 1691.952274] [<ffffffffa01ae1e5>] ? br_handle_frame+0x195/0x1ac [bridge] > [ 1691.958942] [<ffffffffa01ae050>] ? > br_handle_frame_finish+0x1c7/0x1c7 [bridge] > [ 1691.966217] [<ffffffff812764df>] ? __netif_receive_skb+0x2a7/0x450 > [ 1691.972452] [<ffffffff81276918>] ? netif_receive_skb+0x52/0x58 > [ 1691.978340] [<ffffffff81276e1a>] ? napi_gro_receive+0x1f/0x2f > [ 1691.984143] [<ffffffff812769ef>] ? napi_skb_finish+0x1c/0x31 > [ 1691.989862] [<ffffffffa0226fcd>] ? igb_poll+0x6d9/0x9ee [igb] > [ 1691.995666] [<ffffffff8103eb92>] ? try_to_wake_up+0x16a/0x17c > [ 1692.001470] [<ffffffff8109034f>] ? handle_irq_event+0x40/0x55 > [ 1692.007275] [<ffffffff8106fc3c>] ? arch_local_irq_save+0x14/0x1d > [ 1692.013338] [<ffffffff81276f45>] ? net_rx_action+0xa4/0x1b1 > [ 1692.018971] [<ffffffff8104ad26>] ? __do_softirq+0xb8/0x176 > [ 1692.024516] [<ffffffff81333b5c>] ? call_softirq+0x1c/0x30 > [ 1692.029973] [<ffffffff8100aa57>] ? do_softirq+0x3f/0x84 > [ 1692.035257] [<ffffffff8104af91>] ? irq_exit+0x3f/0x8f > [ 1692.040368] [<ffffffff8100a793>] ? do_IRQ+0x85/0x9e > [ 1692.045308] [<ffffffff8132cad3>] ? common_interrupt+0x13/0x13 > [ 1692.051110] <EOI> > [ 1692.053204] [<ffffffff81061348>] ? enqueue_hrtimer+0x3f/0x53 > [ 1692.058922] [<ffffffffa032c417>] ? arch_local_irq_enable+0x7/0x8 > [processor] > [ 1692.066021] [<ffffffffa032cfdf>] ? acpi_idle_enter_bm+0x218/0x250 > [processor] > [ 1692.073208] [<ffffffff8125df49>] ? menu_select+0x169/0x296 > [ 1692.078752] [<ffffffff8125d059>] ? cpuidle_idle_call+0xf4/0x17e > [ 1692.084727] [<ffffffff81008298>] ? cpu_idle+0xa2/0xc4 > [ 1692.089838] [<ffffffff8169db60>] ? start_kernel+0x3b9/0x3c4 > [ 1692.095469] [<ffffffff8169d3c6>] ? x86_64_start_kernel+0x102/0x10f > [ 1692.101703] Code: 4d 02 3c 03 0f 86 59 02 00 00 0f b6 d0 44 39 ea 7f > 32 83 c2 03 44 39 ea 0f 8f 45 02 00 00 48 85 db 74 18 48 8b 74 24 10 0f > b6 c0 <8b> 96 cc 00 00 00 89 54 05 ff 41 80 4c 24 08 04 80 01 04 41 80 > [ 1692.121051] RIP [<ffffffff8129fb8d>] ip_options_compile+0x1c1/0x435 > [ 1692.127382] RSP <ffff88042f203af0> > [ 1692.130850] CR2: 00000000000000cc > [ 1692.134470] ---[ end trace 0afda543b32ed72b ]--- It seems that the bug trap is occurred in ip_options_compile() due to rt is NULL. 8b 96 cc 00 00 00 mov 0xcc(%rsi),%edx rsi is rt, and 0xcc means rt->rt_spec_dst. So I think below code hit the bug trap. 332 if (skb) { 333 memcpy(&optptr[optptr[2]-1], &rt->rt_spec_dst, 4); <- here 334 opt->is_changed = 1; 335 } And call trace seems as follows. __netif_receive_skb() -> br_handle_frame() -> NF_HOOK() -> br_nf_pre_routing() -> br_parse_ip_options() -> ip_options_compile() br_parse_ip_options() was introduced at 462fb2a (bridge : Sanitize skb before it enters the IP stack) but ip_options_compile() or ip_options_rcv_srr() seems to be called with no rt info. Thanks. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: Kernel panic when using bridge 2011-04-09 7:19 ` Hiroaki SHIMODA @ 2011-04-11 23:48 ` Scot Doyle 2011-04-12 1:31 ` Stephen Hemminger 0 siblings, 1 reply; 56+ messages in thread From: Scot Doyle @ 2011-04-11 23:48 UTC (permalink / raw) To: Hiroaki SHIMODA, Stephen Hemminger Cc: netdev, Sebastian Nickel, Pallai Roland On 04/09/2011 02:19 AM, Hiroaki SHIMODA wrote: > > It seems that the bug trap is occurred in ip_options_compile() due to > rt is NULL. > > 8b 96 cc 00 00 00 mov 0xcc(%rsi),%edx > rsi is rt, and 0xcc means rt->rt_spec_dst. So I think below code hit > the bug trap. > > 332 if (skb) { > 333 memcpy(&optptr[optptr[2]-1],&rt->rt_spec_dst, 4);<- here > 334 opt->is_changed = 1; > 335 } > > And call trace seems as follows. > __netif_receive_skb() > -> br_handle_frame() > -> NF_HOOK() > -> br_nf_pre_routing() > -> br_parse_ip_options() > -> ip_options_compile() > > br_parse_ip_options() was introduced at 462fb2a (bridge : Sanitize > skb before it enters the IP stack) but ip_options_compile() or > ip_options_rcv_srr() seems to be called with no rt info. Thanks to a tip from Sebastian, I can now reproduce this panic by running "IP Stack Integrity Checker v0.07" from another machine on the same subnet with command "icmpsic -s x.y.z.a -d x.y.z.b" where "x.y.z.a" is IP address of the other machine and "x.y.z.b" is the IP address of the target. When I enable iptables logging on the target machine, no panic occurs. When I disable iptables logging (but otherwise leave the same iptables rules) a panic occurs within a few seconds. Thanks Hiroaki for the analysis of the kernel panic output. I've confirmed that you are correct by placing a printk just before those two lines. In every panic, the printk was triggered on line 333 of net/ipv4/ip_options.c The kernel panic does not occur after applying the following patch. # diff net/ipv4/ip_options.c.original net/ipv4/ip_options.c.fix 332c332 < if (skb) { --- > if (skb && rt) { 374c374 < if (skb) { --- > if (skb && rt) { What do you all think? Will it cause other problems? ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: Kernel panic when using bridge 2011-04-11 23:48 ` Scot Doyle @ 2011-04-12 1:31 ` Stephen Hemminger 2011-04-12 3:47 ` Scot Doyle 0 siblings, 1 reply; 56+ messages in thread From: Stephen Hemminger @ 2011-04-12 1:31 UTC (permalink / raw) To: Scot Doyle; +Cc: Hiroaki SHIMODA, netdev, Sebastian Nickel, Pallai Roland On Mon, 11 Apr 2011 18:48:00 -0500 Scot Doyle <lkml@scotdoyle.com> wrote: > On 04/09/2011 02:19 AM, Hiroaki SHIMODA wrote: > > > > It seems that the bug trap is occurred in ip_options_compile() due to > > rt is NULL. > > > > 8b 96 cc 00 00 00 mov 0xcc(%rsi),%edx > > rsi is rt, and 0xcc means rt->rt_spec_dst. So I think below code hit > > the bug trap. > > > > 332 if (skb) { > > 333 memcpy(&optptr[optptr[2]-1],&rt->rt_spec_dst, 4);<- here > > 334 opt->is_changed = 1; > > 335 } > > > > And call trace seems as follows. > > __netif_receive_skb() > > -> br_handle_frame() > > -> NF_HOOK() > > -> br_nf_pre_routing() > > -> br_parse_ip_options() > > -> ip_options_compile() > > > > br_parse_ip_options() was introduced at 462fb2a (bridge : Sanitize > > skb before it enters the IP stack) but ip_options_compile() or > > ip_options_rcv_srr() seems to be called with no rt info. > > Thanks to a tip from Sebastian, I can now reproduce this panic by > running "IP Stack Integrity Checker v0.07" from another machine on the > same subnet with command "icmpsic -s x.y.z.a -d x.y.z.b" where "x.y.z.a" > is IP address of the other machine and "x.y.z.b" is the IP address of > the target. When I enable iptables logging on the target machine, no > panic occurs. When I disable iptables logging (but otherwise leave the > same iptables rules) a panic occurs within a few seconds. > > Thanks Hiroaki for the analysis of the kernel panic output. I've > confirmed that you are correct by placing a printk just before those two > lines. In every panic, the printk was triggered on line 333 of > net/ipv4/ip_options.c > > The kernel panic does not occur after applying the following patch. > > # diff net/ipv4/ip_options.c.original net/ipv4/ip_options.c.fix > 332c332 > < if (skb) { > --- > > if (skb && rt) { > 374c374 > < if (skb) { > --- > > if (skb && rt) { > > What do you all think? Will it cause other problems? It would help if you gave a little more context (like diff -up) next time. I think the correct fix is for the skb handed to ip_compile_options to match the layout expected by ip_compile_options. This patch is compile tested only, please validate. Subject: [PATCH] bridge: set pseudo-route table before calling ip_comple_options For some ip options, ip_compile_options assumes it can find the associated route table. The bridge to iptables code doesn't supply the necessary reference causing NULL dereference. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> --- Patch against net-next-2.6, but if validated should go to net-2.6 and stable. --- a/net/bridge/br_netfilter.c 2011-04-11 18:18:22.534837859 -0700 +++ b/net/bridge/br_netfilter.c 2011-04-11 18:25:15.427244826 -0700 @@ -221,6 +221,7 @@ static int br_parse_ip_options(struct sk struct ip_options *opt; struct iphdr *iph; struct net_device *dev = skb->dev; + struct rtable *rt; u32 len; iph = ip_hdr(skb); @@ -255,6 +256,14 @@ static int br_parse_ip_options(struct sk return 0; } + /* Associate bogus bridge route table */ + rt = bridge_parent_rtable(dev); + if (!rt) { + kfree_skb(skb); + return 0; + } + skb_dst_set(skb, &rt->dst); + opt->optlen = iph->ihl*4 - sizeof(struct iphdr); if (ip_options_compile(dev_net(dev), opt, skb)) goto inhdr_error; ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: Kernel panic when using bridge 2011-04-12 1:31 ` Stephen Hemminger @ 2011-04-12 3:47 ` Scot Doyle 2011-04-12 4:09 ` Eric Dumazet 0 siblings, 1 reply; 56+ messages in thread From: Scot Doyle @ 2011-04-12 3:47 UTC (permalink / raw) To: Stephen Hemminger; +Cc: Hiroaki SHIMODA, netdev On 04/11/2011 08:31 PM, Stephen Hemminger wrote: > > It would help if you gave a little more context (like diff -up) > next time. > > I think the correct fix is for the skb handed to ip_compile_options > to match the layout expected by ip_compile_options. > > This patch is compile tested only, please validate. > > > Subject: [PATCH] bridge: set pseudo-route table before calling ip_comple_options > > For some ip options, ip_compile_options assumes it can find the associated > route table. The bridge to iptables code doesn't supply the necessary > reference causing NULL dereference. > > Signed-off-by: Stephen Hemminger<shemminger@vyatta.com> > > --- > Patch against net-next-2.6, but if validated should go to net-2.6 > and stable. > > --- a/net/bridge/br_netfilter.c 2011-04-11 18:18:22.534837859 -0700 > +++ b/net/bridge/br_netfilter.c 2011-04-11 18:25:15.427244826 -0700 > @@ -221,6 +221,7 @@ static int br_parse_ip_options(struct sk > struct ip_options *opt; > struct iphdr *iph; > struct net_device *dev = skb->dev; > + struct rtable *rt; > u32 len; > > iph = ip_hdr(skb); > @@ -255,6 +256,14 @@ static int br_parse_ip_options(struct sk > return 0; > } > > + /* Associate bogus bridge route table */ > + rt = bridge_parent_rtable(dev); > + if (!rt) { > + kfree_skb(skb); > + return 0; > + } > + skb_dst_set(skb,&rt->dst); > + > opt->optlen = iph->ihl*4 - sizeof(struct iphdr); > if (ip_options_compile(dev_net(dev), opt, skb)) > goto inhdr_error; > > Thanks for the advice on diff context, I appreciate it. Here's the output from the patch: [ 422.577325] ------------[ cut here ]------------ [ 422.581932] WARNING: at net/core/dst.c:278 dst_release+0x2e/0x5d() [ 422.588086] Hardware name: PowerEdge R510 [ 422.592075] Modules linked in: kvm_intel kvm bridge stp loop snd_pcm snd_timer snd soundcore snd_page_alloc i7core_edac psmouse pcspkr edac_core evdev serio_raw power_meter processor ghes tpm_tis dcdbas tpm tpm_bios thermal_sys button hed ext2 mbcache dm_mod raid1 md_mod sd_mod crc_t10dif usb_storage uas uhci_hcd mpt2sas scsi_transport_sas igb ehci_hcd raid_class scsi_mod usbcore bnx2 dca [last unloaded: scsi_wait_scan] [ 422.629510] Pid: 0, comm: swapper Not tainted 2.6.39-rc2+ #10 [ 422.635225] Call Trace: [ 422.637655] <IRQ> [<ffffffff81045635>] ? warn_slowpath_common+0x78/0x8c [ 422.644425] [<ffffffffa01cbe89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [ 422.650918] [<ffffffff8127cd60>] ? dst_release+0x2e/0x5d [ 422.656290] [<ffffffff8126c25f>] ? skb_release_head_state+0x21/0xeb [ 422.662613] [<ffffffffa01cbe89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [ 422.669108] [<ffffffff8126c06f>] ? __kfree_skb+0x9/0x77 [ 422.674392] [<ffffffff812985f7>] ? nf_hook_slow+0x93/0x114 [ 422.679936] [<ffffffffa01cbe89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [ 422.686431] [<ffffffffa01cbe89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [ 422.692927] [<ffffffffa01cbe6f>] ? NF_HOOK.clone.4+0x3c/0x56 [bridge] [ 422.699421] [<ffffffff812a7d8e>] ? tcp_gro_receive+0xa1/0x204 [ 422.705225] [<ffffffffa01cc1e5>] ? br_handle_frame+0x195/0x1ac [bridge] [ 422.711892] [<ffffffffa01cc050>] ? br_handle_frame_finish+0x1c7/0x1c7 [bridge] [ 422.719166] [<ffffffff812764ef>] ? __netif_receive_skb+0x2a7/0x450 [ 422.725401] [<ffffffff81276928>] ? netif_receive_skb+0x52/0x58 [ 422.731289] [<ffffffff81276e2a>] ? napi_gro_receive+0x1f/0x2f [ 422.737091] [<ffffffff812769ff>] ? napi_skb_finish+0x1c/0x31 [ 422.742809] [<ffffffffa0226fcd>] ? igb_poll+0x6d9/0x9ee [igb] [ 422.748615] [<ffffffffa003bde2>] ? scsi_run_queue+0x2ce/0x30a [scsi_mod] [ 422.755371] [<ffffffffa003cb31>] ? scsi_io_completion+0x44c/0x4cf [scsi_mod] [ 422.762472] [<ffffffff81276f55>] ? net_rx_action+0xa4/0x1b1 [ 422.768103] [<ffffffff8104ad26>] ? __do_softirq+0xb8/0x176 [ 422.773647] [<ffffffff81333c5c>] ? call_softirq+0x1c/0x30 [ 422.779104] [<ffffffff8100aa57>] ? do_softirq+0x3f/0x84 [ 422.784388] [<ffffffff8104af91>] ? irq_exit+0x3f/0x8f [ 422.789499] [<ffffffff8100a793>] ? do_IRQ+0x85/0x9e [ 422.794439] [<ffffffff8132cbd3>] ? common_interrupt+0x13/0x13 [ 422.800240] <EOI> [<ffffffff81061348>] ? enqueue_hrtimer+0x3f/0x53 [ 422.806575] [<ffffffffa0310417>] ? arch_local_irq_enable+0x7/0x8 [processor] [ 422.813676] [<ffffffffa0310dab>] ? acpi_idle_enter_c1+0x86/0xa2 [processor] [ 422.820690] [<ffffffff8125d05d>] ? cpuidle_idle_call+0xf4/0x17e [ 422.826664] [<ffffffff81008298>] ? cpu_idle+0xa2/0xc4 [ 422.831776] [<ffffffff8169db60>] ? start_kernel+0x3b9/0x3c4 [ 422.837406] [<ffffffff8169d3c6>] ? x86_64_start_kernel+0x102/0x10f [ 422.843640] ---[ end trace 5d4687f8472ee50c ]--- ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: Kernel panic when using bridge 2011-04-12 3:47 ` Scot Doyle @ 2011-04-12 4:09 ` Eric Dumazet 2011-04-12 4:22 ` Eric Dumazet 0 siblings, 1 reply; 56+ messages in thread From: Eric Dumazet @ 2011-04-12 4:09 UTC (permalink / raw) To: Scot Doyle; +Cc: Stephen Hemminger, Hiroaki SHIMODA, netdev Le lundi 11 avril 2011 à 22:47 -0500, Scot Doyle a écrit : > On 04/11/2011 08:31 PM, Stephen Hemminger wrote: > > > > It would help if you gave a little more context (like diff -up) > > next time. > > > > I think the correct fix is for the skb handed to ip_compile_options > > to match the layout expected by ip_compile_options. > > > > This patch is compile tested only, please validate. > > > > > > Subject: [PATCH] bridge: set pseudo-route table before calling ip_comple_options > > > > For some ip options, ip_compile_options assumes it can find the associated > > route table. The bridge to iptables code doesn't supply the necessary > > reference causing NULL dereference. > > > > Signed-off-by: Stephen Hemminger<shemminger@vyatta.com> > > > > --- > > Patch against net-next-2.6, but if validated should go to net-2.6 > > and stable. > > > > --- a/net/bridge/br_netfilter.c 2011-04-11 18:18:22.534837859 -0700 > > +++ b/net/bridge/br_netfilter.c 2011-04-11 18:25:15.427244826 -0700 > > @@ -221,6 +221,7 @@ static int br_parse_ip_options(struct sk > > struct ip_options *opt; > > struct iphdr *iph; > > struct net_device *dev = skb->dev; > > + struct rtable *rt; > > u32 len; > > > > iph = ip_hdr(skb); > > @@ -255,6 +256,14 @@ static int br_parse_ip_options(struct sk > > return 0; > > } > > > > + /* Associate bogus bridge route table */ > > + rt = bridge_parent_rtable(dev); > > + if (!rt) { > > + kfree_skb(skb); > > + return 0; > > + } > > + skb_dst_set(skb,&rt->dst); Please try skb_dst_set_noref() here instead of skb_dst_set() Or increment rt refcount. > > + > > opt->optlen = iph->ihl*4 - sizeof(struct iphdr); > > if (ip_options_compile(dev_net(dev), opt, skb)) > > goto inhdr_error; > > > > > Thanks for the advice on diff context, I appreciate it. Here's the > output from the patch: > > [ 422.577325] ------------[ cut here ]------------ > [ 422.581932] WARNING: at net/core/dst.c:278 dst_release+0x2e/0x5d() > [ 422.588086] Hardware name: PowerEdge R510 > [ 422.592075] Modules linked in: kvm_intel kvm bridge stp loop snd_pcm > snd_timer snd soundcore snd_page_alloc i7core_edac psmouse pcspkr > edac_core evdev serio_raw power_meter processor ghes tpm_tis dcdbas tpm > tpm_bios thermal_sys button hed ext2 mbcache dm_mod raid1 md_mod sd_mod > crc_t10dif usb_storage uas uhci_hcd mpt2sas scsi_transport_sas igb > ehci_hcd raid_class scsi_mod usbcore bnx2 dca [last unloaded: > scsi_wait_scan] > [ 422.629510] Pid: 0, comm: swapper Not tainted 2.6.39-rc2+ #10 > [ 422.635225] Call Trace: > [ 422.637655] <IRQ> [<ffffffff81045635>] ? warn_slowpath_common+0x78/0x8c > [ 422.644425] [<ffffffffa01cbe89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] > [ 422.650918] [<ffffffff8127cd60>] ? dst_release+0x2e/0x5d > [ 422.656290] [<ffffffff8126c25f>] ? skb_release_head_state+0x21/0xeb > [ 422.662613] [<ffffffffa01cbe89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] > [ 422.669108] [<ffffffff8126c06f>] ? __kfree_skb+0x9/0x77 > [ 422.674392] [<ffffffff812985f7>] ? nf_hook_slow+0x93/0x114 > [ 422.679936] [<ffffffffa01cbe89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] > [ 422.686431] [<ffffffffa01cbe89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] > [ 422.692927] [<ffffffffa01cbe6f>] ? NF_HOOK.clone.4+0x3c/0x56 [bridge] > [ 422.699421] [<ffffffff812a7d8e>] ? tcp_gro_receive+0xa1/0x204 > [ 422.705225] [<ffffffffa01cc1e5>] ? br_handle_frame+0x195/0x1ac [bridge] > [ 422.711892] [<ffffffffa01cc050>] ? > br_handle_frame_finish+0x1c7/0x1c7 [bridge] > [ 422.719166] [<ffffffff812764ef>] ? __netif_receive_skb+0x2a7/0x450 > [ 422.725401] [<ffffffff81276928>] ? netif_receive_skb+0x52/0x58 > [ 422.731289] [<ffffffff81276e2a>] ? napi_gro_receive+0x1f/0x2f > [ 422.737091] [<ffffffff812769ff>] ? napi_skb_finish+0x1c/0x31 > [ 422.742809] [<ffffffffa0226fcd>] ? igb_poll+0x6d9/0x9ee [igb] > [ 422.748615] [<ffffffffa003bde2>] ? scsi_run_queue+0x2ce/0x30a [scsi_mod] > [ 422.755371] [<ffffffffa003cb31>] ? scsi_io_completion+0x44c/0x4cf > [scsi_mod] > [ 422.762472] [<ffffffff81276f55>] ? net_rx_action+0xa4/0x1b1 > [ 422.768103] [<ffffffff8104ad26>] ? __do_softirq+0xb8/0x176 > [ 422.773647] [<ffffffff81333c5c>] ? call_softirq+0x1c/0x30 > [ 422.779104] [<ffffffff8100aa57>] ? do_softirq+0x3f/0x84 > [ 422.784388] [<ffffffff8104af91>] ? irq_exit+0x3f/0x8f > [ 422.789499] [<ffffffff8100a793>] ? do_IRQ+0x85/0x9e > [ 422.794439] [<ffffffff8132cbd3>] ? common_interrupt+0x13/0x13 > [ 422.800240] <EOI> [<ffffffff81061348>] ? enqueue_hrtimer+0x3f/0x53 > [ 422.806575] [<ffffffffa0310417>] ? arch_local_irq_enable+0x7/0x8 > [processor] > [ 422.813676] [<ffffffffa0310dab>] ? acpi_idle_enter_c1+0x86/0xa2 > [processor] > [ 422.820690] [<ffffffff8125d05d>] ? cpuidle_idle_call+0xf4/0x17e > [ 422.826664] [<ffffffff81008298>] ? cpu_idle+0xa2/0xc4 > [ 422.831776] [<ffffffff8169db60>] ? start_kernel+0x3b9/0x3c4 > [ 422.837406] [<ffffffff8169d3c6>] ? x86_64_start_kernel+0x102/0x10f > [ 422.843640] ---[ end trace 5d4687f8472ee50c ]--- > ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: Kernel panic when using bridge 2011-04-12 4:09 ` Eric Dumazet @ 2011-04-12 4:22 ` Eric Dumazet 2011-04-12 5:17 ` Scot Doyle 0 siblings, 1 reply; 56+ messages in thread From: Eric Dumazet @ 2011-04-12 4:22 UTC (permalink / raw) To: Scot Doyle; +Cc: Stephen Hemminger, Hiroaki SHIMODA, netdev Le mardi 12 avril 2011 à 06:09 +0200, Eric Dumazet a écrit : > Le lundi 11 avril 2011 à 22:47 -0500, Scot Doyle a écrit : > > On 04/11/2011 08:31 PM, Stephen Hemminger wrote: > > > > > > It would help if you gave a little more context (like diff -up) > > > next time. > > > > > > I think the correct fix is for the skb handed to ip_compile_options > > > to match the layout expected by ip_compile_options. > > > > > > This patch is compile tested only, please validate. > > > > > > > > > Subject: [PATCH] bridge: set pseudo-route table before calling ip_comple_options > > > > > > For some ip options, ip_compile_options assumes it can find the associated > > > route table. The bridge to iptables code doesn't supply the necessary > > > reference causing NULL dereference. > > > > > > Signed-off-by: Stephen Hemminger<shemminger@vyatta.com> > > > > > > --- > > > Patch against net-next-2.6, but if validated should go to net-2.6 > > > and stable. > > > > > > --- a/net/bridge/br_netfilter.c 2011-04-11 18:18:22.534837859 -0700 > > > +++ b/net/bridge/br_netfilter.c 2011-04-11 18:25:15.427244826 -0700 > > > @@ -221,6 +221,7 @@ static int br_parse_ip_options(struct sk > > > struct ip_options *opt; > > > struct iphdr *iph; > > > struct net_device *dev = skb->dev; > > > + struct rtable *rt; > > > u32 len; > > > > > > iph = ip_hdr(skb); > > > @@ -255,6 +256,14 @@ static int br_parse_ip_options(struct sk > > > return 0; > > > } > > > > > > + /* Associate bogus bridge route table */ > > > + rt = bridge_parent_rtable(dev); > > > + if (!rt) { > > > + kfree_skb(skb); > > > + return 0; > > > + } > > > + skb_dst_set(skb,&rt->dst); > > Please try skb_dst_set_noref() here instead of skb_dst_set() > > Or increment rt refcount. Also, I would first check if skb->dst already set to not leak a dst if (!skb->dst) { rt = bridge_parent_rtable(dev); if (!rt) { kfree_skb(skb); return 0; } skb_dst_set_noref(skb,&rt->dst); } ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: Kernel panic when using bridge 2011-04-12 4:22 ` Eric Dumazet @ 2011-04-12 5:17 ` Scot Doyle 2011-04-12 5:51 ` Eric Dumazet 0 siblings, 1 reply; 56+ messages in thread From: Scot Doyle @ 2011-04-12 5:17 UTC (permalink / raw) To: Eric Dumazet; +Cc: Stephen Hemminger, Hiroaki SHIMODA, netdev On 04/11/2011 11:22 PM, Eric Dumazet wrote: > Also, I would first check if skb->dst already set to not leak a dst > > if (!skb->dst) { > rt = bridge_parent_rtable(dev); > if (!rt) { > kfree_skb(skb); > return 0; > } > skb_dst_set_noref(skb,&rt->dst); > } Thank you for the idea. Here is the compiler output referring to the first line above. net/bridge/br_netfilter.c: In function 'br_parse_ip_options': net/bridge/br_netfilter.c:260:10: error: 'struct sk_buff' has no member named 'dst' ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: Kernel panic when using bridge 2011-04-12 5:17 ` Scot Doyle @ 2011-04-12 5:51 ` Eric Dumazet 2011-04-12 7:02 ` Scot Doyle 0 siblings, 1 reply; 56+ messages in thread From: Eric Dumazet @ 2011-04-12 5:51 UTC (permalink / raw) To: Scot Doyle; +Cc: Stephen Hemminger, Hiroaki SHIMODA, netdev Le mardi 12 avril 2011 à 00:17 -0500, Scot Doyle a écrit : > On 04/11/2011 11:22 PM, Eric Dumazet wrote: > > Also, I would first check if skb->dst already set to not leak a dst > > > > if (!skb->dst) { Oh well, sorry (not enough time these days to even test patches) if (!skb_dst(skb)) { > > rt = bridge_parent_rtable(dev); > > if (!rt) { > > kfree_skb(skb); > > return 0; > > } > > skb_dst_set_noref(skb,&rt->dst); > > } > > Thank you for the idea. Here is the compiler output referring to the > first line above. > > net/bridge/br_netfilter.c: In function 'br_parse_ip_options': > net/bridge/br_netfilter.c:260:10: error: 'struct sk_buff' has no member > named 'dst' > ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: Kernel panic when using bridge 2011-04-12 5:51 ` Eric Dumazet @ 2011-04-12 7:02 ` Scot Doyle 2011-04-12 7:31 ` Eric Dumazet 2011-04-12 11:49 ` Kernel panic when using bridge Eric Dumazet 0 siblings, 2 replies; 56+ messages in thread From: Scot Doyle @ 2011-04-12 7:02 UTC (permalink / raw) To: Eric Dumazet, Stephen Hemminger; +Cc: Hiroaki SHIMODA, netdev On 04/12/2011 12:51 AM, Eric Dumazet wrote: > > Oh well, sorry (not enough time these days to even test patches) > > if (!skb_dst(skb)) { --- br_netfilter.c.a 2011-04-01 02:37:53.000000000 -0500 +++ br_netfilter.c.b 2011-04-12 00:29:00.000000000 -0500 @@ -221,6 +221,7 @@ static int br_parse_ip_options(struct sk struct ip_options *opt; struct iphdr *iph; struct net_device *dev = skb->dev; + struct rtable *rt; u32 len; iph = ip_hdr(skb); @@ -255,6 +256,16 @@ static int br_parse_ip_options(struct sk return 0; } + /* Associate bogus bridge route table */ + if (!skb_dst(skb)) { + rt = bridge_parent_rtable(dev); + if (!rt) { + kfree_skb(skb); + return 0; + } + skb_dst_set_noref(skb,&rt->dst); + } + opt->optlen = iph->ihl*4 - sizeof(struct iphdr); if (ip_options_compile(dev_net(dev), opt, skb)) goto inhdr_error; Now we are making progress! With the patch above from Stephen and Eric, I cannot make the kernel panic when sending packets to the IP address of the bridge. However, if a guest virtual machine is sharing the bridge with the host via a tap device, I can cause a host panic by targeting the IP address of the guest. Is this an unrelated problem? Here are two kernel panics. The guest virtual machine was pingable before being attacked with IP Stack Checker's tcpsic command. Spanning Tree Protocol was off during the first panic and on during the second. ------------ [ 606.921739] br0: port 2(tap0) entering forwarding state [ 636.058941] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffff812c2781 [ 636.058942] [ 636.069789] Pid: 2261, comm: kvm Tainted: G W 2.6.39-rc2+ #11 [ 636.076292] Call Trace: [ 636.078725] <IRQ> [<ffffffff8132ad78>] ? panic+0x92/0x1a1 [ 636.084287] [<ffffffff8104abe8>] ? _local_bh_enable_ip.clone.8+0x20/0x8c [ 636.091044] [<ffffffff812c2781>] ? icmp_send+0x337/0x349 [ 636.096418] [<ffffffff810454e5>] ? __stack_chk_fail+0x17/0x17 [ 636.102221] [<ffffffff812c2781>] ? icmp_send+0x337/0x349 [ 636.107595] [<ffffffff81298527>] ? nf_iterate+0x41/0x7e [ 636.112883] [<ffffffff81298527>] ? nf_iterate+0x41/0x7e [ 636.118172] [<ffffffffa017b0d4>] ? br_flood+0xc8/0xc8 [bridge] [ 636.124065] [<ffffffffa017b250>] ? __br_deliver+0xb0/0xb0 [bridge] [ 636.130302] [<ffffffff812985d7>] ? nf_hook_slow+0x73/0x114 [ 636.135850] [<ffffffffa017b250>] ? __br_deliver+0xb0/0xb0 [bridge] [ 636.142089] [<ffffffffa017be89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [ 636.148586] [<ffffffffa017b250>] ? __br_deliver+0xb0/0xb0 [bridge] [ 636.154826] [<ffffffffa017b186>] ? NF_HOOK.clone.5+0x3c/0x56 [bridge] [ 636.161323] [<ffffffffa017bfe1>] ? br_handle_frame_finish+0x158/0x1c7 [bridge] [ 636.168601] [<ffffffffa0180689>] ? br_nf_pre_routing_finish+0x1d4/0x1e1 [bridge] [ 636.176052] [<ffffffffa017fc76>] ? NF_HOOK_THRESH+0x3b/0x55 [bridge] [ 636.182463] [<ffffffffa0180c84>] ? br_nf_pre_routing+0x3be/0x3cb [bridge] [ 636.189307] [<ffffffff812985d7>] ? nf_hook_slow+0x73/0x114 [ 636.194852] [<ffffffff81298527>] ? nf_iterate+0x41/0x7e [ 636.200139] [<ffffffffa017be89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [ 636.206637] [<ffffffffa017be89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [ 636.213133] [<ffffffff812985d7>] ? nf_hook_slow+0x73/0x114 [ 636.218679] [<ffffffffa017be89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [ 636.225177] [<ffffffffa017bfe1>] ? br_handle_frame_finish+0x158/0x1c7 [bridge] [ 636.232455] [<ffffffffa017be89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [ 636.238954] [<ffffffffa017be6f>] ? NF_HOOK.clone.4+0x3c/0x56 [bridge] [ 636.245452] [<ffffffff812a7d8e>] ? tcp_gro_receive+0xa1/0x204 [ 636.251258] [<ffffffffa017c1e5>] ? br_handle_frame+0x195/0x1ac [bridge] [ 636.257928] [<ffffffffa017c050>] ? br_handle_frame_finish+0x1c7/0x1c7 [bridge] [ 636.265204] [<ffffffff812764ef>] ? __netif_receive_skb+0x2a7/0x450 [ 636.271443] [<ffffffff81276928>] ? netif_receive_skb+0x52/0x58 [ 636.277335] [<ffffffff81276e2a>] ? napi_gro_receive+0x1f/0x2f [ 636.283139] [<ffffffff812769ff>] ? napi_skb_finish+0x1c/0x31 [ 636.288865] [<ffffffffa0241fcd>] ? igb_poll+0x6d9/0x9ee [igb] [ 636.294673] [<ffffffffa003bde2>] ? scsi_run_queue+0x2ce/0x30a [scsi_mod] [ 636.301431] [<ffffffffa017be89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [ 636.307930] [<ffffffff812764ef>] ? __netif_receive_skb+0x2a7/0x450 [ 636.314168] [<ffffffff81276f55>] ? net_rx_action+0xa4/0x1b1 [ 636.319800] [<ffffffff8104ad26>] ? __do_softirq+0xb8/0x176 [ 636.325346] [<ffffffff81333c5c>] ? call_softirq+0x1c/0x30 [ 636.330807] [<ffffffff8100aa57>] ? do_softirq+0x3f/0x84 [ 636.336092] [<ffffffff8104af91>] ? irq_exit+0x3f/0x8f [ 636.341204] [<ffffffff8100a793>] ? do_IRQ+0x85/0x9e [ 636.346146] [<ffffffff8132cbd3>] ? common_interrupt+0x13/0x13 [ 636.351949] <EOI> [<ffffffff81271f58>] ? arch_local_irq_save+0x12/0x1b [ 636.358629] [<ffffffff8100a9f2>] ? arch_local_irq_restore+0x2/0x8 [ 636.364781] [<ffffffff8127680d>] ? netif_rx_ni+0x1e/0x27 [ 636.370154] [<ffffffffa01557d2>] ? tun_get_user+0x3a3/0x3cb [tun] [ 636.376305] [<ffffffffa0155bd8>] ? tun_get_socket+0x3b/0x3b [tun] [ 636.382457] [<ffffffffa0155c36>] ? tun_chr_aio_write+0x5e/0x79 [tun] [ 636.388869] [<ffffffff810f6b07>] ? do_sync_readv_writev+0x9a/0xd5 [ 636.395021] [<ffffffff810371f3>] ? need_resched+0x1a/0x23 [ 636.400481] [<ffffffff8132b725>] ? _cond_resched+0x9/0x20 [ 636.405941] [<ffffffff810f5f77>] ? copy_from_user+0x18/0x30 [ 636.411573] [<ffffffff8115fbf6>] ? security_file_permission+0x18/0x33 [ 636.418068] [<ffffffff810f6d55>] ? do_readv_writev+0xa4/0x11a [ 636.423873] [<ffffffff810f7913>] ? fput+0x1a/0x1a2 [ 636.428726] [<ffffffff810f6f39>] ? sys_writev+0x45/0x90 [ 636.434012] [<ffffffff81332a52>] ? system_call_fastpath+0x16/0x1b ------------ [ 110.442839] br0: port 2(tap0) entering forwarding state [ 136.948700] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffff812c2781 [ 136.948702] [ 136.959561] Pid: 1093, comm: md123_resync Not tainted 2.6.39-rc2+ #11 [ 136.965977] Call Trace: [ 136.968408] <IRQ> [<ffffffff8132ad78>] ? panic+0x92/0x1a1 [ 136.973970] [<ffffffff8104abe8>] ? _local_bh_enable_ip.clone.8+0x20/0x8c [ 136.980727] [<ffffffff812c2781>] ? icmp_send+0x337/0x349 [ 136.986102] [<ffffffff810454e5>] ? __stack_chk_fail+0x17/0x17 [ 136.991906] [<ffffffff812c2781>] ? icmp_send+0x337/0x349 [ 136.997281] [<ffffffff81298527>] ? nf_iterate+0x41/0x7e [ 137.002570] [<ffffffffa0198fe1>] ? br_handle_frame_finish+0x158/0x1c7 [bridge] [ 137.009847] [<ffffffffa019d689>] ? br_nf_pre_routing_finish+0x1d4/0x1e1 [bridge] [ 137.017297] [<ffffffffa019cc76>] ? NF_HOOK_THRESH+0x3b/0x55 [bridge] [ 137.023707] [<ffffffffa019dc84>] ? br_nf_pre_routing+0x3be/0x3cb [bridge] [ 137.030551] [<ffffffff81298527>] ? nf_iterate+0x41/0x7e [ 137.035837] [<ffffffff8103704d>] ? test_tsk_need_resched+0xe/0x17 [ 137.041991] [<ffffffffa0198e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [ 137.048488] [<ffffffffa0198e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [ 137.054984] [<ffffffff812985d7>] ? nf_hook_slow+0x73/0x114 [ 137.060531] [<ffffffffa0198e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [ 137.067028] [<ffffffffa0198e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [ 137.073526] [<ffffffffa0198e6f>] ? NF_HOOK.clone.4+0x3c/0x56 [bridge] [ 137.080023] [<ffffffff812a7d8e>] ? tcp_gro_receive+0xa1/0x204 [ 137.085830] [<ffffffffa01991e5>] ? br_handle_frame+0x195/0x1ac [bridge] [ 137.092500] [<ffffffffa0199050>] ? br_handle_frame_finish+0x1c7/0x1c7 [bridge] [ 137.099776] [<ffffffff812764ef>] ? __netif_receive_skb+0x2a7/0x450 [ 137.106013] [<ffffffff81276928>] ? netif_receive_skb+0x52/0x58 [ 137.111906] [<ffffffff81276e2a>] ? napi_gro_receive+0x1f/0x2f [ 137.117713] [<ffffffff812769ff>] ? napi_skb_finish+0x1c/0x31 [ 137.123438] [<ffffffffa0226fcd>] ? igb_poll+0x6d9/0x9ee [igb] [ 137.129243] [<ffffffff8109034f>] ? handle_irq_event+0x40/0x55 [ 137.135049] [<ffffffff8132cbd3>] ? common_interrupt+0x13/0x13 [ 137.140854] [<ffffffff81276f55>] ? net_rx_action+0xa4/0x1b1 [ 137.146487] [<ffffffff8104ad26>] ? __do_softirq+0xb8/0x176 [ 137.152034] [<ffffffff81333c5c>] ? call_softirq+0x1c/0x30 [ 137.157494] [<ffffffff8100aa57>] ? do_softirq+0x3f/0x84 [ 137.162779] [<ffffffff8104af91>] ? irq_exit+0x3f/0x8f [ 137.167893] [<ffffffff8100a793>] ? do_IRQ+0x85/0x9e [ 137.172833] [<ffffffff8132cbd3>] ? common_interrupt+0x13/0x13 [ 137.178636] <EOI> [<ffffffff8106fc1a>] ? arch_local_irq_restore+0x2/0x8 [ 137.185408] [<ffffffffa0050fca>] ? _scsih_qcmd+0x54f/0x561 [mpt2sas] [ 137.191823] [<ffffffffa01e452f>] ? scsi_dispatch_cmd+0x180/0x219 [scsi_mod] [ 137.198841] [<ffffffffa01ea385>] ? scsi_request_fn+0x3e6/0x413 [scsi_mod] [ 137.205683] [<ffffffff81187470>] ? elv_rqhash_add.clone.15+0x26/0x4c [ 137.212095] [<ffffffff8118bde2>] ? __blk_run_queue+0x5e/0x84 [ 137.217814] [<ffffffff8118d63c>] ? __make_request+0x273/0x28f [ 137.223619] [<ffffffff8118b569>] ? generic_make_request+0x267/0x2e1 [ 137.229943] [<ffffffff8105eb49>] ? remove_wait_queue+0x11/0x4d [ 137.235837] [<ffffffffa0002417>] ? raise_barrier+0x162/0x16f [raid1] [ 137.242246] [<ffffffff8103eba4>] ? try_to_wake_up+0x17c/0x17c [ 137.248052] [<ffffffffa0002f2f>] ? sync_request+0x567/0x583 [raid1] [ 137.254379] [<ffffffffa00bd834>] ? md_do_sync+0x776/0xb8e [md_mod] [ 137.260617] [<ffffffff8100e537>] ? sched_clock+0x5/0x8 [ 137.265819] [<ffffffffa00bde83>] ? md_thread+0xfa/0x118 [md_mod] [ 137.271886] [<ffffffffa00bdd89>] ? md_rdev_init+0x8f/0x8f [md_mod] [ 137.278124] [<ffffffffa00bdd89>] ? md_rdev_init+0x8f/0x8f [md_mod] [ 137.284362] [<ffffffff8105e497>] ? kthread+0x7a/0x82 [ 137.289390] [<ffffffff81333b64>] ? kernel_thread_helper+0x4/0x10 [ 137.295454] [<ffffffff8105e41d>] ? kthread_worker_fn+0x149/0x149 [ 137.301519] [<ffffffff81333b60>] ? gs_change+0x13/0x13 ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: Kernel panic when using bridge 2011-04-12 7:02 ` Scot Doyle @ 2011-04-12 7:31 ` Eric Dumazet 2011-04-12 8:39 ` [PATCH] inetpeer: reduce stack usage Eric Dumazet 2011-04-12 11:49 ` Kernel panic when using bridge Eric Dumazet 1 sibling, 1 reply; 56+ messages in thread From: Eric Dumazet @ 2011-04-12 7:31 UTC (permalink / raw) To: Scot Doyle; +Cc: Stephen Hemminger, Hiroaki SHIMODA, netdev Le mardi 12 avril 2011 à 02:02 -0500, Scot Doyle a écrit : > On 04/12/2011 12:51 AM, Eric Dumazet wrote: > > > > Oh well, sorry (not enough time these days to even test patches) > > > > if (!skb_dst(skb)) { > > --- br_netfilter.c.a 2011-04-01 02:37:53.000000000 -0500 > +++ br_netfilter.c.b 2011-04-12 00:29:00.000000000 -0500 > @@ -221,6 +221,7 @@ static int br_parse_ip_options(struct sk > struct ip_options *opt; > struct iphdr *iph; > struct net_device *dev = skb->dev; > + struct rtable *rt; > u32 len; > > iph = ip_hdr(skb); > @@ -255,6 +256,16 @@ static int br_parse_ip_options(struct sk > return 0; > } > > + /* Associate bogus bridge route table */ > + if (!skb_dst(skb)) { > + rt = bridge_parent_rtable(dev); > + if (!rt) { > + kfree_skb(skb); > + return 0; > + } > + skb_dst_set_noref(skb,&rt->dst); > + } > + > opt->optlen = iph->ihl*4 - sizeof(struct iphdr); > if (ip_options_compile(dev_net(dev), opt, skb)) > goto inhdr_error; > > > Now we are making progress! With the patch above from Stephen and Eric, > I cannot make the kernel panic when sending packets to the IP address of > the bridge. > > However, if a guest virtual machine is sharing the bridge with the host > via a tap device, I can cause a host panic by targeting the IP address > of the guest. Is this an unrelated problem? > > Here are two kernel panics. The guest virtual machine was pingable > before being attacked with IP Stack Checker's tcpsic command. Spanning > Tree Protocol was off during the first panic and on during the second. > I wonder if you are not running out of free stack space... And it might be because of inet_getpeer() calling cleanup_once() # objdump64 -d net/ipv4/inetpeer.o | scripts/checkstack.pl 0x0317 cleanup_once [inetpeer.o]: 344 0x03d6 cleanup_once [inetpeer.o]: 344 0x0680 inet_getpeer [inetpeer.o]: 344 0x071d inet_getpeer [inetpeer.o]: 344 0x0004 inet_initpeers [inetpeer.o]: 112 ^ permalink raw reply [flat|nested] 56+ messages in thread
* [PATCH] inetpeer: reduce stack usage 2011-04-12 7:31 ` Eric Dumazet @ 2011-04-12 8:39 ` Eric Dumazet 2011-04-12 14:51 ` Hiroaki SHIMODA 0 siblings, 1 reply; 56+ messages in thread From: Eric Dumazet @ 2011-04-12 8:39 UTC (permalink / raw) To: Scot Doyle, David Miller; +Cc: Stephen Hemminger, Hiroaki SHIMODA, netdev On 64bit arches, we use 752 bytes of stack when cleanup_once() is called from inet_getpeer(). Lets share the avl stack to save ~376 bytes. Before patch : # objdump -d net/ipv4/inetpeer.o | scripts/checkstack.pl 0x000006c3 unlink_from_pool [inetpeer.o]: 376 0x00000721 unlink_from_pool [inetpeer.o]: 376 0x00000cb1 inet_getpeer [inetpeer.o]: 376 0x00000e6d inet_getpeer [inetpeer.o]: 376 0x0004 inet_initpeers [inetpeer.o]: 112 # size net/ipv4/inetpeer.o text data bss dec hex filename 5320 432 21 5773 168d net/ipv4/inetpeer.o After patch : objdump -d net/ipv4/inetpeer.o | scripts/checkstack.pl 0x00000c11 inet_getpeer [inetpeer.o]: 376 0x00000dcd inet_getpeer [inetpeer.o]: 376 0x00000ab9 peer_check_expire [inetpeer.o]: 328 0x00000b7f peer_check_expire [inetpeer.o]: 328 0x0004 inet_initpeers [inetpeer.o]: 112 # size net/ipv4/inetpeer.o text data bss dec hex filename 5163 432 21 5616 15f0 net/ipv4/inetpeer.o Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Scot Doyle <lkml@scotdoyle.com> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> --- net/ipv4/inetpeer.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/net/ipv4/inetpeer.c b/net/ipv4/inetpeer.c index dd1b20e..9df4e63 100644 --- a/net/ipv4/inetpeer.c +++ b/net/ipv4/inetpeer.c @@ -354,7 +354,8 @@ static void inetpeer_free_rcu(struct rcu_head *head) } /* May be called with local BH enabled. */ -static void unlink_from_pool(struct inet_peer *p, struct inet_peer_base *base) +static void unlink_from_pool(struct inet_peer *p, struct inet_peer_base *base, + struct inet_peer __rcu **stack[PEER_MAXDEPTH]) { int do_free; @@ -368,7 +369,6 @@ static void unlink_from_pool(struct inet_peer *p, struct inet_peer_base *base) * We use refcnt=-1 to alert lockless readers this entry is deleted. */ if (atomic_cmpxchg(&p->refcnt, 1, -1) == 1) { - struct inet_peer __rcu **stack[PEER_MAXDEPTH]; struct inet_peer __rcu ***stackptr, ***delp; if (lookup(&p->daddr, stack, base) != p) BUG(); @@ -422,7 +422,7 @@ static struct inet_peer_base *peer_to_base(struct inet_peer *p) } /* May be called with local BH enabled. */ -static int cleanup_once(unsigned long ttl) +static int cleanup_once(unsigned long ttl, struct inet_peer __rcu **stack[PEER_MAXDEPTH]) { struct inet_peer *p = NULL; @@ -454,7 +454,7 @@ static int cleanup_once(unsigned long ttl) * happen because of entry limits in route cache. */ return -1; - unlink_from_pool(p, peer_to_base(p)); + unlink_from_pool(p, peer_to_base(p), stack); return 0; } @@ -524,7 +524,7 @@ struct inet_peer *inet_getpeer(struct inetpeer_addr *daddr, int create) if (base->total >= inet_peer_threshold) /* Remove one less-recently-used entry. */ - cleanup_once(0); + cleanup_once(0, stack); return p; } @@ -540,6 +540,7 @@ static void peer_check_expire(unsigned long dummy) { unsigned long now = jiffies; int ttl, total; + struct inet_peer __rcu **stack[PEER_MAXDEPTH]; total = compute_total(); if (total >= inet_peer_threshold) @@ -548,7 +549,7 @@ static void peer_check_expire(unsigned long dummy) ttl = inet_peer_maxttl - (inet_peer_maxttl - inet_peer_minttl) / HZ * total / inet_peer_threshold * HZ; - while (!cleanup_once(ttl)) { + while (!cleanup_once(ttl, stack)) { if (jiffies != now) break; } ^ permalink raw reply related [flat|nested] 56+ messages in thread
* Re: [PATCH] inetpeer: reduce stack usage 2011-04-12 8:39 ` [PATCH] inetpeer: reduce stack usage Eric Dumazet @ 2011-04-12 14:51 ` Hiroaki SHIMODA 2011-04-12 14:55 ` Eric Dumazet 0 siblings, 1 reply; 56+ messages in thread From: Hiroaki SHIMODA @ 2011-04-12 14:51 UTC (permalink / raw) To: Eric Dumazet, David Miller; +Cc: Scot Doyle, Stephen Hemminger, netdev On Tue, 12 Apr 2011 10:39:40 +0200 Eric Dumazet <eric.dumazet@gmail.com> wrote: > On 64bit arches, we use 752 bytes of stack when cleanup_once() is called > from inet_getpeer(). > > Lets share the avl stack to save ~376 bytes. > > Before patch : > > # objdump -d net/ipv4/inetpeer.o | scripts/checkstack.pl > > 0x000006c3 unlink_from_pool [inetpeer.o]: 376 > 0x00000721 unlink_from_pool [inetpeer.o]: 376 > 0x00000cb1 inet_getpeer [inetpeer.o]: 376 > 0x00000e6d inet_getpeer [inetpeer.o]: 376 > 0x0004 inet_initpeers [inetpeer.o]: 112 > # size net/ipv4/inetpeer.o > text data bss dec hex filename > 5320 432 21 5773 168d net/ipv4/inetpeer.o > > After patch : > > objdump -d net/ipv4/inetpeer.o | scripts/checkstack.pl > 0x00000c11 inet_getpeer [inetpeer.o]: 376 > 0x00000dcd inet_getpeer [inetpeer.o]: 376 > 0x00000ab9 peer_check_expire [inetpeer.o]: 328 > 0x00000b7f peer_check_expire [inetpeer.o]: 328 > 0x0004 inet_initpeers [inetpeer.o]: 112 > # size net/ipv4/inetpeer.o > text data bss dec hex filename > 5163 432 21 5616 15f0 net/ipv4/inetpeer.o > > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> > Cc: Scot Doyle <lkml@scotdoyle.com> > Cc: Stephen Hemminger <shemminger@vyatta.com> > Cc: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> I couldn't understand that actually cleanup_once() was called from inet_getpeer() and then the stack overflow was hit, but this patch surely reduces stack usage. Reviewed-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> Thanks. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] inetpeer: reduce stack usage 2011-04-12 14:51 ` Hiroaki SHIMODA @ 2011-04-12 14:55 ` Eric Dumazet 2011-04-12 20:58 ` David Miller 0 siblings, 1 reply; 56+ messages in thread From: Eric Dumazet @ 2011-04-12 14:55 UTC (permalink / raw) To: Hiroaki SHIMODA; +Cc: David Miller, Scot Doyle, Stephen Hemminger, netdev Le mardi 12 avril 2011 à 23:51 +0900, Hiroaki SHIMODA a écrit : > I couldn't understand that actually cleanup_once() was called > from inet_getpeer() and then the stack overflow was hit, > but this patch surely reduces stack usage. > > Reviewed-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> > Well, I dont believe we actually hit a stack overflow in Scot Doyle reported crashes, but it certainly is better to use a bit less stack anyway ;) Thanks for reviewing ! ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] inetpeer: reduce stack usage 2011-04-12 14:55 ` Eric Dumazet @ 2011-04-12 20:58 ` David Miller 0 siblings, 0 replies; 56+ messages in thread From: David Miller @ 2011-04-12 20:58 UTC (permalink / raw) To: eric.dumazet; +Cc: shimoda.hiroaki, lkml, shemminger, netdev From: Eric Dumazet <eric.dumazet@gmail.com> Date: Tue, 12 Apr 2011 16:55:23 +0200 > Le mardi 12 avril 2011 à 23:51 +0900, Hiroaki SHIMODA a écrit : > >> I couldn't understand that actually cleanup_once() was called >> from inet_getpeer() and then the stack overflow was hit, >> but this patch surely reduces stack usage. >> >> Reviewed-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> >> > > Well, I dont believe we actually hit a stack overflow in Scot Doyle > reported crashes, but it certainly is better to use a bit less stack > anyway ;) > > Thanks for reviewing ! Applied, thanks everyone. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: Kernel panic when using bridge 2011-04-12 7:02 ` Scot Doyle 2011-04-12 7:31 ` Eric Dumazet @ 2011-04-12 11:49 ` Eric Dumazet 2011-04-12 13:02 ` Jan Lübbe 1 sibling, 1 reply; 56+ messages in thread From: Eric Dumazet @ 2011-04-12 11:49 UTC (permalink / raw) To: Scot Doyle; +Cc: Stephen Hemminger, Hiroaki SHIMODA, netdev, Jan Luebbe Le mardi 12 avril 2011 à 02:02 -0500, Scot Doyle a écrit : > On 04/12/2011 12:51 AM, Eric Dumazet wrote: > > > > Oh well, sorry (not enough time these days to even test patches) > > > > if (!skb_dst(skb)) { > > --- br_netfilter.c.a 2011-04-01 02:37:53.000000000 -0500 > +++ br_netfilter.c.b 2011-04-12 00:29:00.000000000 -0500 > @@ -221,6 +221,7 @@ static int br_parse_ip_options(struct sk > struct ip_options *opt; > struct iphdr *iph; > struct net_device *dev = skb->dev; > + struct rtable *rt; > u32 len; > > iph = ip_hdr(skb); > @@ -255,6 +256,16 @@ static int br_parse_ip_options(struct sk > return 0; > } > > + /* Associate bogus bridge route table */ > + if (!skb_dst(skb)) { > + rt = bridge_parent_rtable(dev); > + if (!rt) { > + kfree_skb(skb); > + return 0; > + } > + skb_dst_set_noref(skb,&rt->dst); > + } > + > opt->optlen = iph->ihl*4 - sizeof(struct iphdr); > if (ip_options_compile(dev_net(dev), opt, skb)) > goto inhdr_error; > > > Now we are making progress! With the patch above from Stephen and Eric, > I cannot make the kernel panic when sending packets to the IP address of > the bridge. > > However, if a guest virtual machine is sharing the bridge with the host > via a tap device, I can cause a host panic by targeting the IP address > of the guest. Is this an unrelated problem? > > Here are two kernel panics. The guest virtual machine was pingable > before being attacked with IP Stack Checker's tcpsic command. Spanning > Tree Protocol was off during the first panic and on during the second. > > ------------ > > [ 606.921739] br0: port 2(tap0) entering forwarding state > [ 636.058941] Kernel panic - not syncing: stack-protector: Kernel stack > is corrupted in: ffffffff812c2781 > [ 636.058942] > [ 636.069789] Pid: 2261, comm: kvm Tainted: G W 2.6.39-rc2+ #11 > [ 636.076292] Call Trace: > [ 636.078725] <IRQ> [<ffffffff8132ad78>] ? panic+0x92/0x1a1 > [ 636.084287] [<ffffffff8104abe8>] ? _local_bh_enable_ip.clone.8+0x20/0x8c > [ 636.091044] [<ffffffff812c2781>] ? icmp_send+0x337/0x349 > [ 636.096418] [<ffffffff810454e5>] ? __stack_chk_fail+0x17/0x17 > [ 636.102221] [<ffffffff812c2781>] ? icmp_send+0x337/0x349 > [ 636.107595] [<ffffffff81298527>] ? nf_iterate+0x41/0x7e > [ 636.112883] [<ffffffff81298527>] ? nf_iterate+0x41/0x7e > [ 636.118172] [<ffffffffa017b0d4>] ? br_flood+0xc8/0xc8 [bridge] > [ 636.124065] [<ffffffffa017b250>] ? __br_deliver+0xb0/0xb0 [bridge] > [ 636.130302] [<ffffffff812985d7>] ? nf_hook_slow+0x73/0x114 > [ 636.135850] [<ffffffffa017b250>] ? __br_deliver+0xb0/0xb0 [bridge] > [ 636.142089] [<ffffffffa017be89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] > [ 636.148586] [<ffffffffa017b250>] ? __br_deliver+0xb0/0xb0 [bridge] > [ 636.154826] [<ffffffffa017b186>] ? NF_HOOK.clone.5+0x3c/0x56 [bridge] > [ 636.161323] [<ffffffffa017bfe1>] ? > br_handle_frame_finish+0x158/0x1c7 [bridge] > [ 636.168601] [<ffffffffa0180689>] ? > br_nf_pre_routing_finish+0x1d4/0x1e1 [bridge] > [ 636.176052] [<ffffffffa017fc76>] ? NF_HOOK_THRESH+0x3b/0x55 [bridge] > [ 636.182463] [<ffffffffa0180c84>] ? br_nf_pre_routing+0x3be/0x3cb > [bridge] > [ 636.189307] [<ffffffff812985d7>] ? nf_hook_slow+0x73/0x114 > [ 636.194852] [<ffffffff81298527>] ? nf_iterate+0x41/0x7e > [ 636.200139] [<ffffffffa017be89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] > [ 636.206637] [<ffffffffa017be89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] > [ 636.213133] [<ffffffff812985d7>] ? nf_hook_slow+0x73/0x114 > [ 636.218679] [<ffffffffa017be89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] > [ 636.225177] [<ffffffffa017bfe1>] ? > br_handle_frame_finish+0x158/0x1c7 [bridge] > [ 636.232455] [<ffffffffa017be89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] > [ 636.238954] [<ffffffffa017be6f>] ? NF_HOOK.clone.4+0x3c/0x56 [bridge] > [ 636.245452] [<ffffffff812a7d8e>] ? tcp_gro_receive+0xa1/0x204 > [ 636.251258] [<ffffffffa017c1e5>] ? br_handle_frame+0x195/0x1ac [bridge] > [ 636.257928] [<ffffffffa017c050>] ? > br_handle_frame_finish+0x1c7/0x1c7 [bridge] > [ 636.265204] [<ffffffff812764ef>] ? __netif_receive_skb+0x2a7/0x450 > [ 636.271443] [<ffffffff81276928>] ? netif_receive_skb+0x52/0x58 > [ 636.277335] [<ffffffff81276e2a>] ? napi_gro_receive+0x1f/0x2f > [ 636.283139] [<ffffffff812769ff>] ? napi_skb_finish+0x1c/0x31 > [ 636.288865] [<ffffffffa0241fcd>] ? igb_poll+0x6d9/0x9ee [igb] > [ 636.294673] [<ffffffffa003bde2>] ? scsi_run_queue+0x2ce/0x30a [scsi_mod] > [ 636.301431] [<ffffffffa017be89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] > [ 636.307930] [<ffffffff812764ef>] ? __netif_receive_skb+0x2a7/0x450 > [ 636.314168] [<ffffffff81276f55>] ? net_rx_action+0xa4/0x1b1 > [ 636.319800] [<ffffffff8104ad26>] ? __do_softirq+0xb8/0x176 > [ 636.325346] [<ffffffff81333c5c>] ? call_softirq+0x1c/0x30 > [ 636.330807] [<ffffffff8100aa57>] ? do_softirq+0x3f/0x84 > [ 636.336092] [<ffffffff8104af91>] ? irq_exit+0x3f/0x8f > [ 636.341204] [<ffffffff8100a793>] ? do_IRQ+0x85/0x9e > [ 636.346146] [<ffffffff8132cbd3>] ? common_interrupt+0x13/0x13 > [ 636.351949] <EOI> [<ffffffff81271f58>] ? arch_local_irq_save+0x12/0x1b > [ 636.358629] [<ffffffff8100a9f2>] ? arch_local_irq_restore+0x2/0x8 > [ 636.364781] [<ffffffff8127680d>] ? netif_rx_ni+0x1e/0x27 > [ 636.370154] [<ffffffffa01557d2>] ? tun_get_user+0x3a3/0x3cb [tun] > [ 636.376305] [<ffffffffa0155bd8>] ? tun_get_socket+0x3b/0x3b [tun] > [ 636.382457] [<ffffffffa0155c36>] ? tun_chr_aio_write+0x5e/0x79 [tun] > [ 636.388869] [<ffffffff810f6b07>] ? do_sync_readv_writev+0x9a/0xd5 > [ 636.395021] [<ffffffff810371f3>] ? need_resched+0x1a/0x23 > [ 636.400481] [<ffffffff8132b725>] ? _cond_resched+0x9/0x20 > [ 636.405941] [<ffffffff810f5f77>] ? copy_from_user+0x18/0x30 > [ 636.411573] [<ffffffff8115fbf6>] ? security_file_permission+0x18/0x33 > [ 636.418068] [<ffffffff810f6d55>] ? do_readv_writev+0xa4/0x11a > [ 636.423873] [<ffffffff810f7913>] ? fput+0x1a/0x1a2 > [ 636.428726] [<ffffffff810f6f39>] ? sys_writev+0x45/0x90 > [ 636.434012] [<ffffffff81332a52>] ? system_call_fastpath+0x16/0x1b > > ------------ > > [ 110.442839] br0: port 2(tap0) entering forwarding state > [ 136.948700] Kernel panic - not syncing: stack-protector: Kernel stack > is corrupted in: ffffffff812c2781 > [ 136.948702] > [ 136.959561] Pid: 1093, comm: md123_resync Not tainted 2.6.39-rc2+ #11 > [ 136.965977] Call Trace: > [ 136.968408] <IRQ> [<ffffffff8132ad78>] ? panic+0x92/0x1a1 > [ 136.973970] [<ffffffff8104abe8>] ? _local_bh_enable_ip.clone.8+0x20/0x8c > [ 136.980727] [<ffffffff812c2781>] ? icmp_send+0x337/0x349 > [ 136.986102] [<ffffffff810454e5>] ? __stack_chk_fail+0x17/0x17 > [ 136.991906] [<ffffffff812c2781>] ? icmp_send+0x337/0x349 > [ 136.997281] [<ffffffff81298527>] ? nf_iterate+0x41/0x7e > [ 137.002570] [<ffffffffa0198fe1>] ? > br_handle_frame_finish+0x158/0x1c7 [bridge] > [ 137.009847] [<ffffffffa019d689>] ? > br_nf_pre_routing_finish+0x1d4/0x1e1 [bridge] > [ 137.017297] [<ffffffffa019cc76>] ? NF_HOOK_THRESH+0x3b/0x55 [bridge] > [ 137.023707] [<ffffffffa019dc84>] ? br_nf_pre_routing+0x3be/0x3cb > [bridge] > [ 137.030551] [<ffffffff81298527>] ? nf_iterate+0x41/0x7e > [ 137.035837] [<ffffffff8103704d>] ? test_tsk_need_resched+0xe/0x17 > [ 137.041991] [<ffffffffa0198e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] > [ 137.048488] [<ffffffffa0198e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] > [ 137.054984] [<ffffffff812985d7>] ? nf_hook_slow+0x73/0x114 > [ 137.060531] [<ffffffffa0198e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] > [ 137.067028] [<ffffffffa0198e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] > [ 137.073526] [<ffffffffa0198e6f>] ? NF_HOOK.clone.4+0x3c/0x56 [bridge] > [ 137.080023] [<ffffffff812a7d8e>] ? tcp_gro_receive+0xa1/0x204 > [ 137.085830] [<ffffffffa01991e5>] ? br_handle_frame+0x195/0x1ac [bridge] > [ 137.092500] [<ffffffffa0199050>] ? > br_handle_frame_finish+0x1c7/0x1c7 [bridge] > [ 137.099776] [<ffffffff812764ef>] ? __netif_receive_skb+0x2a7/0x450 > [ 137.106013] [<ffffffff81276928>] ? netif_receive_skb+0x52/0x58 > [ 137.111906] [<ffffffff81276e2a>] ? napi_gro_receive+0x1f/0x2f > [ 137.117713] [<ffffffff812769ff>] ? napi_skb_finish+0x1c/0x31 > [ 137.123438] [<ffffffffa0226fcd>] ? igb_poll+0x6d9/0x9ee [igb] > [ 137.129243] [<ffffffff8109034f>] ? handle_irq_event+0x40/0x55 > [ 137.135049] [<ffffffff8132cbd3>] ? common_interrupt+0x13/0x13 > [ 137.140854] [<ffffffff81276f55>] ? net_rx_action+0xa4/0x1b1 > [ 137.146487] [<ffffffff8104ad26>] ? __do_softirq+0xb8/0x176 > [ 137.152034] [<ffffffff81333c5c>] ? call_softirq+0x1c/0x30 > [ 137.157494] [<ffffffff8100aa57>] ? do_softirq+0x3f/0x84 > [ 137.162779] [<ffffffff8104af91>] ? irq_exit+0x3f/0x8f > [ 137.167893] [<ffffffff8100a793>] ? do_IRQ+0x85/0x9e > [ 137.172833] [<ffffffff8132cbd3>] ? common_interrupt+0x13/0x13 > [ 137.178636] <EOI> [<ffffffff8106fc1a>] ? arch_local_irq_restore+0x2/0x8 > [ 137.185408] [<ffffffffa0050fca>] ? _scsih_qcmd+0x54f/0x561 [mpt2sas] > [ 137.191823] [<ffffffffa01e452f>] ? scsi_dispatch_cmd+0x180/0x219 > [scsi_mod] > [ 137.198841] [<ffffffffa01ea385>] ? scsi_request_fn+0x3e6/0x413 > [scsi_mod] > [ 137.205683] [<ffffffff81187470>] ? elv_rqhash_add.clone.15+0x26/0x4c > [ 137.212095] [<ffffffff8118bde2>] ? __blk_run_queue+0x5e/0x84 > [ 137.217814] [<ffffffff8118d63c>] ? __make_request+0x273/0x28f > [ 137.223619] [<ffffffff8118b569>] ? generic_make_request+0x267/0x2e1 > [ 137.229943] [<ffffffff8105eb49>] ? remove_wait_queue+0x11/0x4d > [ 137.235837] [<ffffffffa0002417>] ? raise_barrier+0x162/0x16f [raid1] > [ 137.242246] [<ffffffff8103eba4>] ? try_to_wake_up+0x17c/0x17c > [ 137.248052] [<ffffffffa0002f2f>] ? sync_request+0x567/0x583 [raid1] > [ 137.254379] [<ffffffffa00bd834>] ? md_do_sync+0x776/0xb8e [md_mod] > [ 137.260617] [<ffffffff8100e537>] ? sched_clock+0x5/0x8 > [ 137.265819] [<ffffffffa00bde83>] ? md_thread+0xfa/0x118 [md_mod] > [ 137.271886] [<ffffffffa00bdd89>] ? md_rdev_init+0x8f/0x8f [md_mod] > [ 137.278124] [<ffffffffa00bdd89>] ? md_rdev_init+0x8f/0x8f [md_mod] > [ 137.284362] [<ffffffff8105e497>] ? kthread+0x7a/0x82 > [ 137.289390] [<ffffffff81333b64>] ? kernel_thread_helper+0x4/0x10 > [ 137.295454] [<ffffffff8105e41d>] ? kthread_worker_fn+0x149/0x149 > [ 137.301519] [<ffffffff81333b60>] ? gs_change+0x13/0x13 > Considering recent changes in ip_options_echo() I would suggest to add following patch and/or revert commit 8628bd8af7c4c14f40 (ipv4: Fix IP timestamp option (IPOPT_TS_PRESPEC) handling in ip_options_echo()) Thanks diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c index 28a736f..35f2bf9 100644 --- a/net/ipv4/ip_options.c +++ b/net/ipv4/ip_options.c @@ -200,6 +200,11 @@ int ip_options_echo(struct ip_options * dopt, struct sk_buff * skb) *dptr++ = IPOPT_END; dopt->optlen++; } + if (unlikely(dopt->optlen > 40)) { + pr_err("ip_options_echo() fatal error optlen=%u > 40\n", dopt->optlen); + print_hex_dump(KERN_ERR, "ip options: ", DUMP_PREFIX_OFFSET, + 16, 1, dopt->__data, dopt->optlen, false); + } return 0; } ^ permalink raw reply related [flat|nested] 56+ messages in thread
* Re: Kernel panic when using bridge 2011-04-12 11:49 ` Kernel panic when using bridge Eric Dumazet @ 2011-04-12 13:02 ` Jan Lübbe 2011-04-12 13:15 ` Eric Dumazet 0 siblings, 1 reply; 56+ messages in thread From: Jan Lübbe @ 2011-04-12 13:02 UTC (permalink / raw) To: Eric Dumazet; +Cc: Scot Doyle, Stephen Hemminger, Hiroaki SHIMODA, netdev Hi! On Tue, 2011-04-12 at 13:49 +0200, Eric Dumazet wrote: > Considering recent changes in ip_options_echo() I would suggest to add > following patch and/or revert commit 8628bd8af7c4c14f40 > (ipv4: Fix IP timestamp option (IPOPT_TS_PRESPEC) handling in > ip_options_echo()) I've read this thread, but I'm not sure why my patch is related to these kernel panics. The behavior is only changed for packets with the timestamp option in prespecified addresses mode. Even then, it shouldn't cause ip_options_build to write to unallocated memory later. > diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c > index 28a736f..35f2bf9 100644 > --- a/net/ipv4/ip_options.c > +++ b/net/ipv4/ip_options.c > @@ -200,6 +200,11 @@ int ip_options_echo(struct ip_options * dopt, struct sk_buff * skb) > *dptr++ = IPOPT_END; > dopt->optlen++; > } > + if (unlikely(dopt->optlen > 40)) { > + pr_err("ip_options_echo() fatal error optlen=%u > 40\n", dopt->optlen); > + print_hex_dump(KERN_ERR, "ip options: ", DUMP_PREFIX_OFFSET, > + 16, 1, dopt->__data, dopt->optlen, false); > + } > return 0; > } Here you check dopt->optlen, which certainly should be 40 at most. The calculation of dopt->optlen wasn't changed by my patch, though. Regards, Jan ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: Kernel panic when using bridge 2011-04-12 13:02 ` Jan Lübbe @ 2011-04-12 13:15 ` Eric Dumazet 2011-04-12 14:19 ` Jan Lübbe 0 siblings, 1 reply; 56+ messages in thread From: Eric Dumazet @ 2011-04-12 13:15 UTC (permalink / raw) To: Jan Lübbe; +Cc: Scot Doyle, Stephen Hemminger, Hiroaki SHIMODA, netdev Le mardi 12 avril 2011 à 15:02 +0200, Jan Lübbe a écrit : > Hi! > > On Tue, 2011-04-12 at 13:49 +0200, Eric Dumazet wrote: > > Considering recent changes in ip_options_echo() I would suggest to add > > following patch and/or revert commit 8628bd8af7c4c14f40 > > (ipv4: Fix IP timestamp option (IPOPT_TS_PRESPEC) handling in > > ip_options_echo()) > > I've read this thread, but I'm not sure why my patch is related to these > kernel panics. The behavior is only changed for packets with the > timestamp option in prespecified addresses mode. Even then, it shouldn't > cause ip_options_build to write to unallocated memory later. > > > diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c > > index 28a736f..35f2bf9 100644 > > --- a/net/ipv4/ip_options.c > > +++ b/net/ipv4/ip_options.c > > @@ -200,6 +200,11 @@ int ip_options_echo(struct ip_options * dopt, struct sk_buff * skb) > > *dptr++ = IPOPT_END; > > dopt->optlen++; > > } > > + if (unlikely(dopt->optlen > 40)) { > > + pr_err("ip_options_echo() fatal error optlen=%u > 40\n", dopt->optlen); > > + print_hex_dump(KERN_ERR, "ip options: ", DUMP_PREFIX_OFFSET, > > + 16, 1, dopt->__data, dopt->optlen, false); > > + } > > return 0; > > } > > Here you check dopt->optlen, which certainly should be 40 at most. The > calculation of dopt->optlen wasn't changed by my patch, though. Check again the thread Jan. Scot is using a tool (IP Stack Checker's tcpsic) to forge random tcp packets. Maybe your patch is fine but requires a change in a previous function, to make sure we deny some crazy packet before generating an ip_options with more than 40 bytes, in an icmp_send() reply. I took a look at this ip_options stuff and must say its really hard to even _read_ the code. Understanding it might need several days or a new brain ? I cannot Ack or Nack your patch, I must admit it. Isnt it frightening ? ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: Kernel panic when using bridge 2011-04-12 13:15 ` Eric Dumazet @ 2011-04-12 14:19 ` Jan Lübbe 2011-04-12 14:49 ` Eric Dumazet 0 siblings, 1 reply; 56+ messages in thread From: Jan Lübbe @ 2011-04-12 14:19 UTC (permalink / raw) To: Eric Dumazet; +Cc: Scot Doyle, Stephen Hemminger, Hiroaki SHIMODA, netdev On Tue, 2011-04-12 at 15:15 +0200, Eric Dumazet wrote: > Le mardi 12 avril 2011 à 15:02 +0200, Jan Lübbe a écrit : > > Here you check dopt->optlen, which certainly should be 40 at most. The > > calculation of dopt->optlen wasn't changed by my patch, though. > > Check again the thread Jan. > > Scot is using a tool (IP Stack Checker's tcpsic) to forge random tcp > packets. > Maybe your patch is fine but requires a change in a previous function, > to make sure we deny some crazy packet before generating an ip_options > with more than 40 bytes, in an icmp_send() reply. One thing which could expose a problem is that it now will timestamp the packet in the last 'slot', too. (which it didn't before) In general, there is not a lot of error-checking in the options stuff. > I took a look at this ip_options stuff and must say its really hard to > even _read_ the code. Understanding it might need several days or a new > brain ? It took me some days do even figure out how it is supposed to fit together... > I cannot Ack or Nack your patch, I must admit it. Isnt it frightening? David Miller already declared this code as 'officially terrible'... Your patch should catch those forged packets before more harmful things can go wrong, but even before my patch, i think forged packets could cause trouble... Regards, Jan ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: Kernel panic when using bridge 2011-04-12 14:19 ` Jan Lübbe @ 2011-04-12 14:49 ` Eric Dumazet 2011-04-12 15:13 ` Jan Lübbe 0 siblings, 1 reply; 56+ messages in thread From: Eric Dumazet @ 2011-04-12 14:49 UTC (permalink / raw) To: Jan Lübbe; +Cc: Scot Doyle, Stephen Hemminger, Hiroaki SHIMODA, netdev Le mardi 12 avril 2011 à 16:19 +0200, Jan Lübbe a écrit : > Your patch should catch those forged packets before more harmful things > can go wrong, but even before my patch, i think forged packets could > cause trouble... Well, this is a debugging aid and wont avoid a crash later (since we already made an out of bounds write, generally on a stack slot) Of course, this might be a complete shot in the dark, but a stackprotector fault in icmp_send() really sounds like a problem in ip_options_echo() [ or bad input data given to this function ] Other related changes (but as old as v2.6.22) : commit 11a03f78fbf15a866ba ([NetLabel]: core network changes) ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: Kernel panic when using bridge 2011-04-12 14:49 ` Eric Dumazet @ 2011-04-12 15:13 ` Jan Lübbe 2011-04-12 16:14 ` Eric Dumazet 0 siblings, 1 reply; 56+ messages in thread From: Jan Lübbe @ 2011-04-12 15:13 UTC (permalink / raw) To: Eric Dumazet; +Cc: Scot Doyle, Stephen Hemminger, Hiroaki SHIMODA, netdev On Tue, 2011-04-12 at 16:49 +0200, Eric Dumazet wrote: > Of course, this might be a complete shot in the dark, but a > stackprotector fault in icmp_send() really sounds like a problem in > ip_options_echo() [ or bad input data given to this function ] It was my understanding that all IP options given to ip_options_echo are either from local sources or have gone through ip_options_compile, which seems to verify that the sum of the individual option lengths do not exceed the ip header. So there wouldn't need to be additional checks in ip_options_echo. If this is not the case, we need size checks in ip_options_echo before copying over each option. > Other related changes (but as old as v2.6.22) : > > commit 11a03f78fbf15a866ba > ([NetLabel]: core network changes) When investigating the problem I had with timestamps, i found that most of the lines in ip_options_echo and _compile have not been changed since before 2.2 (some even before 2.0). The newer changes have all been updates for changed API elsewhere in the stack. Regards, Jan ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: Kernel panic when using bridge 2011-04-12 15:13 ` Jan Lübbe @ 2011-04-12 16:14 ` Eric Dumazet 2011-04-12 16:20 ` Stephen Hemminger 2011-04-12 16:32 ` Kernel panic when using bridge Bandan Das 0 siblings, 2 replies; 56+ messages in thread From: Eric Dumazet @ 2011-04-12 16:14 UTC (permalink / raw) To: Jan Lübbe Cc: Scot Doyle, Stephen Hemminger, Hiroaki SHIMODA, netdev, Bandan Das Le mardi 12 avril 2011 à 17:13 +0200, Jan Lübbe a écrit : > On Tue, 2011-04-12 at 16:49 +0200, Eric Dumazet wrote: > > Of course, this might be a complete shot in the dark, but a > > stackprotector fault in icmp_send() really sounds like a problem in > > ip_options_echo() [ or bad input data given to this function ] > > It was my understanding that all IP options given to ip_options_echo are > either from local sources or have gone through ip_options_compile, which > seems to verify that the sum of the individual option lengths do not > exceed the ip header. So there wouldn't need to be additional checks in > ip_options_echo. > > If this is not the case, we need size checks in ip_options_echo before > copying over each option. > > > Other related changes (but as old as v2.6.22) : > > > > commit 11a03f78fbf15a866ba > > ([NetLabel]: core network changes) > > When investigating the problem I had with timestamps, i found that most > of the lines in ip_options_echo and _compile have not been changed since > before 2.2 (some even before 2.0). The newer changes have all been > updates for changed API elsewhere in the stack. > commit 462fb2af9788a82 might be the problem. (bridge : Sanitize skb before it enters the IP stack) We are supposed to provide a zeroed ip_options to ip_options_compile() diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c index 008ff6c..f3bc322 100644 --- a/net/bridge/br_netfilter.c +++ b/net/bridge/br_netfilter.c @@ -249,11 +249,9 @@ static int br_parse_ip_options(struct sk_buff *skb) goto drop; } - /* Zero out the CB buffer if no options present */ - if (iph->ihl == 5) { - memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); + memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); + if (iph->ihl == 5) return 0; - } opt->optlen = iph->ihl*4 - sizeof(struct iphdr); if (ip_options_compile(dev_net(dev), opt, skb)) ^ permalink raw reply related [flat|nested] 56+ messages in thread
* Re: Kernel panic when using bridge 2011-04-12 16:14 ` Eric Dumazet @ 2011-04-12 16:20 ` Stephen Hemminger 2011-04-12 16:35 ` Eric Dumazet 2011-04-12 16:32 ` Kernel panic when using bridge Bandan Das 1 sibling, 1 reply; 56+ messages in thread From: Stephen Hemminger @ 2011-04-12 16:20 UTC (permalink / raw) To: Eric Dumazet Cc: Jan Lübbe, Scot Doyle, Hiroaki SHIMODA, netdev, Bandan Das On Tue, 12 Apr 2011 18:14:11 +0200 Eric Dumazet <eric.dumazet@gmail.com> wrote: > Le mardi 12 avril 2011 à 17:13 +0200, Jan Lübbe a écrit : > > On Tue, 2011-04-12 at 16:49 +0200, Eric Dumazet wrote: > > > Of course, this might be a complete shot in the dark, but a > > > stackprotector fault in icmp_send() really sounds like a problem in > > > ip_options_echo() [ or bad input data given to this function ] > > > > It was my understanding that all IP options given to ip_options_echo are > > either from local sources or have gone through ip_options_compile, which > > seems to verify that the sum of the individual option lengths do not > > exceed the ip header. So there wouldn't need to be additional checks in > > ip_options_echo. > > > > If this is not the case, we need size checks in ip_options_echo before > > copying over each option. > > > > > Other related changes (but as old as v2.6.22) : > > > > > > commit 11a03f78fbf15a866ba > > > ([NetLabel]: core network changes) > > > > When investigating the problem I had with timestamps, i found that most > > of the lines in ip_options_echo and _compile have not been changed since > > before 2.2 (some even before 2.0). The newer changes have all been > > updates for changed API elsewhere in the stack. > > > > commit 462fb2af9788a82 might be the problem. > (bridge : Sanitize skb before it enters the IP stack) > > We are supposed to provide a zeroed ip_options to ip_options_compile() > > diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c > index 008ff6c..f3bc322 100644 > --- a/net/bridge/br_netfilter.c > +++ b/net/bridge/br_netfilter.c > @@ -249,11 +249,9 @@ static int br_parse_ip_options(struct sk_buff *skb) > goto drop; > } > > - /* Zero out the CB buffer if no options present */ > - if (iph->ihl == 5) { > - memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); > + memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); > + if (iph->ihl == 5) > return 0; > - } > > opt->optlen = iph->ihl*4 - sizeof(struct iphdr); > if (ip_options_compile(dev_net(dev), opt, skb)) I think the confusion is that IPCB(skb) is not the IP header but scratch space used during IP header processing. Before the sanitize patch the CB was cleared. Acked-by: Stephen Hemminger <shemminger@vyatta.com> ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: Kernel panic when using bridge 2011-04-12 16:20 ` Stephen Hemminger @ 2011-04-12 16:35 ` Eric Dumazet 2011-04-12 16:45 ` Bandan Das 0 siblings, 1 reply; 56+ messages in thread From: Eric Dumazet @ 2011-04-12 16:35 UTC (permalink / raw) To: Stephen Hemminger Cc: Jan Lübbe, Scot Doyle, Hiroaki SHIMODA, netdev, Bandan Das Le mardi 12 avril 2011 à 09:20 -0700, Stephen Hemminger a écrit : > I think the confusion is that IPCB(skb) is not the IP header but > scratch space used during IP header processing. Before the sanitize > patch the CB was cleared. > > Acked-by: Stephen Hemminger <shemminger@vyatta.com> Should we clear it also in br_nf_dev_queue_xmit(), since we did this prior to commit 462fb2af9788a8 ? Thanks ! ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: Kernel panic when using bridge 2011-04-12 16:35 ` Eric Dumazet @ 2011-04-12 16:45 ` Bandan Das 2011-04-12 16:54 ` Eric Dumazet 0 siblings, 1 reply; 56+ messages in thread From: Bandan Das @ 2011-04-12 16:45 UTC (permalink / raw) To: Eric Dumazet Cc: Stephen Hemminger, Jan Lübbe, Scot Doyle, Hiroaki SHIMODA, netdev, Bandan Das On 0, Eric Dumazet <eric.dumazet@gmail.com> wrote: > Le mardi 12 avril 2011 à 09:20 -0700, Stephen Hemminger a écrit : > > > I think the confusion is that IPCB(skb) is not the IP header but > > scratch space used during IP header processing. Before the sanitize > > patch the CB was cleared. > > > > Acked-by: Stephen Hemminger <shemminger@vyatta.com> > > Should we clear it also in br_nf_dev_queue_xmit(), since we did this > prior to commit 462fb2af9788a8 ? > > Thanks ! > Wouldn't that clear out any valid IP options if it were there ? I think that was the whole point of adding br_parse_ip_options : /* BUG: Should really parse the IP options here. */ memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); -- Bandan ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: Kernel panic when using bridge 2011-04-12 16:45 ` Bandan Das @ 2011-04-12 16:54 ` Eric Dumazet 2011-04-12 17:18 ` [PATCH] bridge: reset IPCB in br_parse_ip_options Eric Dumazet 0 siblings, 1 reply; 56+ messages in thread From: Eric Dumazet @ 2011-04-12 16:54 UTC (permalink / raw) To: Bandan Das Cc: Stephen Hemminger, Jan Lübbe, Scot Doyle, Hiroaki SHIMODA, netdev Le mardi 12 avril 2011 à 12:45 -0400, Bandan Das a écrit : > On 0, Eric Dumazet <eric.dumazet@gmail.com> wrote: > > Le mardi 12 avril 2011 à 09:20 -0700, Stephen Hemminger a écrit : > > > > > I think the confusion is that IPCB(skb) is not the IP header but > > > scratch space used during IP header processing. Before the sanitize > > > patch the CB was cleared. > > > > > > Acked-by: Stephen Hemminger <shemminger@vyatta.com> > > > > Should we clear it also in br_nf_dev_queue_xmit(), since we did this > > prior to commit 462fb2af9788a8 ? > > > > Thanks ! > > > Wouldn't that clear out any valid IP options if it were there ? I think > that was the whole point of adding br_parse_ip_options : > > /* BUG: Should really parse the IP options here. */ > memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); > > > Oh yes, I missed br_nf_dev_queue_xmit() called br_parse_ip_options() and not ip_options_compile() I'll submit an official patch, thanks ! ^ permalink raw reply [flat|nested] 56+ messages in thread
* [PATCH] bridge: reset IPCB in br_parse_ip_options 2011-04-12 16:54 ` Eric Dumazet @ 2011-04-12 17:18 ` Eric Dumazet 2011-04-12 20:39 ` David Miller 2011-04-12 23:55 ` Scot Doyle 0 siblings, 2 replies; 56+ messages in thread From: Eric Dumazet @ 2011-04-12 17:18 UTC (permalink / raw) To: David Miller Cc: Stephen Hemminger, Jan Lübbe, Scot Doyle, Hiroaki SHIMODA, netdev, Bandan Das Commit 462fb2af9788a82 (bridge : Sanitize skb before it enters the IP stack), missed one IPCB init before calling ip_options_compile() Thanks to Scot Doyle for his tests and bug reports. Reported-by: Scot Doyle <lkml@scotdoyle.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> Acked-by: Bandan Das <bandan.das@stratus.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Cc: Jan Lübbe <jluebbe@debian.org> --- net/bridge/br_netfilter.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c index 008ff6c..b353f7c 100644 --- a/net/bridge/br_netfilter.c +++ b/net/bridge/br_netfilter.c @@ -249,11 +249,9 @@ static int br_parse_ip_options(struct sk_buff *skb) goto drop; } - /* Zero out the CB buffer if no options present */ - if (iph->ihl == 5) { - memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); + memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); + if (iph->ihl == 5) return 0; - } opt->optlen = iph->ihl*4 - sizeof(struct iphdr); if (ip_options_compile(dev_net(dev), opt, skb)) ^ permalink raw reply related [flat|nested] 56+ messages in thread
* Re: [PATCH] bridge: reset IPCB in br_parse_ip_options 2011-04-12 17:18 ` [PATCH] bridge: reset IPCB in br_parse_ip_options Eric Dumazet @ 2011-04-12 20:39 ` David Miller 2011-04-12 23:55 ` Scot Doyle 1 sibling, 0 replies; 56+ messages in thread From: David Miller @ 2011-04-12 20:39 UTC (permalink / raw) To: eric.dumazet Cc: shemminger, jluebbe, lkml, shimoda.hiroaki, netdev, bandan.das From: Eric Dumazet <eric.dumazet@gmail.com> Date: Tue, 12 Apr 2011 19:18:40 +0200 > Commit 462fb2af9788a82 (bridge : Sanitize skb before it enters the IP > stack), missed one IPCB init before calling ip_options_compile() > > Thanks to Scot Doyle for his tests and bug reports. > > Reported-by: Scot Doyle <lkml@scotdoyle.com> > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> > Cc: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> > Acked-by: Bandan Das <bandan.das@stratus.com> > Acked-by: Stephen Hemminger <shemminger@vyatta.com> > Cc: Jan Lübbe <jluebbe@debian.org> Applied, thanks everyone. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] bridge: reset IPCB in br_parse_ip_options 2011-04-12 17:18 ` [PATCH] bridge: reset IPCB in br_parse_ip_options Eric Dumazet 2011-04-12 20:39 ` David Miller @ 2011-04-12 23:55 ` Scot Doyle 2011-04-13 4:12 ` Scot Doyle 1 sibling, 1 reply; 56+ messages in thread From: Scot Doyle @ 2011-04-12 23:55 UTC (permalink / raw) To: Eric Dumazet Cc: David Miller, Stephen Hemminger, Jan Lübbe, Hiroaki SHIMODA, netdev, Bandan Das On 04/12/2011 12:18 PM, Eric Dumazet wrote: > Commit 462fb2af9788a82 (bridge : Sanitize skb before it enters the IP > stack), missed one IPCB init before calling ip_options_compile() > > Thanks to Scot Doyle for his tests and bug reports. > > Reported-by: Scot Doyle<lkml@scotdoyle.com> > Signed-off-by: Eric Dumazet<eric.dumazet@gmail.com> > Cc: Hiroaki SHIMODA<shimoda.hiroaki@gmail.com> > Acked-by: Bandan Das<bandan.das@stratus.com> > Acked-by: Stephen Hemminger<shemminger@vyatta.com> > Cc: Jan Lübbe<jluebbe@debian.org> > --- > net/bridge/br_netfilter.c | 6 ++---- > 1 file changed, 2 insertions(+), 4 deletions(-) > > diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c > index 008ff6c..b353f7c 100644 > --- a/net/bridge/br_netfilter.c > +++ b/net/bridge/br_netfilter.c > @@ -249,11 +249,9 @@ static int br_parse_ip_options(struct sk_buff *skb) > goto drop; > } > > - /* Zero out the CB buffer if no options present */ > - if (iph->ihl == 5) { > - memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); > + memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); > + if (iph->ihl == 5) > return 0; > - } > > opt->optlen = iph->ihl*4 - sizeof(struct iphdr); > if (ip_options_compile(dev_net(dev), opt, skb)) > > Here's the output after pulling 2.6.39-rc3, applying the patches listed below, doing a "make clean" and hitting the bridge's assigned ip address with the IP Stack Checker tcpsic command. Maybe I should also be applying the patch from yesterday too? I'll try that next. diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c index 008ff6c..b9bdff9 100644 --- a/net/bridge/br_netfilter.c +++ b/net/bridge/br_netfilter.c @@ -249,11 +249,9 @@ static int br_parse_ip_options(struct sk_buff *skb) goto drop; } - /* Zero out the CB buffer if no options present */ - if (iph->ihl == 5) { - memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); - return 0; - } + memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); + if (iph->ihl == 5) + return 0; opt->optlen = iph->ihl*4 - sizeof(struct iphdr); if (ip_options_compile(dev_net(dev), opt, skb)) diff --git a/net/ipv4/inetpeer.c b/net/ipv4/inetpeer.c index dd1b20e..9df4e63 100644 --- a/net/ipv4/inetpeer.c +++ b/net/ipv4/inetpeer.c @@ -354,7 +354,8 @@ static void inetpeer_free_rcu(struct rcu_head *head) } /* May be called with local BH enabled. */ -static void unlink_from_pool(struct inet_peer *p, struct inet_peer_base *base) +static void unlink_from_pool(struct inet_peer *p, struct inet_peer_base *base, + struct inet_peer __rcu **stack[PEER_MAXDEPTH]) { int do_free; @@ -368,7 +369,6 @@ static void unlink_from_pool(struct inet_peer *p, struct inet_peer_base *base) * We use refcnt=-1 to alert lockless readers this entry is deleted. */ if (atomic_cmpxchg(&p->refcnt, 1, -1) == 1) { - struct inet_peer __rcu **stack[PEER_MAXDEPTH]; struct inet_peer __rcu ***stackptr, ***delp; if (lookup(&p->daddr, stack, base) != p) BUG(); @@ -422,7 +422,7 @@ static struct inet_peer_base *peer_to_base(struct inet_peer *p) } /* May be called with local BH enabled. */ -static int cleanup_once(unsigned long ttl) +static int cleanup_once(unsigned long ttl, struct inet_peer __rcu **stack[PEER_MAXDEPTH]) { struct inet_peer *p = NULL; @@ -454,7 +454,7 @@ static int cleanup_once(unsigned long ttl) * happen because of entry limits in route cache. */ return -1; - unlink_from_pool(p, peer_to_base(p)); + unlink_from_pool(p, peer_to_base(p), stack); return 0; } @@ -524,7 +524,7 @@ struct inet_peer *inet_getpeer(struct inetpeer_addr *daddr, int create) if (base->total >= inet_peer_threshold) /* Remove one less-recently-used entry. */ - cleanup_once(0); + cleanup_once(0, stack); return p; } @@ -540,6 +540,7 @@ static void peer_check_expire(unsigned long dummy) { unsigned long now = jiffies; int ttl, total; + struct inet_peer __rcu **stack[PEER_MAXDEPTH]; total = compute_total(); if (total >= inet_peer_threshold) @@ -548,7 +549,7 @@ static void peer_check_expire(unsigned long dummy) ttl = inet_peer_maxttl - (inet_peer_maxttl - inet_peer_minttl) / HZ * total / inet_peer_threshold * HZ; - while (!cleanup_once(ttl)) { + while (!cleanup_once(ttl, stack)) { if (jiffies != now) break; } diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c index 28a736f..dea9947 100644 --- a/net/ipv4/ip_options.c +++ b/net/ipv4/ip_options.c @@ -200,6 +200,11 @@ int ip_options_echo(struct ip_options * dopt, struct sk_buff * skb) *dptr++ = IPOPT_END; dopt->optlen++; } + if (unlikely(dopt->optlen > 40)) { + pr_err("ip_options_echo() fatal error optlen=%u > 40\n", dopt->optlen); + print_hex_dump(KERN_ERR, "ip options: ", DUMP_PREFIX_OFFSET, + 16, 1, dopt->__data, dopt->optlen, false); + } return 0; } ------------ [ 761.720393] BUG: unable to handle kernel NULL pointer dereference at 00000000000000d0 [ 761.728206] IP: [<ffffffff8129fbe9>] ip_options_compile+0x1c1/0x435 [ 761.734452] PGD 0 [ 761.736459] Oops: 0000 [#1] SMP [ 761.739683] last sysfs file: /sys/devices/virtual/misc/kvm/uevent [ 761.745744] CPU 0 [ 761.747570] Modules linked in: kvm_intel kvm bridge stp loop snd_pcm snd_timer snd tpm_tis tpm tpm_bios soundcore psmouse snd_page_alloc processor ghes thermal_sys i7core_edac evdev pcspkr serio_raw edac_core dcdbas power_meter button hed ext2 mbcache dm_mod raid1 md_mod sd_mod crc_t10dif usb_storage uas uhci_hcd ehci_hcd mpt2sas scsi_transport_sas raid_class igb scsi_mod usbcore bnx2 dca [last unloaded: scsi_wait_scan] [ 761.785171] [ 761.786651] Pid: 0, comm: swapper Not tainted 2.6.39-rc3+ #14 Dell Inc. PowerEdge R510/0DPRKF [ 761.795157] RIP: 0010:[<ffffffff8129fbe9>] [<ffffffff8129fbe9>] ip_options_compile+0x1c1/0x435 [ 761.803823] RSP: 0018:ffff88042f203af0 EFLAGS: 00010286 [ 761.809106] RAX: 0000000000000017 RBX: ffff8804027b3600 RCX: ffff88040466a864 [ 761.816205] RDX: 000000000000001a RSI: 0000000000000000 RDI: ffffffff817e6100 [ 761.823304] RBP: ffff88040466a862 R08: ffffffffa01d6e89 R09: ffff88042f203c58 [ 761.830402] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8804027b3628 [ 761.837501] R13: 000000000000001d R14: ffff88040466a84e R15: 0000000000000024 [ 761.844601] FS: 0000000000000000(0000) GS:ffff88042f200000(0000) knlGS:0000000000000000 [ 761.852650] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 761.858365] CR2: 00000000000000d0 CR3: 0000000001603000 CR4: 00000000000006f0 [ 761.865463] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 761.872562] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 761.879661] Process swapper (pid: 0, threadinfo ffffffff81600000, task ffffffff8160b020) [ 761.887710] Stack: [ 761.889708] 0000000000000000 ffffffff81276928 0000000000000000 ffffffff817e6100 [ 761.897102] 000000000000004e ffff88040500e600 ffff88040500e600 ffff8804027b3600 [ 761.904496] ffff880404fc0000 ffff8804027b3628 0000000000000000 ffff880404fc0000 [ 761.911889] Call Trace: [ 761.914319] <IRQ> [ 761.916413] [<ffffffff81276928>] ? netif_receive_skb+0x52/0x58 [ 761.922306] [<ffffffffa01dae3b>] ? br_parse_ip_options+0x134/0x1a8 [bridge] [ 761.929319] [<ffffffffa01dbbe0>] ? br_nf_pre_routing+0x348/0x3cb [bridge] [ 761.936160] [<ffffffff81298527>] ? nf_iterate+0x41/0x7e [ 761.941444] [<ffffffff8104afaa>] ? irq_exit+0x58/0x8f [ 761.946556] [<ffffffffa01d6e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [ 761.953052] [<ffffffffa01d6e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [ 761.959546] [<ffffffff812985d7>] ? nf_hook_slow+0x73/0x114 [ 761.965089] [<ffffffffa01d6e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [ 761.971583] [<ffffffff8126d097>] ? __netdev_alloc_skb+0x15/0x2f [ 761.977561] [<ffffffffa01d6e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [ 761.984055] [<ffffffffa01d6e6f>] ? NF_HOOK.clone.4+0x3c/0x56 [bridge] [ 761.990551] [<ffffffff812a7dde>] ? tcp_gro_receive+0xa1/0x204 [ 761.996355] [<ffffffffa01d71e5>] ? br_handle_frame+0x195/0x1ac [bridge] [ 762.003022] [<ffffffffa01d7050>] ? br_handle_frame_finish+0x1c7/0x1c7 [bridge] [ 762.010294] [<ffffffff812764ef>] ? __netif_receive_skb+0x2a7/0x450 [ 762.016530] [<ffffffff81276928>] ? netif_receive_skb+0x52/0x58 [ 762.022420] [<ffffffff81276e2a>] ? napi_gro_receive+0x1f/0x2f [ 762.028222] [<ffffffff812769ff>] ? napi_skb_finish+0x1c/0x31 [ 762.033941] [<ffffffffa024afcd>] ? igb_poll+0x6d9/0x9ee [igb] [ 762.039744] [<ffffffff8109034f>] ? handle_irq_event+0x40/0x55 [ 762.045547] [<ffffffff8106fc3c>] ? arch_local_irq_save+0x14/0x1d [ 762.051609] [<ffffffff81276f55>] ? net_rx_action+0xa4/0x1b1 [ 762.057239] [<ffffffff8104ad26>] ? __do_softirq+0xb8/0x176 [ 762.062783] [<ffffffff81333cdc>] ? call_softirq+0x1c/0x30 [ 762.068241] [<ffffffff8100aa57>] ? do_softirq+0x3f/0x84 [ 762.073524] [<ffffffff8104af91>] ? irq_exit+0x3f/0x8f [ 762.078635] [<ffffffff8100a793>] ? do_IRQ+0x85/0x9e [ 762.083575] [<ffffffff8132cc53>] ? common_interrupt+0x13/0x13 [ 762.089375] <EOI> [ 762.091469] [<ffffffff81061348>] ? enqueue_hrtimer+0x3f/0x53 [ 762.097188] [<ffffffffa0430417>] ? arch_local_irq_enable+0x7/0x8 [processor] [ 762.104288] [<ffffffffa0430dab>] ? acpi_idle_enter_c1+0x86/0xa2 [processor] [ 762.111303] [<ffffffff8125d05d>] ? cpuidle_idle_call+0xf4/0x17e [ 762.117277] [<ffffffff81008298>] ? cpu_idle+0xa2/0xc4 [ 762.122388] [<ffffffff8169db60>] ? start_kernel+0x3b9/0x3c4 [ 762.128018] [<ffffffff8169d3c6>] ? x86_64_start_kernel+0x102/0x10f [ 762.134253] Code: 4d 02 3c 03 0f 86 59 02 00 00 0f b6 d0 44 39 ea 7f 32 83 c2 03 44 39 ea 0f 8f 45 02 00 00 48 85 db 74 18 48 8b 74 24 10 0f b6 c0 <8b> 96 d0 00 00 00 89 54 05 ff 41 80 4c 24 08 04 80 01 04 41 80 [ 762.153593] RIP [<ffffffff8129fbe9>] ip_options_compile+0x1c1/0x435 [ 762.159923] RSP <ffff88042f203af0> [ 762.163391] CR2: 00000000000000d0 [ 762.167017] ---[ end trace e15d7b082f680b62 ]--- ^ permalink raw reply related [flat|nested] 56+ messages in thread
* Re: [PATCH] bridge: reset IPCB in br_parse_ip_options 2011-04-12 23:55 ` Scot Doyle @ 2011-04-13 4:12 ` Scot Doyle 2011-04-13 15:10 ` Scot Doyle 0 siblings, 1 reply; 56+ messages in thread From: Scot Doyle @ 2011-04-13 4:12 UTC (permalink / raw) To: Eric Dumazet Cc: David Miller, Stephen Hemminger, Jan Lübbe, Hiroaki SHIMODA, netdev, Bandan Das On 04/12/2011 06:55 PM, Scot Doyle wrote: > On 04/12/2011 12:18 PM, Eric Dumazet wrote: >> Commit 462fb2af9788a82 (bridge : Sanitize skb before it enters the IP >> stack), missed one IPCB init before calling ip_options_compile() >> >> Thanks to Scot Doyle for his tests and bug reports. >> >> Reported-by: Scot Doyle<lkml@scotdoyle.com> >> Signed-off-by: Eric Dumazet<eric.dumazet@gmail.com> >> Cc: Hiroaki SHIMODA<shimoda.hiroaki@gmail.com> >> Acked-by: Bandan Das<bandan.das@stratus.com> >> Acked-by: Stephen Hemminger<shemminger@vyatta.com> >> Cc: Jan Lübbe<jluebbe@debian.org> >> --- >> net/bridge/br_netfilter.c | 6 ++---- >> 1 file changed, 2 insertions(+), 4 deletions(-) >> >> diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c >> index 008ff6c..b353f7c 100644 >> --- a/net/bridge/br_netfilter.c >> +++ b/net/bridge/br_netfilter.c >> @@ -249,11 +249,9 @@ static int br_parse_ip_options(struct sk_buff *skb) >> goto drop; >> } >> >> - /* Zero out the CB buffer if no options present */ >> - if (iph->ihl == 5) { >> - memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); >> + memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); >> + if (iph->ihl == 5) >> return 0; >> - } >> >> opt->optlen = iph->ihl*4 - sizeof(struct iphdr); >> if (ip_options_compile(dev_net(dev), opt, skb)) >> >> > > > Here's the output after pulling 2.6.39-rc3, applying the patches listed > below, doing a "make clean" and hitting the bridge's assigned ip address > with the IP Stack Checker tcpsic command. Maybe I should also be > applying the patch from yesterday too? I'll try that next. > > diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c > index 008ff6c..b9bdff9 100644 > --- a/net/bridge/br_netfilter.c > +++ b/net/bridge/br_netfilter.c > @@ -249,11 +249,9 @@ static int br_parse_ip_options(struct sk_buff *skb) > goto drop; > } > > - /* Zero out the CB buffer if no options present */ > - if (iph->ihl == 5) { > - memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); > - return 0; > - } > + memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); > + if (iph->ihl == 5) > + return 0; > > opt->optlen = iph->ihl*4 - sizeof(struct iphdr); > if (ip_options_compile(dev_net(dev), opt, skb)) > > diff --git a/net/ipv4/inetpeer.c b/net/ipv4/inetpeer.c > index dd1b20e..9df4e63 100644 > --- a/net/ipv4/inetpeer.c > +++ b/net/ipv4/inetpeer.c > @@ -354,7 +354,8 @@ static void inetpeer_free_rcu(struct rcu_head *head) > } > > /* May be called with local BH enabled. */ > -static void unlink_from_pool(struct inet_peer *p, struct inet_peer_base > *base) > +static void unlink_from_pool(struct inet_peer *p, struct inet_peer_base > *base, > + struct inet_peer __rcu **stack[PEER_MAXDEPTH]) > { > int do_free; > > @@ -368,7 +369,6 @@ static void unlink_from_pool(struct inet_peer *p, > struct inet_peer_base *base) > * We use refcnt=-1 to alert lockless readers this entry is deleted. > */ > if (atomic_cmpxchg(&p->refcnt, 1, -1) == 1) { > - struct inet_peer __rcu **stack[PEER_MAXDEPTH]; > struct inet_peer __rcu ***stackptr, ***delp; > if (lookup(&p->daddr, stack, base) != p) > BUG(); > @@ -422,7 +422,7 @@ static struct inet_peer_base *peer_to_base(struct > inet_peer *p) > } > > /* May be called with local BH enabled. */ > -static int cleanup_once(unsigned long ttl) > +static int cleanup_once(unsigned long ttl, struct inet_peer __rcu > **stack[PEER_MAXDEPTH]) > { > struct inet_peer *p = NULL; > > @@ -454,7 +454,7 @@ static int cleanup_once(unsigned long ttl) > * happen because of entry limits in route cache. */ > return -1; > > - unlink_from_pool(p, peer_to_base(p)); > + unlink_from_pool(p, peer_to_base(p), stack); > return 0; > } > > @@ -524,7 +524,7 @@ struct inet_peer *inet_getpeer(struct inetpeer_addr > *daddr, int create) > > if (base->total >= inet_peer_threshold) > /* Remove one less-recently-used entry. */ > - cleanup_once(0); > + cleanup_once(0, stack); > > return p; > } > @@ -540,6 +540,7 @@ static void peer_check_expire(unsigned long dummy) > { > unsigned long now = jiffies; > int ttl, total; > + struct inet_peer __rcu **stack[PEER_MAXDEPTH]; > > total = compute_total(); > if (total >= inet_peer_threshold) > @@ -548,7 +549,7 @@ static void peer_check_expire(unsigned long dummy) > ttl = inet_peer_maxttl > - (inet_peer_maxttl - inet_peer_minttl) / HZ * > total / inet_peer_threshold * HZ; > - while (!cleanup_once(ttl)) { > + while (!cleanup_once(ttl, stack)) { > if (jiffies != now) > break; > } > > diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c > index 28a736f..dea9947 100644 > --- a/net/ipv4/ip_options.c > +++ b/net/ipv4/ip_options.c > @@ -200,6 +200,11 @@ int ip_options_echo(struct ip_options * dopt, > struct sk_buff * skb) > *dptr++ = IPOPT_END; > dopt->optlen++; > } > + if (unlikely(dopt->optlen > 40)) { > + pr_err("ip_options_echo() fatal error optlen=%u > 40\n", dopt->optlen); > + print_hex_dump(KERN_ERR, "ip options: ", DUMP_PREFIX_OFFSET, > + 16, 1, dopt->__data, dopt->optlen, false); > + } > return 0; > } > > > ------------ > > > [ 761.720393] BUG: unable to handle kernel NULL pointer dereference at > 00000000000000d0 > [ 761.728206] IP: [<ffffffff8129fbe9>] ip_options_compile+0x1c1/0x435 > [ 761.734452] PGD 0 > [ 761.736459] Oops: 0000 [#1] SMP > [ 761.739683] last sysfs file: /sys/devices/virtual/misc/kvm/uevent > [ 761.745744] CPU 0 > [ 761.747570] Modules linked in: kvm_intel kvm bridge stp loop snd_pcm > snd_timer snd tpm_tis tpm tpm_bios soundcore psmouse snd_page_alloc > processor ghes thermal_sys > i7core_edac evdev pcspkr serio_raw edac_core dcdbas power_meter button > hed ext2 mbcache dm_mod raid1 md_mod sd_mod crc_t10dif usb_storage uas > uhci_hcd ehci_hcd mpt2sas > scsi_transport_sas raid_class igb scsi_mod usbcore bnx2 dca [last > unloaded: scsi_wait_scan] > [ 761.785171] > [ 761.786651] Pid: 0, comm: swapper Not tainted 2.6.39-rc3+ #14 Dell > Inc. PowerEdge R510/0DPRKF > [ 761.795157] RIP: 0010:[<ffffffff8129fbe9>] [<ffffffff8129fbe9>] > ip_options_compile+0x1c1/0x435 > [ 761.803823] RSP: 0018:ffff88042f203af0 EFLAGS: 00010286 > [ 761.809106] RAX: 0000000000000017 RBX: ffff8804027b3600 RCX: > ffff88040466a864 > [ 761.816205] RDX: 000000000000001a RSI: 0000000000000000 RDI: > ffffffff817e6100 > [ 761.823304] RBP: ffff88040466a862 R08: ffffffffa01d6e89 R09: > ffff88042f203c58 > [ 761.830402] R10: 0000000000000000 R11: 0000000000000000 R12: > ffff8804027b3628 > [ 761.837501] R13: 000000000000001d R14: ffff88040466a84e R15: > 0000000000000024 > [ 761.844601] FS: 0000000000000000(0000) GS:ffff88042f200000(0000) > knlGS:0000000000000000 > [ 761.852650] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 761.858365] CR2: 00000000000000d0 CR3: 0000000001603000 CR4: > 00000000000006f0 > [ 761.865463] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [ 761.872562] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: > 0000000000000400 > [ 761.879661] Process swapper (pid: 0, threadinfo ffffffff81600000, task > ffffffff8160b020) > [ 761.887710] Stack: > [ 761.889708] 0000000000000000 ffffffff81276928 0000000000000000 > ffffffff817e6100 > [ 761.897102] 000000000000004e ffff88040500e600 ffff88040500e600 > ffff8804027b3600 > [ 761.904496] ffff880404fc0000 ffff8804027b3628 0000000000000000 > ffff880404fc0000 > [ 761.911889] Call Trace: > [ 761.914319] <IRQ> > [ 761.916413] [<ffffffff81276928>] ? netif_receive_skb+0x52/0x58 > [ 761.922306] [<ffffffffa01dae3b>] ? br_parse_ip_options+0x134/0x1a8 > [bridge] > [ 761.929319] [<ffffffffa01dbbe0>] ? br_nf_pre_routing+0x348/0x3cb [bridge] > [ 761.936160] [<ffffffff81298527>] ? nf_iterate+0x41/0x7e > [ 761.941444] [<ffffffff8104afaa>] ? irq_exit+0x58/0x8f > [ 761.946556] [<ffffffffa01d6e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] > [ 761.953052] [<ffffffffa01d6e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] > [ 761.959546] [<ffffffff812985d7>] ? nf_hook_slow+0x73/0x114 > [ 761.965089] [<ffffffffa01d6e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] > [ 761.971583] [<ffffffff8126d097>] ? __netdev_alloc_skb+0x15/0x2f > [ 761.977561] [<ffffffffa01d6e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] > [ 761.984055] [<ffffffffa01d6e6f>] ? NF_HOOK.clone.4+0x3c/0x56 [bridge] > [ 761.990551] [<ffffffff812a7dde>] ? tcp_gro_receive+0xa1/0x204 > [ 761.996355] [<ffffffffa01d71e5>] ? br_handle_frame+0x195/0x1ac [bridge] > [ 762.003022] [<ffffffffa01d7050>] ? br_handle_frame_finish+0x1c7/0x1c7 > [bridge] > [ 762.010294] [<ffffffff812764ef>] ? __netif_receive_skb+0x2a7/0x450 > [ 762.016530] [<ffffffff81276928>] ? netif_receive_skb+0x52/0x58 > [ 762.022420] [<ffffffff81276e2a>] ? napi_gro_receive+0x1f/0x2f > [ 762.028222] [<ffffffff812769ff>] ? napi_skb_finish+0x1c/0x31 > [ 762.033941] [<ffffffffa024afcd>] ? igb_poll+0x6d9/0x9ee [igb] > [ 762.039744] [<ffffffff8109034f>] ? handle_irq_event+0x40/0x55 > [ 762.045547] [<ffffffff8106fc3c>] ? arch_local_irq_save+0x14/0x1d > [ 762.051609] [<ffffffff81276f55>] ? net_rx_action+0xa4/0x1b1 > [ 762.057239] [<ffffffff8104ad26>] ? __do_softirq+0xb8/0x176 > [ 762.062783] [<ffffffff81333cdc>] ? call_softirq+0x1c/0x30 > [ 762.068241] [<ffffffff8100aa57>] ? do_softirq+0x3f/0x84 > [ 762.073524] [<ffffffff8104af91>] ? irq_exit+0x3f/0x8f > [ 762.078635] [<ffffffff8100a793>] ? do_IRQ+0x85/0x9e > [ 762.083575] [<ffffffff8132cc53>] ? common_interrupt+0x13/0x13 > [ 762.089375] <EOI> > [ 762.091469] [<ffffffff81061348>] ? enqueue_hrtimer+0x3f/0x53 > [ 762.097188] [<ffffffffa0430417>] ? arch_local_irq_enable+0x7/0x8 > [processor] > [ 762.104288] [<ffffffffa0430dab>] ? acpi_idle_enter_c1+0x86/0xa2 > [processor] > [ 762.111303] [<ffffffff8125d05d>] ? cpuidle_idle_call+0xf4/0x17e > [ 762.117277] [<ffffffff81008298>] ? cpu_idle+0xa2/0xc4 > [ 762.122388] [<ffffffff8169db60>] ? start_kernel+0x3b9/0x3c4 > [ 762.128018] [<ffffffff8169d3c6>] ? x86_64_start_kernel+0x102/0x10f > [ 762.134253] Code: 4d 02 3c 03 0f 86 59 02 00 00 0f b6 d0 44 39 ea 7f > 32 83 c2 03 44 39 ea 0f 8f 45 02 00 00 48 85 db 74 18 48 8b 74 24 10 0f > b6 c0 <8b> 96 d0 00 00 00 89 54 05 ff 41 80 4c 24 08 04 80 01 04 41 80 > [ 762.153593] RIP [<ffffffff8129fbe9>] ip_options_compile+0x1c1/0x435 > [ 762.159923] RSP <ffff88042f203af0> > [ 762.163391] CR2: 00000000000000d0 > [ 762.167017] ---[ end trace e15d7b082f680b62 ]--- > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Good news! I cannot create any kernel panics with the following patches to 2.6.39-rc3 commit a6360dd37e1a144ed11e6548371bade559a1e4df while targeting either the host's bridged IP address or the guest virtual machine bridged IP addresses with the IP Stack Checker tools. diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c index 008ff6c..cdb4423 100644 --- a/net/bridge/br_netfilter.c +++ b/net/bridge/br_netfilter.c @@ -221,6 +221,7 @@ static int br_parse_ip_options(struct sk_buff *skb) struct ip_options *opt; struct iphdr *iph; struct net_device *dev = skb->dev; + struct rtable *rt; u32 len; iph = ip_hdr(skb); @@ -249,10 +250,18 @@ static int br_parse_ip_options(struct sk_buff *skb) goto drop; } - /* Zero out the CB buffer if no options present */ - if (iph->ihl == 5) { - memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); - return 0; + memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); + if (iph->ihl == 5) + return 0; + + /* Associate bogus bridge route table */ + if (!skb_dst(skb)) { + rt = bridge_parent_rtable(dev); + if (!rt) { + kfree_skb(skb); + return 0; + } + skb_dst_set_noref(skb,&rt->dst); } opt->optlen = iph->ihl*4 - sizeof(struct iphdr); diff --git a/net/ipv4/inetpeer.c b/net/ipv4/inetpeer.c index dd1b20e..9df4e63 100644 --- a/net/ipv4/inetpeer.c +++ b/net/ipv4/inetpeer.c @@ -354,7 +354,8 @@ static void inetpeer_free_rcu(struct rcu_head *head) } /* May be called with local BH enabled. */ -static void unlink_from_pool(struct inet_peer *p, struct inet_peer_base *base) +static void unlink_from_pool(struct inet_peer *p, struct inet_peer_base *base, + struct inet_peer __rcu **stack[PEER_MAXDEPTH]) { int do_free; @@ -368,7 +369,6 @@ static void unlink_from_pool(struct inet_peer *p, struct inet_peer_base *base) * We use refcnt=-1 to alert lockless readers this entry is deleted. */ if (atomic_cmpxchg(&p->refcnt, 1, -1) == 1) { - struct inet_peer __rcu **stack[PEER_MAXDEPTH]; struct inet_peer __rcu ***stackptr, ***delp; if (lookup(&p->daddr, stack, base) != p) BUG(); @@ -422,7 +422,7 @@ static struct inet_peer_base *peer_to_base(struct inet_peer *p) } /* May be called with local BH enabled. */ -static int cleanup_once(unsigned long ttl) +static int cleanup_once(unsigned long ttl, struct inet_peer __rcu **stack[PEER_MAXDEPTH]) { struct inet_peer *p = NULL; @@ -454,7 +454,7 @@ static int cleanup_once(unsigned long ttl) * happen because of entry limits in route cache. */ return -1; - unlink_from_pool(p, peer_to_base(p)); + unlink_from_pool(p, peer_to_base(p), stack); return 0; } @@ -524,7 +524,7 @@ struct inet_peer *inet_getpeer(struct inetpeer_addr *daddr, int create) if (base->total >= inet_peer_threshold) /* Remove one less-recently-used entry. */ - cleanup_once(0); + cleanup_once(0, stack); return p; } @@ -540,6 +540,7 @@ static void peer_check_expire(unsigned long dummy) { unsigned long now = jiffies; int ttl, total; + struct inet_peer __rcu **stack[PEER_MAXDEPTH]; total = compute_total(); if (total >= inet_peer_threshold) @@ -548,7 +549,7 @@ static void peer_check_expire(unsigned long dummy) ttl = inet_peer_maxttl - (inet_peer_maxttl - inet_peer_minttl) / HZ * total / inet_peer_threshold * HZ; - while (!cleanup_once(ttl)) { + while (!cleanup_once(ttl, stack)) { if (jiffies != now) break; } ^ permalink raw reply related [flat|nested] 56+ messages in thread
* Re: [PATCH] bridge: reset IPCB in br_parse_ip_options 2011-04-13 4:12 ` Scot Doyle @ 2011-04-13 15:10 ` Scot Doyle 2011-04-13 15:24 ` Stephen Hemminger 2011-04-13 15:28 ` Eric Dumazet 0 siblings, 2 replies; 56+ messages in thread From: Scot Doyle @ 2011-04-13 15:10 UTC (permalink / raw) To: Eric Dumazet, Stephen Hemminger; +Cc: David Miller, Hiroaki SHIMODA, netdev On 04/12/2011 11:12 PM, Scot Doyle wrote: > On 04/12/2011 06:55 PM, Scot Doyle wrote: >> On 04/12/2011 12:18 PM, Eric Dumazet wrote: >>> Commit 462fb2af9788a82 (bridge : Sanitize skb before it enters the IP >>> stack), missed one IPCB init before calling ip_options_compile() >>> >>> Thanks to Scot Doyle for his tests and bug reports. >>> >>> Reported-by: Scot Doyle<lkml@scotdoyle.com> >>> Signed-off-by: Eric Dumazet<eric.dumazet@gmail.com> >>> Cc: Hiroaki SHIMODA<shimoda.hiroaki@gmail.com> >>> Acked-by: Bandan Das<bandan.das@stratus.com> >>> Acked-by: Stephen Hemminger<shemminger@vyatta.com> >>> Cc: Jan Lübbe<jluebbe@debian.org> >>> --- >>> net/bridge/br_netfilter.c | 6 ++---- >>> 1 file changed, 2 insertions(+), 4 deletions(-) >>> >>> diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c >>> index 008ff6c..b353f7c 100644 >>> --- a/net/bridge/br_netfilter.c >>> +++ b/net/bridge/br_netfilter.c >>> @@ -249,11 +249,9 @@ static int br_parse_ip_options(struct sk_buff *skb) >>> goto drop; >>> } >>> >>> - /* Zero out the CB buffer if no options present */ >>> - if (iph->ihl == 5) { >>> - memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); >>> + memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); >>> + if (iph->ihl == 5) >>> return 0; >>> - } >>> >>> opt->optlen = iph->ihl*4 - sizeof(struct iphdr); >>> if (ip_options_compile(dev_net(dev), opt, skb)) >>> >>> >> >> >> Here's the output after pulling 2.6.39-rc3, applying the patches listed >> below, doing a "make clean" and hitting the bridge's assigned ip address >> with the IP Stack Checker tcpsic command. Maybe I should also be >> applying the patch from yesterday too? I'll try that next. >> >> diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c >> index 008ff6c..b9bdff9 100644 >> --- a/net/bridge/br_netfilter.c >> +++ b/net/bridge/br_netfilter.c >> @@ -249,11 +249,9 @@ static int br_parse_ip_options(struct sk_buff *skb) >> goto drop; >> } >> >> - /* Zero out the CB buffer if no options present */ >> - if (iph->ihl == 5) { >> - memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); >> - return 0; >> - } >> + memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); >> + if (iph->ihl == 5) >> + return 0; >> >> opt->optlen = iph->ihl*4 - sizeof(struct iphdr); >> if (ip_options_compile(dev_net(dev), opt, skb)) >> >> diff --git a/net/ipv4/inetpeer.c b/net/ipv4/inetpeer.c >> index dd1b20e..9df4e63 100644 >> --- a/net/ipv4/inetpeer.c >> +++ b/net/ipv4/inetpeer.c >> @@ -354,7 +354,8 @@ static void inetpeer_free_rcu(struct rcu_head *head) >> } >> >> /* May be called with local BH enabled. */ >> -static void unlink_from_pool(struct inet_peer *p, struct inet_peer_base >> *base) >> +static void unlink_from_pool(struct inet_peer *p, struct inet_peer_base >> *base, >> + struct inet_peer __rcu **stack[PEER_MAXDEPTH]) >> { >> int do_free; >> >> @@ -368,7 +369,6 @@ static void unlink_from_pool(struct inet_peer *p, >> struct inet_peer_base *base) >> * We use refcnt=-1 to alert lockless readers this entry is deleted. >> */ >> if (atomic_cmpxchg(&p->refcnt, 1, -1) == 1) { >> - struct inet_peer __rcu **stack[PEER_MAXDEPTH]; >> struct inet_peer __rcu ***stackptr, ***delp; >> if (lookup(&p->daddr, stack, base) != p) >> BUG(); >> @@ -422,7 +422,7 @@ static struct inet_peer_base *peer_to_base(struct >> inet_peer *p) >> } >> >> /* May be called with local BH enabled. */ >> -static int cleanup_once(unsigned long ttl) >> +static int cleanup_once(unsigned long ttl, struct inet_peer __rcu >> **stack[PEER_MAXDEPTH]) >> { >> struct inet_peer *p = NULL; >> >> @@ -454,7 +454,7 @@ static int cleanup_once(unsigned long ttl) >> * happen because of entry limits in route cache. */ >> return -1; >> >> - unlink_from_pool(p, peer_to_base(p)); >> + unlink_from_pool(p, peer_to_base(p), stack); >> return 0; >> } >> >> @@ -524,7 +524,7 @@ struct inet_peer *inet_getpeer(struct inetpeer_addr >> *daddr, int create) >> >> if (base->total >= inet_peer_threshold) >> /* Remove one less-recently-used entry. */ >> - cleanup_once(0); >> + cleanup_once(0, stack); >> >> return p; >> } >> @@ -540,6 +540,7 @@ static void peer_check_expire(unsigned long dummy) >> { >> unsigned long now = jiffies; >> int ttl, total; >> + struct inet_peer __rcu **stack[PEER_MAXDEPTH]; >> >> total = compute_total(); >> if (total >= inet_peer_threshold) >> @@ -548,7 +549,7 @@ static void peer_check_expire(unsigned long dummy) >> ttl = inet_peer_maxttl >> - (inet_peer_maxttl - inet_peer_minttl) / HZ * >> total / inet_peer_threshold * HZ; >> - while (!cleanup_once(ttl)) { >> + while (!cleanup_once(ttl, stack)) { >> if (jiffies != now) >> break; >> } >> >> diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c >> index 28a736f..dea9947 100644 >> --- a/net/ipv4/ip_options.c >> +++ b/net/ipv4/ip_options.c >> @@ -200,6 +200,11 @@ int ip_options_echo(struct ip_options * dopt, >> struct sk_buff * skb) >> *dptr++ = IPOPT_END; >> dopt->optlen++; >> } >> + if (unlikely(dopt->optlen > 40)) { >> + pr_err("ip_options_echo() fatal error optlen=%u > 40\n", dopt->optlen); >> + print_hex_dump(KERN_ERR, "ip options: ", DUMP_PREFIX_OFFSET, >> + 16, 1, dopt->__data, dopt->optlen, false); >> + } >> return 0; >> } >> >> >> ------------ >> >> >> [ 761.720393] BUG: unable to handle kernel NULL pointer dereference at >> 00000000000000d0 >> [ 761.728206] IP: [<ffffffff8129fbe9>] ip_options_compile+0x1c1/0x435 >> [ 761.734452] PGD 0 >> [ 761.736459] Oops: 0000 [#1] SMP >> [ 761.739683] last sysfs file: /sys/devices/virtual/misc/kvm/uevent >> [ 761.745744] CPU 0 >> [ 761.747570] Modules linked in: kvm_intel kvm bridge stp loop snd_pcm >> snd_timer snd tpm_tis tpm tpm_bios soundcore psmouse snd_page_alloc >> processor ghes thermal_sys >> i7core_edac evdev pcspkr serio_raw edac_core dcdbas power_meter button >> hed ext2 mbcache dm_mod raid1 md_mod sd_mod crc_t10dif usb_storage uas >> uhci_hcd ehci_hcd mpt2sas >> scsi_transport_sas raid_class igb scsi_mod usbcore bnx2 dca [last >> unloaded: scsi_wait_scan] >> [ 761.785171] >> [ 761.786651] Pid: 0, comm: swapper Not tainted 2.6.39-rc3+ #14 Dell >> Inc. PowerEdge R510/0DPRKF >> [ 761.795157] RIP: 0010:[<ffffffff8129fbe9>] [<ffffffff8129fbe9>] >> ip_options_compile+0x1c1/0x435 >> [ 761.803823] RSP: 0018:ffff88042f203af0 EFLAGS: 00010286 >> [ 761.809106] RAX: 0000000000000017 RBX: ffff8804027b3600 RCX: >> ffff88040466a864 >> [ 761.816205] RDX: 000000000000001a RSI: 0000000000000000 RDI: >> ffffffff817e6100 >> [ 761.823304] RBP: ffff88040466a862 R08: ffffffffa01d6e89 R09: >> ffff88042f203c58 >> [ 761.830402] R10: 0000000000000000 R11: 0000000000000000 R12: >> ffff8804027b3628 >> [ 761.837501] R13: 000000000000001d R14: ffff88040466a84e R15: >> 0000000000000024 >> [ 761.844601] FS: 0000000000000000(0000) GS:ffff88042f200000(0000) >> knlGS:0000000000000000 >> [ 761.852650] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >> [ 761.858365] CR2: 00000000000000d0 CR3: 0000000001603000 CR4: >> 00000000000006f0 >> [ 761.865463] DR0: 0000000000000000 DR1: 0000000000000000 DR2: >> 0000000000000000 >> [ 761.872562] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: >> 0000000000000400 >> [ 761.879661] Process swapper (pid: 0, threadinfo ffffffff81600000, task >> ffffffff8160b020) >> [ 761.887710] Stack: >> [ 761.889708] 0000000000000000 ffffffff81276928 0000000000000000 >> ffffffff817e6100 >> [ 761.897102] 000000000000004e ffff88040500e600 ffff88040500e600 >> ffff8804027b3600 >> [ 761.904496] ffff880404fc0000 ffff8804027b3628 0000000000000000 >> ffff880404fc0000 >> [ 761.911889] Call Trace: >> [ 761.914319] <IRQ> >> [ 761.916413] [<ffffffff81276928>] ? netif_receive_skb+0x52/0x58 >> [ 761.922306] [<ffffffffa01dae3b>] ? br_parse_ip_options+0x134/0x1a8 >> [bridge] >> [ 761.929319] [<ffffffffa01dbbe0>] ? br_nf_pre_routing+0x348/0x3cb >> [bridge] >> [ 761.936160] [<ffffffff81298527>] ? nf_iterate+0x41/0x7e >> [ 761.941444] [<ffffffff8104afaa>] ? irq_exit+0x58/0x8f >> [ 761.946556] [<ffffffffa01d6e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] >> [ 761.953052] [<ffffffffa01d6e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] >> [ 761.959546] [<ffffffff812985d7>] ? nf_hook_slow+0x73/0x114 >> [ 761.965089] [<ffffffffa01d6e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] >> [ 761.971583] [<ffffffff8126d097>] ? __netdev_alloc_skb+0x15/0x2f >> [ 761.977561] [<ffffffffa01d6e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] >> [ 761.984055] [<ffffffffa01d6e6f>] ? NF_HOOK.clone.4+0x3c/0x56 [bridge] >> [ 761.990551] [<ffffffff812a7dde>] ? tcp_gro_receive+0xa1/0x204 >> [ 761.996355] [<ffffffffa01d71e5>] ? br_handle_frame+0x195/0x1ac [bridge] >> [ 762.003022] [<ffffffffa01d7050>] ? br_handle_frame_finish+0x1c7/0x1c7 >> [bridge] >> [ 762.010294] [<ffffffff812764ef>] ? __netif_receive_skb+0x2a7/0x450 >> [ 762.016530] [<ffffffff81276928>] ? netif_receive_skb+0x52/0x58 >> [ 762.022420] [<ffffffff81276e2a>] ? napi_gro_receive+0x1f/0x2f >> [ 762.028222] [<ffffffff812769ff>] ? napi_skb_finish+0x1c/0x31 >> [ 762.033941] [<ffffffffa024afcd>] ? igb_poll+0x6d9/0x9ee [igb] >> [ 762.039744] [<ffffffff8109034f>] ? handle_irq_event+0x40/0x55 >> [ 762.045547] [<ffffffff8106fc3c>] ? arch_local_irq_save+0x14/0x1d >> [ 762.051609] [<ffffffff81276f55>] ? net_rx_action+0xa4/0x1b1 >> [ 762.057239] [<ffffffff8104ad26>] ? __do_softirq+0xb8/0x176 >> [ 762.062783] [<ffffffff81333cdc>] ? call_softirq+0x1c/0x30 >> [ 762.068241] [<ffffffff8100aa57>] ? do_softirq+0x3f/0x84 >> [ 762.073524] [<ffffffff8104af91>] ? irq_exit+0x3f/0x8f >> [ 762.078635] [<ffffffff8100a793>] ? do_IRQ+0x85/0x9e >> [ 762.083575] [<ffffffff8132cc53>] ? common_interrupt+0x13/0x13 >> [ 762.089375] <EOI> >> [ 762.091469] [<ffffffff81061348>] ? enqueue_hrtimer+0x3f/0x53 >> [ 762.097188] [<ffffffffa0430417>] ? arch_local_irq_enable+0x7/0x8 >> [processor] >> [ 762.104288] [<ffffffffa0430dab>] ? acpi_idle_enter_c1+0x86/0xa2 >> [processor] >> [ 762.111303] [<ffffffff8125d05d>] ? cpuidle_idle_call+0xf4/0x17e >> [ 762.117277] [<ffffffff81008298>] ? cpu_idle+0xa2/0xc4 >> [ 762.122388] [<ffffffff8169db60>] ? start_kernel+0x3b9/0x3c4 >> [ 762.128018] [<ffffffff8169d3c6>] ? x86_64_start_kernel+0x102/0x10f >> [ 762.134253] Code: 4d 02 3c 03 0f 86 59 02 00 00 0f b6 d0 44 39 ea 7f >> 32 83 c2 03 44 39 ea 0f 8f 45 02 00 00 48 85 db 74 18 48 8b 74 24 10 0f >> b6 c0 <8b> 96 d0 00 00 00 89 54 05 ff 41 80 4c 24 08 04 80 01 04 41 80 >> [ 762.153593] RIP [<ffffffff8129fbe9>] ip_options_compile+0x1c1/0x435 >> [ 762.159923] RSP <ffff88042f203af0> >> [ 762.163391] CR2: 00000000000000d0 >> [ 762.167017] ---[ end trace e15d7b082f680b62 ]--- >> -- >> To unsubscribe from this list: send the line "unsubscribe netdev" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > > Good news! I cannot create any kernel panics with the following patches > to 2.6.39-rc3 commit a6360dd37e1a144ed11e6548371bade559a1e4df while > targeting either the host's bridged IP address or the guest virtual > machine bridged IP addresses with the IP Stack Checker tools. > > diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c > index 008ff6c..cdb4423 100644 > --- a/net/bridge/br_netfilter.c > +++ b/net/bridge/br_netfilter.c > @@ -221,6 +221,7 @@ static int br_parse_ip_options(struct sk_buff *skb) > struct ip_options *opt; > struct iphdr *iph; > struct net_device *dev = skb->dev; > + struct rtable *rt; > u32 len; > > iph = ip_hdr(skb); > @@ -249,10 +250,18 @@ static int br_parse_ip_options(struct sk_buff *skb) > goto drop; > } > > - /* Zero out the CB buffer if no options present */ > - if (iph->ihl == 5) { > - memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); > - return 0; > + memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); > + if (iph->ihl == 5) > + return 0; > + > + /* Associate bogus bridge route table */ > + if (!skb_dst(skb)) { > + rt = bridge_parent_rtable(dev); > + if (!rt) { > + kfree_skb(skb); > + return 0; > + } > + skb_dst_set_noref(skb,&rt->dst); > } > > opt->optlen = iph->ihl*4 - sizeof(struct iphdr); > diff --git a/net/ipv4/inetpeer.c b/net/ipv4/inetpeer.c > index dd1b20e..9df4e63 100644 > --- a/net/ipv4/inetpeer.c > +++ b/net/ipv4/inetpeer.c > @@ -354,7 +354,8 @@ static void inetpeer_free_rcu(struct rcu_head *head) > } > > /* May be called with local BH enabled. */ > -static void unlink_from_pool(struct inet_peer *p, struct inet_peer_base > *base) > +static void unlink_from_pool(struct inet_peer *p, struct inet_peer_base > *base, > + struct inet_peer __rcu **stack[PEER_MAXDEPTH]) > { > int do_free; > > @@ -368,7 +369,6 @@ static void unlink_from_pool(struct inet_peer *p, > struct inet_peer_base *base) > * We use refcnt=-1 to alert lockless readers this entry is deleted. > */ > if (atomic_cmpxchg(&p->refcnt, 1, -1) == 1) { > - struct inet_peer __rcu **stack[PEER_MAXDEPTH]; > struct inet_peer __rcu ***stackptr, ***delp; > if (lookup(&p->daddr, stack, base) != p) > BUG(); > @@ -422,7 +422,7 @@ static struct inet_peer_base *peer_to_base(struct > inet_peer *p) > } > > /* May be called with local BH enabled. */ > -static int cleanup_once(unsigned long ttl) > +static int cleanup_once(unsigned long ttl, struct inet_peer __rcu > **stack[PEER_MAXDEPTH]) > { > struct inet_peer *p = NULL; > > @@ -454,7 +454,7 @@ static int cleanup_once(unsigned long ttl) > * happen because of entry limits in route cache. */ > return -1; > > - unlink_from_pool(p, peer_to_base(p)); > + unlink_from_pool(p, peer_to_base(p), stack); > return 0; > } > > @@ -524,7 +524,7 @@ struct inet_peer *inet_getpeer(struct inetpeer_addr > *daddr, int create) > > if (base->total >= inet_peer_threshold) > /* Remove one less-recently-used entry. */ > - cleanup_once(0); > + cleanup_once(0, stack); > > return p; > } > @@ -540,6 +540,7 @@ static void peer_check_expire(unsigned long dummy) > { > unsigned long now = jiffies; > int ttl, total; > + struct inet_peer __rcu **stack[PEER_MAXDEPTH]; > > total = compute_total(); > if (total >= inet_peer_threshold) > @@ -548,7 +549,7 @@ static void peer_check_expire(unsigned long dummy) > ttl = inet_peer_maxttl > - (inet_peer_maxttl - inet_peer_minttl) / HZ * > total / inet_peer_threshold * HZ; > - while (!cleanup_once(ttl)) { > + while (!cleanup_once(ttl, stack)) { > if (jiffies != now) > break; > } > The net effect is that three patches are required to eliminate the panics. These two patches have been accepted by David: http://article.gmane.org/gmane.linux.network/192015 http://article.gmane.org/gmane.linux.network/192055 This patch, incrementally authored by Stephen and Eric and compiled by me, is also required: http://article.gmane.org/gmane.linux.network/192007 What should happen for this third patch to be included upstream? ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] bridge: reset IPCB in br_parse_ip_options 2011-04-13 15:10 ` Scot Doyle @ 2011-04-13 15:24 ` Stephen Hemminger 2011-04-13 15:54 ` Scot Doyle 2011-04-13 15:28 ` Eric Dumazet 1 sibling, 1 reply; 56+ messages in thread From: Stephen Hemminger @ 2011-04-13 15:24 UTC (permalink / raw) To: Scot Doyle; +Cc: Eric Dumazet, David Miller, Hiroaki SHIMODA, netdev > The net effect is that three patches are required to eliminate the panics. > > These two patches have been accepted by David: > http://article.gmane.org/gmane.linux.network/192015 > http://article.gmane.org/gmane.linux.network/192055 > > This patch, incrementally authored by Stephen and Eric and compiled by > me, is also required: > http://article.gmane.org/gmane.linux.network/192007 > > What should happen for this third patch to be included upstream? Although making IP more robust is a good. I still think bridging shouldn't give bad packets to IP. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] bridge: reset IPCB in br_parse_ip_options 2011-04-13 15:24 ` Stephen Hemminger @ 2011-04-13 15:54 ` Scot Doyle 0 siblings, 0 replies; 56+ messages in thread From: Scot Doyle @ 2011-04-13 15:54 UTC (permalink / raw) To: Stephen Hemminger; +Cc: Eric Dumazet, netdev On 04/13/2011 10:24 AM, Stephen Hemminger wrote: >> The net effect is that three patches are required to eliminate the panics. >> >> These two patches have been accepted by David: >> http://article.gmane.org/gmane.linux.network/192015 >> http://article.gmane.org/gmane.linux.network/192055 >> >> This patch, incrementally authored by Stephen and Eric and compiled by >> me, is also required: >> http://article.gmane.org/gmane.linux.network/192007 >> >> What should happen for this third patch to be included upstream? > Although making IP more robust is a good. > I still think bridging shouldn't give bad packets to IP. I'm definitely willing to test alternative patches. The output without that third patch is: [ 761.720393] BUG: unable to handle kernel NULL pointer dereference at 00000000000000d0 [ 761.728206] IP: [<ffffffff8129fbe9>] ip_options_compile+0x1c1/0x435 [ 761.734452] PGD 0 [ 761.736459] Oops: 0000 [#1] SMP [ 761.739683] last sysfs file: /sys/devices/virtual/misc/kvm/uevent [ 761.745744] CPU 0 [ 761.747570] Modules linked in: kvm_intel kvm bridge stp loop snd_pcm snd_timer snd tpm_tis tpm tpm_bios soundcore psmouse snd_page_alloc processor ghes thermal_sys i7core_edac evdev pcspkr serio_raw edac_core dcdbas power_meter button hed ext2 mbcache dm_mod raid1 md_mod sd_mod crc_t10dif usb_storage uas uhci_hcd ehci_hcd mpt2sas scsi_transport_sas raid_class igb scsi_mod usbcore bnx2 dca [last unloaded: scsi_wait_scan] [ 761.785171] [ 761.786651] Pid: 0, comm: swapper Not tainted 2.6.39-rc3+ #14 Dell Inc. PowerEdge R510/0DPRKF [ 761.795157] RIP: 0010:[<ffffffff8129fbe9>] [<ffffffff8129fbe9>] ip_options_compile+0x1c1/0x435 [ 761.803823] RSP: 0018:ffff88042f203af0 EFLAGS: 00010286 [ 761.809106] RAX: 0000000000000017 RBX: ffff8804027b3600 RCX: ffff88040466a864 [ 761.816205] RDX: 000000000000001a RSI: 0000000000000000 RDI: ffffffff817e6100 [ 761.823304] RBP: ffff88040466a862 R08: ffffffffa01d6e89 R09: ffff88042f203c58 [ 761.830402] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8804027b3628 [ 761.837501] R13: 000000000000001d R14: ffff88040466a84e R15: 0000000000000024 [ 761.844601] FS: 0000000000000000(0000) GS:ffff88042f200000(0000) knlGS:0000000000000000 [ 761.852650] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 761.858365] CR2: 00000000000000d0 CR3: 0000000001603000 CR4: 00000000000006f0 [ 761.865463] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 761.872562] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 761.879661] Process swapper (pid: 0, threadinfo ffffffff81600000, task ffffffff8160b020) [ 761.887710] Stack: [ 761.889708] 0000000000000000 ffffffff81276928 0000000000000000 ffffffff817e6100 [ 761.897102] 000000000000004e ffff88040500e600 ffff88040500e600 ffff8804027b3600 [ 761.904496] ffff880404fc0000 ffff8804027b3628 0000000000000000 ffff880404fc0000 [ 761.911889] Call Trace: [ 761.914319] <IRQ> [ 761.916413] [<ffffffff81276928>] ? netif_receive_skb+0x52/0x58 [ 761.922306] [<ffffffffa01dae3b>] ? br_parse_ip_options+0x134/0x1a8 [bridge] [ 761.929319] [<ffffffffa01dbbe0>] ? br_nf_pre_routing+0x348/0x3cb [bridge] [ 761.936160] [<ffffffff81298527>] ? nf_iterate+0x41/0x7e [ 761.941444] [<ffffffff8104afaa>] ? irq_exit+0x58/0x8f [ 761.946556] [<ffffffffa01d6e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [ 761.953052] [<ffffffffa01d6e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [ 761.959546] [<ffffffff812985d7>] ? nf_hook_slow+0x73/0x114 [ 761.965089] [<ffffffffa01d6e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [ 761.971583] [<ffffffff8126d097>] ? __netdev_alloc_skb+0x15/0x2f [ 761.977561] [<ffffffffa01d6e89>] ? NF_HOOK.clone.4+0x56/0x56 [bridge] [ 761.984055] [<ffffffffa01d6e6f>] ? NF_HOOK.clone.4+0x3c/0x56 [bridge] [ 761.990551] [<ffffffff812a7dde>] ? tcp_gro_receive+0xa1/0x204 [ 761.996355] [<ffffffffa01d71e5>] ? br_handle_frame+0x195/0x1ac [bridge] [ 762.003022] [<ffffffffa01d7050>] ? br_handle_frame_finish+0x1c7/0x1c7 [bridge] [ 762.010294] [<ffffffff812764ef>] ? __netif_receive_skb+0x2a7/0x450 [ 762.016530] [<ffffffff81276928>] ? netif_receive_skb+0x52/0x58 [ 762.022420] [<ffffffff81276e2a>] ? napi_gro_receive+0x1f/0x2f [ 762.028222] [<ffffffff812769ff>] ? napi_skb_finish+0x1c/0x31 [ 762.033941] [<ffffffffa024afcd>] ? igb_poll+0x6d9/0x9ee [igb] [ 762.039744] [<ffffffff8109034f>] ? handle_irq_event+0x40/0x55 [ 762.045547] [<ffffffff8106fc3c>] ? arch_local_irq_save+0x14/0x1d [ 762.051609] [<ffffffff81276f55>] ? net_rx_action+0xa4/0x1b1 [ 762.057239] [<ffffffff8104ad26>] ? __do_softirq+0xb8/0x176 [ 762.062783] [<ffffffff81333cdc>] ? call_softirq+0x1c/0x30 [ 762.068241] [<ffffffff8100aa57>] ? do_softirq+0x3f/0x84 [ 762.073524] [<ffffffff8104af91>] ? irq_exit+0x3f/0x8f [ 762.078635] [<ffffffff8100a793>] ? do_IRQ+0x85/0x9e [ 762.083575] [<ffffffff8132cc53>] ? common_interrupt+0x13/0x13 [ 762.089375] <EOI> [ 762.091469] [<ffffffff81061348>] ? enqueue_hrtimer+0x3f/0x53 [ 762.097188] [<ffffffffa0430417>] ? arch_local_irq_enable+0x7/0x8 [processor] [ 762.104288] [<ffffffffa0430dab>] ? acpi_idle_enter_c1+0x86/0xa2 [processor] [ 762.111303] [<ffffffff8125d05d>] ? cpuidle_idle_call+0xf4/0x17e [ 762.117277] [<ffffffff81008298>] ? cpu_idle+0xa2/0xc4 [ 762.122388] [<ffffffff8169db60>] ? start_kernel+0x3b9/0x3c4 [ 762.128018] [<ffffffff8169d3c6>] ? x86_64_start_kernel+0x102/0x10f [ 762.134253] Code: 4d 02 3c 03 0f 86 59 02 00 00 0f b6 d0 44 39 ea 7f 32 83 c2 03 44 39 ea 0f 8f 45 02 00 00 48 85 db 74 18 48 8b 74 24 10 0f b6 c0 <8b> 96 d0 00 00 00 89 54 05 ff 41 80 4c 24 08 04 80 01 04 41 80 [ 762.153593] RIP [<ffffffff8129fbe9>] ip_options_compile+0x1c1/0x435 [ 762.159923] RSP <ffff88042f203af0> [ 762.163391] CR2: 00000000000000d0 [ 762.167017] ---[ end trace e15d7b082f680b62 ]--- ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] bridge: reset IPCB in br_parse_ip_options 2011-04-13 15:10 ` Scot Doyle 2011-04-13 15:24 ` Stephen Hemminger @ 2011-04-13 15:28 ` Eric Dumazet 2011-04-13 21:48 ` David Miller 1 sibling, 1 reply; 56+ messages in thread From: Eric Dumazet @ 2011-04-13 15:28 UTC (permalink / raw) To: Scot Doyle; +Cc: Stephen Hemminger, David Miller, Hiroaki SHIMODA, netdev Le mercredi 13 avril 2011 à 10:10 -0500, Scot Doyle a écrit : > The net effect is that three patches are required to eliminate the panics. > > These two patches have been accepted by David: > http://article.gmane.org/gmane.linux.network/192015 > http://article.gmane.org/gmane.linux.network/192055 > > This patch, incrementally authored by Stephen and Eric and compiled by > me, is also required: > http://article.gmane.org/gmane.linux.network/192007 > > What should happen for this third patch to be included upstream? Dont worry, Stephen or me will send it asap. Thanks ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] bridge: reset IPCB in br_parse_ip_options 2011-04-13 15:28 ` Eric Dumazet @ 2011-04-13 21:48 ` David Miller 2011-04-14 0:03 ` Stephen Hemminger 2011-04-14 2:31 ` Eric Dumazet 0 siblings, 2 replies; 56+ messages in thread From: David Miller @ 2011-04-13 21:48 UTC (permalink / raw) To: eric.dumazet; +Cc: lkml, shemminger, shimoda.hiroaki, netdev From: Eric Dumazet <eric.dumazet@gmail.com> Date: Wed, 13 Apr 2011 17:28:07 +0200 > Dont worry, Stephen or me will send it asap. I'm looking forward to it :) ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] bridge: reset IPCB in br_parse_ip_options 2011-04-13 21:48 ` David Miller @ 2011-04-14 0:03 ` Stephen Hemminger 2011-04-14 0:05 ` David Miller 2011-04-14 2:31 ` Eric Dumazet 1 sibling, 1 reply; 56+ messages in thread From: Stephen Hemminger @ 2011-04-14 0:03 UTC (permalink / raw) To: David Miller; +Cc: eric.dumazet, lkml, shimoda.hiroaki, netdev On Wed, 13 Apr 2011 14:48:12 -0700 (PDT) David Miller <davem@davemloft.net> wrote: > From: Eric Dumazet <eric.dumazet@gmail.com> > Date: Wed, 13 Apr 2011 17:28:07 +0200 > > > Dont worry, Stephen or me will send it asap. > > I'm looking forward to it :) You applied the clear of ipcb already. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] bridge: reset IPCB in br_parse_ip_options 2011-04-14 0:03 ` Stephen Hemminger @ 2011-04-14 0:05 ` David Miller 2011-04-14 0:08 ` Stephen Hemminger 0 siblings, 1 reply; 56+ messages in thread From: David Miller @ 2011-04-14 0:05 UTC (permalink / raw) To: shemminger; +Cc: eric.dumazet, lkml, shimoda.hiroaki, netdev From: Stephen Hemminger <shemminger@vyatta.com> Date: Wed, 13 Apr 2011 17:03:51 -0700 > On Wed, 13 Apr 2011 14:48:12 -0700 (PDT) > David Miller <davem@davemloft.net> wrote: > >> From: Eric Dumazet <eric.dumazet@gmail.com> >> Date: Wed, 13 Apr 2011 17:28:07 +0200 >> >> > Dont worry, Stephen or me will send it asap. >> >> I'm looking forward to it :) > > You applied the clear of ipcb already. There are other patches involved, I think. The one with the NULL route handling, for one. Please follow back in this thread for the details, the IPCB clear wasn't sufficient to get rid of all of the reporter's OOPS's. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] bridge: reset IPCB in br_parse_ip_options 2011-04-14 0:05 ` David Miller @ 2011-04-14 0:08 ` Stephen Hemminger 0 siblings, 0 replies; 56+ messages in thread From: Stephen Hemminger @ 2011-04-14 0:08 UTC (permalink / raw) To: David Miller; +Cc: eric.dumazet, lkml, shimoda.hiroaki, netdev On Wed, 13 Apr 2011 17:05:03 -0700 (PDT) David Miller <davem@davemloft.net> wrote: > From: Stephen Hemminger <shemminger@vyatta.com> > Date: Wed, 13 Apr 2011 17:03:51 -0700 > > > On Wed, 13 Apr 2011 14:48:12 -0700 (PDT) > > David Miller <davem@davemloft.net> wrote: > > > >> From: Eric Dumazet <eric.dumazet@gmail.com> > >> Date: Wed, 13 Apr 2011 17:28:07 +0200 > >> > >> > Dont worry, Stephen or me will send it asap. > >> > >> I'm looking forward to it :) > > > > You applied the clear of ipcb already. > > There are other patches involved, I think. > > The one with the NULL route handling, for one. > > Please follow back in this thread for the details, the IPCB clear > wasn't sufficient to get rid of all of the reporter's OOPS's. Agreed, it is not the complete fix. -- ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] bridge: reset IPCB in br_parse_ip_options 2011-04-13 21:48 ` David Miller 2011-04-14 0:03 ` Stephen Hemminger @ 2011-04-14 2:31 ` Eric Dumazet 2011-04-14 2:54 ` Stephen Hemminger 1 sibling, 1 reply; 56+ messages in thread From: Eric Dumazet @ 2011-04-14 2:31 UTC (permalink / raw) To: David Miller; +Cc: lkml, shemminger, shimoda.hiroaki, netdev Le mercredi 13 avril 2011 à 14:48 -0700, David Miller a écrit : > From: Eric Dumazet <eric.dumazet@gmail.com> > Date: Wed, 13 Apr 2011 17:28:07 +0200 > > > Dont worry, Stephen or me will send it asap. > > I'm looking forward to it :) I was considering another way to handle this problem, patching ip_options_compile() to take care of null skb_dst() in slow path instead ? What would be the best thing ? diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c index 28a736f..c10ad63 100644 --- a/net/ipv4/ip_options.c +++ b/net/ipv4/ip_options.c @@ -329,7 +329,7 @@ int ip_options_compile(struct net *net, pp_ptr = optptr + 2; goto error; } - if (skb) { + if (rt) { memcpy(&optptr[optptr[2]-1], &rt->rt_spec_dst, 4); opt->is_changed = 1; } @@ -371,7 +371,7 @@ int ip_options_compile(struct net *net, goto error; } opt->ts = optptr - iph; - if (skb) { + if (rt) { memcpy(&optptr[optptr[2]-1], &rt->rt_spec_dst, 4); timeptr = (__be32*)&optptr[optptr[2]+3]; } ^ permalink raw reply related [flat|nested] 56+ messages in thread
* Re: [PATCH] bridge: reset IPCB in br_parse_ip_options 2011-04-14 2:31 ` Eric Dumazet @ 2011-04-14 2:54 ` Stephen Hemminger 2011-04-14 3:03 ` [PATCH] ip: ip_options_compile() resilient to NULL skb route Eric Dumazet 0 siblings, 1 reply; 56+ messages in thread From: Stephen Hemminger @ 2011-04-14 2:54 UTC (permalink / raw) To: Eric Dumazet; +Cc: David Miller, lkml, shimoda.hiroaki, netdev On Thu, 14 Apr 2011 04:31:16 +0200 Eric Dumazet <eric.dumazet@gmail.com> wrote: > Le mercredi 13 avril 2011 à 14:48 -0700, David Miller a écrit : > > From: Eric Dumazet <eric.dumazet@gmail.com> > > Date: Wed, 13 Apr 2011 17:28:07 +0200 > > > > > Dont worry, Stephen or me will send it asap. > > > > I'm looking forward to it :) > > I was considering another way to handle this problem, > patching ip_options_compile() to take care of null skb_dst() in slow > path instead ? > > What would be the best thing ? > > diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c > index 28a736f..c10ad63 100644 > --- a/net/ipv4/ip_options.c > +++ b/net/ipv4/ip_options.c > @@ -329,7 +329,7 @@ int ip_options_compile(struct net *net, > pp_ptr = optptr + 2; > goto error; > } > - if (skb) { > + if (rt) { > memcpy(&optptr[optptr[2]-1], &rt->rt_spec_dst, 4); > opt->is_changed = 1; > } > @@ -371,7 +371,7 @@ int ip_options_compile(struct net *net, > goto error; > } > opt->ts = optptr - iph; > - if (skb) { > + if (rt) { > memcpy(&optptr[optptr[2]-1], &rt->rt_spec_dst, 4); > timeptr = (__be32*)&optptr[optptr[2]+3]; > } > I like this because it lets the bridge be transparent. The existing options code adds entry in record route, and which is not desirable. ^ permalink raw reply [flat|nested] 56+ messages in thread
* [PATCH] ip: ip_options_compile() resilient to NULL skb route 2011-04-14 2:54 ` Stephen Hemminger @ 2011-04-14 3:03 ` Eric Dumazet 2011-04-14 3:30 ` Hiroaki SHIMODA 0 siblings, 1 reply; 56+ messages in thread From: Eric Dumazet @ 2011-04-14 3:03 UTC (permalink / raw) To: Stephen Hemminger; +Cc: David Miller, lkml, shimoda.hiroaki, netdev Le mercredi 13 avril 2011 à 19:54 -0700, Stephen Hemminger a écrit : > I like this because it lets the bridge be transparent. > The existing options code adds entry in record route, and which > is not desirable. OK then, I realize I should have submitted a full patch, here it is. Thanks ! [PATCH] ip: ip_options_compile() resilient to NULL skb route Scot Doyle demonstrated ip_options_compile() could be called with an skb without an attached route, using a setup involving a bridge, netfilter, and forged IP packets. Let's make ip_options_compile() a bit more robust, instead of changing bridge/netfilter code. Reported-by: Scot Doyle <lkml@scotdoyle.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> --- net/ipv4/ip_options.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c index 28a736f..c10ad63 100644 --- a/net/ipv4/ip_options.c +++ b/net/ipv4/ip_options.c @@ -329,7 +329,7 @@ int ip_options_compile(struct net *net, pp_ptr = optptr + 2; goto error; } - if (skb) { + if (rt) { memcpy(&optptr[optptr[2]-1], &rt->rt_spec_dst, 4); opt->is_changed = 1; } @@ -371,7 +371,7 @@ int ip_options_compile(struct net *net, goto error; } opt->ts = optptr - iph; - if (skb) { + if (rt) { memcpy(&optptr[optptr[2]-1], &rt->rt_spec_dst, 4); timeptr = (__be32*)&optptr[optptr[2]+3]; } ^ permalink raw reply related [flat|nested] 56+ messages in thread
* Re: [PATCH] ip: ip_options_compile() resilient to NULL skb route 2011-04-14 3:03 ` [PATCH] ip: ip_options_compile() resilient to NULL skb route Eric Dumazet @ 2011-04-14 3:30 ` Hiroaki SHIMODA 2011-04-14 3:37 ` Eric Dumazet 2011-04-14 15:55 ` [PATCH v2] " Eric Dumazet 0 siblings, 2 replies; 56+ messages in thread From: Hiroaki SHIMODA @ 2011-04-14 3:30 UTC (permalink / raw) To: Eric Dumazet; +Cc: Stephen Hemminger, David Miller, lkml, netdev On Thu, 14 Apr 2011 05:03:34 +0200 Eric Dumazet <eric.dumazet@gmail.com> wrote: > Le mercredi 13 avril 2011 à 19:54 -0700, Stephen Hemminger a écrit : > > > I like this because it lets the bridge be transparent. > > The existing options code adds entry in record route, and which > > is not desirable. > > OK then, I realize I should have submitted a full patch, here it is. > > Thanks ! > > [PATCH] ip: ip_options_compile() resilient to NULL skb route > > Scot Doyle demonstrated ip_options_compile() could be called with an skb > without an attached route, using a setup involving a bridge, netfilter, > and forged IP packets. > > Let's make ip_options_compile() a bit more robust, instead of changing > bridge/netfilter code. And ip_options_rcv_srr() in br_parse_ip_options() also expects an skb with attached route, so below patch is needed ? diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c index 28a736f..3af1968 100644 --- a/net/ipv4/ip_options.c +++ b/net/ipv4/ip_options.c @@ -603,7 +603,7 @@ int ip_options_rcv_srr(struct sk_buff *skb) unsigned long orefdst; int err; - if (!opt->srr) + if (!opt->srr || !rt) return 0; if (skb->pkt_type != PACKET_HOST) Thanks. ^ permalink raw reply related [flat|nested] 56+ messages in thread
* Re: [PATCH] ip: ip_options_compile() resilient to NULL skb route 2011-04-14 3:30 ` Hiroaki SHIMODA @ 2011-04-14 3:37 ` Eric Dumazet 2011-04-14 4:15 ` Hiroaki SHIMODA 2011-04-14 15:55 ` [PATCH v2] " Eric Dumazet 1 sibling, 1 reply; 56+ messages in thread From: Eric Dumazet @ 2011-04-14 3:37 UTC (permalink / raw) To: Hiroaki SHIMODA; +Cc: Stephen Hemminger, David Miller, lkml, netdev Le jeudi 14 avril 2011 à 12:30 +0900, Hiroaki SHIMODA a écrit : > On Thu, 14 Apr 2011 05:03:34 +0200 > Eric Dumazet <eric.dumazet@gmail.com> wrote: > > > Le mercredi 13 avril 2011 à 19:54 -0700, Stephen Hemminger a écrit : > > > > > I like this because it lets the bridge be transparent. > > > The existing options code adds entry in record route, and which > > > is not desirable. > > > > OK then, I realize I should have submitted a full patch, here it is. > > > > Thanks ! > > > > [PATCH] ip: ip_options_compile() resilient to NULL skb route > > > > Scot Doyle demonstrated ip_options_compile() could be called with an skb > > without an attached route, using a setup involving a bridge, netfilter, > > and forged IP packets. > > > > Let's make ip_options_compile() a bit more robust, instead of changing > > bridge/netfilter code. > > And ip_options_rcv_srr() in br_parse_ip_options() also > expects an skb with attached route, so below patch is needed ? > > diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c > index 28a736f..3af1968 100644 > --- a/net/ipv4/ip_options.c > +++ b/net/ipv4/ip_options.c > @@ -603,7 +603,7 @@ int ip_options_rcv_srr(struct sk_buff *skb) > unsigned long orefdst; > int err; > > - if (!opt->srr) > + if (!opt->srr || !rt) > return 0; > > if (skb->pkt_type != PACKET_HOST) > > Thanks. Indeed good catch, but should we return 0 or -EINVAL so that caller can drop packet ? @@ -606,7 +606,7 @@ int ip_options_rcv_srr(struct sk_buff *skb) if (!opt->srr) return 0; - if (skb->pkt_type != PACKET_HOST) + if (skb->pkt_type != PACKET_HOST || !rt) return -EINVAL; if (rt->rt_type == RTN_UNICAST) { if (!opt->is_strictroute) ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH] ip: ip_options_compile() resilient to NULL skb route 2011-04-14 3:37 ` Eric Dumazet @ 2011-04-14 4:15 ` Hiroaki SHIMODA 2011-04-14 13:34 ` Scot Doyle 0 siblings, 1 reply; 56+ messages in thread From: Hiroaki SHIMODA @ 2011-04-14 4:15 UTC (permalink / raw) To: Eric Dumazet; +Cc: Stephen Hemminger, David Miller, lkml, netdev On Thu, 14 Apr 2011 05:37:43 +0200 Eric Dumazet <eric.dumazet@gmail.com> wrote: > Indeed good catch, but should we return 0 or -EINVAL so that caller can > drop packet ? > > @@ -606,7 +606,7 @@ int ip_options_rcv_srr(struct sk_buff *skb) > if (!opt->srr) > return 0; > > - if (skb->pkt_type != PACKET_HOST) > + if (skb->pkt_type != PACKET_HOST || !rt) > return -EINVAL; > if (rt->rt_type == RTN_UNICAST) { > if (!opt->is_strictroute) > As your patch does we don't treat an skb without rt as error on bridge/netfilter context. So, I think returning 0 would be better off. But thinking of ip_options_rcv_srr() is called from another context again adding an extra check in br_parse_ip_options() is safer ? diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c index f3bc322..10ac127 100644 --- a/net/bridge/br_netfilter.c +++ b/net/bridge/br_netfilter.c @@ -263,7 +263,7 @@ static int br_parse_ip_options(struct sk_buff *skb) if (in_dev && !IN_DEV_SOURCE_ROUTE(in_dev)) goto drop; - if (ip_options_rcv_srr(skb)) + if (skb_rtable(skb) && ip_options_rcv_srr(skb)) goto drop; } Thanks. ^ permalink raw reply related [flat|nested] 56+ messages in thread
* Re: [PATCH] ip: ip_options_compile() resilient to NULL skb route 2011-04-14 4:15 ` Hiroaki SHIMODA @ 2011-04-14 13:34 ` Scot Doyle 0 siblings, 0 replies; 56+ messages in thread From: Scot Doyle @ 2011-04-14 13:34 UTC (permalink / raw) To: Hiroaki SHIMODA, Eric Dumazet; +Cc: Stephen Hemminger, David Miller, netdev I tested the three patches linked below, plus the two patches previously accepted by David in this thread, with 2.6.39-rc3 commit 85f2e689a5c8fb6ed8fdbee00109e7f6e5fefcb6. No panics :-) http://article.gmane.org/gmane.linux.network/192293 http://article.gmane.org/gmane.linux.network/192299 http://article.gmane.org/gmane.linux.network/192301 ------------ diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c index 008ff6c..10ac127 100644 --- a/net/bridge/br_netfilter.c +++ b/net/bridge/br_netfilter.c @@ -249,11 +249,9 @@ static int br_parse_ip_options(struct sk_buff *skb) goto drop; } - /* Zero out the CB buffer if no options present */ - if (iph->ihl == 5) { - memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); + memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); + if (iph->ihl == 5) return 0; - } opt->optlen = iph->ihl*4 - sizeof(struct iphdr); if (ip_options_compile(dev_net(dev), opt, skb)) @@ -265,7 +263,7 @@ static int br_parse_ip_options(struct sk_buff *skb) if (in_dev && !IN_DEV_SOURCE_ROUTE(in_dev)) goto drop; - if (ip_options_rcv_srr(skb)) + if (skb_rtable(skb) && ip_options_rcv_srr(skb)) goto drop; } diff --git a/net/ipv4/inetpeer.c b/net/ipv4/inetpeer.c index dd1b20e..9df4e63 100644 --- a/net/ipv4/inetpeer.c +++ b/net/ipv4/inetpeer.c @@ -354,7 +354,8 @@ static void inetpeer_free_rcu(struct rcu_head *head) } /* May be called with local BH enabled. */ -static void unlink_from_pool(struct inet_peer *p, struct inet_peer_base *base) +static void unlink_from_pool(struct inet_peer *p, struct inet_peer_base *base, + struct inet_peer __rcu **stack[PEER_MAXDEPTH]) { int do_free; @@ -368,7 +369,6 @@ static void unlink_from_pool(struct inet_peer *p, struct inet_peer_base *base) * We use refcnt=-1 to alert lockless readers this entry is deleted. */ if (atomic_cmpxchg(&p->refcnt, 1, -1) == 1) { - struct inet_peer __rcu **stack[PEER_MAXDEPTH]; struct inet_peer __rcu ***stackptr, ***delp; if (lookup(&p->daddr, stack, base) != p) BUG(); @@ -422,7 +422,7 @@ static struct inet_peer_base *peer_to_base(struct inet_peer *p) } /* May be called with local BH enabled. */ -static int cleanup_once(unsigned long ttl) +static int cleanup_once(unsigned long ttl, struct inet_peer __rcu **stack[PEER_MAXDEPTH]) { struct inet_peer *p = NULL; @@ -454,7 +454,7 @@ static int cleanup_once(unsigned long ttl) * happen because of entry limits in route cache. */ return -1; - unlink_from_pool(p, peer_to_base(p)); + unlink_from_pool(p, peer_to_base(p), stack); return 0; } @@ -524,7 +524,7 @@ struct inet_peer *inet_getpeer(struct inetpeer_addr *daddr, int create) if (base->total >= inet_peer_threshold) /* Remove one less-recently-used entry. */ - cleanup_once(0); + cleanup_once(0, stack); return p; } @@ -540,6 +540,7 @@ static void peer_check_expire(unsigned long dummy) { unsigned long now = jiffies; int ttl, total; + struct inet_peer __rcu **stack[PEER_MAXDEPTH]; total = compute_total(); if (total >= inet_peer_threshold) @@ -548,7 +549,7 @@ static void peer_check_expire(unsigned long dummy) ttl = inet_peer_maxttl - (inet_peer_maxttl - inet_peer_minttl) / HZ * total / inet_peer_threshold * HZ; - while (!cleanup_once(ttl)) { + while (!cleanup_once(ttl, stack)) { if (jiffies != now) break; } diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c index 28a736f..546dd02 100644 --- a/net/ipv4/ip_options.c +++ b/net/ipv4/ip_options.c @@ -329,7 +329,7 @@ int ip_options_compile(struct net *net, pp_ptr = optptr + 2; goto error; } - if (skb) { + if (rt) { memcpy(&optptr[optptr[2]-1], &rt->rt_spec_dst, 4); opt->is_changed = 1; } @@ -371,7 +371,7 @@ int ip_options_compile(struct net *net, goto error; } opt->ts = optptr - iph; - if (skb) { + if (rt) { memcpy(&optptr[optptr[2]-1], &rt->rt_spec_dst, 4); timeptr = (__be32*)&optptr[optptr[2]+3]; } @@ -606,7 +606,7 @@ int ip_options_rcv_srr(struct sk_buff *skb) if (!opt->srr) return 0; - if (skb->pkt_type != PACKET_HOST) + if (skb->pkt_type != PACKET_HOST || !rt) return -EINVAL; if (rt->rt_type == RTN_UNICAST) { if (!opt->is_strictroute) ^ permalink raw reply related [flat|nested] 56+ messages in thread
* [PATCH v2] ip: ip_options_compile() resilient to NULL skb route 2011-04-14 3:30 ` Hiroaki SHIMODA 2011-04-14 3:37 ` Eric Dumazet @ 2011-04-14 15:55 ` Eric Dumazet 2011-04-14 22:02 ` Scot Doyle 2011-04-14 23:20 ` Hiroaki SHIMODA 1 sibling, 2 replies; 56+ messages in thread From: Eric Dumazet @ 2011-04-14 15:55 UTC (permalink / raw) To: Hiroaki SHIMODA; +Cc: Stephen Hemminger, David Miller, lkml, netdev Scot Doyle demonstrated ip_options_compile() could be called with an skb without an attached route, using a setup involving a bridge, netfilter, and forged IP packets. Let's make ip_options_compile() and ip_options_rcv_srr() a bit more robust, instead of changing bridge/netfilter code. With help from Hiroaki SHIMODA. Reported-by: Scot Doyle <lkml@scotdoyle.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> --- v2: ip_options_rcv_srr() fix as well, from Hiroaki net/ipv4/ip_options.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c index 28a736f..2391b24 100644 --- a/net/ipv4/ip_options.c +++ b/net/ipv4/ip_options.c @@ -329,7 +329,7 @@ int ip_options_compile(struct net *net, pp_ptr = optptr + 2; goto error; } - if (skb) { + if (rt) { memcpy(&optptr[optptr[2]-1], &rt->rt_spec_dst, 4); opt->is_changed = 1; } @@ -371,7 +371,7 @@ int ip_options_compile(struct net *net, goto error; } opt->ts = optptr - iph; - if (skb) { + if (rt) { memcpy(&optptr[optptr[2]-1], &rt->rt_spec_dst, 4); timeptr = (__be32*)&optptr[optptr[2]+3]; } @@ -603,7 +603,7 @@ int ip_options_rcv_srr(struct sk_buff *skb) unsigned long orefdst; int err; - if (!opt->srr) + if (!opt->srr || !rt) return 0; if (skb->pkt_type != PACKET_HOST) ^ permalink raw reply related [flat|nested] 56+ messages in thread
* Re: [PATCH v2] ip: ip_options_compile() resilient to NULL skb route 2011-04-14 15:55 ` [PATCH v2] " Eric Dumazet @ 2011-04-14 22:02 ` Scot Doyle 2011-04-14 22:04 ` David Miller 2011-04-14 23:20 ` Hiroaki SHIMODA 1 sibling, 1 reply; 56+ messages in thread From: Scot Doyle @ 2011-04-14 22:02 UTC (permalink / raw) To: Eric Dumazet; +Cc: Hiroaki SHIMODA, Stephen Hemminger, David Miller, netdev On 04/14/2011 10:55 AM, Eric Dumazet wrote: > Scot Doyle demonstrated ip_options_compile() could be called with an skb > without an attached route, using a setup involving a bridge, netfilter, > and forged IP packets. > > Let's make ip_options_compile() and ip_options_rcv_srr() a bit more > robust, instead of changing bridge/netfilter code. > > With help from Hiroaki SHIMODA. > > Reported-by: Scot Doyle<lkml@scotdoyle.com> > Signed-off-by: Eric Dumazet<eric.dumazet@gmail.com> > Cc: Stephen Hemminger<shemminger@vyatta.com> > Cc: Hiroaki SHIMODA<shimoda.hiroaki@gmail.com> > --- > v2: ip_options_rcv_srr() fix as well, from Hiroaki > > net/ipv4/ip_options.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c > index 28a736f..2391b24 100644 > --- a/net/ipv4/ip_options.c > +++ b/net/ipv4/ip_options.c > @@ -329,7 +329,7 @@ int ip_options_compile(struct net *net, > pp_ptr = optptr + 2; > goto error; > } > - if (skb) { > + if (rt) { > memcpy(&optptr[optptr[2]-1],&rt->rt_spec_dst, 4); > opt->is_changed = 1; > } > @@ -371,7 +371,7 @@ int ip_options_compile(struct net *net, > goto error; > } > opt->ts = optptr - iph; > - if (skb) { > + if (rt) { > memcpy(&optptr[optptr[2]-1],&rt->rt_spec_dst, 4); > timeptr = (__be32*)&optptr[optptr[2]+3]; > } > @@ -603,7 +603,7 @@ int ip_options_rcv_srr(struct sk_buff *skb) > unsigned long orefdst; > int err; > > - if (!opt->srr) > + if (!opt->srr || !rt) > return 0; > > if (skb->pkt_type != PACKET_HOST) The 2.6.39-rc3 kernel, plus this patch and the two patches previously accepted by David in this thread, didn't panic when tested with the IP Stack Checker tool hitting either the assigned bridge IP address or a guest virtual machine IP address sharing that bridge. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH v2] ip: ip_options_compile() resilient to NULL skb route 2011-04-14 22:02 ` Scot Doyle @ 2011-04-14 22:04 ` David Miller 0 siblings, 0 replies; 56+ messages in thread From: David Miller @ 2011-04-14 22:04 UTC (permalink / raw) To: lkml; +Cc: eric.dumazet, shimoda.hiroaki, shemminger, netdev From: Scot Doyle <lkml@scotdoyle.com> Date: Thu, 14 Apr 2011 17:02:48 -0500 > The 2.6.39-rc3 kernel, plus this patch and the two patches previously > accepted by David in this thread, didn't panic when tested with the IP > Stack Checker tool hitting either the assigned bridge IP address or a > guest virtual machine IP address sharing that bridge. Thank you for testing. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH v2] ip: ip_options_compile() resilient to NULL skb route 2011-04-14 15:55 ` [PATCH v2] " Eric Dumazet 2011-04-14 22:02 ` Scot Doyle @ 2011-04-14 23:20 ` Hiroaki SHIMODA 2011-04-15 6:26 ` David Miller 1 sibling, 1 reply; 56+ messages in thread From: Hiroaki SHIMODA @ 2011-04-14 23:20 UTC (permalink / raw) To: Eric Dumazet; +Cc: Stephen Hemminger, David Miller, lkml, netdev On Thu, 14 Apr 2011 17:55:37 +0200 Eric Dumazet <eric.dumazet@gmail.com> wrote: > Scot Doyle demonstrated ip_options_compile() could be called with an skb > without an attached route, using a setup involving a bridge, netfilter, > and forged IP packets. > > Let's make ip_options_compile() and ip_options_rcv_srr() a bit more > robust, instead of changing bridge/netfilter code. > > With help from Hiroaki SHIMODA. > > Reported-by: Scot Doyle <lkml@scotdoyle.com> > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> > Cc: Stephen Hemminger <shemminger@vyatta.com> > Cc: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> Acked-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> Thanks. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [PATCH v2] ip: ip_options_compile() resilient to NULL skb route 2011-04-14 23:20 ` Hiroaki SHIMODA @ 2011-04-15 6:26 ` David Miller 0 siblings, 0 replies; 56+ messages in thread From: David Miller @ 2011-04-15 6:26 UTC (permalink / raw) To: shimoda.hiroaki; +Cc: eric.dumazet, shemminger, lkml, netdev From: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> Date: Fri, 15 Apr 2011 08:20:22 +0900 > On Thu, 14 Apr 2011 17:55:37 +0200 > Eric Dumazet <eric.dumazet@gmail.com> wrote: > >> Scot Doyle demonstrated ip_options_compile() could be called with an skb >> without an attached route, using a setup involving a bridge, netfilter, >> and forged IP packets. >> >> Let's make ip_options_compile() and ip_options_rcv_srr() a bit more >> robust, instead of changing bridge/netfilter code. >> >> With help from Hiroaki SHIMODA. >> >> Reported-by: Scot Doyle <lkml@scotdoyle.com> >> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> >> Cc: Stephen Hemminger <shemminger@vyatta.com> >> Cc: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> > Acked-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> Applied, thanks everyone. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: Kernel panic when using bridge 2011-04-12 16:14 ` Eric Dumazet 2011-04-12 16:20 ` Stephen Hemminger @ 2011-04-12 16:32 ` Bandan Das 1 sibling, 0 replies; 56+ messages in thread From: Bandan Das @ 2011-04-12 16:32 UTC (permalink / raw) To: Eric Dumazet Cc: Jan Lübbe, Scot Doyle, Stephen Hemminger, Hiroaki SHIMODA, netdev, Bandan Das On 0, Eric Dumazet <eric.dumazet@gmail.com> wrote: > Le mardi 12 avril 2011 à 17:13 +0200, Jan Lübbe a écrit : > > On Tue, 2011-04-12 at 16:49 +0200, Eric Dumazet wrote: > > > Of course, this might be a complete shot in the dark, but a > > > stackprotector fault in icmp_send() really sounds like a problem in > > > ip_options_echo() [ or bad input data given to this function ] > > > > It was my understanding that all IP options given to ip_options_echo are > > either from local sources or have gone through ip_options_compile, which > > seems to verify that the sum of the individual option lengths do not > > exceed the ip header. So there wouldn't need to be additional checks in > > ip_options_echo. > > > > If this is not the case, we need size checks in ip_options_echo before > > copying over each option. > > > > > Other related changes (but as old as v2.6.22) : > > > > > > commit 11a03f78fbf15a866ba > > > ([NetLabel]: core network changes) > > > > When investigating the problem I had with timestamps, i found that most > > of the lines in ip_options_echo and _compile have not been changed since > > before 2.2 (some even before 2.0). The newer changes have all been > > updates for changed API elsewhere in the stack. > > > > commit 462fb2af9788a82 might be the problem. > (bridge : Sanitize skb before it enters the IP stack) > > We are supposed to provide a zeroed ip_options to ip_options_compile() > > diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c > index 008ff6c..f3bc322 100644 > --- a/net/bridge/br_netfilter.c > +++ b/net/bridge/br_netfilter.c > @@ -249,11 +249,9 @@ static int br_parse_ip_options(struct sk_buff *skb) > goto drop; > } > > - /* Zero out the CB buffer if no options present */ > - if (iph->ihl == 5) { > - memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); > + memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); > + if (iph->ihl == 5) > return 0; > - } > > opt->optlen = iph->ihl*4 - sizeof(struct iphdr); > if (ip_options_compile(dev_net(dev), opt, skb)) > > Looks good to me. The CB area should be cleared out anyways before handing over the packet. Thank you for spotting this! Acked-by: Bandan Das <bandan.das@stratus.com> ^ permalink raw reply [flat|nested] 56+ messages in thread
end of thread, other threads:[~2011-04-15 6:27 UTC | newest] Thread overview: 56+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-04-08 1:20 Kernel panic when using bridge Scot Doyle 2011-04-08 13:49 ` Sebastian Nickel 2011-04-08 14:57 ` Scot Doyle 2011-04-08 19:12 ` Pallai Roland 2011-04-08 19:17 ` Stephen Hemminger 2011-04-09 4:51 ` Scot Doyle 2011-04-09 7:19 ` Hiroaki SHIMODA 2011-04-11 23:48 ` Scot Doyle 2011-04-12 1:31 ` Stephen Hemminger 2011-04-12 3:47 ` Scot Doyle 2011-04-12 4:09 ` Eric Dumazet 2011-04-12 4:22 ` Eric Dumazet 2011-04-12 5:17 ` Scot Doyle 2011-04-12 5:51 ` Eric Dumazet 2011-04-12 7:02 ` Scot Doyle 2011-04-12 7:31 ` Eric Dumazet 2011-04-12 8:39 ` [PATCH] inetpeer: reduce stack usage Eric Dumazet 2011-04-12 14:51 ` Hiroaki SHIMODA 2011-04-12 14:55 ` Eric Dumazet 2011-04-12 20:58 ` David Miller 2011-04-12 11:49 ` Kernel panic when using bridge Eric Dumazet 2011-04-12 13:02 ` Jan Lübbe 2011-04-12 13:15 ` Eric Dumazet 2011-04-12 14:19 ` Jan Lübbe 2011-04-12 14:49 ` Eric Dumazet 2011-04-12 15:13 ` Jan Lübbe 2011-04-12 16:14 ` Eric Dumazet 2011-04-12 16:20 ` Stephen Hemminger 2011-04-12 16:35 ` Eric Dumazet 2011-04-12 16:45 ` Bandan Das 2011-04-12 16:54 ` Eric Dumazet 2011-04-12 17:18 ` [PATCH] bridge: reset IPCB in br_parse_ip_options Eric Dumazet 2011-04-12 20:39 ` David Miller 2011-04-12 23:55 ` Scot Doyle 2011-04-13 4:12 ` Scot Doyle 2011-04-13 15:10 ` Scot Doyle 2011-04-13 15:24 ` Stephen Hemminger 2011-04-13 15:54 ` Scot Doyle 2011-04-13 15:28 ` Eric Dumazet 2011-04-13 21:48 ` David Miller 2011-04-14 0:03 ` Stephen Hemminger 2011-04-14 0:05 ` David Miller 2011-04-14 0:08 ` Stephen Hemminger 2011-04-14 2:31 ` Eric Dumazet 2011-04-14 2:54 ` Stephen Hemminger 2011-04-14 3:03 ` [PATCH] ip: ip_options_compile() resilient to NULL skb route Eric Dumazet 2011-04-14 3:30 ` Hiroaki SHIMODA 2011-04-14 3:37 ` Eric Dumazet 2011-04-14 4:15 ` Hiroaki SHIMODA 2011-04-14 13:34 ` Scot Doyle 2011-04-14 15:55 ` [PATCH v2] " Eric Dumazet 2011-04-14 22:02 ` Scot Doyle 2011-04-14 22:04 ` David Miller 2011-04-14 23:20 ` Hiroaki SHIMODA 2011-04-15 6:26 ` David Miller 2011-04-12 16:32 ` Kernel panic when using bridge Bandan Das
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).