netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* net-next kernel NULL pointer dereference at fib_rules_tclass
@ 2012-07-10  7:16 Or Gerlitz
  2012-07-10  8:42 ` Lin Ming
  2012-07-10 16:44 ` David Miller
  0 siblings, 2 replies; 9+ messages in thread
From: Or Gerlitz @ 2012-07-10  7:16 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Shlomo Pongratz, Amir Vadai, Erez Shitrit

Hi Dave,

Using latest net-next (061a5c316b6526dbc729049a16243ec27937cc31) I
get the below crash during the boot cycle. The crash happens on a set of
nodes which use igb for their onboard 1g nic, as soon as the device goes
up. Another group, that uses a 2nd lab, where the nodes use bnx2 for 1g
NIC doesn't get this crash, but the kernel there is built by a different
.config .

Or.

Bringing up loopback interface:  [  OK  ]
Bringing up interface eth1:
Determining IP information for eth1...IPv6: ADDRCONF(NETDEV_UP): eth1: link is not ready
igb: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
Starting system logger: BUG: unable to handle kernel NULL pointer dereference at 00000000000000ac
IP: [<ffffffff81320393>] fib_rules_tclass+0xf/0x17
PGD 223171067 PUD 22353e067 PMD 0
Oops: 0000 [#1] SMP
CPU 0
Modules linked in:
 ipv6 dm_mirror dm_region_hash dm_log uinput igb ptp pps_core mlx4_ib ib_mad ib_core mlx4_en mlx4_core sg kvm_intel kvm microcode pcspkr rng_core ioatdma dca shpchp dm_mod button sr_mod ext3 jbd sd_mod usb_storage ata_piix libata scsi_mod ehci_hcd uhci_hcd floppy [last unloaded: scsi_wait_scan]

Pid: 0, comm: swapper/0 Not tainted 3.5.0-rc5-12540-g061a5c3-dirty #94 Supermicro X7DWU/X7DWU
RIP: 0010:[<ffffffff81320393>]  [<ffffffff81320393>] fib_rules_tclass+0xf/0x17
RSP: 0018:ffff88022fc03a30  EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff88022fc03b54 RCX: 0000000000000050
RDX: 0000000000000020 RSI: 0000000000000001 RDI: ffff88022fc03a40
RBP: ffff88022fc03a30 R08: ffff88022fc03a70 R09: ffff88022fc03a40
R10: 0000000000000020 R11: ffff880225390a80 R12: 0000000000000001
R13: ffff88021cc7a000 R14: 0000000000000000 R15: ffff8802269c26c0
FS:  0000000000000000(0000) GS:ffff88022fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000000000ac CR3: 0000000222aeb000 CR4: 00000000000007f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper/0 (pid: 0, threadinfo ffffffff81600000, task ffffffff81613410)
Stack:
 ffff88022fc03ac0 ffffffff81318956 ffff8802fd010010 ffff8802232d5a80
 ffff880222add880 ffff880223269a98 0000000000000020 ffff880200000000
 0000000100000000 ffff000000000000 12311eac2540eaf0 ffff88027e001eac
Call Trace:
 <IRQ>

 [<ffffffff81318956>] fib_validate_source+0x170/0x2a5
 [<ffffffff812e6603>] ip_route_input_common+0x6fe/0xd12
 [<ffffffff812e8380>] ? ip_rcv_finish+0x70/0x457
 [<ffffffff812e8461>] ip_rcv_finish+0x151/0x457
 [<ffffffff812e8380>] ? ip_rcv_finish+0x70/0x457
 [<ffffffff812e89a1>] ip_rcv+0x23a/0x260
 [<ffffffff812beae7>] __netif_receive_skb+0x3ac/0x415
 [<ffffffff812be86f>] ? __netif_receive_skb+0x134/0x415
 [<ffffffff81312ae5>] ? inet_gro_receive+0x81/0x23f
 [<ffffffff812b68da>] ? skb_free_head+0x47/0x49
 [<ffffffff812c035d>] netif_receive_skb+0xee/0xf7
 [<ffffffff812c071d>] ? dev_gro_receive+0x15f/0x2fb
 [<ffffffff812c063a>] ? dev_gro_receive+0x7c/0x2fb
 [<ffffffff81065644>] ? trace_hardirqs_on+0xd/0xf
 [<ffffffff812c044c>] napi_skb_finish+0x24/0x56
 [<ffffffff812c0bf0>] napi_gro_receive+0x10f/0x11e
 [<ffffffffa0216e85>] igb_poll+0x843/0xae5 [igb]
 [<ffffffff812c0e01>] ? net_rx_action+0x14c/0x1ee
 [<ffffffff812c0d76>] net_rx_action+0xc1/0x1ee
 [<ffffffff8102f746>] __do_softirq+0xff/0x1de
 [<ffffffff813631cc>] call_softirq+0x1c/0x26
 [<ffffffff81003090>] do_softirq+0x38/0x80
 [<ffffffff8102f41f>] irq_exit+0x4e/0x83
 [<ffffffff810028f9>] do_IRQ+0x98/0xaf
 [<ffffffff8135b52c>] common_interrupt+0x6c/0x6c
 <EOI>

 [<ffffffff810083ec>] ? mwait_idle+0x13c/0x208
 [<ffffffff810083e3>] ? mwait_idle+0x133/0x208
 [<ffffffff810088d1>] cpu_idle+0x6e/0xab
 [<ffffffff81343e13>] rest_init+0xc7/0xce
 [<ffffffff81343d4c>] ? csum_partial_copy_generic+0x16c/0x16c
 [<ffffffff8167fbf3>] start_kernel+0x332/0x33f
 [<ffffffff8167f6f6>] ? kernel_init+0x19d/0x19d
 [<ffffffff8167f2b4>] x86_64_start_reservations+0xb8/0xbd
 [<ffffffff8167f3a6>] x86_64_start_kernel+0xed/0xf4
Code: 81 31 c0 e8 a5 bb dd ff 48 83 c4 28 31 c0 5b 41 5c 41 5d 41 5e 41 5f c9 c3 90 90 90 48 8b 57 20 55 31 c0 48 89 e5 48 85 d2 74 06 <8b> 82 8c 00 00 00 c9 c3 8b 47 7c 33 46 14 85 87 80 00 00 00 55
RIP  [<ffffffff81320393>] fib_rules_tclass+0xf/0x17
 RSP <ffff88022fc03a30>
CR2: 00000000000000ac
---[ end trace e7c6714b8de1c341 ]---
Kernel panic - not syncing: Fatal exception in interrupt

^ permalink raw reply	[flat|nested] 9+ messages in thread

* net-next kernel NULL pointer dereference at fib_rules_tclass
@ 2012-07-10  7:29 Or Gerlitz
  0 siblings, 0 replies; 9+ messages in thread
From: Or Gerlitz @ 2012-07-10  7:29 UTC (permalink / raw)
  To: David Miller
  Cc: netdev@vger.kernel.org, Amir Vadai, Shlomo Pongratz, Erez Shitrit

Hi Dave,

Using latest net-next (061a5c316b6526dbc729049a16243ec27937cc31) I
get the below crash during the boot cycle. The crash happens on a set of
nodes which use igb for their onboard 1g nic, as soon as the device goes
up. Another group, that uses a 2nd lab, where the nodes use bnx2 for 1g
NIC doesn't get this crash, but the kernel there is built by a different
.config

Or.


Bringing up loopback interface:  [  OK  ]
Bringing up interface eth1:
Determining IP information for eth1...IPv6: ADDRCONF(NETDEV_UP): eth1:
link is not ready
igb: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
Starting system logger: BUG: unable to handle kernel NULL pointer
dereference at 00000000000000ac
IP: [<ffffffff81320393>] fib_rules_tclass+0xf/0x17
PGD 223171067 PUD 22353e067 PMD 0
Oops: 0000 [#1] SMP
CPU 0
Modules linked in:
  ipv6 dm_mirror dm_region_hash dm_log uinput igb ptp pps_core mlx4_ib
ib_mad ib_core mlx4_en mlx4_core sg kvm_intel kvm microcode pcspkr
rng_core ioatdma dca shpchp dm_mod button sr_mod ext3 jbd sd_mod
usb_storage ata_piix libata scsi_mod ehci_hcd uhci_hcd floppy [last
unloaded: scsi_wait_scan]

Pid: 0, comm: swapper/0 Not tainted 3.5.0-rc5-12540-g061a5c3-dirty #94
Supermicro X7DWU/X7DWU
RIP: 0010:[<ffffffff81320393>]  [<ffffffff81320393>]
fib_rules_tclass+0xf/0x17
RSP: 0018:ffff88022fc03a30  EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff88022fc03b54 RCX: 0000000000000050
RDX: 0000000000000020 RSI: 0000000000000001 RDI: ffff88022fc03a40
RBP: ffff88022fc03a30 R08: ffff88022fc03a70 R09: ffff88022fc03a40
R10: 0000000000000020 R11: ffff880225390a80 R12: 0000000000000001
R13: ffff88021cc7a000 R14: 0000000000000000 R15: ffff8802269c26c0
FS:  0000000000000000(0000) GS:ffff88022fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000000000ac CR3: 0000000222aeb000 CR4: 00000000000007f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper/0 (pid: 0, threadinfo ffffffff81600000, task
ffffffff81613410)
Stack:
  ffff88022fc03ac0 ffffffff81318956 ffff8802fd010010 ffff8802232d5a80
  ffff880222add880 ffff880223269a98 0000000000000020 ffff880200000000
  0000000100000000 ffff000000000000 12311eac2540eaf0 ffff88027e001eac
Call Trace:
  <IRQ>

  [<ffffffff81318956>] fib_validate_source+0x170/0x2a5
  [<ffffffff812e6603>] ip_route_input_common+0x6fe/0xd12
  [<ffffffff812e8380>] ? ip_rcv_finish+0x70/0x457
  [<ffffffff812e8461>] ip_rcv_finish+0x151/0x457
  [<ffffffff812e8380>] ? ip_rcv_finish+0x70/0x457
  [<ffffffff812e89a1>] ip_rcv+0x23a/0x260
  [<ffffffff812beae7>] __netif_receive_skb+0x3ac/0x415
  [<ffffffff812be86f>] ? __netif_receive_skb+0x134/0x415
  [<ffffffff81312ae5>] ? inet_gro_receive+0x81/0x23f
  [<ffffffff812b68da>] ? skb_free_head+0x47/0x49
  [<ffffffff812c035d>] netif_receive_skb+0xee/0xf7
[<ffffffff812c071d>] ? dev_gro_receive+0x15f/0x2fb
  [<ffffffff812c063a>] ? dev_gro_receive+0x7c/0x2fb
  [<ffffffff81065644>] ? trace_hardirqs_on+0xd/0xf
  [<ffffffff812c044c>] napi_skb_finish+0x24/0x56
  [<ffffffff812c0bf0>] napi_gro_receive+0x10f/0x11e
  [<ffffffffa0216e85>] igb_poll+0x843/0xae5 [igb]
  [<ffffffff812c0e01>] ? net_rx_action+0x14c/0x1ee
  [<ffffffff812c0d76>] net_rx_action+0xc1/0x1ee
  [<ffffffff8102f746>] __do_softirq+0xff/0x1de
  [<ffffffff813631cc>] call_softirq+0x1c/0x26
  [<ffffffff81003090>] do_softirq+0x38/0x80
  [<ffffffff8102f41f>] irq_exit+0x4e/0x83
  [<ffffffff810028f9>] do_IRQ+0x98/0xaf
  [<ffffffff8135b52c>] common_interrupt+0x6c/0x6c
  <EOI>

  [<ffffffff810083ec>] ? mwait_idle+0x13c/0x208
  [<ffffffff810083e3>] ? mwait_idle+0x133/0x208
  [<ffffffff810088d1>] cpu_idle+0x6e/0xab
  [<ffffffff81343e13>] rest_init+0xc7/0xce
  [<ffffffff81343d4c>] ? csum_partial_copy_generic+0x16c/0x16c
  [<ffffffff8167fbf3>] start_kernel+0x332/0x33f
  [<ffffffff8167f6f6>] ? kernel_init+0x19d/0x19d
  [<ffffffff8167f2b4>] x86_64_start_reservations+0xb8/0xbd
  [<ffffffff8167f3a6>] x86_64_start_kernel+0xed/0xf4
Code: 81 31 c0 e8 a5 bb dd ff 48 83 c4 28 31 c0 5b 41 5c 41 5d 41 5e 41
5f c9 c3 90 90 90 48 8b 57 20 55 31 c0 48 89 e5 48 85 d2 74 06 <8b> 82
8c 00 00 00 c9 c3 8b 47 7c 33 46 14 85 87 80 00 00 00 55
RIP  [<ffffffff81320393>] fib_rules_tclass+0xf/0x17
  RSP <ffff88022fc03a30>
CR2: 00000000000000ac
---[ end trace e7c6714b8de1c341 ]---
Kernel panic - not syncing: Fatal exception in interrupt

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: net-next kernel NULL pointer dereference at fib_rules_tclass
  2012-07-10  7:16 net-next kernel NULL pointer dereference at fib_rules_tclass Or Gerlitz
@ 2012-07-10  8:42 ` Lin Ming
  2012-07-10  9:00   ` David Miller
  2012-07-10 16:44 ` David Miller
  1 sibling, 1 reply; 9+ messages in thread
From: Lin Ming @ 2012-07-10  8:42 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: David Miller, netdev, Shlomo Pongratz, Amir Vadai, Erez Shitrit

On Tue, Jul 10, 2012 at 3:16 PM, Or Gerlitz <ogerlitz@mellanox.com> wrote:
> Hi Dave,
>
> Using latest net-next (061a5c316b6526dbc729049a16243ec27937cc31) I
> get the below crash during the boot cycle. The crash happens on a set of
> nodes which use igb for their onboard 1g nic, as soon as the device goes
> up. Another group, that uses a 2nd lab, where the nodes use bnx2 for 1g
> NIC doesn't get this crash, but the kernel there is built by a different
> .config .

Hi,

I got similar panic, but not at boot time.
I'll look for the cause.

Regards,
Lin Ming

>
> Or.
>
> Bringing up loopback interface:  [  OK  ]
> Bringing up interface eth1:
> Determining IP information for eth1...IPv6: ADDRCONF(NETDEV_UP): eth1: link is not ready
> igb: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
> IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
> Starting system logger: BUG: unable to handle kernel NULL pointer dereference at 00000000000000ac
> IP: [<ffffffff81320393>] fib_rules_tclass+0xf/0x17
> PGD 223171067 PUD 22353e067 PMD 0
> Oops: 0000 [#1] SMP
> CPU 0
> Modules linked in:
>  ipv6 dm_mirror dm_region_hash dm_log uinput igb ptp pps_core mlx4_ib ib_mad ib_core mlx4_en mlx4_core sg kvm_intel kvm microcode pcspkr rng_core ioatdma dca shpchp dm_mod button sr_mod ext3 jbd sd_mod usb_storage ata_piix libata scsi_mod ehci_hcd uhci_hcd floppy [last unloaded: scsi_wait_scan]
>
> Pid: 0, comm: swapper/0 Not tainted 3.5.0-rc5-12540-g061a5c3-dirty #94 Supermicro X7DWU/X7DWU
> RIP: 0010:[<ffffffff81320393>]  [<ffffffff81320393>] fib_rules_tclass+0xf/0x17
> RSP: 0018:ffff88022fc03a30  EFLAGS: 00010202
> RAX: 0000000000000000 RBX: ffff88022fc03b54 RCX: 0000000000000050
> RDX: 0000000000000020 RSI: 0000000000000001 RDI: ffff88022fc03a40
> RBP: ffff88022fc03a30 R08: ffff88022fc03a70 R09: ffff88022fc03a40
> R10: 0000000000000020 R11: ffff880225390a80 R12: 0000000000000001
> R13: ffff88021cc7a000 R14: 0000000000000000 R15: ffff8802269c26c0
> FS:  0000000000000000(0000) GS:ffff88022fc00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 00000000000000ac CR3: 0000000222aeb000 CR4: 00000000000007f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process swapper/0 (pid: 0, threadinfo ffffffff81600000, task ffffffff81613410)
> Stack:
>  ffff88022fc03ac0 ffffffff81318956 ffff8802fd010010 ffff8802232d5a80
>  ffff880222add880 ffff880223269a98 0000000000000020 ffff880200000000
>  0000000100000000 ffff000000000000 12311eac2540eaf0 ffff88027e001eac
> Call Trace:
>  <IRQ>
>
>  [<ffffffff81318956>] fib_validate_source+0x170/0x2a5
>  [<ffffffff812e6603>] ip_route_input_common+0x6fe/0xd12
>  [<ffffffff812e8380>] ? ip_rcv_finish+0x70/0x457
>  [<ffffffff812e8461>] ip_rcv_finish+0x151/0x457
>  [<ffffffff812e8380>] ? ip_rcv_finish+0x70/0x457
>  [<ffffffff812e89a1>] ip_rcv+0x23a/0x260
>  [<ffffffff812beae7>] __netif_receive_skb+0x3ac/0x415
>  [<ffffffff812be86f>] ? __netif_receive_skb+0x134/0x415
>  [<ffffffff81312ae5>] ? inet_gro_receive+0x81/0x23f
>  [<ffffffff812b68da>] ? skb_free_head+0x47/0x49
>  [<ffffffff812c035d>] netif_receive_skb+0xee/0xf7
>  [<ffffffff812c071d>] ? dev_gro_receive+0x15f/0x2fb
>  [<ffffffff812c063a>] ? dev_gro_receive+0x7c/0x2fb
>  [<ffffffff81065644>] ? trace_hardirqs_on+0xd/0xf
>  [<ffffffff812c044c>] napi_skb_finish+0x24/0x56
>  [<ffffffff812c0bf0>] napi_gro_receive+0x10f/0x11e
>  [<ffffffffa0216e85>] igb_poll+0x843/0xae5 [igb]
>  [<ffffffff812c0e01>] ? net_rx_action+0x14c/0x1ee
>  [<ffffffff812c0d76>] net_rx_action+0xc1/0x1ee
>  [<ffffffff8102f746>] __do_softirq+0xff/0x1de
>  [<ffffffff813631cc>] call_softirq+0x1c/0x26
>  [<ffffffff81003090>] do_softirq+0x38/0x80
>  [<ffffffff8102f41f>] irq_exit+0x4e/0x83
>  [<ffffffff810028f9>] do_IRQ+0x98/0xaf
>  [<ffffffff8135b52c>] common_interrupt+0x6c/0x6c
>  <EOI>
>
>  [<ffffffff810083ec>] ? mwait_idle+0x13c/0x208
>  [<ffffffff810083e3>] ? mwait_idle+0x133/0x208
>  [<ffffffff810088d1>] cpu_idle+0x6e/0xab
>  [<ffffffff81343e13>] rest_init+0xc7/0xce
>  [<ffffffff81343d4c>] ? csum_partial_copy_generic+0x16c/0x16c
>  [<ffffffff8167fbf3>] start_kernel+0x332/0x33f
>  [<ffffffff8167f6f6>] ? kernel_init+0x19d/0x19d
>  [<ffffffff8167f2b4>] x86_64_start_reservations+0xb8/0xbd
>  [<ffffffff8167f3a6>] x86_64_start_kernel+0xed/0xf4
> Code: 81 31 c0 e8 a5 bb dd ff 48 83 c4 28 31 c0 5b 41 5c 41 5d 41 5e 41 5f c9 c3 90 90 90 48 8b 57 20 55 31 c0 48 89 e5 48 85 d2 74 06 <8b> 82 8c 00 00 00 c9 c3 8b 47 7c 33 46 14 85 87 80 00 00 00 55
> RIP  [<ffffffff81320393>] fib_rules_tclass+0xf/0x17
>  RSP <ffff88022fc03a30>
> CR2: 00000000000000ac
> ---[ end trace e7c6714b8de1c341 ]---
> Kernel panic - not syncing: Fatal exception in interrupt

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: net-next kernel NULL pointer dereference at fib_rules_tclass
  2012-07-10  8:42 ` Lin Ming
@ 2012-07-10  9:00   ` David Miller
  0 siblings, 0 replies; 9+ messages in thread
From: David Miller @ 2012-07-10  9:00 UTC (permalink / raw)
  To: mlin; +Cc: ogerlitz, netdev, shlomop, amirv, erezsh

From: Lin Ming <mlin@ss.pku.edu.cn>
Date: Tue, 10 Jul 2012 16:42:29 +0800

> On Tue, Jul 10, 2012 at 3:16 PM, Or Gerlitz <ogerlitz@mellanox.com> wrote:
>> Hi Dave,
>>
>> Using latest net-next (061a5c316b6526dbc729049a16243ec27937cc31) I
>> get the below crash during the boot cycle. The crash happens on a set of
>> nodes which use igb for their onboard 1g nic, as soon as the device goes
>> up. Another group, that uses a 2nd lab, where the nodes use bnx2 for 1g
>> NIC doesn't get this crash, but the kernel there is built by a different
>> .config .
> 
> Hi,
> 
> I got similar panic, but not at boot time.
> I'll look for the cause.

Don't worry about it, I am sure that I added this bug and therefore
I will fix it.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: net-next kernel NULL pointer dereference at fib_rules_tclass
  2012-07-10  7:16 net-next kernel NULL pointer dereference at fib_rules_tclass Or Gerlitz
  2012-07-10  8:42 ` Lin Ming
@ 2012-07-10 16:44 ` David Miller
  2012-07-10 17:25   ` Eric Dumazet
  1 sibling, 1 reply; 9+ messages in thread
From: David Miller @ 2012-07-10 16:44 UTC (permalink / raw)
  To: ogerlitz; +Cc: netdev, shlomop, amirv, erezsh

From: Or Gerlitz <ogerlitz@mellanox.com>
Date: Tue, 10 Jul 2012 10:16:55 +0300

> Starting system logger: BUG: unable to handle kernel NULL pointer dereference at 00000000000000ac
> IP: [<ffffffff81320393>] fib_rules_tclass+0xf/0x17

Ok, fib_rules_tclass() checks for res->r being NULL and only
dereferences it if it is not.

fib4_rule->tclassid has offset ~0x8c on x86-64, and this fault
address is 0x10 bytes off.

Does this patch fix the problem?

diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
index 539c672..000c467 100644
--- a/include/net/ip_fib.h
+++ b/include/net/ip_fib.h
@@ -230,6 +230,7 @@ static inline int fib_lookup(struct net *net, struct flowi4 *flp,
 			     struct fib_result *res)
 {
 	if (!net->ipv4.fib_has_custom_rules) {
+		res->r = NULL;
 		if (net->ipv4.fib_local &&
 		    !fib_table_lookup(net->ipv4.fib_local, flp, res,
 				      FIB_LOOKUP_NOREF))

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: net-next kernel NULL pointer dereference at fib_rules_tclass
  2012-07-10 16:44 ` David Miller
@ 2012-07-10 17:25   ` Eric Dumazet
  2012-07-10 18:14     ` Greg Rose
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2012-07-10 17:25 UTC (permalink / raw)
  To: David Miller; +Cc: ogerlitz, netdev, shlomop, amirv, erezsh

On Tue, 2012-07-10 at 09:44 -0700, David Miller wrote:
> From: Or Gerlitz <ogerlitz@mellanox.com>
> Date: Tue, 10 Jul 2012 10:16:55 +0300
> 
> > Starting system logger: BUG: unable to handle kernel NULL pointer dereference at 00000000000000ac
> > IP: [<ffffffff81320393>] fib_rules_tclass+0xf/0x17
> 
> Ok, fib_rules_tclass() checks for res->r being NULL and only
> dereferences it if it is not.
> 
> fib4_rule->tclassid has offset ~0x8c on x86-64, and this fault
> address is 0x10 bytes off.
> 
> Does this patch fix the problem?
> 
> diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
> index 539c672..000c467 100644
> --- a/include/net/ip_fib.h
> +++ b/include/net/ip_fib.h
> @@ -230,6 +230,7 @@ static inline int fib_lookup(struct net *net, struct flowi4 *flp,
>  			     struct fib_result *res)
>  {
>  	if (!net->ipv4.fib_has_custom_rules) {
> +		res->r = NULL;
>  		if (net->ipv4.fib_local &&
>  		    !fib_table_lookup(net->ipv4.fib_local, flp, res,
>  				      FIB_LOOKUP_NOREF))

It does here, thanks

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: net-next kernel NULL pointer dereference at fib_rules_tclass
  2012-07-10 17:25   ` Eric Dumazet
@ 2012-07-10 18:14     ` Greg Rose
  2012-07-11  1:05       ` David Miller
  0 siblings, 1 reply; 9+ messages in thread
From: Greg Rose @ 2012-07-10 18:14 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, ogerlitz, netdev, shlomop, amirv, erezsh

On Tue, 10 Jul 2012 19:25:01 +0200
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> On Tue, 2012-07-10 at 09:44 -0700, David Miller wrote:
> > From: Or Gerlitz <ogerlitz@mellanox.com>
> > Date: Tue, 10 Jul 2012 10:16:55 +0300
> > 
> > > Starting system logger: BUG: unable to handle kernel NULL pointer
> > > dereference at 00000000000000ac IP: [<ffffffff81320393>]
> > > fib_rules_tclass+0xf/0x17
> > 
> > Ok, fib_rules_tclass() checks for res->r being NULL and only
> > dereferences it if it is not.
> > 
> > fib4_rule->tclassid has offset ~0x8c on x86-64, and this fault
> > address is 0x10 bytes off.
> > 
> > Does this patch fix the problem?
> > 
> > diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
> > index 539c672..000c467 100644
> > --- a/include/net/ip_fib.h
> > +++ b/include/net/ip_fib.h
> > @@ -230,6 +230,7 @@ static inline int fib_lookup(struct net *net,
> > struct flowi4 *flp, struct fib_result *res)
> >  {
> >  	if (!net->ipv4.fib_has_custom_rules) {
> > +		res->r = NULL;
> >  		if (net->ipv4.fib_local &&
> >  		    !fib_table_lookup(net->ipv4.fib_local, flp,
> > res, FIB_LOOKUP_NOREF))
> 
> It does here, thanks

Works for me too.

Thanks,

> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: net-next kernel NULL pointer dereference at fib_rules_tclass
  2012-07-10 18:14     ` Greg Rose
@ 2012-07-11  1:05       ` David Miller
  2012-07-11  7:42         ` Or Gerlitz
  0 siblings, 1 reply; 9+ messages in thread
From: David Miller @ 2012-07-11  1:05 UTC (permalink / raw)
  To: gregory.v.rose; +Cc: eric.dumazet, ogerlitz, netdev, shlomop, amirv, erezsh

From: Greg Rose <gregory.v.rose@intel.com>
Date: Tue, 10 Jul 2012 11:14:34 -0700

> On Tue, 10 Jul 2012 19:25:01 +0200
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
>> On Tue, 2012-07-10 at 09:44 -0700, David Miller wrote:
>> > From: Or Gerlitz <ogerlitz@mellanox.com>
>> > Date: Tue, 10 Jul 2012 10:16:55 +0300
>> > 
>> > > Starting system logger: BUG: unable to handle kernel NULL pointer
>> > > dereference at 00000000000000ac IP: [<ffffffff81320393>]
>> > > fib_rules_tclass+0xf/0x17
>> > 
>> > Ok, fib_rules_tclass() checks for res->r being NULL and only
>> > dereferences it if it is not.
>> > 
>> > fib4_rule->tclassid has offset ~0x8c on x86-64, and this fault
>> > address is 0x10 bytes off.
>> > 
>> > Does this patch fix the problem?
>> > 
>> > diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
>> > index 539c672..000c467 100644
>> > --- a/include/net/ip_fib.h
>> > +++ b/include/net/ip_fib.h
>> > @@ -230,6 +230,7 @@ static inline int fib_lookup(struct net *net,
>> > struct flowi4 *flp, struct fib_result *res)
>> >  {
>> >  	if (!net->ipv4.fib_has_custom_rules) {
>> > +		res->r = NULL;
>> >  		if (net->ipv4.fib_local &&
>> >  		    !fib_table_lookup(net->ipv4.fib_local, flp,
>> > res, FIB_LOOKUP_NOREF))
>> 
>> It does here, thanks
> 
> Works for me too.

Great, pushed out to net-next, thanks everyone.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: net-next kernel NULL pointer dereference at fib_rules_tclass
  2012-07-11  1:05       ` David Miller
@ 2012-07-11  7:42         ` Or Gerlitz
  0 siblings, 0 replies; 9+ messages in thread
From: Or Gerlitz @ 2012-07-11  7:42 UTC (permalink / raw)
  To: David Miller; +Cc: gregory.v.rose, eric.dumazet, netdev, shlomop, amirv, erezsh

On 7/11/2012 4:05 AM, David Miller wrote:
> Great, pushed out to net-next, thanks everyone.

works here too, no crashing any more (on that one...)

Or.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-07-11  7:44 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-07-10  7:16 net-next kernel NULL pointer dereference at fib_rules_tclass Or Gerlitz
2012-07-10  8:42 ` Lin Ming
2012-07-10  9:00   ` David Miller
2012-07-10 16:44 ` David Miller
2012-07-10 17:25   ` Eric Dumazet
2012-07-10 18:14     ` Greg Rose
2012-07-11  1:05       ` David Miller
2012-07-11  7:42         ` Or Gerlitz
  -- strict thread matches above, loose matches on Subject: below --
2012-07-10  7:29 Or Gerlitz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).