netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RFC: crash in fib6_clean_all() while loading ipv6 module
@ 2013-09-09 10:05 Michal Kubecek
  2013-09-09 17:27 ` David Miller
  0 siblings, 1 reply; 4+ messages in thread
From: Michal Kubecek @ 2013-09-09 10:05 UTC (permalink / raw)
  To: netdev

Hello,

Two of our customers encountered a crash in fib6_clean_all() when
booting their system:

[   12.408421] BUG: unable to handle kernel NULL pointer dereference at        (null)
[   12.408424] IP: [<ffffffffa02566b4>] fib6_clean_all+0x34/0xd0 [ipv6_lib]
[   12.408434] PGD c3590f067 PUD c29073067 PMD 0 
[   12.408436] Oops: 0000 [#1] SMP 
[   12.408439] CPU 1 
[   12.408440] Modules linked in: processor(+) ipv6(+) thermal_sys ipv6_lib
ahci(+) ixgbe(+) ehci_hcd libahci hwmon igb(+) libata usbcore i2c_i801 dca
iTCO_wdt pcspkr scsi_mod e1000e i2c_core usb_common iTCO_vendor_support
rtc_cmos container ptp pps_core button mdio
[   12.408449] Supported: Yes
[   12.408449] 
[   12.408451] Pid: 211, comm: modprobe Not tainted 3.0.76-0.11-default #1 ...
[   12.408453] RIP: 0010:[<ffffffffa02566b4>]  [<ffffffffa02566b4>] fib6_clean_all+0x34/0xd0 [ipv6_lib]
[   12.408460] RSP: 0018:ffff880c1e983b98  EFLAGS: 00010246
[   12.408462] RAX: 0000000000000000 RBX: ffffffffffffffff RCX: 0000000000000000
[   12.408463] RDX: 0000000000000000 RSI: ffffffffa0254f90 RDI: ffffffff81a72280
[   12.408464] RBP: ffffffff81a72280 R08: 0000000000000000 R09: 000000003520aec8
[   12.408466] R10: 000000005f000c28 R11: 000000003520aec8 R12: 0000000000000000
[   12.408467] R13: ffffffff81a72280 R14: 0000000000000000 R15: ffffffffa0254f90
[   12.408469] FS:  00007fb2039f1700(0000) GS:ffff880c7f040000(0000) knlGS:0000000000000000
[   12.408470] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   12.408471] CR2: 0000000000000000 CR3: 0000000c2904e000 CR4: 00000000001407e0
[   12.408473] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   12.408474] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[   12.408476] Process modprobe (pid: 211, threadinfo ffff880c1e982000, task ffff880c3242c540)
[   12.408477] Stack:
[   12.408478]  0000000000000000 0000000000000000 0000000000000000 ffffffff8146544e
[   12.408480]  000000000006a9d0 0000000000000000 000000003520aec8 000000005f000c28
[   12.408483]  000000003520aec8 0000000000000000 0000000000000000 0000000000000000
[   12.408485] Call Trace:
[   12.408507]  [<ffffffffa02567a3>] fib6_run_gc+0x53/0xf0 [ipv6_lib]
[   12.408524]  [<ffffffffa025c5c6>] ndisc_netdev_event+0x186/0x190 [ipv6_lib]
[   12.408541]  [<ffffffff81460437>] notifier_call_chain+0x37/0x70
[   12.408547]  [<ffffffff813a6e52>] dev_addr_add+0x52/0x90
[   12.408556]  [<ffffffffa032e3e9>] ixgbe_probe+0x8c9/0xd10 [ixgbe]
[   12.408565]  [<ffffffff8127d5d4>] local_pci_probe+0x54/0xe0
[   12.408570]  [<ffffffff8127d740>] __pci_device_probe+0xe0/0xf0
[   12.408573]  [<ffffffff8127e953>] pci_device_probe+0x33/0x60
[   12.408578]  [<ffffffff8132ddda>] really_probe+0x7a/0x260
[   12.408581]  [<ffffffff8132e023>] driver_probe_device+0x63/0xc0
[   12.408585]  [<ffffffff8132e113>] __driver_attach+0x93/0xa0
[   12.408588]  [<ffffffff8132d468>] bus_for_each_dev+0x88/0xb0
[   12.408591]  [<ffffffff8132cbb5>] bus_add_driver+0x155/0x2a0
[   12.408595]  [<ffffffff8132e769>] driver_register+0x79/0x170
[   12.408599]  [<ffffffff8127ebe8>] __pci_register_driver+0x58/0xe0
[   12.408604]  [<ffffffff810001cb>] do_one_initcall+0x3b/0x180
[   12.408610]  [<ffffffff810a02df>] sys_init_module+0xcf/0x240
[   12.408615]  [<ffffffff81464592>] system_call_fastpath+0x16/0x1b

The crash is caused by fib6_clean_all() dereferencing a null pointer in
init_net.ipv6.fib_table_hash because ndisc_netdev_event() handler which
calls fib6_run_gc() is registered by ndisc_init() but the relevant data
structures aren't initialized until ip6_route_init() is called. If a
device intialization falls into this window, it emits NETDEV_CHANGEADDR,
leading to a crash.

This could be prevented by setting a flag when ip6_route_init() is
complete and not calling fib6_run_gc() from ndisc_netdev_event() until
the flag is set. However, I don't like the idea of adding a test which
will be useful only in a short window while loading ipv6 module.

The only other actions ndisc_netdev_event() does are flushing the neigh
table (which should be empty at the moment) and calling
ndisc_send_unsol_na() (which should do nothing as there is no IPv6
address assigned yet). Thus I believe taking the netdev event handler
registration call out of ndisc_init() into a separate function, say
ndisc_late_init(), which would be called after ip6_route_init() should
be a safe and efficient way to avoid the race condition.

Before I submit the patch, I wanted to ask if someone can see a problem
with this solution (or a better solution).

                                                         Michal Kubecek

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: RFC: crash in fib6_clean_all() while loading ipv6 module
  2013-09-09 10:05 RFC: crash in fib6_clean_all() while loading ipv6 module Michal Kubecek
@ 2013-09-09 17:27 ` David Miller
  2013-09-09 19:45   ` [PATCH net] ipv6: don't call fib6_run_gc() until routing is ready Michal Kubecek
  0 siblings, 1 reply; 4+ messages in thread
From: David Miller @ 2013-09-09 17:27 UTC (permalink / raw)
  To: mkubecek; +Cc: netdev

From: Michal Kubecek <mkubecek@suse.cz>
Date: Mon, 9 Sep 2013 12:05:15 +0200

> This could be prevented by setting a flag when ip6_route_init() is
> complete and not calling fib6_run_gc() from ndisc_netdev_event() until
> the flag is set. However, I don't like the idea of adding a test which
> will be useful only in a short window while loading ipv6 module.

Please just initialize the parts of ipv6 in the correct order necessary
to prevent this problem.

I stronly dislike using flags when it's simply an initialization
ordering problem.  It's just like registering a device interrupt
handle before the driver's data structures are properly setup.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH net] ipv6: don't call fib6_run_gc() until routing is ready
  2013-09-09 17:27 ` David Miller
@ 2013-09-09 19:45   ` Michal Kubecek
  2013-09-11 21:05     ` David Miller
  0 siblings, 1 reply; 4+ messages in thread
From: Michal Kubecek @ 2013-09-09 19:45 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy

When loading the ipv6 module, ndisc_init() is called before
ip6_route_init(). As the former registers a handler calling
fib6_run_gc(), this opens a window to run the garbage collector
before necessary data structures are initialized. If a network
device is initialized in this window, adding MAC address to it
triggers a NETDEV_CHANGEADDR event, leading to a crash in
fib6_clean_all().

Take the event handler registration out of ndisc_init() into a
separate function ndisc_late_init() and move it after
ip6_route_init().

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
---
 include/net/ndisc.h |  2 ++
 net/ipv6/af_inet6.c |  6 ++++++
 net/ipv6/ndisc.c    | 18 +++++++++++-------
 3 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/include/net/ndisc.h b/include/net/ndisc.h
index 3c4211f..ea0cc26 100644
--- a/include/net/ndisc.h
+++ b/include/net/ndisc.h
@@ -190,7 +190,9 @@ static inline struct neighbour *__ipv6_neigh_lookup(struct net_device *dev, cons
 }
 
 extern int			ndisc_init(void);
+extern int			ndisc_late_init(void);
 
+extern void			ndisc_late_cleanup(void);
 extern void			ndisc_cleanup(void);
 
 extern int			ndisc_rcv(struct sk_buff *skb);
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index 136fe55..7c96100 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -915,6 +915,9 @@ static int __init inet6_init(void)
 	err = ip6_route_init();
 	if (err)
 		goto ip6_route_fail;
+	err = ndisc_late_init();
+	if (err)
+		goto ndisc_late_fail;
 	err = ip6_flowlabel_init();
 	if (err)
 		goto ip6_flowlabel_fail;
@@ -981,6 +984,8 @@ ipv6_exthdrs_fail:
 addrconf_fail:
 	ip6_flowlabel_cleanup();
 ip6_flowlabel_fail:
+	ndisc_late_cleanup();
+ndisc_late_fail:
 	ip6_route_cleanup();
 ip6_route_fail:
 #ifdef CONFIG_PROC_FS
@@ -1043,6 +1048,7 @@ static void __exit inet6_exit(void)
 	ipv6_exthdrs_exit();
 	addrconf_cleanup();
 	ip6_flowlabel_cleanup();
+	ndisc_late_cleanup();
 	ip6_route_cleanup();
 #ifdef CONFIG_PROC_FS
 
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 1217945..f8a55ff 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -1727,24 +1727,28 @@ int __init ndisc_init(void)
 	if (err)
 		goto out_unregister_pernet;
 #endif
-	err = register_netdevice_notifier(&ndisc_netdev_notifier);
-	if (err)
-		goto out_unregister_sysctl;
 out:
 	return err;
 
-out_unregister_sysctl:
 #ifdef CONFIG_SYSCTL
-	neigh_sysctl_unregister(&nd_tbl.parms);
 out_unregister_pernet:
-#endif
 	unregister_pernet_subsys(&ndisc_net_ops);
 	goto out;
+#endif
 }
 
-void ndisc_cleanup(void)
+int __init ndisc_late_init(void)
+{
+	return register_netdevice_notifier(&ndisc_netdev_notifier);
+}
+
+void ndisc_late_cleanup(void)
 {
 	unregister_netdevice_notifier(&ndisc_netdev_notifier);
+}
+
+void ndisc_cleanup(void)
+{
 #ifdef CONFIG_SYSCTL
 	neigh_sysctl_unregister(&nd_tbl.parms);
 #endif
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH net] ipv6: don't call fib6_run_gc() until routing is ready
  2013-09-09 19:45   ` [PATCH net] ipv6: don't call fib6_run_gc() until routing is ready Michal Kubecek
@ 2013-09-11 21:05     ` David Miller
  0 siblings, 0 replies; 4+ messages in thread
From: David Miller @ 2013-09-11 21:05 UTC (permalink / raw)
  To: mkubecek; +Cc: netdev, kuznet, jmorris, yoshfuji, kaber

From: Michal Kubecek <mkubecek@suse.cz>
Date: Mon,  9 Sep 2013 21:45:04 +0200 (CEST)

> When loading the ipv6 module, ndisc_init() is called before
> ip6_route_init(). As the former registers a handler calling
> fib6_run_gc(), this opens a window to run the garbage collector
> before necessary data structures are initialized. If a network
> device is initialized in this window, adding MAC address to it
> triggers a NETDEV_CHANGEADDR event, leading to a crash in
> fib6_clean_all().
> 
> Take the event handler registration out of ndisc_init() into a
> separate function ndisc_late_init() and move it after
> ip6_route_init().
> 
> Signed-off-by: Michal Kubecek <mkubecek@suse.cz>

Looks good, applied, thanks.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-09-11 21:05 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-09 10:05 RFC: crash in fib6_clean_all() while loading ipv6 module Michal Kubecek
2013-09-09 17:27 ` David Miller
2013-09-09 19:45   ` [PATCH net] ipv6: don't call fib6_run_gc() until routing is ready Michal Kubecek
2013-09-11 21:05     ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).