* ipv6_addrconf: WARNING about suspicious RCU usage
@ 2018-01-20 11:57 Heiner Kallweit
2018-01-20 13:37 ` Ido Schimmel
0 siblings, 1 reply; 12+ messages in thread
From: Heiner Kallweit @ 2018-01-20 11:57 UTC (permalink / raw)
To: netdev@vger.kernel.org
Since some time (didn't bisect it yet) I get the following warning.
Is it a known issue?
[86220.125562] =============================
[86220.125586] WARNING: suspicious RCU usage
[86220.125612] 4.15.0-rc7-next-20180110+ #7 Not tainted
[86220.125641] -----------------------------
[86220.125666] kernel/sched/core.c:6026 Illegal context switch in RCU-bh read-side critical section!
[86220.125711]
other info that might help us debug this:
[86220.125755]
rcu_scheduler_active = 2, debug_locks = 1
[86220.125792] 4 locks held by kworker/0:2/1003:
[86220.125817] #0: ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
[86220.125895] #1: ((addr_chk_work).work){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
[86220.125959] #2: (rtnl_mutex){+.+.}, at: [<00000000b06d9510>] rtnl_lock+0x12/0x20
[86220.126017] #3: (rcu_read_lock_bh){....}, at: [<00000000aef52299>] addrconf_verify_rtnl+0x1e/0x510 [ipv6]
[86220.126111]
stack backtrace:
[86220.126142] CPU: 0 PID: 1003 Comm: kworker/0:2 Not tainted 4.15.0-rc7-next-20180110+ #7
[86220.126185] Hardware name: ZOTAC ZBOX-CI321NANO/ZBOX-CI321NANO, BIOS B246P105 06/01/2015
[86220.126250] Workqueue: ipv6_addrconf addrconf_verify_work [ipv6]
[86220.126288] Call Trace:
[86220.126312] dump_stack+0x70/0x9e
[86220.126337] lockdep_rcu_suspicious+0xce/0xf0
[86220.126365] ___might_sleep+0x1d3/0x240
[86220.126390] __might_sleep+0x45/0x80
[86220.126416] kmem_cache_alloc_trace+0x53/0x250
[86220.126458] ? ipv6_add_addr+0xfe/0x6e0 [ipv6]
[86220.126498] ipv6_add_addr+0xfe/0x6e0 [ipv6]
[86220.126538] ipv6_create_tempaddr+0x24d/0x430 [ipv6]
[86220.126580] ? ipv6_create_tempaddr+0x24d/0x430 [ipv6]
[86220.126623] addrconf_verify_rtnl+0x339/0x510 [ipv6]
[86220.126664] ? addrconf_verify_rtnl+0x339/0x510 [ipv6]
[86220.126708] addrconf_verify_work+0xe/0x20 [ipv6]
[86220.126738] process_one_work+0x258/0x680
[86220.126765] worker_thread+0x35/0x3f0
[86220.126790] kthread+0x124/0x140
[86220.126813] ? process_one_work+0x680/0x680
[86220.126839] ? kthread_create_worker_on_cpu+0x40/0x40
[86220.126869] ? umh_complete+0x40/0x40
[86220.126893] ? call_usermodehelper_exec_async+0x12a/0x160
[86220.126926] ret_from_fork+0x4b/0x60
[86220.126999] BUG: sleeping function called from invalid context at mm/slab.h:420
[86220.127041] in_atomic(): 1, irqs_disabled(): 0, pid: 1003, name: kworker/0:2
[86220.127082] 4 locks held by kworker/0:2/1003:
[86220.127107] #0: ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
[86220.127179] #1: ((addr_chk_work).work){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
[86220.127242] #2: (rtnl_mutex){+.+.}, at: [<00000000b06d9510>] rtnl_lock+0x12/0x20
[86220.127300] #3: (rcu_read_lock_bh){....}, at: [<00000000aef52299>] addrconf_verify_rtnl+0x1e/0x510 [ipv6]
[86220.127414] CPU: 0 PID: 1003 Comm: kworker/0:2 Not tainted 4.15.0-rc7-next-20180110+ #7
[86220.127463] Hardware name: ZOTAC ZBOX-CI321NANO/ZBOX-CI321NANO, BIOS B246P105 06/01/2015
[86220.127528] Workqueue: ipv6_addrconf addrconf_verify_work [ipv6]
[86220.127568] Call Trace:
[86220.127591] dump_stack+0x70/0x9e
[86220.127616] ___might_sleep+0x14d/0x240
[86220.127644] __might_sleep+0x45/0x80
[86220.127672] kmem_cache_alloc_trace+0x53/0x250
[86220.127717] ? ipv6_add_addr+0xfe/0x6e0 [ipv6]
[86220.127762] ipv6_add_addr+0xfe/0x6e0 [ipv6]
[86220.127807] ipv6_create_tempaddr+0x24d/0x430 [ipv6]
[86220.127854] ? ipv6_create_tempaddr+0x24d/0x430 [ipv6]
[86220.127903] addrconf_verify_rtnl+0x339/0x510 [ipv6]
[86220.127950] ? addrconf_verify_rtnl+0x339/0x510 [ipv6]
[86220.127998] addrconf_verify_work+0xe/0x20 [ipv6]
[86220.128032] process_one_work+0x258/0x680
[86220.128063] worker_thread+0x35/0x3f0
[86220.128091] kthread+0x124/0x140
[86220.128117] ? process_one_work+0x680/0x680
[86220.128146] ? kthread_create_worker_on_cpu+0x40/0x40
[86220.128180] ? umh_complete+0x40/0x40
[86220.128207] ? call_usermodehelper_exec_async+0x12a/0x160
[86220.128243] ret_from_fork+0x4b/0x60
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: ipv6_addrconf: WARNING about suspicious RCU usage
2018-01-20 11:57 ipv6_addrconf: WARNING about suspicious RCU usage Heiner Kallweit
@ 2018-01-20 13:37 ` Ido Schimmel
2018-01-20 18:49 ` Eric Dumazet
0 siblings, 1 reply; 12+ messages in thread
From: Ido Schimmel @ 2018-01-20 13:37 UTC (permalink / raw)
To: Heiner Kallweit; +Cc: netdev@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 2103 bytes --]
On Sat, Jan 20, 2018 at 12:57:01PM +0100, Heiner Kallweit wrote:
> Since some time (didn't bisect it yet) I get the following warning.
> Is it a known issue?
[...]
> [86220.126999] BUG: sleeping function called from invalid context at mm/slab.h:420
> [86220.127041] in_atomic(): 1, irqs_disabled(): 0, pid: 1003, name: kworker/0:2
> [86220.127082] 4 locks held by kworker/0:2/1003:
> [86220.127107] #0: ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
> [86220.127179] #1: ((addr_chk_work).work){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
> [86220.127242] #2: (rtnl_mutex){+.+.}, at: [<00000000b06d9510>] rtnl_lock+0x12/0x20
> [86220.127300] #3: (rcu_read_lock_bh){....}, at: [<00000000aef52299>] addrconf_verify_rtnl+0x1e/0x510 [ipv6]
> [86220.127414] CPU: 0 PID: 1003 Comm: kworker/0:2 Not tainted 4.15.0-rc7-next-20180110+ #7
> [86220.127463] Hardware name: ZOTAC ZBOX-CI321NANO/ZBOX-CI321NANO, BIOS B246P105 06/01/2015
> [86220.127528] Workqueue: ipv6_addrconf addrconf_verify_work [ipv6]
> [86220.127568] Call Trace:
> [86220.127591] dump_stack+0x70/0x9e
> [86220.127616] ___might_sleep+0x14d/0x240
> [86220.127644] __might_sleep+0x45/0x80
> [86220.127672] kmem_cache_alloc_trace+0x53/0x250
> [86220.127717] ? ipv6_add_addr+0xfe/0x6e0 [ipv6]
> [86220.127762] ipv6_add_addr+0xfe/0x6e0 [ipv6]
> [86220.127807] ipv6_create_tempaddr+0x24d/0x430 [ipv6]
> [86220.127854] ? ipv6_create_tempaddr+0x24d/0x430 [ipv6]
> [86220.127903] addrconf_verify_rtnl+0x339/0x510 [ipv6]
> [86220.127950] ? addrconf_verify_rtnl+0x339/0x510 [ipv6]
> [86220.127998] addrconf_verify_work+0xe/0x20 [ipv6]
> [86220.128032] process_one_work+0x258/0x680
> [86220.128063] worker_thread+0x35/0x3f0
> [86220.128091] kthread+0x124/0x140
> [86220.128117] ? process_one_work+0x680/0x680
> [86220.128146] ? kthread_create_worker_on_cpu+0x40/0x40
> [86220.128180] ? umh_complete+0x40/0x40
> [86220.128207] ? call_usermodehelper_exec_async+0x12a/0x160
> [86220.128243] ret_from_fork+0x4b/0x60
Can you please try attached patch (untested)?
[-- Attachment #2: 0001-ipv6-Do-not-sleep-in-RCU-read-side-critical-section.patch --]
[-- Type: text/plain, Size: 853 bytes --]
>From 70e4bc2a8fe08fb30251f786990a91d3ed2232e6 Mon Sep 17 00:00:00 2001
From: Ido Schimmel <idosch@mellanox.com>
Date: Sat, 20 Jan 2018 15:29:33 +0200
Subject: [PATCH] ipv6: Do not sleep in RCU read-side critical section
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
---
net/ipv6/addrconf.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index ab99cb641b7c..9ad1365c91ca 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -4356,7 +4356,8 @@ static void addrconf_verify_rtnl(void)
spin_lock(&ifpub->lock);
ifpub->regen_count = 0;
spin_unlock(&ifpub->lock);
- ipv6_create_tempaddr(ifpub, ifp, true);
+ ipv6_create_tempaddr(ifpub, ifp,
+ false);
in6_ifa_put(ifpub);
in6_ifa_put(ifp);
goto restart;
--
2.14.3
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: ipv6_addrconf: WARNING about suspicious RCU usage
2018-01-20 13:37 ` Ido Schimmel
@ 2018-01-20 18:49 ` Eric Dumazet
2018-01-20 19:19 ` Ido Schimmel
0 siblings, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2018-01-20 18:49 UTC (permalink / raw)
To: Ido Schimmel, Heiner Kallweit; +Cc: netdev@vger.kernel.org
On Sat, 2018-01-20 at 15:37 +0200, Ido Schimmel wrote:
> On Sat, Jan 20, 2018 at 12:57:01PM +0100, Heiner Kallweit wrote:
> > Since some time (didn't bisect it yet) I get the following warning.
> > Is it a known issue?
>
> [...]
>
> > [86220.126999] BUG: sleeping function called from invalid context at mm/slab.h:420
> > [86220.127041] in_atomic(): 1, irqs_disabled(): 0, pid: 1003, name: kworker/0:2
> > [86220.127082] 4 locks held by kworker/0:2/1003:
> > [86220.127107] #0: ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
> > [86220.127179] #1: ((addr_chk_work).work){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
> > [86220.127242] #2: (rtnl_mutex){+.+.}, at: [<00000000b06d9510>] rtnl_lock+0x12/0x20
> > [86220.127300] #3: (rcu_read_lock_bh){....}, at: [<00000000aef52299>] addrconf_verify_rtnl+0x1e/0x510 [ipv6]
> > [86220.127414] CPU: 0 PID: 1003 Comm: kworker/0:2 Not tainted 4.15.0-rc7-next-20180110+ #7
> > [86220.127463] Hardware name: ZOTAC ZBOX-CI321NANO/ZBOX-CI321NANO, BIOS B246P105 06/01/2015
> > [86220.127528] Workqueue: ipv6_addrconf addrconf_verify_work [ipv6]
> > [86220.127568] Call Trace:
> > [86220.127591] dump_stack+0x70/0x9e
> > [86220.127616] ___might_sleep+0x14d/0x240
> > [86220.127644] __might_sleep+0x45/0x80
> > [86220.127672] kmem_cache_alloc_trace+0x53/0x250
> > [86220.127717] ? ipv6_add_addr+0xfe/0x6e0 [ipv6]
> > [86220.127762] ipv6_add_addr+0xfe/0x6e0 [ipv6]
> > [86220.127807] ipv6_create_tempaddr+0x24d/0x430 [ipv6]
> > [86220.127854] ? ipv6_create_tempaddr+0x24d/0x430 [ipv6]
> > [86220.127903] addrconf_verify_rtnl+0x339/0x510 [ipv6]
> > [86220.127950] ? addrconf_verify_rtnl+0x339/0x510 [ipv6]
> > [86220.127998] addrconf_verify_work+0xe/0x20 [ipv6]
> > [86220.128032] process_one_work+0x258/0x680
> > [86220.128063] worker_thread+0x35/0x3f0
> > [86220.128091] kthread+0x124/0x140
> > [86220.128117] ? process_one_work+0x680/0x680
> > [86220.128146] ? kthread_create_worker_on_cpu+0x40/0x40
> > [86220.128180] ? umh_complete+0x40/0x40
> > [86220.128207] ? call_usermodehelper_exec_async+0x12a/0x160
> > [86220.128243] ret_from_fork+0x4b/0x60
>
> Can you please try attached patch (untested)?
I would also/instead break rcu section.
Holding RCU (and BH) for whole hash traversal is a recipe for disaster,
if we have thousands of IPv6 addresses.
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index ab99cb641b7cccdda0ad4ae553c09274d7dbc047..adda73466ae1dd0f3b700b3db5fbf3065e4d3f7f 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -4356,9 +4356,11 @@ static void addrconf_verify_rtnl(void)
spin_lock(&ifpub->lock);
ifpub->regen_count = 0;
spin_unlock(&ifpub->lock);
+ rcu_read_unlock_bh();
ipv6_create_tempaddr(ifpub, ifp, true);
in6_ifa_put(ifpub);
in6_ifa_put(ifp);
+ rcu_read_lock_bh();
goto restart;
}
} else if (time_before(ifp->tstamp + ifp->prefered_lft * HZ - regen_advance * HZ, next))
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: ipv6_addrconf: WARNING about suspicious RCU usage
2018-01-20 18:49 ` Eric Dumazet
@ 2018-01-20 19:19 ` Ido Schimmel
2018-01-20 19:48 ` Heiner Kallweit
2018-01-21 21:22 ` Heiner Kallweit
0 siblings, 2 replies; 12+ messages in thread
From: Ido Schimmel @ 2018-01-20 19:19 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Heiner Kallweit, netdev@vger.kernel.org
On Sat, Jan 20, 2018 at 10:49:03AM -0800, Eric Dumazet wrote:
> On Sat, 2018-01-20 at 15:37 +0200, Ido Schimmel wrote:
> > On Sat, Jan 20, 2018 at 12:57:01PM +0100, Heiner Kallweit wrote:
> > > Since some time (didn't bisect it yet) I get the following warning.
> > > Is it a known issue?
> >
> > [...]
> >
> > > [86220.126999] BUG: sleeping function called from invalid context at mm/slab.h:420
> > > [86220.127041] in_atomic(): 1, irqs_disabled(): 0, pid: 1003, name: kworker/0:2
> > > [86220.127082] 4 locks held by kworker/0:2/1003:
> > > [86220.127107] #0: ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
> > > [86220.127179] #1: ((addr_chk_work).work){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
> > > [86220.127242] #2: (rtnl_mutex){+.+.}, at: [<00000000b06d9510>] rtnl_lock+0x12/0x20
> > > [86220.127300] #3: (rcu_read_lock_bh){....}, at: [<00000000aef52299>] addrconf_verify_rtnl+0x1e/0x510 [ipv6]
> > > [86220.127414] CPU: 0 PID: 1003 Comm: kworker/0:2 Not tainted 4.15.0-rc7-next-20180110+ #7
> > > [86220.127463] Hardware name: ZOTAC ZBOX-CI321NANO/ZBOX-CI321NANO, BIOS B246P105 06/01/2015
> > > [86220.127528] Workqueue: ipv6_addrconf addrconf_verify_work [ipv6]
> > > [86220.127568] Call Trace:
> > > [86220.127591] dump_stack+0x70/0x9e
> > > [86220.127616] ___might_sleep+0x14d/0x240
> > > [86220.127644] __might_sleep+0x45/0x80
> > > [86220.127672] kmem_cache_alloc_trace+0x53/0x250
> > > [86220.127717] ? ipv6_add_addr+0xfe/0x6e0 [ipv6]
> > > [86220.127762] ipv6_add_addr+0xfe/0x6e0 [ipv6]
> > > [86220.127807] ipv6_create_tempaddr+0x24d/0x430 [ipv6]
> > > [86220.127854] ? ipv6_create_tempaddr+0x24d/0x430 [ipv6]
> > > [86220.127903] addrconf_verify_rtnl+0x339/0x510 [ipv6]
> > > [86220.127950] ? addrconf_verify_rtnl+0x339/0x510 [ipv6]
> > > [86220.127998] addrconf_verify_work+0xe/0x20 [ipv6]
> > > [86220.128032] process_one_work+0x258/0x680
> > > [86220.128063] worker_thread+0x35/0x3f0
> > > [86220.128091] kthread+0x124/0x140
> > > [86220.128117] ? process_one_work+0x680/0x680
> > > [86220.128146] ? kthread_create_worker_on_cpu+0x40/0x40
> > > [86220.128180] ? umh_complete+0x40/0x40
> > > [86220.128207] ? call_usermodehelper_exec_async+0x12a/0x160
> > > [86220.128243] ret_from_fork+0x4b/0x60
> >
> > Can you please try attached patch (untested)?
>
>
>
> I would also/instead break rcu section.
Thanks Eric, this should work. We can continue to block in
ipv6_create_tempaddr().
Heiner, can you try Eric's patch instead?
>
> Holding RCU (and BH) for whole hash traversal is a recipe for disaster,
> if we have thousands of IPv6 addresses.
>
> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
> index ab99cb641b7cccdda0ad4ae553c09274d7dbc047..adda73466ae1dd0f3b700b3db5fbf3065e4d3f7f 100644
> --- a/net/ipv6/addrconf.c
> +++ b/net/ipv6/addrconf.c
> @@ -4356,9 +4356,11 @@ static void addrconf_verify_rtnl(void)
> spin_lock(&ifpub->lock);
> ifpub->regen_count = 0;
> spin_unlock(&ifpub->lock);
> + rcu_read_unlock_bh();
> ipv6_create_tempaddr(ifpub, ifp, true);
> in6_ifa_put(ifpub);
> in6_ifa_put(ifp);
> + rcu_read_lock_bh();
> goto restart;
> }
> } else if (time_before(ifp->tstamp + ifp->prefered_lft * HZ - regen_advance * HZ, next))
>
>
>
>
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: ipv6_addrconf: WARNING about suspicious RCU usage
2018-01-20 19:19 ` Ido Schimmel
@ 2018-01-20 19:48 ` Heiner Kallweit
2018-01-21 21:22 ` Heiner Kallweit
1 sibling, 0 replies; 12+ messages in thread
From: Heiner Kallweit @ 2018-01-20 19:48 UTC (permalink / raw)
To: Ido Schimmel, Eric Dumazet; +Cc: netdev@vger.kernel.org
Am 20.01.2018 um 20:19 schrieb Ido Schimmel:
> On Sat, Jan 20, 2018 at 10:49:03AM -0800, Eric Dumazet wrote:
>> On Sat, 2018-01-20 at 15:37 +0200, Ido Schimmel wrote:
>>> On Sat, Jan 20, 2018 at 12:57:01PM +0100, Heiner Kallweit wrote:
>>>> Since some time (didn't bisect it yet) I get the following warning.
>>>> Is it a known issue?
>>>
>>> [...]
>>>
>>>> [86220.126999] BUG: sleeping function called from invalid context at mm/slab.h:420
>>>> [86220.127041] in_atomic(): 1, irqs_disabled(): 0, pid: 1003, name: kworker/0:2
>>>> [86220.127082] 4 locks held by kworker/0:2/1003:
>>>> [86220.127107] #0: ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
>>>> [86220.127179] #1: ((addr_chk_work).work){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
>>>> [86220.127242] #2: (rtnl_mutex){+.+.}, at: [<00000000b06d9510>] rtnl_lock+0x12/0x20
>>>> [86220.127300] #3: (rcu_read_lock_bh){....}, at: [<00000000aef52299>] addrconf_verify_rtnl+0x1e/0x510 [ipv6]
>>>> [86220.127414] CPU: 0 PID: 1003 Comm: kworker/0:2 Not tainted 4.15.0-rc7-next-20180110+ #7
>>>> [86220.127463] Hardware name: ZOTAC ZBOX-CI321NANO/ZBOX-CI321NANO, BIOS B246P105 06/01/2015
>>>> [86220.127528] Workqueue: ipv6_addrconf addrconf_verify_work [ipv6]
>>>> [86220.127568] Call Trace:
>>>> [86220.127591] dump_stack+0x70/0x9e
>>>> [86220.127616] ___might_sleep+0x14d/0x240
>>>> [86220.127644] __might_sleep+0x45/0x80
>>>> [86220.127672] kmem_cache_alloc_trace+0x53/0x250
>>>> [86220.127717] ? ipv6_add_addr+0xfe/0x6e0 [ipv6]
>>>> [86220.127762] ipv6_add_addr+0xfe/0x6e0 [ipv6]
>>>> [86220.127807] ipv6_create_tempaddr+0x24d/0x430 [ipv6]
>>>> [86220.127854] ? ipv6_create_tempaddr+0x24d/0x430 [ipv6]
>>>> [86220.127903] addrconf_verify_rtnl+0x339/0x510 [ipv6]
>>>> [86220.127950] ? addrconf_verify_rtnl+0x339/0x510 [ipv6]
>>>> [86220.127998] addrconf_verify_work+0xe/0x20 [ipv6]
>>>> [86220.128032] process_one_work+0x258/0x680
>>>> [86220.128063] worker_thread+0x35/0x3f0
>>>> [86220.128091] kthread+0x124/0x140
>>>> [86220.128117] ? process_one_work+0x680/0x680
>>>> [86220.128146] ? kthread_create_worker_on_cpu+0x40/0x40
>>>> [86220.128180] ? umh_complete+0x40/0x40
>>>> [86220.128207] ? call_usermodehelper_exec_async+0x12a/0x160
>>>> [86220.128243] ret_from_fork+0x4b/0x60
>>>
>>> Can you please try attached patch (untested)?
>>
>>
>>
>> I would also/instead break rcu section.
>
> Thanks Eric, this should work. We can continue to block in
> ipv6_create_tempaddr().
>
> Heiner, can you try Eric's patch instead?
>
Sure, thanks for the quick response to both of you.
>>
>> Holding RCU (and BH) for whole hash traversal is a recipe for disaster,
>> if we have thousands of IPv6 addresses.
>>
>> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
>> index ab99cb641b7cccdda0ad4ae553c09274d7dbc047..adda73466ae1dd0f3b700b3db5fbf3065e4d3f7f 100644
>> --- a/net/ipv6/addrconf.c
>> +++ b/net/ipv6/addrconf.c
>> @@ -4356,9 +4356,11 @@ static void addrconf_verify_rtnl(void)
>> spin_lock(&ifpub->lock);
>> ifpub->regen_count = 0;
>> spin_unlock(&ifpub->lock);
>> + rcu_read_unlock_bh();
>> ipv6_create_tempaddr(ifpub, ifp, true);
>> in6_ifa_put(ifpub);
>> in6_ifa_put(ifp);
>> + rcu_read_lock_bh();
>> goto restart;
>> }
>> } else if (time_before(ifp->tstamp + ifp->prefered_lft * HZ - regen_advance * HZ, next))
>>
>>
>>
>>
>
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: ipv6_addrconf: WARNING about suspicious RCU usage
2018-01-20 19:19 ` Ido Schimmel
2018-01-20 19:48 ` Heiner Kallweit
@ 2018-01-21 21:22 ` Heiner Kallweit
2018-01-23 18:01 ` Ido Schimmel
1 sibling, 1 reply; 12+ messages in thread
From: Heiner Kallweit @ 2018-01-21 21:22 UTC (permalink / raw)
To: Ido Schimmel, Eric Dumazet; +Cc: netdev@vger.kernel.org
Am 20.01.2018 um 20:19 schrieb Ido Schimmel:
> On Sat, Jan 20, 2018 at 10:49:03AM -0800, Eric Dumazet wrote:
>> On Sat, 2018-01-20 at 15:37 +0200, Ido Schimmel wrote:
>>> On Sat, Jan 20, 2018 at 12:57:01PM +0100, Heiner Kallweit wrote:
>>>> Since some time (didn't bisect it yet) I get the following warning.
>>>> Is it a known issue?
>>>
>>> [...]
>>>
>>>> [86220.126999] BUG: sleeping function called from invalid context at mm/slab.h:420
>>>> [86220.127041] in_atomic(): 1, irqs_disabled(): 0, pid: 1003, name: kworker/0:2
>>>> [86220.127082] 4 locks held by kworker/0:2/1003:
>>>> [86220.127107] #0: ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
>>>> [86220.127179] #1: ((addr_chk_work).work){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
>>>> [86220.127242] #2: (rtnl_mutex){+.+.}, at: [<00000000b06d9510>] rtnl_lock+0x12/0x20
>>>> [86220.127300] #3: (rcu_read_lock_bh){....}, at: [<00000000aef52299>] addrconf_verify_rtnl+0x1e/0x510 [ipv6]
>>>> [86220.127414] CPU: 0 PID: 1003 Comm: kworker/0:2 Not tainted 4.15.0-rc7-next-20180110+ #7
>>>> [86220.127463] Hardware name: ZOTAC ZBOX-CI321NANO/ZBOX-CI321NANO, BIOS B246P105 06/01/2015
>>>> [86220.127528] Workqueue: ipv6_addrconf addrconf_verify_work [ipv6]
>>>> [86220.127568] Call Trace:
>>>> [86220.127591] dump_stack+0x70/0x9e
>>>> [86220.127616] ___might_sleep+0x14d/0x240
>>>> [86220.127644] __might_sleep+0x45/0x80
>>>> [86220.127672] kmem_cache_alloc_trace+0x53/0x250
>>>> [86220.127717] ? ipv6_add_addr+0xfe/0x6e0 [ipv6]
>>>> [86220.127762] ipv6_add_addr+0xfe/0x6e0 [ipv6]
>>>> [86220.127807] ipv6_create_tempaddr+0x24d/0x430 [ipv6]
>>>> [86220.127854] ? ipv6_create_tempaddr+0x24d/0x430 [ipv6]
>>>> [86220.127903] addrconf_verify_rtnl+0x339/0x510 [ipv6]
>>>> [86220.127950] ? addrconf_verify_rtnl+0x339/0x510 [ipv6]
>>>> [86220.127998] addrconf_verify_work+0xe/0x20 [ipv6]
>>>> [86220.128032] process_one_work+0x258/0x680
>>>> [86220.128063] worker_thread+0x35/0x3f0
>>>> [86220.128091] kthread+0x124/0x140
>>>> [86220.128117] ? process_one_work+0x680/0x680
>>>> [86220.128146] ? kthread_create_worker_on_cpu+0x40/0x40
>>>> [86220.128180] ? umh_complete+0x40/0x40
>>>> [86220.128207] ? call_usermodehelper_exec_async+0x12a/0x160
>>>> [86220.128243] ret_from_fork+0x4b/0x60
>>>
>>> Can you please try attached patch (untested)?
>>
>>
>>
>> I would also/instead break rcu section.
>
> Thanks Eric, this should work. We can continue to block in
> ipv6_create_tempaddr().
>
> Heiner, can you try Eric's patch instead?
>
So far everything looks good with Eric's patch. The warning didn't show up again.
>>
>> Holding RCU (and BH) for whole hash traversal is a recipe for disaster,
>> if we have thousands of IPv6 addresses.
>>
>> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
>> index ab99cb641b7cccdda0ad4ae553c09274d7dbc047..adda73466ae1dd0f3b700b3db5fbf3065e4d3f7f 100644
>> --- a/net/ipv6/addrconf.c
>> +++ b/net/ipv6/addrconf.c
>> @@ -4356,9 +4356,11 @@ static void addrconf_verify_rtnl(void)
>> spin_lock(&ifpub->lock);
>> ifpub->regen_count = 0;
>> spin_unlock(&ifpub->lock);
>> + rcu_read_unlock_bh();
>> ipv6_create_tempaddr(ifpub, ifp, true);
>> in6_ifa_put(ifpub);
>> in6_ifa_put(ifp);
>> + rcu_read_lock_bh();
>> goto restart;
>> }
>> } else if (time_before(ifp->tstamp + ifp->prefered_lft * HZ - regen_advance * HZ, next))
>>
>>
>>
>>
>
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: ipv6_addrconf: WARNING about suspicious RCU usage
2018-01-21 21:22 ` Heiner Kallweit
@ 2018-01-23 18:01 ` Ido Schimmel
2018-01-23 18:43 ` Eric Dumazet
2018-01-26 22:15 ` ipv6_addrconf: WARNING about suspicious RCU usage Heiner Kallweit
0 siblings, 2 replies; 12+ messages in thread
From: Ido Schimmel @ 2018-01-23 18:01 UTC (permalink / raw)
To: Heiner Kallweit, eric.dumazet; +Cc: netdev@vger.kernel.org
On Sun, Jan 21, 2018 at 10:22:16PM +0100, Heiner Kallweit wrote:
> So far everything looks good with Eric's patch. The warning didn't show up again.
Eric, can you please submit your patch?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: ipv6_addrconf: WARNING about suspicious RCU usage
2018-01-23 18:01 ` Ido Schimmel
@ 2018-01-23 18:43 ` Eric Dumazet
2018-01-27 0:10 ` [PATCH net] ipv6: addrconf: break critical section in addrconf_verify_rtnl() Eric Dumazet
2018-01-26 22:15 ` ipv6_addrconf: WARNING about suspicious RCU usage Heiner Kallweit
1 sibling, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2018-01-23 18:43 UTC (permalink / raw)
To: Ido Schimmel, Heiner Kallweit; +Cc: netdev@vger.kernel.org
On Tue, 2018-01-23 at 20:01 +0200, Ido Schimmel wrote:
> On Sun, Jan 21, 2018 at 10:22:16PM +0100, Heiner Kallweit wrote:
> > So far everything looks good with Eric's patch. The warning didn't show up again.
>
> Eric, can you please submit your patch?
Sure, will do today. Thanks for the reminder.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH net] ipv6: addrconf: break critical section in addrconf_verify_rtnl()
2018-01-23 18:43 ` Eric Dumazet
@ 2018-01-27 0:10 ` Eric Dumazet
2018-01-27 8:52 ` Ido Schimmel
2018-01-29 19:24 ` David Miller
0 siblings, 2 replies; 12+ messages in thread
From: Eric Dumazet @ 2018-01-27 0:10 UTC (permalink / raw)
To: Ido Schimmel, Heiner Kallweit, David Miller; +Cc: netdev@vger.kernel.org
From: Eric Dumazet <edumazet@google.com>
Heiner reported a lockdep splat [1]
This is caused by attempting GFP_KERNEL allocation while RCU lock is
held and BH blocked.
We believe that addrconf_verify_rtnl() could run for a long period,
so instead of using GFP_ATOMIC here as Ido suggested, we should break
the critical section and restart it after the allocation.
[1]
[86220.125562] =============================
[86220.125586] WARNING: suspicious RCU usage
[86220.125612] 4.15.0-rc7-next-20180110+ #7 Not tainted
[86220.125641] -----------------------------
[86220.125666] kernel/sched/core.c:6026 Illegal context switch in RCU-bh read-side critical section!
[86220.125711]
other info that might help us debug this:
[86220.125755]
rcu_scheduler_active = 2, debug_locks = 1
[86220.125792] 4 locks held by kworker/0:2/1003:
[86220.125817] #0: ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
[86220.125895] #1: ((addr_chk_work).work){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
[86220.125959] #2: (rtnl_mutex){+.+.}, at: [<00000000b06d9510>] rtnl_lock+0x12/0x20
[86220.126017] #3: (rcu_read_lock_bh){....}, at: [<00000000aef52299>] addrconf_verify_rtnl+0x1e/0x510 [ipv6]
[86220.126111]
stack backtrace:
[86220.126142] CPU: 0 PID: 1003 Comm: kworker/0:2 Not tainted 4.15.0-rc7-next-20180110+ #7
[86220.126185] Hardware name: ZOTAC ZBOX-CI321NANO/ZBOX-CI321NANO, BIOS B246P105 06/01/2015
[86220.126250] Workqueue: ipv6_addrconf addrconf_verify_work [ipv6]
[86220.126288] Call Trace:
[86220.126312] dump_stack+0x70/0x9e
[86220.126337] lockdep_rcu_suspicious+0xce/0xf0
[86220.126365] ___might_sleep+0x1d3/0x240
[86220.126390] __might_sleep+0x45/0x80
[86220.126416] kmem_cache_alloc_trace+0x53/0x250
[86220.126458] ? ipv6_add_addr+0xfe/0x6e0 [ipv6]
[86220.126498] ipv6_add_addr+0xfe/0x6e0 [ipv6]
[86220.126538] ipv6_create_tempaddr+0x24d/0x430 [ipv6]
[86220.126580] ? ipv6_create_tempaddr+0x24d/0x430 [ipv6]
[86220.126623] addrconf_verify_rtnl+0x339/0x510 [ipv6]
[86220.126664] ? addrconf_verify_rtnl+0x339/0x510 [ipv6]
[86220.126708] addrconf_verify_work+0xe/0x20 [ipv6]
[86220.126738] process_one_work+0x258/0x680
[86220.126765] worker_thread+0x35/0x3f0
[86220.126790] kthread+0x124/0x140
[86220.126813] ? process_one_work+0x680/0x680
[86220.126839] ? kthread_create_worker_on_cpu+0x40/0x40
[86220.126869] ? umh_complete+0x40/0x40
[86220.126893] ? call_usermodehelper_exec_async+0x12a/0x160
[86220.126926] ret_from_fork+0x4b/0x60
[86220.126999] BUG: sleeping function called from invalid context at mm/slab.h:420
[86220.127041] in_atomic(): 1, irqs_disabled(): 0, pid: 1003, name: kworker/0:2
[86220.127082] 4 locks held by kworker/0:2/1003:
[86220.127107] #0: ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
[86220.127179] #1: ((addr_chk_work).work){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
[86220.127242] #2: (rtnl_mutex){+.+.}, at: [<00000000b06d9510>] rtnl_lock+0x12/0x20
[86220.127300] #3: (rcu_read_lock_bh){....}, at: [<00000000aef52299>] addrconf_verify_rtnl+0x1e/0x510 [ipv6]
[86220.127414] CPU: 0 PID: 1003 Comm: kworker/0:2 Not tainted 4.15.0-rc7-next-20180110+ #7
[86220.127463] Hardware name: ZOTAC ZBOX-CI321NANO/ZBOX-CI321NANO, BIOS B246P105 06/01/2015
[86220.127528] Workqueue: ipv6_addrconf addrconf_verify_work [ipv6]
[86220.127568] Call Trace:
[86220.127591] dump_stack+0x70/0x9e
[86220.127616] ___might_sleep+0x14d/0x240
[86220.127644] __might_sleep+0x45/0x80
[86220.127672] kmem_cache_alloc_trace+0x53/0x250
[86220.127717] ? ipv6_add_addr+0xfe/0x6e0 [ipv6]
[86220.127762] ipv6_add_addr+0xfe/0x6e0 [ipv6]
[86220.127807] ipv6_create_tempaddr+0x24d/0x430 [ipv6]
[86220.127854] ? ipv6_create_tempaddr+0x24d/0x430 [ipv6]
[86220.127903] addrconf_verify_rtnl+0x339/0x510 [ipv6]
[86220.127950] ? addrconf_verify_rtnl+0x339/0x510 [ipv6]
[86220.127998] addrconf_verify_work+0xe/0x20 [ipv6]
[86220.128032] process_one_work+0x258/0x680
[86220.128063] worker_thread+0x35/0x3f0
[86220.128091] kthread+0x124/0x140
[86220.128117] ? process_one_work+0x680/0x680
[86220.128146] ? kthread_create_worker_on_cpu+0x40/0x40
[86220.128180] ? umh_complete+0x40/0x40
[86220.128207] ? call_usermodehelper_exec_async+0x12a/0x160
[86220.128243] ret_from_fork+0x4b/0x60
Fixes: f3d9832e56c4 ("ipv6: addrconf: cleanup locking in ipv6_add_addr")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Heiner Kallweit <hkallweit1@gmail.com>
---
net/ipv6/addrconf.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index f49bd7897e95f15a381e4700660991f2d3c3fed4..10facd174210974ac82b2304211061b90714aac8 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -4352,9 +4352,11 @@ static void addrconf_verify_rtnl(void)
spin_lock(&ifpub->lock);
ifpub->regen_count = 0;
spin_unlock(&ifpub->lock);
+ rcu_read_unlock_bh();
ipv6_create_tempaddr(ifpub, ifp, true);
in6_ifa_put(ifpub);
in6_ifa_put(ifp);
+ rcu_read_lock_bh();
goto restart;
}
} else if (time_before(ifp->tstamp + ifp->prefered_lft * HZ - regen_advance * HZ, next))
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [PATCH net] ipv6: addrconf: break critical section in addrconf_verify_rtnl()
2018-01-27 0:10 ` [PATCH net] ipv6: addrconf: break critical section in addrconf_verify_rtnl() Eric Dumazet
@ 2018-01-27 8:52 ` Ido Schimmel
2018-01-29 19:24 ` David Miller
1 sibling, 0 replies; 12+ messages in thread
From: Ido Schimmel @ 2018-01-27 8:52 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Heiner Kallweit, David Miller, netdev@vger.kernel.org
On Fri, Jan 26, 2018 at 04:10:43PM -0800, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> Heiner reported a lockdep splat [1]
>
> This is caused by attempting GFP_KERNEL allocation while RCU lock is
> held and BH blocked.
>
> We believe that addrconf_verify_rtnl() could run for a long period,
> so instead of using GFP_ATOMIC here as Ido suggested, we should break
> the critical section and restart it after the allocation.
[...]
> Fixes: f3d9832e56c4 ("ipv6: addrconf: cleanup locking in ipv6_add_addr")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Thanks!
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH net] ipv6: addrconf: break critical section in addrconf_verify_rtnl()
2018-01-27 0:10 ` [PATCH net] ipv6: addrconf: break critical section in addrconf_verify_rtnl() Eric Dumazet
2018-01-27 8:52 ` Ido Schimmel
@ 2018-01-29 19:24 ` David Miller
1 sibling, 0 replies; 12+ messages in thread
From: David Miller @ 2018-01-29 19:24 UTC (permalink / raw)
To: eric.dumazet; +Cc: idosch, hkallweit1, netdev
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 26 Jan 2018 16:10:43 -0800
> From: Eric Dumazet <edumazet@google.com>
>
> Heiner reported a lockdep splat [1]
>
> This is caused by attempting GFP_KERNEL allocation while RCU lock is
> held and BH blocked.
>
> We believe that addrconf_verify_rtnl() could run for a long period,
> so instead of using GFP_ATOMIC here as Ido suggested, we should break
> the critical section and restart it after the allocation.
>
>
> [1]
...
> Fixes: f3d9832e56c4 ("ipv6: addrconf: cleanup locking in ipv6_add_addr")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Heiner Kallweit <hkallweit1@gmail.com>
Applied and queued up for v4.15 -stable, thanks.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: ipv6_addrconf: WARNING about suspicious RCU usage
2018-01-23 18:01 ` Ido Schimmel
2018-01-23 18:43 ` Eric Dumazet
@ 2018-01-26 22:15 ` Heiner Kallweit
1 sibling, 0 replies; 12+ messages in thread
From: Heiner Kallweit @ 2018-01-26 22:15 UTC (permalink / raw)
To: eric.dumazet; +Cc: Ido Schimmel, netdev@vger.kernel.org
Am 23.01.2018 um 19:01 schrieb Ido Schimmel:
> On Sun, Jan 21, 2018 at 10:22:16PM +0100, Heiner Kallweit wrote:
>> So far everything looks good with Eric's patch. The warning didn't show up again.
>
> Eric, can you please submit your patch?
> .
>
Soon the merge window starts, can the patch be submitted before?
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2018-01-29 19:24 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-20 11:57 ipv6_addrconf: WARNING about suspicious RCU usage Heiner Kallweit
2018-01-20 13:37 ` Ido Schimmel
2018-01-20 18:49 ` Eric Dumazet
2018-01-20 19:19 ` Ido Schimmel
2018-01-20 19:48 ` Heiner Kallweit
2018-01-21 21:22 ` Heiner Kallweit
2018-01-23 18:01 ` Ido Schimmel
2018-01-23 18:43 ` Eric Dumazet
2018-01-27 0:10 ` [PATCH net] ipv6: addrconf: break critical section in addrconf_verify_rtnl() Eric Dumazet
2018-01-27 8:52 ` Ido Schimmel
2018-01-29 19:24 ` David Miller
2018-01-26 22:15 ` ipv6_addrconf: WARNING about suspicious RCU usage Heiner Kallweit
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).