netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* ipv6_addrconf: WARNING about suspicious RCU usage
@ 2018-01-20 11:57 Heiner Kallweit
  2018-01-20 13:37 ` Ido Schimmel
  0 siblings, 1 reply; 12+ messages in thread
From: Heiner Kallweit @ 2018-01-20 11:57 UTC (permalink / raw)
  To: netdev@vger.kernel.org

Since some time (didn't bisect it yet) I get the following warning.
Is it a known issue?

[86220.125562] =============================
[86220.125586] WARNING: suspicious RCU usage
[86220.125612] 4.15.0-rc7-next-20180110+ #7 Not tainted
[86220.125641] -----------------------------
[86220.125666] kernel/sched/core.c:6026 Illegal context switch in RCU-bh read-side critical section!
[86220.125711]
               other info that might help us debug this:

[86220.125755]
               rcu_scheduler_active = 2, debug_locks = 1
[86220.125792] 4 locks held by kworker/0:2/1003:
[86220.125817]  #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
[86220.125895]  #1:  ((addr_chk_work).work){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
[86220.125959]  #2:  (rtnl_mutex){+.+.}, at: [<00000000b06d9510>] rtnl_lock+0x12/0x20
[86220.126017]  #3:  (rcu_read_lock_bh){....}, at: [<00000000aef52299>] addrconf_verify_rtnl+0x1e/0x510 [ipv6]
[86220.126111]
               stack backtrace:
[86220.126142] CPU: 0 PID: 1003 Comm: kworker/0:2 Not tainted 4.15.0-rc7-next-20180110+ #7
[86220.126185] Hardware name: ZOTAC ZBOX-CI321NANO/ZBOX-CI321NANO, BIOS B246P105 06/01/2015
[86220.126250] Workqueue: ipv6_addrconf addrconf_verify_work [ipv6]
[86220.126288] Call Trace:
[86220.126312]  dump_stack+0x70/0x9e
[86220.126337]  lockdep_rcu_suspicious+0xce/0xf0
[86220.126365]  ___might_sleep+0x1d3/0x240
[86220.126390]  __might_sleep+0x45/0x80
[86220.126416]  kmem_cache_alloc_trace+0x53/0x250
[86220.126458]  ? ipv6_add_addr+0xfe/0x6e0 [ipv6]
[86220.126498]  ipv6_add_addr+0xfe/0x6e0 [ipv6]
[86220.126538]  ipv6_create_tempaddr+0x24d/0x430 [ipv6]
[86220.126580]  ? ipv6_create_tempaddr+0x24d/0x430 [ipv6]
[86220.126623]  addrconf_verify_rtnl+0x339/0x510 [ipv6]
[86220.126664]  ? addrconf_verify_rtnl+0x339/0x510 [ipv6]
[86220.126708]  addrconf_verify_work+0xe/0x20 [ipv6]
[86220.126738]  process_one_work+0x258/0x680
[86220.126765]  worker_thread+0x35/0x3f0
[86220.126790]  kthread+0x124/0x140
[86220.126813]  ? process_one_work+0x680/0x680
[86220.126839]  ? kthread_create_worker_on_cpu+0x40/0x40
[86220.126869]  ? umh_complete+0x40/0x40
[86220.126893]  ? call_usermodehelper_exec_async+0x12a/0x160
[86220.126926]  ret_from_fork+0x4b/0x60
[86220.126999] BUG: sleeping function called from invalid context at mm/slab.h:420
[86220.127041] in_atomic(): 1, irqs_disabled(): 0, pid: 1003, name: kworker/0:2
[86220.127082] 4 locks held by kworker/0:2/1003:
[86220.127107]  #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
[86220.127179]  #1:  ((addr_chk_work).work){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
[86220.127242]  #2:  (rtnl_mutex){+.+.}, at: [<00000000b06d9510>] rtnl_lock+0x12/0x20
[86220.127300]  #3:  (rcu_read_lock_bh){....}, at: [<00000000aef52299>] addrconf_verify_rtnl+0x1e/0x510 [ipv6]
[86220.127414] CPU: 0 PID: 1003 Comm: kworker/0:2 Not tainted 4.15.0-rc7-next-20180110+ #7
[86220.127463] Hardware name: ZOTAC ZBOX-CI321NANO/ZBOX-CI321NANO, BIOS B246P105 06/01/2015
[86220.127528] Workqueue: ipv6_addrconf addrconf_verify_work [ipv6]
[86220.127568] Call Trace:
[86220.127591]  dump_stack+0x70/0x9e
[86220.127616]  ___might_sleep+0x14d/0x240
[86220.127644]  __might_sleep+0x45/0x80
[86220.127672]  kmem_cache_alloc_trace+0x53/0x250
[86220.127717]  ? ipv6_add_addr+0xfe/0x6e0 [ipv6]
[86220.127762]  ipv6_add_addr+0xfe/0x6e0 [ipv6]
[86220.127807]  ipv6_create_tempaddr+0x24d/0x430 [ipv6]
[86220.127854]  ? ipv6_create_tempaddr+0x24d/0x430 [ipv6]
[86220.127903]  addrconf_verify_rtnl+0x339/0x510 [ipv6]
[86220.127950]  ? addrconf_verify_rtnl+0x339/0x510 [ipv6]
[86220.127998]  addrconf_verify_work+0xe/0x20 [ipv6]
[86220.128032]  process_one_work+0x258/0x680
[86220.128063]  worker_thread+0x35/0x3f0
[86220.128091]  kthread+0x124/0x140
[86220.128117]  ? process_one_work+0x680/0x680
[86220.128146]  ? kthread_create_worker_on_cpu+0x40/0x40
[86220.128180]  ? umh_complete+0x40/0x40
[86220.128207]  ? call_usermodehelper_exec_async+0x12a/0x160
[86220.128243]  ret_from_fork+0x4b/0x60

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: ipv6_addrconf: WARNING about suspicious RCU usage
  2018-01-20 11:57 ipv6_addrconf: WARNING about suspicious RCU usage Heiner Kallweit
@ 2018-01-20 13:37 ` Ido Schimmel
  2018-01-20 18:49   ` Eric Dumazet
  0 siblings, 1 reply; 12+ messages in thread
From: Ido Schimmel @ 2018-01-20 13:37 UTC (permalink / raw)
  To: Heiner Kallweit; +Cc: netdev@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 2103 bytes --]

On Sat, Jan 20, 2018 at 12:57:01PM +0100, Heiner Kallweit wrote:
> Since some time (didn't bisect it yet) I get the following warning.
> Is it a known issue?

[...]

> [86220.126999] BUG: sleeping function called from invalid context at mm/slab.h:420
> [86220.127041] in_atomic(): 1, irqs_disabled(): 0, pid: 1003, name: kworker/0:2
> [86220.127082] 4 locks held by kworker/0:2/1003:
> [86220.127107]  #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
> [86220.127179]  #1:  ((addr_chk_work).work){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
> [86220.127242]  #2:  (rtnl_mutex){+.+.}, at: [<00000000b06d9510>] rtnl_lock+0x12/0x20
> [86220.127300]  #3:  (rcu_read_lock_bh){....}, at: [<00000000aef52299>] addrconf_verify_rtnl+0x1e/0x510 [ipv6]
> [86220.127414] CPU: 0 PID: 1003 Comm: kworker/0:2 Not tainted 4.15.0-rc7-next-20180110+ #7
> [86220.127463] Hardware name: ZOTAC ZBOX-CI321NANO/ZBOX-CI321NANO, BIOS B246P105 06/01/2015
> [86220.127528] Workqueue: ipv6_addrconf addrconf_verify_work [ipv6]
> [86220.127568] Call Trace:
> [86220.127591]  dump_stack+0x70/0x9e
> [86220.127616]  ___might_sleep+0x14d/0x240
> [86220.127644]  __might_sleep+0x45/0x80
> [86220.127672]  kmem_cache_alloc_trace+0x53/0x250
> [86220.127717]  ? ipv6_add_addr+0xfe/0x6e0 [ipv6]
> [86220.127762]  ipv6_add_addr+0xfe/0x6e0 [ipv6]
> [86220.127807]  ipv6_create_tempaddr+0x24d/0x430 [ipv6]
> [86220.127854]  ? ipv6_create_tempaddr+0x24d/0x430 [ipv6]
> [86220.127903]  addrconf_verify_rtnl+0x339/0x510 [ipv6]
> [86220.127950]  ? addrconf_verify_rtnl+0x339/0x510 [ipv6]
> [86220.127998]  addrconf_verify_work+0xe/0x20 [ipv6]
> [86220.128032]  process_one_work+0x258/0x680
> [86220.128063]  worker_thread+0x35/0x3f0
> [86220.128091]  kthread+0x124/0x140
> [86220.128117]  ? process_one_work+0x680/0x680
> [86220.128146]  ? kthread_create_worker_on_cpu+0x40/0x40
> [86220.128180]  ? umh_complete+0x40/0x40
> [86220.128207]  ? call_usermodehelper_exec_async+0x12a/0x160
> [86220.128243]  ret_from_fork+0x4b/0x60

Can you please try attached patch (untested)?

[-- Attachment #2: 0001-ipv6-Do-not-sleep-in-RCU-read-side-critical-section.patch --]
[-- Type: text/plain, Size: 853 bytes --]

>From 70e4bc2a8fe08fb30251f786990a91d3ed2232e6 Mon Sep 17 00:00:00 2001
From: Ido Schimmel <idosch@mellanox.com>
Date: Sat, 20 Jan 2018 15:29:33 +0200
Subject: [PATCH] ipv6: Do not sleep in RCU read-side critical section

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
---
 net/ipv6/addrconf.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index ab99cb641b7c..9ad1365c91ca 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -4356,7 +4356,8 @@ static void addrconf_verify_rtnl(void)
 						spin_lock(&ifpub->lock);
 						ifpub->regen_count = 0;
 						spin_unlock(&ifpub->lock);
-						ipv6_create_tempaddr(ifpub, ifp, true);
+						ipv6_create_tempaddr(ifpub, ifp,
+								     false);
 						in6_ifa_put(ifpub);
 						in6_ifa_put(ifp);
 						goto restart;
-- 
2.14.3


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: ipv6_addrconf: WARNING about suspicious RCU usage
  2018-01-20 13:37 ` Ido Schimmel
@ 2018-01-20 18:49   ` Eric Dumazet
  2018-01-20 19:19     ` Ido Schimmel
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2018-01-20 18:49 UTC (permalink / raw)
  To: Ido Schimmel, Heiner Kallweit; +Cc: netdev@vger.kernel.org

On Sat, 2018-01-20 at 15:37 +0200, Ido Schimmel wrote:
> On Sat, Jan 20, 2018 at 12:57:01PM +0100, Heiner Kallweit wrote:
> > Since some time (didn't bisect it yet) I get the following warning.
> > Is it a known issue?
> 
> [...]
> 
> > [86220.126999] BUG: sleeping function called from invalid context at mm/slab.h:420
> > [86220.127041] in_atomic(): 1, irqs_disabled(): 0, pid: 1003, name: kworker/0:2
> > [86220.127082] 4 locks held by kworker/0:2/1003:
> > [86220.127107]  #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
> > [86220.127179]  #1:  ((addr_chk_work).work){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
> > [86220.127242]  #2:  (rtnl_mutex){+.+.}, at: [<00000000b06d9510>] rtnl_lock+0x12/0x20
> > [86220.127300]  #3:  (rcu_read_lock_bh){....}, at: [<00000000aef52299>] addrconf_verify_rtnl+0x1e/0x510 [ipv6]
> > [86220.127414] CPU: 0 PID: 1003 Comm: kworker/0:2 Not tainted 4.15.0-rc7-next-20180110+ #7
> > [86220.127463] Hardware name: ZOTAC ZBOX-CI321NANO/ZBOX-CI321NANO, BIOS B246P105 06/01/2015
> > [86220.127528] Workqueue: ipv6_addrconf addrconf_verify_work [ipv6]
> > [86220.127568] Call Trace:
> > [86220.127591]  dump_stack+0x70/0x9e
> > [86220.127616]  ___might_sleep+0x14d/0x240
> > [86220.127644]  __might_sleep+0x45/0x80
> > [86220.127672]  kmem_cache_alloc_trace+0x53/0x250
> > [86220.127717]  ? ipv6_add_addr+0xfe/0x6e0 [ipv6]
> > [86220.127762]  ipv6_add_addr+0xfe/0x6e0 [ipv6]
> > [86220.127807]  ipv6_create_tempaddr+0x24d/0x430 [ipv6]
> > [86220.127854]  ? ipv6_create_tempaddr+0x24d/0x430 [ipv6]
> > [86220.127903]  addrconf_verify_rtnl+0x339/0x510 [ipv6]
> > [86220.127950]  ? addrconf_verify_rtnl+0x339/0x510 [ipv6]
> > [86220.127998]  addrconf_verify_work+0xe/0x20 [ipv6]
> > [86220.128032]  process_one_work+0x258/0x680
> > [86220.128063]  worker_thread+0x35/0x3f0
> > [86220.128091]  kthread+0x124/0x140
> > [86220.128117]  ? process_one_work+0x680/0x680
> > [86220.128146]  ? kthread_create_worker_on_cpu+0x40/0x40
> > [86220.128180]  ? umh_complete+0x40/0x40
> > [86220.128207]  ? call_usermodehelper_exec_async+0x12a/0x160
> > [86220.128243]  ret_from_fork+0x4b/0x60
> 
> Can you please try attached patch (untested)?



I would also/instead break rcu section.

Holding RCU (and BH) for whole hash traversal is a recipe for disaster,
if we have thousands of IPv6 addresses.

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index ab99cb641b7cccdda0ad4ae553c09274d7dbc047..adda73466ae1dd0f3b700b3db5fbf3065e4d3f7f 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -4356,9 +4356,11 @@ static void addrconf_verify_rtnl(void)
 						spin_lock(&ifpub->lock);
 						ifpub->regen_count = 0;
 						spin_unlock(&ifpub->lock);
+						rcu_read_unlock_bh();
 						ipv6_create_tempaddr(ifpub, ifp, true);
 						in6_ifa_put(ifpub);
 						in6_ifa_put(ifp);
+						rcu_read_lock_bh();
 						goto restart;
 					}
 				} else if (time_before(ifp->tstamp + ifp->prefered_lft * HZ - regen_advance * HZ, next))

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: ipv6_addrconf: WARNING about suspicious RCU usage
  2018-01-20 18:49   ` Eric Dumazet
@ 2018-01-20 19:19     ` Ido Schimmel
  2018-01-20 19:48       ` Heiner Kallweit
  2018-01-21 21:22       ` Heiner Kallweit
  0 siblings, 2 replies; 12+ messages in thread
From: Ido Schimmel @ 2018-01-20 19:19 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Heiner Kallweit, netdev@vger.kernel.org

On Sat, Jan 20, 2018 at 10:49:03AM -0800, Eric Dumazet wrote:
> On Sat, 2018-01-20 at 15:37 +0200, Ido Schimmel wrote:
> > On Sat, Jan 20, 2018 at 12:57:01PM +0100, Heiner Kallweit wrote:
> > > Since some time (didn't bisect it yet) I get the following warning.
> > > Is it a known issue?
> > 
> > [...]
> > 
> > > [86220.126999] BUG: sleeping function called from invalid context at mm/slab.h:420
> > > [86220.127041] in_atomic(): 1, irqs_disabled(): 0, pid: 1003, name: kworker/0:2
> > > [86220.127082] 4 locks held by kworker/0:2/1003:
> > > [86220.127107]  #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
> > > [86220.127179]  #1:  ((addr_chk_work).work){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
> > > [86220.127242]  #2:  (rtnl_mutex){+.+.}, at: [<00000000b06d9510>] rtnl_lock+0x12/0x20
> > > [86220.127300]  #3:  (rcu_read_lock_bh){....}, at: [<00000000aef52299>] addrconf_verify_rtnl+0x1e/0x510 [ipv6]
> > > [86220.127414] CPU: 0 PID: 1003 Comm: kworker/0:2 Not tainted 4.15.0-rc7-next-20180110+ #7
> > > [86220.127463] Hardware name: ZOTAC ZBOX-CI321NANO/ZBOX-CI321NANO, BIOS B246P105 06/01/2015
> > > [86220.127528] Workqueue: ipv6_addrconf addrconf_verify_work [ipv6]
> > > [86220.127568] Call Trace:
> > > [86220.127591]  dump_stack+0x70/0x9e
> > > [86220.127616]  ___might_sleep+0x14d/0x240
> > > [86220.127644]  __might_sleep+0x45/0x80
> > > [86220.127672]  kmem_cache_alloc_trace+0x53/0x250
> > > [86220.127717]  ? ipv6_add_addr+0xfe/0x6e0 [ipv6]
> > > [86220.127762]  ipv6_add_addr+0xfe/0x6e0 [ipv6]
> > > [86220.127807]  ipv6_create_tempaddr+0x24d/0x430 [ipv6]
> > > [86220.127854]  ? ipv6_create_tempaddr+0x24d/0x430 [ipv6]
> > > [86220.127903]  addrconf_verify_rtnl+0x339/0x510 [ipv6]
> > > [86220.127950]  ? addrconf_verify_rtnl+0x339/0x510 [ipv6]
> > > [86220.127998]  addrconf_verify_work+0xe/0x20 [ipv6]
> > > [86220.128032]  process_one_work+0x258/0x680
> > > [86220.128063]  worker_thread+0x35/0x3f0
> > > [86220.128091]  kthread+0x124/0x140
> > > [86220.128117]  ? process_one_work+0x680/0x680
> > > [86220.128146]  ? kthread_create_worker_on_cpu+0x40/0x40
> > > [86220.128180]  ? umh_complete+0x40/0x40
> > > [86220.128207]  ? call_usermodehelper_exec_async+0x12a/0x160
> > > [86220.128243]  ret_from_fork+0x4b/0x60
> > 
> > Can you please try attached patch (untested)?
> 
> 
> 
> I would also/instead break rcu section.

Thanks Eric, this should work. We can continue to block in
ipv6_create_tempaddr().

Heiner, can you try Eric's patch instead?

> 
> Holding RCU (and BH) for whole hash traversal is a recipe for disaster,
> if we have thousands of IPv6 addresses.
> 
> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
> index ab99cb641b7cccdda0ad4ae553c09274d7dbc047..adda73466ae1dd0f3b700b3db5fbf3065e4d3f7f 100644
> --- a/net/ipv6/addrconf.c
> +++ b/net/ipv6/addrconf.c
> @@ -4356,9 +4356,11 @@ static void addrconf_verify_rtnl(void)
>  						spin_lock(&ifpub->lock);
>  						ifpub->regen_count = 0;
>  						spin_unlock(&ifpub->lock);
> +						rcu_read_unlock_bh();
>  						ipv6_create_tempaddr(ifpub, ifp, true);
>  						in6_ifa_put(ifpub);
>  						in6_ifa_put(ifp);
> +						rcu_read_lock_bh();
>  						goto restart;
>  					}
>  				} else if (time_before(ifp->tstamp + ifp->prefered_lft * HZ - regen_advance * HZ, next))
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: ipv6_addrconf: WARNING about suspicious RCU usage
  2018-01-20 19:19     ` Ido Schimmel
@ 2018-01-20 19:48       ` Heiner Kallweit
  2018-01-21 21:22       ` Heiner Kallweit
  1 sibling, 0 replies; 12+ messages in thread
From: Heiner Kallweit @ 2018-01-20 19:48 UTC (permalink / raw)
  To: Ido Schimmel, Eric Dumazet; +Cc: netdev@vger.kernel.org

Am 20.01.2018 um 20:19 schrieb Ido Schimmel:
> On Sat, Jan 20, 2018 at 10:49:03AM -0800, Eric Dumazet wrote:
>> On Sat, 2018-01-20 at 15:37 +0200, Ido Schimmel wrote:
>>> On Sat, Jan 20, 2018 at 12:57:01PM +0100, Heiner Kallweit wrote:
>>>> Since some time (didn't bisect it yet) I get the following warning.
>>>> Is it a known issue?
>>>
>>> [...]
>>>
>>>> [86220.126999] BUG: sleeping function called from invalid context at mm/slab.h:420
>>>> [86220.127041] in_atomic(): 1, irqs_disabled(): 0, pid: 1003, name: kworker/0:2
>>>> [86220.127082] 4 locks held by kworker/0:2/1003:
>>>> [86220.127107]  #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
>>>> [86220.127179]  #1:  ((addr_chk_work).work){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
>>>> [86220.127242]  #2:  (rtnl_mutex){+.+.}, at: [<00000000b06d9510>] rtnl_lock+0x12/0x20
>>>> [86220.127300]  #3:  (rcu_read_lock_bh){....}, at: [<00000000aef52299>] addrconf_verify_rtnl+0x1e/0x510 [ipv6]
>>>> [86220.127414] CPU: 0 PID: 1003 Comm: kworker/0:2 Not tainted 4.15.0-rc7-next-20180110+ #7
>>>> [86220.127463] Hardware name: ZOTAC ZBOX-CI321NANO/ZBOX-CI321NANO, BIOS B246P105 06/01/2015
>>>> [86220.127528] Workqueue: ipv6_addrconf addrconf_verify_work [ipv6]
>>>> [86220.127568] Call Trace:
>>>> [86220.127591]  dump_stack+0x70/0x9e
>>>> [86220.127616]  ___might_sleep+0x14d/0x240
>>>> [86220.127644]  __might_sleep+0x45/0x80
>>>> [86220.127672]  kmem_cache_alloc_trace+0x53/0x250
>>>> [86220.127717]  ? ipv6_add_addr+0xfe/0x6e0 [ipv6]
>>>> [86220.127762]  ipv6_add_addr+0xfe/0x6e0 [ipv6]
>>>> [86220.127807]  ipv6_create_tempaddr+0x24d/0x430 [ipv6]
>>>> [86220.127854]  ? ipv6_create_tempaddr+0x24d/0x430 [ipv6]
>>>> [86220.127903]  addrconf_verify_rtnl+0x339/0x510 [ipv6]
>>>> [86220.127950]  ? addrconf_verify_rtnl+0x339/0x510 [ipv6]
>>>> [86220.127998]  addrconf_verify_work+0xe/0x20 [ipv6]
>>>> [86220.128032]  process_one_work+0x258/0x680
>>>> [86220.128063]  worker_thread+0x35/0x3f0
>>>> [86220.128091]  kthread+0x124/0x140
>>>> [86220.128117]  ? process_one_work+0x680/0x680
>>>> [86220.128146]  ? kthread_create_worker_on_cpu+0x40/0x40
>>>> [86220.128180]  ? umh_complete+0x40/0x40
>>>> [86220.128207]  ? call_usermodehelper_exec_async+0x12a/0x160
>>>> [86220.128243]  ret_from_fork+0x4b/0x60
>>>
>>> Can you please try attached patch (untested)?
>>
>>
>>
>> I would also/instead break rcu section.
> 
> Thanks Eric, this should work. We can continue to block in
> ipv6_create_tempaddr().
> 
> Heiner, can you try Eric's patch instead?
> 
Sure, thanks for the quick response to both of you.

>>
>> Holding RCU (and BH) for whole hash traversal is a recipe for disaster,
>> if we have thousands of IPv6 addresses.
>>
>> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
>> index ab99cb641b7cccdda0ad4ae553c09274d7dbc047..adda73466ae1dd0f3b700b3db5fbf3065e4d3f7f 100644
>> --- a/net/ipv6/addrconf.c
>> +++ b/net/ipv6/addrconf.c
>> @@ -4356,9 +4356,11 @@ static void addrconf_verify_rtnl(void)
>>  						spin_lock(&ifpub->lock);
>>  						ifpub->regen_count = 0;
>>  						spin_unlock(&ifpub->lock);
>> +						rcu_read_unlock_bh();
>>  						ipv6_create_tempaddr(ifpub, ifp, true);
>>  						in6_ifa_put(ifpub);
>>  						in6_ifa_put(ifp);
>> +						rcu_read_lock_bh();
>>  						goto restart;
>>  					}
>>  				} else if (time_before(ifp->tstamp + ifp->prefered_lft * HZ - regen_advance * HZ, next))
>>
>>
>>
>>
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: ipv6_addrconf: WARNING about suspicious RCU usage
  2018-01-20 19:19     ` Ido Schimmel
  2018-01-20 19:48       ` Heiner Kallweit
@ 2018-01-21 21:22       ` Heiner Kallweit
  2018-01-23 18:01         ` Ido Schimmel
  1 sibling, 1 reply; 12+ messages in thread
From: Heiner Kallweit @ 2018-01-21 21:22 UTC (permalink / raw)
  To: Ido Schimmel, Eric Dumazet; +Cc: netdev@vger.kernel.org

Am 20.01.2018 um 20:19 schrieb Ido Schimmel:
> On Sat, Jan 20, 2018 at 10:49:03AM -0800, Eric Dumazet wrote:
>> On Sat, 2018-01-20 at 15:37 +0200, Ido Schimmel wrote:
>>> On Sat, Jan 20, 2018 at 12:57:01PM +0100, Heiner Kallweit wrote:
>>>> Since some time (didn't bisect it yet) I get the following warning.
>>>> Is it a known issue?
>>>
>>> [...]
>>>
>>>> [86220.126999] BUG: sleeping function called from invalid context at mm/slab.h:420
>>>> [86220.127041] in_atomic(): 1, irqs_disabled(): 0, pid: 1003, name: kworker/0:2
>>>> [86220.127082] 4 locks held by kworker/0:2/1003:
>>>> [86220.127107]  #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
>>>> [86220.127179]  #1:  ((addr_chk_work).work){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
>>>> [86220.127242]  #2:  (rtnl_mutex){+.+.}, at: [<00000000b06d9510>] rtnl_lock+0x12/0x20
>>>> [86220.127300]  #3:  (rcu_read_lock_bh){....}, at: [<00000000aef52299>] addrconf_verify_rtnl+0x1e/0x510 [ipv6]
>>>> [86220.127414] CPU: 0 PID: 1003 Comm: kworker/0:2 Not tainted 4.15.0-rc7-next-20180110+ #7
>>>> [86220.127463] Hardware name: ZOTAC ZBOX-CI321NANO/ZBOX-CI321NANO, BIOS B246P105 06/01/2015
>>>> [86220.127528] Workqueue: ipv6_addrconf addrconf_verify_work [ipv6]
>>>> [86220.127568] Call Trace:
>>>> [86220.127591]  dump_stack+0x70/0x9e
>>>> [86220.127616]  ___might_sleep+0x14d/0x240
>>>> [86220.127644]  __might_sleep+0x45/0x80
>>>> [86220.127672]  kmem_cache_alloc_trace+0x53/0x250
>>>> [86220.127717]  ? ipv6_add_addr+0xfe/0x6e0 [ipv6]
>>>> [86220.127762]  ipv6_add_addr+0xfe/0x6e0 [ipv6]
>>>> [86220.127807]  ipv6_create_tempaddr+0x24d/0x430 [ipv6]
>>>> [86220.127854]  ? ipv6_create_tempaddr+0x24d/0x430 [ipv6]
>>>> [86220.127903]  addrconf_verify_rtnl+0x339/0x510 [ipv6]
>>>> [86220.127950]  ? addrconf_verify_rtnl+0x339/0x510 [ipv6]
>>>> [86220.127998]  addrconf_verify_work+0xe/0x20 [ipv6]
>>>> [86220.128032]  process_one_work+0x258/0x680
>>>> [86220.128063]  worker_thread+0x35/0x3f0
>>>> [86220.128091]  kthread+0x124/0x140
>>>> [86220.128117]  ? process_one_work+0x680/0x680
>>>> [86220.128146]  ? kthread_create_worker_on_cpu+0x40/0x40
>>>> [86220.128180]  ? umh_complete+0x40/0x40
>>>> [86220.128207]  ? call_usermodehelper_exec_async+0x12a/0x160
>>>> [86220.128243]  ret_from_fork+0x4b/0x60
>>>
>>> Can you please try attached patch (untested)?
>>
>>
>>
>> I would also/instead break rcu section.
> 
> Thanks Eric, this should work. We can continue to block in
> ipv6_create_tempaddr().
> 
> Heiner, can you try Eric's patch instead?
> 
So far everything looks good with Eric's patch. The warning didn't show up again.

>>
>> Holding RCU (and BH) for whole hash traversal is a recipe for disaster,
>> if we have thousands of IPv6 addresses.
>>
>> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
>> index ab99cb641b7cccdda0ad4ae553c09274d7dbc047..adda73466ae1dd0f3b700b3db5fbf3065e4d3f7f 100644
>> --- a/net/ipv6/addrconf.c
>> +++ b/net/ipv6/addrconf.c
>> @@ -4356,9 +4356,11 @@ static void addrconf_verify_rtnl(void)
>>  						spin_lock(&ifpub->lock);
>>  						ifpub->regen_count = 0;
>>  						spin_unlock(&ifpub->lock);
>> +						rcu_read_unlock_bh();
>>  						ipv6_create_tempaddr(ifpub, ifp, true);
>>  						in6_ifa_put(ifpub);
>>  						in6_ifa_put(ifp);
>> +						rcu_read_lock_bh();
>>  						goto restart;
>>  					}
>>  				} else if (time_before(ifp->tstamp + ifp->prefered_lft * HZ - regen_advance * HZ, next))
>>
>>
>>
>>
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: ipv6_addrconf: WARNING about suspicious RCU usage
  2018-01-21 21:22       ` Heiner Kallweit
@ 2018-01-23 18:01         ` Ido Schimmel
  2018-01-23 18:43           ` Eric Dumazet
  2018-01-26 22:15           ` ipv6_addrconf: WARNING about suspicious RCU usage Heiner Kallweit
  0 siblings, 2 replies; 12+ messages in thread
From: Ido Schimmel @ 2018-01-23 18:01 UTC (permalink / raw)
  To: Heiner Kallweit, eric.dumazet; +Cc: netdev@vger.kernel.org

On Sun, Jan 21, 2018 at 10:22:16PM +0100, Heiner Kallweit wrote:
> So far everything looks good with Eric's patch. The warning didn't show up again.

Eric, can you please submit your patch?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: ipv6_addrconf: WARNING about suspicious RCU usage
  2018-01-23 18:01         ` Ido Schimmel
@ 2018-01-23 18:43           ` Eric Dumazet
  2018-01-27  0:10             ` [PATCH net] ipv6: addrconf: break critical section in addrconf_verify_rtnl() Eric Dumazet
  2018-01-26 22:15           ` ipv6_addrconf: WARNING about suspicious RCU usage Heiner Kallweit
  1 sibling, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2018-01-23 18:43 UTC (permalink / raw)
  To: Ido Schimmel, Heiner Kallweit; +Cc: netdev@vger.kernel.org

On Tue, 2018-01-23 at 20:01 +0200, Ido Schimmel wrote:
> On Sun, Jan 21, 2018 at 10:22:16PM +0100, Heiner Kallweit wrote:
> > So far everything looks good with Eric's patch. The warning didn't show up again.
> 
> Eric, can you please submit your patch?

Sure, will do today. Thanks for the reminder.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: ipv6_addrconf: WARNING about suspicious RCU usage
  2018-01-23 18:01         ` Ido Schimmel
  2018-01-23 18:43           ` Eric Dumazet
@ 2018-01-26 22:15           ` Heiner Kallweit
  1 sibling, 0 replies; 12+ messages in thread
From: Heiner Kallweit @ 2018-01-26 22:15 UTC (permalink / raw)
  To: eric.dumazet; +Cc: Ido Schimmel, netdev@vger.kernel.org

Am 23.01.2018 um 19:01 schrieb Ido Schimmel:
> On Sun, Jan 21, 2018 at 10:22:16PM +0100, Heiner Kallweit wrote:
>> So far everything looks good with Eric's patch. The warning didn't show up again.
> 
> Eric, can you please submit your patch?
> .
> 
Soon the merge window starts, can the patch be submitted before?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH net] ipv6: addrconf: break critical section in addrconf_verify_rtnl()
  2018-01-23 18:43           ` Eric Dumazet
@ 2018-01-27  0:10             ` Eric Dumazet
  2018-01-27  8:52               ` Ido Schimmel
  2018-01-29 19:24               ` David Miller
  0 siblings, 2 replies; 12+ messages in thread
From: Eric Dumazet @ 2018-01-27  0:10 UTC (permalink / raw)
  To: Ido Schimmel, Heiner Kallweit, David Miller; +Cc: netdev@vger.kernel.org

From: Eric Dumazet <edumazet@google.com>

Heiner reported a lockdep splat [1]

This is caused by attempting GFP_KERNEL allocation while RCU lock is
held and BH blocked.

We believe that addrconf_verify_rtnl() could run for a long period,
so instead of using GFP_ATOMIC here as Ido suggested, we should break
the critical section and restart it after the allocation.


[1]
[86220.125562] =============================
[86220.125586] WARNING: suspicious RCU usage
[86220.125612] 4.15.0-rc7-next-20180110+ #7 Not tainted
[86220.125641] -----------------------------
[86220.125666] kernel/sched/core.c:6026 Illegal context switch in RCU-bh read-side critical section!
[86220.125711]
               other info that might help us debug this:

[86220.125755]
               rcu_scheduler_active = 2, debug_locks = 1
[86220.125792] 4 locks held by kworker/0:2/1003:
[86220.125817]  #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
[86220.125895]  #1:  ((addr_chk_work).work){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
[86220.125959]  #2:  (rtnl_mutex){+.+.}, at: [<00000000b06d9510>] rtnl_lock+0x12/0x20
[86220.126017]  #3:  (rcu_read_lock_bh){....}, at: [<00000000aef52299>] addrconf_verify_rtnl+0x1e/0x510 [ipv6]
[86220.126111]
               stack backtrace:
[86220.126142] CPU: 0 PID: 1003 Comm: kworker/0:2 Not tainted 4.15.0-rc7-next-20180110+ #7
[86220.126185] Hardware name: ZOTAC ZBOX-CI321NANO/ZBOX-CI321NANO, BIOS B246P105 06/01/2015
[86220.126250] Workqueue: ipv6_addrconf addrconf_verify_work [ipv6]
[86220.126288] Call Trace:
[86220.126312]  dump_stack+0x70/0x9e
[86220.126337]  lockdep_rcu_suspicious+0xce/0xf0
[86220.126365]  ___might_sleep+0x1d3/0x240
[86220.126390]  __might_sleep+0x45/0x80
[86220.126416]  kmem_cache_alloc_trace+0x53/0x250
[86220.126458]  ? ipv6_add_addr+0xfe/0x6e0 [ipv6]
[86220.126498]  ipv6_add_addr+0xfe/0x6e0 [ipv6]
[86220.126538]  ipv6_create_tempaddr+0x24d/0x430 [ipv6]
[86220.126580]  ? ipv6_create_tempaddr+0x24d/0x430 [ipv6]
[86220.126623]  addrconf_verify_rtnl+0x339/0x510 [ipv6]
[86220.126664]  ? addrconf_verify_rtnl+0x339/0x510 [ipv6]
[86220.126708]  addrconf_verify_work+0xe/0x20 [ipv6]
[86220.126738]  process_one_work+0x258/0x680
[86220.126765]  worker_thread+0x35/0x3f0
[86220.126790]  kthread+0x124/0x140
[86220.126813]  ? process_one_work+0x680/0x680
[86220.126839]  ? kthread_create_worker_on_cpu+0x40/0x40
[86220.126869]  ? umh_complete+0x40/0x40
[86220.126893]  ? call_usermodehelper_exec_async+0x12a/0x160
[86220.126926]  ret_from_fork+0x4b/0x60
[86220.126999] BUG: sleeping function called from invalid context at mm/slab.h:420
[86220.127041] in_atomic(): 1, irqs_disabled(): 0, pid: 1003, name: kworker/0:2
[86220.127082] 4 locks held by kworker/0:2/1003:
[86220.127107]  #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
[86220.127179]  #1:  ((addr_chk_work).work){+.+.}, at: [<00000000da8e9b73>] process_one_work+0x1de/0x680
[86220.127242]  #2:  (rtnl_mutex){+.+.}, at: [<00000000b06d9510>] rtnl_lock+0x12/0x20
[86220.127300]  #3:  (rcu_read_lock_bh){....}, at: [<00000000aef52299>] addrconf_verify_rtnl+0x1e/0x510 [ipv6]
[86220.127414] CPU: 0 PID: 1003 Comm: kworker/0:2 Not tainted 4.15.0-rc7-next-20180110+ #7
[86220.127463] Hardware name: ZOTAC ZBOX-CI321NANO/ZBOX-CI321NANO, BIOS B246P105 06/01/2015
[86220.127528] Workqueue: ipv6_addrconf addrconf_verify_work [ipv6]
[86220.127568] Call Trace:
[86220.127591]  dump_stack+0x70/0x9e
[86220.127616]  ___might_sleep+0x14d/0x240
[86220.127644]  __might_sleep+0x45/0x80
[86220.127672]  kmem_cache_alloc_trace+0x53/0x250
[86220.127717]  ? ipv6_add_addr+0xfe/0x6e0 [ipv6]
[86220.127762]  ipv6_add_addr+0xfe/0x6e0 [ipv6]
[86220.127807]  ipv6_create_tempaddr+0x24d/0x430 [ipv6]
[86220.127854]  ? ipv6_create_tempaddr+0x24d/0x430 [ipv6]
[86220.127903]  addrconf_verify_rtnl+0x339/0x510 [ipv6]
[86220.127950]  ? addrconf_verify_rtnl+0x339/0x510 [ipv6]
[86220.127998]  addrconf_verify_work+0xe/0x20 [ipv6]
[86220.128032]  process_one_work+0x258/0x680
[86220.128063]  worker_thread+0x35/0x3f0
[86220.128091]  kthread+0x124/0x140
[86220.128117]  ? process_one_work+0x680/0x680
[86220.128146]  ? kthread_create_worker_on_cpu+0x40/0x40
[86220.128180]  ? umh_complete+0x40/0x40
[86220.128207]  ? call_usermodehelper_exec_async+0x12a/0x160
[86220.128243]  ret_from_fork+0x4b/0x60

Fixes: f3d9832e56c4 ("ipv6: addrconf: cleanup locking in ipv6_add_addr")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 net/ipv6/addrconf.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index f49bd7897e95f15a381e4700660991f2d3c3fed4..10facd174210974ac82b2304211061b90714aac8 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -4352,9 +4352,11 @@ static void addrconf_verify_rtnl(void)
 						spin_lock(&ifpub->lock);
 						ifpub->regen_count = 0;
 						spin_unlock(&ifpub->lock);
+						rcu_read_unlock_bh();
 						ipv6_create_tempaddr(ifpub, ifp, true);
 						in6_ifa_put(ifpub);
 						in6_ifa_put(ifp);
+						rcu_read_lock_bh();
 						goto restart;
 					}
 				} else if (time_before(ifp->tstamp + ifp->prefered_lft * HZ - regen_advance * HZ, next))

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH net] ipv6: addrconf: break critical section in addrconf_verify_rtnl()
  2018-01-27  0:10             ` [PATCH net] ipv6: addrconf: break critical section in addrconf_verify_rtnl() Eric Dumazet
@ 2018-01-27  8:52               ` Ido Schimmel
  2018-01-29 19:24               ` David Miller
  1 sibling, 0 replies; 12+ messages in thread
From: Ido Schimmel @ 2018-01-27  8:52 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Heiner Kallweit, David Miller, netdev@vger.kernel.org

On Fri, Jan 26, 2018 at 04:10:43PM -0800, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> Heiner reported a lockdep splat [1]
> 
> This is caused by attempting GFP_KERNEL allocation while RCU lock is
> held and BH blocked.
> 
> We believe that addrconf_verify_rtnl() could run for a long period,
> so instead of using GFP_ATOMIC here as Ido suggested, we should break
> the critical section and restart it after the allocation.

[...]

> Fixes: f3d9832e56c4 ("ipv6: addrconf: cleanup locking in ipv6_add_addr")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Heiner Kallweit <hkallweit1@gmail.com>

Reviewed-by: Ido Schimmel <idosch@mellanox.com>

Thanks!

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net] ipv6: addrconf: break critical section in addrconf_verify_rtnl()
  2018-01-27  0:10             ` [PATCH net] ipv6: addrconf: break critical section in addrconf_verify_rtnl() Eric Dumazet
  2018-01-27  8:52               ` Ido Schimmel
@ 2018-01-29 19:24               ` David Miller
  1 sibling, 0 replies; 12+ messages in thread
From: David Miller @ 2018-01-29 19:24 UTC (permalink / raw)
  To: eric.dumazet; +Cc: idosch, hkallweit1, netdev

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 26 Jan 2018 16:10:43 -0800

> From: Eric Dumazet <edumazet@google.com>
> 
> Heiner reported a lockdep splat [1]
> 
> This is caused by attempting GFP_KERNEL allocation while RCU lock is
> held and BH blocked.
> 
> We believe that addrconf_verify_rtnl() could run for a long period,
> so instead of using GFP_ATOMIC here as Ido suggested, we should break
> the critical section and restart it after the allocation.
> 
> 
> [1]
 ...
> Fixes: f3d9832e56c4 ("ipv6: addrconf: cleanup locking in ipv6_add_addr")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Heiner Kallweit <hkallweit1@gmail.com>

Applied and queued up for v4.15 -stable, thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-01-29 19:24 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-20 11:57 ipv6_addrconf: WARNING about suspicious RCU usage Heiner Kallweit
2018-01-20 13:37 ` Ido Schimmel
2018-01-20 18:49   ` Eric Dumazet
2018-01-20 19:19     ` Ido Schimmel
2018-01-20 19:48       ` Heiner Kallweit
2018-01-21 21:22       ` Heiner Kallweit
2018-01-23 18:01         ` Ido Schimmel
2018-01-23 18:43           ` Eric Dumazet
2018-01-27  0:10             ` [PATCH net] ipv6: addrconf: break critical section in addrconf_verify_rtnl() Eric Dumazet
2018-01-27  8:52               ` Ido Schimmel
2018-01-29 19:24               ` David Miller
2018-01-26 22:15           ` ipv6_addrconf: WARNING about suspicious RCU usage Heiner Kallweit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).