public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next] udp_tunnel: fix deadlock in udp_tunnel_nic_set_port_priv()
@ 2025-06-23 10:53 Paolo Abeni
  2025-06-23 14:23 ` Stanislav Fomichev
  2025-06-24 23:40 ` patchwork-bot+netdevbpf
  0 siblings, 2 replies; 3+ messages in thread
From: Paolo Abeni @ 2025-06-23 10:53 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Simon Horman,
	Stanislav Fomichev

While configuring a vxlan tunnel in a system with a i40e NIC driver, I
observe the following deadlock:

 WARNING: possible recursive locking detected
 6.16.0-rc2.net-next-6.16_92d87230d899+ #13 Tainted: G            E
 --------------------------------------------
 kworker/u256:4/1125 is trying to acquire lock:
 ffff88921ab9c8c8 (&utn->lock){+.+.}-{4:4}, at: i40e_udp_tunnel_set_port (/home/pabeni/net-next/include/net/udp_tunnel.h:343 /home/pabeni/net-next/drivers/net/ethernet/intel/i40e/i40e_main.c:13013) i40e

 but task is already holding lock:
 ffff88921ab9c8c8 (&utn->lock){+.+.}-{4:4}, at: udp_tunnel_nic_device_sync_work (/home/pabeni/net-next/net/ipv4/udp_tunnel_nic.c:739) udp_tunnel

 other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(&utn->lock);
   lock(&utn->lock);

  *** DEADLOCK ***

  May be due to missing lock nesting notation

 4 locks held by kworker/u256:4/1125:
 #0: ffff8892910ca158 ((wq_completion)udp_tunnel_nic){+.+.}-{0:0}, at: process_one_work (/home/pabeni/net-next/kernel/workqueue.c:3213)
 #1: ffffc900244efd30 ((work_completion)(&utn->work)){+.+.}-{0:0}, at: process_one_work (/home/pabeni/net-next/kernel/workqueue.c:3214)
 #2: ffffffff9a14e290 (rtnl_mutex){+.+.}-{4:4}, at: udp_tunnel_nic_device_sync_work (/home/pabeni/net-next/net/ipv4/udp_tunnel_nic.c:737) udp_tunnel
 #3: ffff88921ab9c8c8 (&utn->lock){+.+.}-{4:4}, at: udp_tunnel_nic_device_sync_work (/home/pabeni/net-next/net/ipv4/udp_tunnel_nic.c:739) udp_tunnel

 stack backtrace:
 Hardware name: Dell Inc. PowerEdge R7525/0YHMCJ, BIOS 2.2.5 04/08/2021
i
 Call Trace:
  <TASK>
 dump_stack_lvl (/home/pabeni/net-next/lib/dump_stack.c:123)
 print_deadlock_bug (/home/pabeni/net-next/kernel/locking/lockdep.c:3047)
 validate_chain (/home/pabeni/net-next/kernel/locking/lockdep.c:3901)
 __lock_acquire (/home/pabeni/net-next/kernel/locking/lockdep.c:5240)
 lock_acquire.part.0 (/home/pabeni/net-next/kernel/locking/lockdep.c:473 /home/pabeni/net-next/kernel/locking/lockdep.c:5873)
 __mutex_lock (/home/pabeni/net-next/kernel/locking/mutex.c:604 /home/pabeni/net-next/kernel/locking/mutex.c:747)
 i40e_udp_tunnel_set_port (/home/pabeni/net-next/include/net/udp_tunnel.h:343 /home/pabeni/net-next/drivers/net/ethernet/intel/i40e/i40e_main.c:13013) i40e
 udp_tunnel_nic_device_sync_by_port (/home/pabeni/net-next/net/ipv4/udp_tunnel_nic.c:230 /home/pabeni/net-next/net/ipv4/udp_tunnel_nic.c:249) udp_tunnel
 __udp_tunnel_nic_device_sync.part.0 (/home/pabeni/net-next/net/ipv4/udp_tunnel_nic.c:292) udp_tunnel
 udp_tunnel_nic_device_sync_work (/home/pabeni/net-next/net/ipv4/udp_tunnel_nic.c:742) udp_tunnel
 process_one_work (/home/pabeni/net-next/kernel/workqueue.c:3243)
 worker_thread (/home/pabeni/net-next/kernel/workqueue.c:3315 /home/pabeni/net-next/kernel/workqueue.c:3402)
 kthread (/home/pabeni/net-next/kernel/kthread.c:464)

AFAICS all the existing callsites of udp_tunnel_nic_set_port_priv() are
already under the utn lock scope, avoid (re-)acquiring it in such a
function.

Fixes: 1ead7501094c ("udp_tunnel: remove rtnl_lock dependency")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 include/net/udp_tunnel.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/include/net/udp_tunnel.h b/include/net/udp_tunnel.h
index cbd3a43074bd..9acef2fbd2fd 100644
--- a/include/net/udp_tunnel.h
+++ b/include/net/udp_tunnel.h
@@ -339,9 +339,8 @@ udp_tunnel_nic_set_port_priv(struct net_device *dev, unsigned int table,
 			     unsigned int idx, u8 priv)
 {
 	if (udp_tunnel_nic_ops) {
-		udp_tunnel_nic_ops->lock(dev);
+		udp_tunnel_nic_ops->assert_locked(dev);
 		udp_tunnel_nic_ops->set_port_priv(dev, table, idx, priv);
-		udp_tunnel_nic_ops->unlock(dev);
 	}
 }
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH net-next] udp_tunnel: fix deadlock in udp_tunnel_nic_set_port_priv()
  2025-06-23 10:53 [PATCH net-next] udp_tunnel: fix deadlock in udp_tunnel_nic_set_port_priv() Paolo Abeni
@ 2025-06-23 14:23 ` Stanislav Fomichev
  2025-06-24 23:40 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 3+ messages in thread
From: Stanislav Fomichev @ 2025-06-23 14:23 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Simon Horman, Stanislav Fomichev

On 06/23, Paolo Abeni wrote:
> While configuring a vxlan tunnel in a system with a i40e NIC driver, I
> observe the following deadlock:
> 
>  WARNING: possible recursive locking detected
>  6.16.0-rc2.net-next-6.16_92d87230d899+ #13 Tainted: G            E
>  --------------------------------------------
>  kworker/u256:4/1125 is trying to acquire lock:
>  ffff88921ab9c8c8 (&utn->lock){+.+.}-{4:4}, at: i40e_udp_tunnel_set_port (/home/pabeni/net-next/include/net/udp_tunnel.h:343 /home/pabeni/net-next/drivers/net/ethernet/intel/i40e/i40e_main.c:13013) i40e
> 
>  but task is already holding lock:
>  ffff88921ab9c8c8 (&utn->lock){+.+.}-{4:4}, at: udp_tunnel_nic_device_sync_work (/home/pabeni/net-next/net/ipv4/udp_tunnel_nic.c:739) udp_tunnel
> 
>  other info that might help us debug this:
>   Possible unsafe locking scenario:
> 
>         CPU0
>         ----
>    lock(&utn->lock);
>    lock(&utn->lock);
> 
>   *** DEADLOCK ***
> 
>   May be due to missing lock nesting notation
> 
>  4 locks held by kworker/u256:4/1125:
>  #0: ffff8892910ca158 ((wq_completion)udp_tunnel_nic){+.+.}-{0:0}, at: process_one_work (/home/pabeni/net-next/kernel/workqueue.c:3213)
>  #1: ffffc900244efd30 ((work_completion)(&utn->work)){+.+.}-{0:0}, at: process_one_work (/home/pabeni/net-next/kernel/workqueue.c:3214)
>  #2: ffffffff9a14e290 (rtnl_mutex){+.+.}-{4:4}, at: udp_tunnel_nic_device_sync_work (/home/pabeni/net-next/net/ipv4/udp_tunnel_nic.c:737) udp_tunnel
>  #3: ffff88921ab9c8c8 (&utn->lock){+.+.}-{4:4}, at: udp_tunnel_nic_device_sync_work (/home/pabeni/net-next/net/ipv4/udp_tunnel_nic.c:739) udp_tunnel
> 
>  stack backtrace:
>  Hardware name: Dell Inc. PowerEdge R7525/0YHMCJ, BIOS 2.2.5 04/08/2021
> i
>  Call Trace:
>   <TASK>
>  dump_stack_lvl (/home/pabeni/net-next/lib/dump_stack.c:123)
>  print_deadlock_bug (/home/pabeni/net-next/kernel/locking/lockdep.c:3047)
>  validate_chain (/home/pabeni/net-next/kernel/locking/lockdep.c:3901)
>  __lock_acquire (/home/pabeni/net-next/kernel/locking/lockdep.c:5240)
>  lock_acquire.part.0 (/home/pabeni/net-next/kernel/locking/lockdep.c:473 /home/pabeni/net-next/kernel/locking/lockdep.c:5873)
>  __mutex_lock (/home/pabeni/net-next/kernel/locking/mutex.c:604 /home/pabeni/net-next/kernel/locking/mutex.c:747)
>  i40e_udp_tunnel_set_port (/home/pabeni/net-next/include/net/udp_tunnel.h:343 /home/pabeni/net-next/drivers/net/ethernet/intel/i40e/i40e_main.c:13013) i40e
>  udp_tunnel_nic_device_sync_by_port (/home/pabeni/net-next/net/ipv4/udp_tunnel_nic.c:230 /home/pabeni/net-next/net/ipv4/udp_tunnel_nic.c:249) udp_tunnel
>  __udp_tunnel_nic_device_sync.part.0 (/home/pabeni/net-next/net/ipv4/udp_tunnel_nic.c:292) udp_tunnel
>  udp_tunnel_nic_device_sync_work (/home/pabeni/net-next/net/ipv4/udp_tunnel_nic.c:742) udp_tunnel
>  process_one_work (/home/pabeni/net-next/kernel/workqueue.c:3243)
>  worker_thread (/home/pabeni/net-next/kernel/workqueue.c:3315 /home/pabeni/net-next/kernel/workqueue.c:3402)
>  kthread (/home/pabeni/net-next/kernel/kthread.c:464)
> 
> AFAICS all the existing callsites of udp_tunnel_nic_set_port_priv() are
> already under the utn lock scope, avoid (re-)acquiring it in such a
> function.
> 
> Fixes: 1ead7501094c ("udp_tunnel: remove rtnl_lock dependency")
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Ouch, how did I miss that it's running from .set_port :-( Thanks!

Acked-by: Stanislav Fomichev <sdf@fomichev.me>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH net-next] udp_tunnel: fix deadlock in udp_tunnel_nic_set_port_priv()
  2025-06-23 10:53 [PATCH net-next] udp_tunnel: fix deadlock in udp_tunnel_nic_set_port_priv() Paolo Abeni
  2025-06-23 14:23 ` Stanislav Fomichev
@ 2025-06-24 23:40 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 3+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-06-24 23:40 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: netdev, davem, edumazet, kuba, horms, sdf

Hello:

This patch was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Mon, 23 Jun 2025 12:53:55 +0200 you wrote:
> While configuring a vxlan tunnel in a system with a i40e NIC driver, I
> observe the following deadlock:
> 
>  WARNING: possible recursive locking detected
>  6.16.0-rc2.net-next-6.16_92d87230d899+ #13 Tainted: G            E
>  --------------------------------------------
>  kworker/u256:4/1125 is trying to acquire lock:
>  ffff88921ab9c8c8 (&utn->lock){+.+.}-{4:4}, at: i40e_udp_tunnel_set_port (/home/pabeni/net-next/include/net/udp_tunnel.h:343 /home/pabeni/net-next/drivers/net/ethernet/intel/i40e/i40e_main.c:13013) i40e
> 
> [...]

Here is the summary with links:
  - [net-next] udp_tunnel: fix deadlock in udp_tunnel_nic_set_port_priv()
    https://git.kernel.org/netdev/net-next/c/c9e78afa688a

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-06-24 23:39 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-23 10:53 [PATCH net-next] udp_tunnel: fix deadlock in udp_tunnel_nic_set_port_priv() Paolo Abeni
2025-06-23 14:23 ` Stanislav Fomichev
2025-06-24 23:40 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox