All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stanislav Fomichev <stfomichev@gmail.com>
To: Paolo Abeni <pabeni@redhat.com>
Cc: netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Simon Horman <horms@kernel.org>,
	Stanislav Fomichev <sdf@fomichev.me>
Subject: Re: [PATCH net-next] udp_tunnel: fix deadlock in udp_tunnel_nic_set_port_priv()
Date: Mon, 23 Jun 2025 07:23:13 -0700	[thread overview]
Message-ID: <aFljUZx1bgn5D9D7@mini-arch> (raw)
In-Reply-To: <95a827621ec78c12d1564ec3209e549774f9657d.1750675978.git.pabeni@redhat.com>

On 06/23, Paolo Abeni wrote:
> While configuring a vxlan tunnel in a system with a i40e NIC driver, I
> observe the following deadlock:
> 
>  WARNING: possible recursive locking detected
>  6.16.0-rc2.net-next-6.16_92d87230d899+ #13 Tainted: G            E
>  --------------------------------------------
>  kworker/u256:4/1125 is trying to acquire lock:
>  ffff88921ab9c8c8 (&utn->lock){+.+.}-{4:4}, at: i40e_udp_tunnel_set_port (/home/pabeni/net-next/include/net/udp_tunnel.h:343 /home/pabeni/net-next/drivers/net/ethernet/intel/i40e/i40e_main.c:13013) i40e
> 
>  but task is already holding lock:
>  ffff88921ab9c8c8 (&utn->lock){+.+.}-{4:4}, at: udp_tunnel_nic_device_sync_work (/home/pabeni/net-next/net/ipv4/udp_tunnel_nic.c:739) udp_tunnel
> 
>  other info that might help us debug this:
>   Possible unsafe locking scenario:
> 
>         CPU0
>         ----
>    lock(&utn->lock);
>    lock(&utn->lock);
> 
>   *** DEADLOCK ***
> 
>   May be due to missing lock nesting notation
> 
>  4 locks held by kworker/u256:4/1125:
>  #0: ffff8892910ca158 ((wq_completion)udp_tunnel_nic){+.+.}-{0:0}, at: process_one_work (/home/pabeni/net-next/kernel/workqueue.c:3213)
>  #1: ffffc900244efd30 ((work_completion)(&utn->work)){+.+.}-{0:0}, at: process_one_work (/home/pabeni/net-next/kernel/workqueue.c:3214)
>  #2: ffffffff9a14e290 (rtnl_mutex){+.+.}-{4:4}, at: udp_tunnel_nic_device_sync_work (/home/pabeni/net-next/net/ipv4/udp_tunnel_nic.c:737) udp_tunnel
>  #3: ffff88921ab9c8c8 (&utn->lock){+.+.}-{4:4}, at: udp_tunnel_nic_device_sync_work (/home/pabeni/net-next/net/ipv4/udp_tunnel_nic.c:739) udp_tunnel
> 
>  stack backtrace:
>  Hardware name: Dell Inc. PowerEdge R7525/0YHMCJ, BIOS 2.2.5 04/08/2021
> i
>  Call Trace:
>   <TASK>
>  dump_stack_lvl (/home/pabeni/net-next/lib/dump_stack.c:123)
>  print_deadlock_bug (/home/pabeni/net-next/kernel/locking/lockdep.c:3047)
>  validate_chain (/home/pabeni/net-next/kernel/locking/lockdep.c:3901)
>  __lock_acquire (/home/pabeni/net-next/kernel/locking/lockdep.c:5240)
>  lock_acquire.part.0 (/home/pabeni/net-next/kernel/locking/lockdep.c:473 /home/pabeni/net-next/kernel/locking/lockdep.c:5873)
>  __mutex_lock (/home/pabeni/net-next/kernel/locking/mutex.c:604 /home/pabeni/net-next/kernel/locking/mutex.c:747)
>  i40e_udp_tunnel_set_port (/home/pabeni/net-next/include/net/udp_tunnel.h:343 /home/pabeni/net-next/drivers/net/ethernet/intel/i40e/i40e_main.c:13013) i40e
>  udp_tunnel_nic_device_sync_by_port (/home/pabeni/net-next/net/ipv4/udp_tunnel_nic.c:230 /home/pabeni/net-next/net/ipv4/udp_tunnel_nic.c:249) udp_tunnel
>  __udp_tunnel_nic_device_sync.part.0 (/home/pabeni/net-next/net/ipv4/udp_tunnel_nic.c:292) udp_tunnel
>  udp_tunnel_nic_device_sync_work (/home/pabeni/net-next/net/ipv4/udp_tunnel_nic.c:742) udp_tunnel
>  process_one_work (/home/pabeni/net-next/kernel/workqueue.c:3243)
>  worker_thread (/home/pabeni/net-next/kernel/workqueue.c:3315 /home/pabeni/net-next/kernel/workqueue.c:3402)
>  kthread (/home/pabeni/net-next/kernel/kthread.c:464)
> 
> AFAICS all the existing callsites of udp_tunnel_nic_set_port_priv() are
> already under the utn lock scope, avoid (re-)acquiring it in such a
> function.
> 
> Fixes: 1ead7501094c ("udp_tunnel: remove rtnl_lock dependency")
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Ouch, how did I miss that it's running from .set_port :-( Thanks!

Acked-by: Stanislav Fomichev <sdf@fomichev.me>

  reply	other threads:[~2025-06-23 14:23 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-23 10:53 [PATCH net-next] udp_tunnel: fix deadlock in udp_tunnel_nic_set_port_priv() Paolo Abeni
2025-06-23 14:23 ` Stanislav Fomichev [this message]
2025-06-24 23:40 ` patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aFljUZx1bgn5D9D7@mini-arch \
    --to=stfomichev@gmail.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sdf@fomichev.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.