All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stanislav Fomichev <stfomichev@gmail.com>
To: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org, davem@davemloft.net, edumazet@google.com,
	pabeni@redhat.com, skalluru@marvell.com, manishc@marvell.com,
	andrew+netdev@lunn.ch, michael.chan@broadcom.com,
	pavan.chebbi@broadcom.com, ajit.khaparde@broadcom.com,
	sriharsha.basavapatna@broadcom.com, somnath.kotur@broadcom.com,
	anthony.l.nguyen@intel.com, przemyslaw.kitszel@intel.com,
	tariqt@nvidia.com, saeedm@nvidia.com, louis.peens@corigine.com,
	shshaikh@marvell.com, GR-Linux-NIC-Dev@marvell.com,
	ecree.xilinx@gmail.com, horms@kernel.org, dsahern@kernel.org,
	ruanjinjie@huawei.com, mheib@redhat.com,
	linux-kernel@vger.kernel.org, intel-wired-lan@lists.osuosl.org,
	linux-rdma@vger.kernel.org, oss-drivers@corigine.com,
	linux-net-drivers@amd.com, leon@kernel.org
Subject: Re: [Intel-wired-lan] [PATCH net-next 2/3] udp_tunnel: remove rtnl_lock dependency
Date: Wed, 21 May 2025 09:54:03 -0700	[thread overview]
Message-ID: <aC4FK0fmUoaXYt4k@mini-arch> (raw)
In-Reply-To: <20250521073401.67fbd1bc@kernel.org>

On 05/21, Jakub Kicinski wrote:
> On Tue, 20 May 2025 13:36:13 -0700 Stanislav Fomichev wrote:
> > Drivers that are using ops lock and don't depend on RTNL lock
> > still need to manage it because udp_tunnel's RTNL dependency.
> > Introduce new udp_tunnel_nic_lock and use it instead of
> > rtnl_lock. Drop non-UDP_TUNNEL_NIC_INFO_MAY_SLEEP mode from
> > udp_tunnel infra (udp_tunnel_nic_device_sync_work needs to
> > grab udp_tunnel_nic_lock mutex and might sleep).
> 
> There is a netdevsim-based test for this that needs to be fixed up.

Oh, I did not see that one, let me try to find and run it.

> > diff --git a/include/net/udp_tunnel.h b/include/net/udp_tunnel.h
> > index 2df3b8344eb5..7f5537fdf2c9 100644
> > --- a/include/net/udp_tunnel.h
> > +++ b/include/net/udp_tunnel.h
> > @@ -221,19 +221,17 @@ static inline void udp_tunnel_encap_enable(struct sock *sk)
> >  #define UDP_TUNNEL_NIC_MAX_TABLES	4
> >  
> >  enum udp_tunnel_nic_info_flags {
> > -	/* Device callbacks may sleep */
> > -	UDP_TUNNEL_NIC_INFO_MAY_SLEEP	= BIT(0),
> 
> Could we use a different lock for sleeping and non-sleeping drivers?

We can probably do it if we reorder the locks (as you ask/suggest
below). Overall, I'm not sure I understand why we want to have two
paths here. If we can do everything via work queue, why have a separate
path for the non-sleepable callback? (more code -> more bugs)

> > @@ -554,11 +543,11 @@ static void __udp_tunnel_nic_reset_ntf(struct net_device *dev)
> >  	struct udp_tunnel_nic *utn;
> >  	unsigned int i, j;
> >  
> > -	ASSERT_RTNL();
> > +	mutex_lock(&udp_tunnel_nic_lock);
> >  
> >  	utn = dev->udp_tunnel_nic;
> 
> utn and info's lifetimes are tied to the lifetime of the device
> I think their existence can remain protected by the external locks

SG, will move the lock down a bit.

> >  	if (!utn)
> > -		return;
> > +		goto unlock;
> >  
> >  	utn->need_sync = false;
> >  	for (i = 0; i < utn->n_tables; i++)
> 
> > -	rtnl_lock();
> > +	mutex_lock(&udp_tunnel_nic_lock);
> >  	utn->work_pending = 0;
> >  	__udp_tunnel_nic_device_sync(utn->dev, utn);
> >  
> > -	if (utn->need_replay)
> > +	if (utn->need_replay) {
> > +		rtnl_lock();
> >  		udp_tunnel_nic_replay(utn->dev, utn);
> > -	rtnl_unlock();
> > +		rtnl_unlock();
> > +	}
> > +	mutex_unlock(&udp_tunnel_nic_lock);
> >  }
> 
> What's the lock ordering between the new lock and rtnl lock?

From ops-locked, we'll get: ops->tunnel_lock (__udp_tunnel_nic_reset_ntf)
From non-ops locked, we'll get: rtnl->tunnel_lock

I see your point, we need to do rtnl->tunnel_lock here as well.

> BTW the lock could live in utn, right? We can't use the instance
> lock because of sharing, but we could put the lock in utn?

I was thinking that there is some global state besides udp_tunnel_nic,
but I don't see any, will move the lock, thanks!

WARNING: multiple messages have this Message-ID (diff)
From: Stanislav Fomichev <stfomichev@gmail.com>
To: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org, davem@davemloft.net, edumazet@google.com,
	pabeni@redhat.com, skalluru@marvell.com, manishc@marvell.com,
	andrew+netdev@lunn.ch, michael.chan@broadcom.com,
	pavan.chebbi@broadcom.com, ajit.khaparde@broadcom.com,
	sriharsha.basavapatna@broadcom.com, somnath.kotur@broadcom.com,
	anthony.l.nguyen@intel.com, przemyslaw.kitszel@intel.com,
	tariqt@nvidia.com, saeedm@nvidia.com, louis.peens@corigine.com,
	shshaikh@marvell.com, GR-Linux-NIC-Dev@marvell.com,
	ecree.xilinx@gmail.com, horms@kernel.org, dsahern@kernel.org,
	ruanjinjie@huawei.com, mheib@redhat.com,
	linux-kernel@vger.kernel.org, intel-wired-lan@lists.osuosl.org,
	linux-rdma@vger.kernel.org, oss-drivers@corigine.com,
	linux-net-drivers@amd.com, leon@kernel.org
Subject: Re: [PATCH net-next 2/3] udp_tunnel: remove rtnl_lock dependency
Date: Wed, 21 May 2025 09:54:03 -0700	[thread overview]
Message-ID: <aC4FK0fmUoaXYt4k@mini-arch> (raw)
In-Reply-To: <20250521073401.67fbd1bc@kernel.org>

On 05/21, Jakub Kicinski wrote:
> On Tue, 20 May 2025 13:36:13 -0700 Stanislav Fomichev wrote:
> > Drivers that are using ops lock and don't depend on RTNL lock
> > still need to manage it because udp_tunnel's RTNL dependency.
> > Introduce new udp_tunnel_nic_lock and use it instead of
> > rtnl_lock. Drop non-UDP_TUNNEL_NIC_INFO_MAY_SLEEP mode from
> > udp_tunnel infra (udp_tunnel_nic_device_sync_work needs to
> > grab udp_tunnel_nic_lock mutex and might sleep).
> 
> There is a netdevsim-based test for this that needs to be fixed up.

Oh, I did not see that one, let me try to find and run it.

> > diff --git a/include/net/udp_tunnel.h b/include/net/udp_tunnel.h
> > index 2df3b8344eb5..7f5537fdf2c9 100644
> > --- a/include/net/udp_tunnel.h
> > +++ b/include/net/udp_tunnel.h
> > @@ -221,19 +221,17 @@ static inline void udp_tunnel_encap_enable(struct sock *sk)
> >  #define UDP_TUNNEL_NIC_MAX_TABLES	4
> >  
> >  enum udp_tunnel_nic_info_flags {
> > -	/* Device callbacks may sleep */
> > -	UDP_TUNNEL_NIC_INFO_MAY_SLEEP	= BIT(0),
> 
> Could we use a different lock for sleeping and non-sleeping drivers?

We can probably do it if we reorder the locks (as you ask/suggest
below). Overall, I'm not sure I understand why we want to have two
paths here. If we can do everything via work queue, why have a separate
path for the non-sleepable callback? (more code -> more bugs)

> > @@ -554,11 +543,11 @@ static void __udp_tunnel_nic_reset_ntf(struct net_device *dev)
> >  	struct udp_tunnel_nic *utn;
> >  	unsigned int i, j;
> >  
> > -	ASSERT_RTNL();
> > +	mutex_lock(&udp_tunnel_nic_lock);
> >  
> >  	utn = dev->udp_tunnel_nic;
> 
> utn and info's lifetimes are tied to the lifetime of the device
> I think their existence can remain protected by the external locks

SG, will move the lock down a bit.

> >  	if (!utn)
> > -		return;
> > +		goto unlock;
> >  
> >  	utn->need_sync = false;
> >  	for (i = 0; i < utn->n_tables; i++)
> 
> > -	rtnl_lock();
> > +	mutex_lock(&udp_tunnel_nic_lock);
> >  	utn->work_pending = 0;
> >  	__udp_tunnel_nic_device_sync(utn->dev, utn);
> >  
> > -	if (utn->need_replay)
> > +	if (utn->need_replay) {
> > +		rtnl_lock();
> >  		udp_tunnel_nic_replay(utn->dev, utn);
> > -	rtnl_unlock();
> > +		rtnl_unlock();
> > +	}
> > +	mutex_unlock(&udp_tunnel_nic_lock);
> >  }
> 
> What's the lock ordering between the new lock and rtnl lock?

From ops-locked, we'll get: ops->tunnel_lock (__udp_tunnel_nic_reset_ntf)
From non-ops locked, we'll get: rtnl->tunnel_lock

I see your point, we need to do rtnl->tunnel_lock here as well.

> BTW the lock could live in utn, right? We can't use the instance
> lock because of sharing, but we could put the lock in utn?

I was thinking that there is some global state besides udp_tunnel_nic,
but I don't see any, will move the lock, thanks!

  reply	other threads:[~2025-05-21 16:54 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-20 20:36 [Intel-wired-lan] [PATCH net-next 0/3] udp_tunnel: remove rtnl_lock dependency Stanislav Fomichev
2025-05-20 20:36 ` Stanislav Fomichev
2025-05-20 20:36 ` [Intel-wired-lan] [PATCH net-next 1/3] net: ASSERT_RTNL remove netif_set_real_num_{rx, tx}_queues Stanislav Fomichev
2025-05-20 20:36   ` [PATCH net-next 1/3] net: ASSERT_RTNL remove netif_set_real_num_{rx,tx}_queues Stanislav Fomichev
2025-05-21  5:14   ` [Intel-wired-lan] [PATCH net-next 1/3] net: ASSERT_RTNL remove netif_set_real_num_{rx, tx}_queues Loktionov, Aleksandr
2025-05-21  5:14     ` Loktionov, Aleksandr
2025-05-21 17:01     ` Stanislav Fomichev
2025-05-20 20:36 ` [Intel-wired-lan] [PATCH net-next 2/3] udp_tunnel: remove rtnl_lock dependency Stanislav Fomichev
2025-05-20 20:36   ` Stanislav Fomichev
2025-05-21 14:34   ` [Intel-wired-lan] " Jakub Kicinski
2025-05-21 14:34     ` Jakub Kicinski
2025-05-21 16:54     ` Stanislav Fomichev [this message]
2025-05-21 16:54       ` Stanislav Fomichev
2025-05-21 22:31       ` [Intel-wired-lan] " Jakub Kicinski
2025-05-21 22:31         ` Jakub Kicinski
2025-05-20 20:36 ` [Intel-wired-lan] [PATCH net-next 3/3] Revert "bnxt_en: bring back rtnl_lock() in the bnxt_open() path" Stanislav Fomichev
2025-05-20 20:36   ` Stanislav Fomichev
2025-05-21  5:15   ` [Intel-wired-lan] " Loktionov, Aleksandr
2025-05-21  5:15     ` Loktionov, Aleksandr

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aC4FK0fmUoaXYt4k@mini-arch \
    --to=stfomichev@gmail.com \
    --cc=GR-Linux-NIC-Dev@marvell.com \
    --cc=ajit.khaparde@broadcom.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=anthony.l.nguyen@intel.com \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=ecree.xilinx@gmail.com \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-net-drivers@amd.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=louis.peens@corigine.com \
    --cc=manishc@marvell.com \
    --cc=mheib@redhat.com \
    --cc=michael.chan@broadcom.com \
    --cc=netdev@vger.kernel.org \
    --cc=oss-drivers@corigine.com \
    --cc=pabeni@redhat.com \
    --cc=pavan.chebbi@broadcom.com \
    --cc=przemyslaw.kitszel@intel.com \
    --cc=ruanjinjie@huawei.com \
    --cc=saeedm@nvidia.com \
    --cc=shshaikh@marvell.com \
    --cc=skalluru@marvell.com \
    --cc=somnath.kotur@broadcom.com \
    --cc=sriharsha.basavapatna@broadcom.com \
    --cc=tariqt@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.