netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net] net: openvswitch: Avoid needlessly taking the RTNL on vport destroy
@ 2025-12-10 12:59 Toke Høiland-Jørgensen
  2025-12-10 13:28 ` Eelco Chaudron
  0 siblings, 1 reply; 6+ messages in thread
From: Toke Høiland-Jørgensen @ 2025-12-10 12:59 UTC (permalink / raw)
  To: Aaron Conole, Eelco Chaudron, Ilya Maximets, Alexei Starovoitov,
	Jesse Gross
  Cc: Toke Høiland-Jørgensen, Adrian Moreno, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, netdev,
	dev

The openvswitch teardown code will immediately call
ovs_netdev_detach_dev() in response to a NETDEV_UNREGISTER notification.
It will then start the dp_notify_work workqueue, which will later end up
calling the vport destroy() callback. This callback takes the RTNL to do
another ovs_netdev_detach_port(), which in this case is unnecessary.
This causes extra pressure on the RTNL, in some cases leading to
"unregister_netdevice: waiting for XX to become free" warnings on
teardown.

We can straight-forwardly avoid the extra RTNL lock acquisition by
checking the device flags before taking the lock, and skip the locking
altogether if the IFF_OVS_DATAPATH flag has already been unset.

Fixes: b07c26511e94 ("openvswitch: fix vport-netdev unregister")
Tested-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
---
 net/openvswitch/vport-netdev.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c
index 91a11067e458..519f038526f9 100644
--- a/net/openvswitch/vport-netdev.c
+++ b/net/openvswitch/vport-netdev.c
@@ -160,10 +160,13 @@ void ovs_netdev_detach_dev(struct vport *vport)
 
 static void netdev_destroy(struct vport *vport)
 {
-	rtnl_lock();
-	if (netif_is_ovs_port(vport->dev))
-		ovs_netdev_detach_dev(vport);
-	rtnl_unlock();
+	if (netif_is_ovs_port(vport->dev)) {
+		rtnl_lock();
+		/* check again while holding the lock */
+		if (netif_is_ovs_port(vport->dev))
+			ovs_netdev_detach_dev(vport);
+		rtnl_unlock();
+	}
 
 	call_rcu(&vport->rcu, vport_netdev_free);
 }
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH net] net: openvswitch: Avoid needlessly taking the RTNL on vport destroy
  2025-12-10 12:59 [PATCH net] net: openvswitch: Avoid needlessly taking the RTNL on vport destroy Toke Høiland-Jørgensen
@ 2025-12-10 13:28 ` Eelco Chaudron
  2025-12-10 15:12   ` Adrián Moreno
  0 siblings, 1 reply; 6+ messages in thread
From: Eelco Chaudron @ 2025-12-10 13:28 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Aaron Conole, Ilya Maximets, Alexei Starovoitov, Jesse Gross,
	Adrian Moreno, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, netdev, dev



On 10 Dec 2025, at 13:59, Toke Høiland-Jørgensen wrote:

> The openvswitch teardown code will immediately call
> ovs_netdev_detach_dev() in response to a NETDEV_UNREGISTER notification.
> It will then start the dp_notify_work workqueue, which will later end up
> calling the vport destroy() callback. This callback takes the RTNL to do
> another ovs_netdev_detach_port(), which in this case is unnecessary.
> This causes extra pressure on the RTNL, in some cases leading to
> "unregister_netdevice: waiting for XX to become free" warnings on
> teardown.
>
> We can straight-forwardly avoid the extra RTNL lock acquisition by
> checking the device flags before taking the lock, and skip the locking
> altogether if the IFF_OVS_DATAPATH flag has already been unset.
>
> Fixes: b07c26511e94 ("openvswitch: fix vport-netdev unregister")
> Tested-by: Adrian Moreno <amorenoz@redhat.com>
> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
> ---
>  net/openvswitch/vport-netdev.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c
> index 91a11067e458..519f038526f9 100644
> --- a/net/openvswitch/vport-netdev.c
> +++ b/net/openvswitch/vport-netdev.c
> @@ -160,10 +160,13 @@ void ovs_netdev_detach_dev(struct vport *vport)
>
>  static void netdev_destroy(struct vport *vport)
>  {
> -	rtnl_lock();
> -	if (netif_is_ovs_port(vport->dev))
> -		ovs_netdev_detach_dev(vport);
> -	rtnl_unlock();
> +	if (netif_is_ovs_port(vport->dev)) {

Hi Toke,

Thanks for digging into this!

The patch looks technically correct to me, but maybe we should add a comment here explaining why we can do it this way, i.e., why we can call netif_is_ovs_port() without the lock.
For example:

/* We can avoid taking the rtnl lock as the IFF_OVS_DATAPATH flag is set/cleared in either netdev_create()/netdev_destroy(), which are both called under the global ovs_lock(). */

Additionally, I think the second netif_is_ovs_port() under the rtnl lock is not required due to the above.

> +		rtnl_lock();
> +		/* check again while holding the lock */
> +		if (netif_is_ovs_port(vport->dev))
> +			ovs_netdev_detach_dev(vport);
> +		rtnl_unlock();
> +	}
>
>  	call_rcu(&vport->rcu, vport_netdev_free);
>  }
> -- 
> 2.52.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net] net: openvswitch: Avoid needlessly taking the RTNL on vport destroy
  2025-12-10 13:28 ` Eelco Chaudron
@ 2025-12-10 15:12   ` Adrián Moreno
  2025-12-10 15:30     ` Eelco Chaudron
  0 siblings, 1 reply; 6+ messages in thread
From: Adrián Moreno @ 2025-12-10 15:12 UTC (permalink / raw)
  To: Eelco Chaudron
  Cc: Toke Høiland-Jørgensen, Aaron Conole, Ilya Maximets,
	Alexei Starovoitov, Jesse Gross, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, netdev, dev

On Wed, Dec 10, 2025 at 02:28:36PM +0100, Eelco Chaudron wrote:
>
>
> On 10 Dec 2025, at 13:59, Toke Høiland-Jørgensen wrote:
>
> > The openvswitch teardown code will immediately call
> > ovs_netdev_detach_dev() in response to a NETDEV_UNREGISTER notification.
> > It will then start the dp_notify_work workqueue, which will later end up
> > calling the vport destroy() callback. This callback takes the RTNL to do
> > another ovs_netdev_detach_port(), which in this case is unnecessary.
> > This causes extra pressure on the RTNL, in some cases leading to
> > "unregister_netdevice: waiting for XX to become free" warnings on
> > teardown.
> >
> > We can straight-forwardly avoid the extra RTNL lock acquisition by
> > checking the device flags before taking the lock, and skip the locking
> > altogether if the IFF_OVS_DATAPATH flag has already been unset.
> >
> > Fixes: b07c26511e94 ("openvswitch: fix vport-netdev unregister")
> > Tested-by: Adrian Moreno <amorenoz@redhat.com>
> > Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
> > ---
> >  net/openvswitch/vport-netdev.c | 11 +++++++----
> >  1 file changed, 7 insertions(+), 4 deletions(-)
> >
> > diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c
> > index 91a11067e458..519f038526f9 100644
> > --- a/net/openvswitch/vport-netdev.c
> > +++ b/net/openvswitch/vport-netdev.c
> > @@ -160,10 +160,13 @@ void ovs_netdev_detach_dev(struct vport *vport)
> >
> >  static void netdev_destroy(struct vport *vport)
> >  {
> > -	rtnl_lock();
> > -	if (netif_is_ovs_port(vport->dev))
> > -		ovs_netdev_detach_dev(vport);
> > -	rtnl_unlock();
> > +	if (netif_is_ovs_port(vport->dev)) {
>
> Hi Toke,
>
> Thanks for digging into this!
>
> The patch looks technically correct to me, but maybe we should add a comment here explaining why we can do it this way, i.e., why we can call netif_is_ovs_port() without the lock.
> For example:
>
> /* We can avoid taking the rtnl lock as the IFF_OVS_DATAPATH flag is set/cleared in either netdev_create()/netdev_destroy(), which are both called under the global ovs_lock(). */
>
> Additionally, I think the second netif_is_ovs_port() under the rtnl lock is not required due to the above.

In the case of netdevs being unregistered outside of OVS, the
ovs_dp_device_notifier gets called which then runs
"ovs_netdev_detach_dev" only under RTNL. Locking ovs_lock() in that
callback would be problematic since the rest of the OVS code assumes
ovs_lock is nested outside of RTNL.

So this could race with a ovs_vport_cmd_del AFAICS.

Adrián

>
> > +		rtnl_lock();
> > +		/* check again while holding the lock */
> > +		if (netif_is_ovs_port(vport->dev))
> > +			ovs_netdev_detach_dev(vport);
> > +		rtnl_unlock();
> > +	}
> >
> >  	call_rcu(&vport->rcu, vport_netdev_free);
> >  }
> > --
> > 2.52.0
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net] net: openvswitch: Avoid needlessly taking the RTNL on vport destroy
  2025-12-10 15:12   ` Adrián Moreno
@ 2025-12-10 15:30     ` Eelco Chaudron
  2025-12-10 15:59       ` Adrián Moreno
  0 siblings, 1 reply; 6+ messages in thread
From: Eelco Chaudron @ 2025-12-10 15:30 UTC (permalink / raw)
  To: Adrián Moreno
  Cc: Toke Høiland-Jørgensen, Aaron Conole, Ilya Maximets,
	Alexei Starovoitov, Jesse Gross, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, netdev, dev



On 10 Dec 2025, at 16:12, Adrián Moreno wrote:

> On Wed, Dec 10, 2025 at 02:28:36PM +0100, Eelco Chaudron wrote:
>>
>>
>> On 10 Dec 2025, at 13:59, Toke Høiland-Jørgensen wrote:
>>
>>> The openvswitch teardown code will immediately call
>>> ovs_netdev_detach_dev() in response to a NETDEV_UNREGISTER notification.
>>> It will then start the dp_notify_work workqueue, which will later end up
>>> calling the vport destroy() callback. This callback takes the RTNL to do
>>> another ovs_netdev_detach_port(), which in this case is unnecessary.
>>> This causes extra pressure on the RTNL, in some cases leading to
>>> "unregister_netdevice: waiting for XX to become free" warnings on
>>> teardown.
>>>
>>> We can straight-forwardly avoid the extra RTNL lock acquisition by
>>> checking the device flags before taking the lock, and skip the locking
>>> altogether if the IFF_OVS_DATAPATH flag has already been unset.
>>>
>>> Fixes: b07c26511e94 ("openvswitch: fix vport-netdev unregister")
>>> Tested-by: Adrian Moreno <amorenoz@redhat.com>
>>> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
>>> ---
>>>  net/openvswitch/vport-netdev.c | 11 +++++++----
>>>  1 file changed, 7 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c
>>> index 91a11067e458..519f038526f9 100644
>>> --- a/net/openvswitch/vport-netdev.c
>>> +++ b/net/openvswitch/vport-netdev.c
>>> @@ -160,10 +160,13 @@ void ovs_netdev_detach_dev(struct vport *vport)
>>>
>>>  static void netdev_destroy(struct vport *vport)
>>>  {
>>> -	rtnl_lock();
>>> -	if (netif_is_ovs_port(vport->dev))
>>> -		ovs_netdev_detach_dev(vport);
>>> -	rtnl_unlock();
>>> +	if (netif_is_ovs_port(vport->dev)) {
>>
>> Hi Toke,
>>
>> Thanks for digging into this!
>>
>> The patch looks technically correct to me, but maybe we should add a comment here explaining why we can do it this way, i.e., why we can call netif_is_ovs_port() without the lock.
>> For example:
>>
>> /* We can avoid taking the rtnl lock as the IFF_OVS_DATAPATH flag is set/cleared in either netdev_create()/netdev_destroy(), which are both called under the global ovs_lock(). */
>>
>> Additionally, I think the second netif_is_ovs_port() under the rtnl lock is not required due to the above.
>
> In the case of netdevs being unregistered outside of OVS, the
> ovs_dp_device_notifier gets called which then runs
> "ovs_netdev_detach_dev" only under RTNL. Locking ovs_lock() in that
> callback would be problematic since the rest of the OVS code assumes
> ovs_lock is nested outside of RTNL.
>
> So this could race with a ovs_vport_cmd_del AFAICS.

Not fully sure I understand the code path you are referring to, but if it’s through ovs_dp_notify_wq()->dp_detach_port_notify()->ovs_dp_detach_port(), it takes the ovs_lock().

By the way: in your testing, did you see the expected improvement, i.e., no more “unregister” delays?

//Eelco

>>
>>> +		rtnl_lock();
>>> +		/* check again while holding the lock */
>>> +		if (netif_is_ovs_port(vport->dev))
>>> +			ovs_netdev_detach_dev(vport);
>>> +		rtnl_unlock();
>>> +	}
>>>
>>>  	call_rcu(&vport->rcu, vport_netdev_free);
>>>  }
>>> --
>>> 2.52.0
>>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net] net: openvswitch: Avoid needlessly taking the RTNL on vport destroy
  2025-12-10 15:30     ` Eelco Chaudron
@ 2025-12-10 15:59       ` Adrián Moreno
  2025-12-10 16:40         ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 6+ messages in thread
From: Adrián Moreno @ 2025-12-10 15:59 UTC (permalink / raw)
  To: Eelco Chaudron
  Cc: Toke Høiland-Jørgensen, Aaron Conole, Ilya Maximets,
	Alexei Starovoitov, Jesse Gross, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, netdev, dev

On Wed, Dec 10, 2025 at 04:30:13PM +0100, Eelco Chaudron wrote:
>
>
> On 10 Dec 2025, at 16:12, Adrián Moreno wrote:
>
> > On Wed, Dec 10, 2025 at 02:28:36PM +0100, Eelco Chaudron wrote:
> >>
> >>
> >> On 10 Dec 2025, at 13:59, Toke Høiland-Jørgensen wrote:
> >>
> >>> The openvswitch teardown code will immediately call
> >>> ovs_netdev_detach_dev() in response to a NETDEV_UNREGISTER notification.
> >>> It will then start the dp_notify_work workqueue, which will later end up
> >>> calling the vport destroy() callback. This callback takes the RTNL to do
> >>> another ovs_netdev_detach_port(), which in this case is unnecessary.
> >>> This causes extra pressure on the RTNL, in some cases leading to
> >>> "unregister_netdevice: waiting for XX to become free" warnings on
> >>> teardown.
> >>>
> >>> We can straight-forwardly avoid the extra RTNL lock acquisition by
> >>> checking the device flags before taking the lock, and skip the locking
> >>> altogether if the IFF_OVS_DATAPATH flag has already been unset.
> >>>
> >>> Fixes: b07c26511e94 ("openvswitch: fix vport-netdev unregister")
> >>> Tested-by: Adrian Moreno <amorenoz@redhat.com>
> >>> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
> >>> ---
> >>>  net/openvswitch/vport-netdev.c | 11 +++++++----
> >>>  1 file changed, 7 insertions(+), 4 deletions(-)
> >>>
> >>> diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c
> >>> index 91a11067e458..519f038526f9 100644
> >>> --- a/net/openvswitch/vport-netdev.c
> >>> +++ b/net/openvswitch/vport-netdev.c
> >>> @@ -160,10 +160,13 @@ void ovs_netdev_detach_dev(struct vport *vport)
> >>>
> >>>  static void netdev_destroy(struct vport *vport)
> >>>  {
> >>> -	rtnl_lock();
> >>> -	if (netif_is_ovs_port(vport->dev))
> >>> -		ovs_netdev_detach_dev(vport);
> >>> -	rtnl_unlock();
> >>> +	if (netif_is_ovs_port(vport->dev)) {
> >>
> >> Hi Toke,
> >>
> >> Thanks for digging into this!
> >>
> >> The patch looks technically correct to me, but maybe we should add a comment here explaining why we can do it this way, i.e., why we can call netif_is_ovs_port() without the lock.
> >> For example:
> >>
> >> /* We can avoid taking the rtnl lock as the IFF_OVS_DATAPATH flag is set/cleared in either netdev_create()/netdev_destroy(), which are both called under the global ovs_lock(). */
> >>
> >> Additionally, I think the second netif_is_ovs_port() under the rtnl lock is not required due to the above.
> >
> > In the case of netdevs being unregistered outside of OVS, the
> > ovs_dp_device_notifier gets called which then runs
> > "ovs_netdev_detach_dev" only under RTNL. Locking ovs_lock() in that
> > callback would be problematic since the rest of the OVS code assumes
> > ovs_lock is nested outside of RTNL.
> >
> > So this could race with a ovs_vport_cmd_del AFAICS.
>
> Not fully sure I understand the code path you are referring to, but if it’s through ovs_dp_notify_wq()->dp_detach_port_notify()->ovs_dp_detach_port(), it takes the ovs_lock().

The codepath described by Toke is:
(netdev gets unregistered outside of OVS) ->
dp_device_event (under RTNL) -> ovs_netdev_detach_dev()
(IFF_OVS_DATAPATH is cleared)

Then dp_notify_work is scheduled and it does what you mention:
ovs_dp_notify_wq (lock ovs_mutex) -> dp_detach_port_notify -> ovs_dp_detach_port
    -> ovs_vport_del -> netdev_destroy (at this point
netif_is_ovs_port is false)

The first part of this codepath (NETDEV_UNREGISTER notification) happens
under RTNL, not under ovs_mutex and it manipulates vport->dev->priv_flags.

So in theory we could receive the netdev notification while we process a
ovs_vport_cmd_del() command from userspace, which also ends up calling
netdev_destroy.

>
> By the way: in your testing, did you see the expected improvement, i.e., no more “unregister” delays?

I did see a reduction in the use of RTNL, which is obvious. I have not
been able to reproduce the "unregister_netdevice: waiting ..." spat yet.

In such a high RTNL-contented scenario, I still don't know how much that
extra rtnl_lock is slowing things up or whether the optimization will be
enough to reduce the spat in all cases, I guess not.

I will try simulating the contention with delay-kfunc.

What I have tried is some manual concurrent manipulation of netdevs and
also ran the OVS kernel unit tests.

Thanks.
Adrián

>
> //Eelco
>
> >>
> >>> +		rtnl_lock();
> >>> +		/* check again while holding the lock */
> >>> +		if (netif_is_ovs_port(vport->dev))
> >>> +			ovs_netdev_detach_dev(vport);
> >>> +		rtnl_unlock();
> >>> +	}
> >>>
> >>>  	call_rcu(&vport->rcu, vport_netdev_free);
> >>>  }
> >>> --
> >>> 2.52.0
> >>
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net] net: openvswitch: Avoid needlessly taking the RTNL on vport destroy
  2025-12-10 15:59       ` Adrián Moreno
@ 2025-12-10 16:40         ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 6+ messages in thread
From: Toke Høiland-Jørgensen @ 2025-12-10 16:40 UTC (permalink / raw)
  To: Adrián Moreno, Eelco Chaudron
  Cc: Aaron Conole, Ilya Maximets, Alexei Starovoitov, Jesse Gross,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, netdev, dev

Adrián Moreno <amorenoz@redhat.com> writes:

>> Not fully sure I understand the code path you are referring to, but
>> if it’s through
>> ovs_dp_notify_wq()->dp_detach_port_notify()->ovs_dp_detach_port(), it
>> takes the ovs_lock().
>
> The codepath described by Toke is:
> (netdev gets unregistered outside of OVS) ->
> dp_device_event (under RTNL) -> ovs_netdev_detach_dev()
> (IFF_OVS_DATAPATH is cleared)
>
> Then dp_notify_work is scheduled and it does what you mention:
> ovs_dp_notify_wq (lock ovs_mutex) -> dp_detach_port_notify -> ovs_dp_detach_port
>     -> ovs_vport_del -> netdev_destroy (at this point
> netif_is_ovs_port is false)
>
> The first part of this codepath (NETDEV_UNREGISTER notification) happens
> under RTNL, not under ovs_mutex and it manipulates vport->dev->priv_flags.
>
> So in theory we could receive the netdev notification while we process a
> ovs_vport_cmd_del() command from userspace, which also ends up calling
> netdev_destroy.

Yeah, I agree, it's not guaranteed that reading the flags outside the
lock will be race free, so re-checking seems safer here (and is also
quite cheap).

There does seem to be other uses of netif_is_ovs_port() that are outside
the RTNL, though, so maybe not such a likely race in practice?

Anyway, I'll respin with a comment.

-Toke


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-12-10 16:40 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-10 12:59 [PATCH net] net: openvswitch: Avoid needlessly taking the RTNL on vport destroy Toke Høiland-Jørgensen
2025-12-10 13:28 ` Eelco Chaudron
2025-12-10 15:12   ` Adrián Moreno
2025-12-10 15:30     ` Eelco Chaudron
2025-12-10 15:59       ` Adrián Moreno
2025-12-10 16:40         ` Toke Høiland-Jørgensen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).