netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] net: openvswitch: Avoid needlessly taking the RTNL on vport destroy
@ 2025-12-11 11:50 Toke Høiland-Jørgensen
  2025-12-15  8:41 ` Eelco Chaudron
  2025-12-15 13:14 ` Aaron Conole
  0 siblings, 2 replies; 8+ messages in thread
From: Toke Høiland-Jørgensen @ 2025-12-11 11:50 UTC (permalink / raw)
  To: Aaron Conole, Eelco Chaudron, Ilya Maximets, Alexei Starovoitov,
	Jesse Gross
  Cc: Toke Høiland-Jørgensen, Adrian Moreno, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, netdev,
	dev

The openvswitch teardown code will immediately call
ovs_netdev_detach_dev() in response to a NETDEV_UNREGISTER notification.
It will then start the dp_notify_work workqueue, which will later end up
calling the vport destroy() callback. This callback takes the RTNL to do
another ovs_netdev_detach_port(), which in this case is unnecessary.
This causes extra pressure on the RTNL, in some cases leading to
"unregister_netdevice: waiting for XX to become free" warnings on
teardown.

We can straight-forwardly avoid the extra RTNL lock acquisition by
checking the device flags before taking the lock, and skip the locking
altogether if the IFF_OVS_DATAPATH flag has already been unset.

Fixes: b07c26511e94 ("openvswitch: fix vport-netdev unregister")
Tested-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
---
v2:
- Expand comments explaining the logic

 net/openvswitch/vport-netdev.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c
index 91a11067e458..6574f9bcdc02 100644
--- a/net/openvswitch/vport-netdev.c
+++ b/net/openvswitch/vport-netdev.c
@@ -160,10 +160,19 @@ void ovs_netdev_detach_dev(struct vport *vport)
 
 static void netdev_destroy(struct vport *vport)
 {
-	rtnl_lock();
-	if (netif_is_ovs_port(vport->dev))
-		ovs_netdev_detach_dev(vport);
-	rtnl_unlock();
+	/* When called from ovs_db_notify_wq() after a dp_device_event(), the
+	 * port has already been detached, so we can avoid taking the RTNL by
+	 * checking this first.
+	 */
+	if (netif_is_ovs_port(vport->dev)) {
+		rtnl_lock();
+		/* Check again while holding the lock to ensure we don't race
+		 * with the netdev notifier and detach twice.
+		 */
+		if (netif_is_ovs_port(vport->dev))
+			ovs_netdev_detach_dev(vport);
+		rtnl_unlock();
+	}
 
 	call_rcu(&vport->rcu, vport_netdev_free);
 }
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] net: openvswitch: Avoid needlessly taking the RTNL on vport destroy
  2025-12-11 11:50 [PATCH v2] net: openvswitch: Avoid needlessly taking the RTNL on vport destroy Toke Høiland-Jørgensen
@ 2025-12-15  8:41 ` Eelco Chaudron
  2025-12-15 11:58   ` Toke Høiland-Jørgensen
  2025-12-15 13:14 ` Aaron Conole
  1 sibling, 1 reply; 8+ messages in thread
From: Eelco Chaudron @ 2025-12-15  8:41 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Adrian Moreno
  Cc: Aaron Conole, Ilya Maximets, Alexei Starovoitov, Jesse Gross,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, netdev, dev



On 11 Dec 2025, at 12:50, Toke Høiland-Jørgensen wrote:

> The openvswitch teardown code will immediately call
> ovs_netdev_detach_dev() in response to a NETDEV_UNREGISTER notification.
> It will then start the dp_notify_work workqueue, which will later end up
> calling the vport destroy() callback. This callback takes the RTNL to do
> another ovs_netdev_detach_port(), which in this case is unnecessary.
> This causes extra pressure on the RTNL, in some cases leading to
> "unregister_netdevice: waiting for XX to become free" warnings on
> teardown.
>
> We can straight-forwardly avoid the extra RTNL lock acquisition by
> checking the device flags before taking the lock, and skip the locking
> altogether if the IFF_OVS_DATAPATH flag has already been unset.
>
> Fixes: b07c26511e94 ("openvswitch: fix vport-netdev unregister")
> Tested-by: Adrian Moreno <amorenoz@redhat.com>
> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>

Guess the change looks good, but I’m waiting for some feedback from Adrian to see if this change makes sense.
Any luck reproducing the issue it’s supposed to fix?

Cheers,

Eelco

> ---
> v2:
> - Expand comments explaining the logic
>
>  net/openvswitch/vport-netdev.c | 17 +++++++++++++----
>  1 file changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c
> index 91a11067e458..6574f9bcdc02 100644
> --- a/net/openvswitch/vport-netdev.c
> +++ b/net/openvswitch/vport-netdev.c
> @@ -160,10 +160,19 @@ void ovs_netdev_detach_dev(struct vport *vport)
>
>  static void netdev_destroy(struct vport *vport)
>  {
> -	rtnl_lock();
> -	if (netif_is_ovs_port(vport->dev))
> -		ovs_netdev_detach_dev(vport);
> -	rtnl_unlock();
> +	/* When called from ovs_db_notify_wq() after a dp_device_event(), the
> +	 * port has already been detached, so we can avoid taking the RTNL by
> +	 * checking this first.
> +	 */
> +	if (netif_is_ovs_port(vport->dev)) {
> +		rtnl_lock();
> +		/* Check again while holding the lock to ensure we don't race
> +		 * with the netdev notifier and detach twice.
> +		 */
> +		if (netif_is_ovs_port(vport->dev))
> +			ovs_netdev_detach_dev(vport);
> +		rtnl_unlock();
> +	}
>
>  	call_rcu(&vport->rcu, vport_netdev_free);
>  }
> -- 
> 2.52.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] net: openvswitch: Avoid needlessly taking the RTNL on vport destroy
  2025-12-15  8:41 ` Eelco Chaudron
@ 2025-12-15 11:58   ` Toke Høiland-Jørgensen
  2025-12-15 12:31     ` Eelco Chaudron
  0 siblings, 1 reply; 8+ messages in thread
From: Toke Høiland-Jørgensen @ 2025-12-15 11:58 UTC (permalink / raw)
  To: Eelco Chaudron, Adrian Moreno
  Cc: Aaron Conole, Ilya Maximets, Alexei Starovoitov, Jesse Gross,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, netdev, dev

Eelco Chaudron <echaudro@redhat.com> writes:

> On 11 Dec 2025, at 12:50, Toke Høiland-Jørgensen wrote:
>
>> The openvswitch teardown code will immediately call
>> ovs_netdev_detach_dev() in response to a NETDEV_UNREGISTER notification.
>> It will then start the dp_notify_work workqueue, which will later end up
>> calling the vport destroy() callback. This callback takes the RTNL to do
>> another ovs_netdev_detach_port(), which in this case is unnecessary.
>> This causes extra pressure on the RTNL, in some cases leading to
>> "unregister_netdevice: waiting for XX to become free" warnings on
>> teardown.
>>
>> We can straight-forwardly avoid the extra RTNL lock acquisition by
>> checking the device flags before taking the lock, and skip the locking
>> altogether if the IFF_OVS_DATAPATH flag has already been unset.
>>
>> Fixes: b07c26511e94 ("openvswitch: fix vport-netdev unregister")
>> Tested-by: Adrian Moreno <amorenoz@redhat.com>
>> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
>
> Guess the change looks good, but I’m waiting for some feedback from
> Adrian to see if this change makes sense.

OK.

> Any luck reproducing the issue it’s supposed to fix?

We got a report from the customer that originally reported it (who had
their own reproducer) that this patch fixes their issue to the point
where they can now delete ~2000 pods/node without triggering the
unregister_netdevice warning at all (where before it triggered at around
~500 pod deletions). So that's encouraging :)

-Toke


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] net: openvswitch: Avoid needlessly taking the RTNL on vport destroy
  2025-12-15 11:58   ` Toke Høiland-Jørgensen
@ 2025-12-15 12:31     ` Eelco Chaudron
  2025-12-22 11:29       ` Paolo Abeni
  0 siblings, 1 reply; 8+ messages in thread
From: Eelco Chaudron @ 2025-12-15 12:31 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Adrian Moreno, Aaron Conole, Ilya Maximets, Alexei Starovoitov,
	Jesse Gross, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, netdev, dev



On 15 Dec 2025, at 12:58, Toke Høiland-Jørgensen wrote:

> Eelco Chaudron <echaudro@redhat.com> writes:
>
>> On 11 Dec 2025, at 12:50, Toke Høiland-Jørgensen wrote:
>>
>>> The openvswitch teardown code will immediately call
>>> ovs_netdev_detach_dev() in response to a NETDEV_UNREGISTER notification.
>>> It will then start the dp_notify_work workqueue, which will later end up
>>> calling the vport destroy() callback. This callback takes the RTNL to do
>>> another ovs_netdev_detach_port(), which in this case is unnecessary.
>>> This causes extra pressure on the RTNL, in some cases leading to
>>> "unregister_netdevice: waiting for XX to become free" warnings on
>>> teardown.
>>>
>>> We can straight-forwardly avoid the extra RTNL lock acquisition by
>>> checking the device flags before taking the lock, and skip the locking
>>> altogether if the IFF_OVS_DATAPATH flag has already been unset.
>>>
>>> Fixes: b07c26511e94 ("openvswitch: fix vport-netdev unregister")
>>> Tested-by: Adrian Moreno <amorenoz@redhat.com>
>>> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
>>
>> Guess the change looks good, but I’m waiting for some feedback from
>> Adrian to see if this change makes sense.
>
> OK.
>
>> Any luck reproducing the issue it’s supposed to fix?
>
> We got a report from the customer that originally reported it (who had
> their own reproducer) that this patch fixes their issue to the point
> where they can now delete ~2000 pods/node without triggering the
> unregister_netdevice warning at all (where before it triggered at around
> ~500 pod deletions). So that's encouraging :)

That’s good news; just wanted to make sure we are not chasing a red herring :)

Acked-by: Eelco Chaudron echaudro@redhat.com


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] net: openvswitch: Avoid needlessly taking the RTNL on vport destroy
  2025-12-11 11:50 [PATCH v2] net: openvswitch: Avoid needlessly taking the RTNL on vport destroy Toke Høiland-Jørgensen
  2025-12-15  8:41 ` Eelco Chaudron
@ 2025-12-15 13:14 ` Aaron Conole
       [not found]   ` <CAKDHXDHsdfRfaDyfzoTymsPkm=mPdFtJOA=GHb6HGx6TjvYA7w@mail.gmail.com>
  1 sibling, 1 reply; 8+ messages in thread
From: Aaron Conole @ 2025-12-15 13:14 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Eelco Chaudron, Ilya Maximets, Alexei Starovoitov, Jesse Gross,
	Adrian Moreno, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, netdev, dev

Toke Høiland-Jørgensen <toke@redhat.com> writes:

> The openvswitch teardown code will immediately call
> ovs_netdev_detach_dev() in response to a NETDEV_UNREGISTER notification.
> It will then start the dp_notify_work workqueue, which will later end up
> calling the vport destroy() callback. This callback takes the RTNL to do
> another ovs_netdev_detach_port(), which in this case is unnecessary.
> This causes extra pressure on the RTNL, in some cases leading to
> "unregister_netdevice: waiting for XX to become free" warnings on
> teardown.
>
> We can straight-forwardly avoid the extra RTNL lock acquisition by
> checking the device flags before taking the lock, and skip the locking
> altogether if the IFF_OVS_DATAPATH flag has already been unset.
>
> Fixes: b07c26511e94 ("openvswitch: fix vport-netdev unregister")
> Tested-by: Adrian Moreno <amorenoz@redhat.com>
> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
> ---

LGTM,

Acked-by: Aaron Conole <aconole@redhat.com>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] net: openvswitch: Avoid needlessly taking the RTNL on vport destroy
       [not found]   ` <CAKDHXDHsdfRfaDyfzoTymsPkm=mPdFtJOA=GHb6HGx6TjvYA7w@mail.gmail.com>
@ 2025-12-16 11:59     ` Adrián Moreno
  0 siblings, 0 replies; 8+ messages in thread
From: Adrián Moreno @ 2025-12-16 11:59 UTC (permalink / raw)
  To: Vladimir Shebordaev
  Cc: Aaron Conole, Toke Høiland-Jørgensen, Eelco Chaudron,
	Ilya Maximets, Alexei Starovoitov, Jesse Gross, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, netdev,
	dev

On Tue, Dec 16, 2025 at 04:31:14AM +0300, Vladimir Shebordaev wrote:
> Perhaps, ovs_netdev_tunnel_destroy() is also worth to be modified this way.
> It is invoked as a vport destructor in similar code paths and even has a
> nice comment about its double duty.
>

In this case OVS owns the underlying tunnel device so it's less likely
that someone will delete it by other means. Still, I think you're
right and we could spare one RTNL lock in such case.

We can fix this in an independent patch so that Fixes points to the
right commit.

Thanks.
Adrián

> --
> Regards,
> Vladimir
>
>
> On Mon, Dec 15, 2025 at 4:16 PM Aaron Conole <aconole@redhat.com> wrote:
>
> > Toke Høiland-Jørgensen <toke@redhat.com> writes:
> >
> > > The openvswitch teardown code will immediately call
> > > ovs_netdev_detach_dev() in response to a NETDEV_UNREGISTER notification.
> > > It will then start the dp_notify_work workqueue, which will later end up
> > > calling the vport destroy() callback. This callback takes the RTNL to do
> > > another ovs_netdev_detach_port(), which in this case is unnecessary.
> > > This causes extra pressure on the RTNL, in some cases leading to
> > > "unregister_netdevice: waiting for XX to become free" warnings on
> > > teardown.
> > >
> > > We can straight-forwardly avoid the extra RTNL lock acquisition by
> > > checking the device flags before taking the lock, and skip the locking
> > > altogether if the IFF_OVS_DATAPATH flag has already been unset.
> > >
> > > Fixes: b07c26511e94 ("openvswitch: fix vport-netdev unregister")
> > > Tested-by: Adrian Moreno <amorenoz@redhat.com>
> > > Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
> > > ---
> >
> > LGTM,
> >
> > Acked-by: Aaron Conole <aconole@redhat.com>
> >
> >
> >


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] net: openvswitch: Avoid needlessly taking the RTNL on vport destroy
  2025-12-15 12:31     ` Eelco Chaudron
@ 2025-12-22 11:29       ` Paolo Abeni
  2025-12-22 11:44         ` Eelco Chaudron
  0 siblings, 1 reply; 8+ messages in thread
From: Paolo Abeni @ 2025-12-22 11:29 UTC (permalink / raw)
  To: Eelco Chaudron, Toke Høiland-Jørgensen
  Cc: Adrian Moreno, Aaron Conole, Ilya Maximets, Alexei Starovoitov,
	Jesse Gross, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Simon Horman, netdev, dev

On 12/15/25 1:31 PM, Eelco Chaudron wrote:
> On 15 Dec 2025, at 12:58, Toke Høiland-Jørgensen wrote:
>> Eelco Chaudron <echaudro@redhat.com> writes:
>>> On 11 Dec 2025, at 12:50, Toke Høiland-Jørgensen wrote:
>>>> The openvswitch teardown code will immediately call
>>>> ovs_netdev_detach_dev() in response to a NETDEV_UNREGISTER notification.
>>>> It will then start the dp_notify_work workqueue, which will later end up
>>>> calling the vport destroy() callback. This callback takes the RTNL to do
>>>> another ovs_netdev_detach_port(), which in this case is unnecessary.
>>>> This causes extra pressure on the RTNL, in some cases leading to
>>>> "unregister_netdevice: waiting for XX to become free" warnings on
>>>> teardown.
>>>>
>>>> We can straight-forwardly avoid the extra RTNL lock acquisition by
>>>> checking the device flags before taking the lock, and skip the locking
>>>> altogether if the IFF_OVS_DATAPATH flag has already been unset.
>>>>
>>>> Fixes: b07c26511e94 ("openvswitch: fix vport-netdev unregister")
>>>> Tested-by: Adrian Moreno <amorenoz@redhat.com>
>>>> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
>>>
>>> Guess the change looks good, but I’m waiting for some feedback from
>>> Adrian to see if this change makes sense.
>>
>> OK.
>>
>>> Any luck reproducing the issue it’s supposed to fix?
>>
>> We got a report from the customer that originally reported it (who had
>> their own reproducer) that this patch fixes their issue to the point
>> where they can now delete ~2000 pods/node without triggering the
>> unregister_netdevice warning at all (where before it triggered at around
>> ~500 pod deletions). So that's encouraging :)
> 
> That’s good news; just wanted to make sure we are not chasing a red herring :)
> 
> Acked-by: Eelco Chaudron echaudro@redhat.com

@Eelco: your SoB above is lacking the required <> around the email
address. I'm fixing that while applying the patch, but please take care
of it in the next reviews.

Thanks,

Paolo


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] net: openvswitch: Avoid needlessly taking the RTNL on vport destroy
  2025-12-22 11:29       ` Paolo Abeni
@ 2025-12-22 11:44         ` Eelco Chaudron
  0 siblings, 0 replies; 8+ messages in thread
From: Eelco Chaudron @ 2025-12-22 11:44 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Toke Høiland-Jørgensen, Adrian Moreno, Aaron Conole,
	Ilya Maximets, Alexei Starovoitov, Jesse Gross, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Simon Horman, netdev, dev



On 22 Dec 2025, at 12:29, Paolo Abeni wrote:

> On 12/15/25 1:31 PM, Eelco Chaudron wrote:
>> On 15 Dec 2025, at 12:58, Toke Høiland-Jørgensen wrote:
>>> Eelco Chaudron <echaudro@redhat.com> writes:
>>>> On 11 Dec 2025, at 12:50, Toke Høiland-Jørgensen wrote:
>>>>> The openvswitch teardown code will immediately call
>>>>> ovs_netdev_detach_dev() in response to a NETDEV_UNREGISTER notification.
>>>>> It will then start the dp_notify_work workqueue, which will later end up
>>>>> calling the vport destroy() callback. This callback takes the RTNL to do
>>>>> another ovs_netdev_detach_port(), which in this case is unnecessary.
>>>>> This causes extra pressure on the RTNL, in some cases leading to
>>>>> "unregister_netdevice: waiting for XX to become free" warnings on
>>>>> teardown.
>>>>>
>>>>> We can straight-forwardly avoid the extra RTNL lock acquisition by
>>>>> checking the device flags before taking the lock, and skip the locking
>>>>> altogether if the IFF_OVS_DATAPATH flag has already been unset.
>>>>>
>>>>> Fixes: b07c26511e94 ("openvswitch: fix vport-netdev unregister")
>>>>> Tested-by: Adrian Moreno <amorenoz@redhat.com>
>>>>> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
>>>>
>>>> Guess the change looks good, but I’m waiting for some feedback from
>>>> Adrian to see if this change makes sense.
>>>
>>> OK.
>>>
>>>> Any luck reproducing the issue it’s supposed to fix?
>>>
>>> We got a report from the customer that originally reported it (who had
>>> their own reproducer) that this patch fixes their issue to the point
>>> where they can now delete ~2000 pods/node without triggering the
>>> unregister_netdevice warning at all (where before it triggered at around
>>> ~500 pod deletions). So that's encouraging :)
>>
>> That’s good news; just wanted to make sure we are not chasing a red herring :)
>>
>> Acked-by: Eelco Chaudron echaudro@redhat.com
>
> @Eelco: your SoB above is lacking the required <> around the email
> address. I'm fixing that while applying the patch, but please take care
> of it in the next reviews.

Thanks Paolo, no idea what happened here :(
>
> Thanks,
>
> Paolo


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-12-22 11:44 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-11 11:50 [PATCH v2] net: openvswitch: Avoid needlessly taking the RTNL on vport destroy Toke Høiland-Jørgensen
2025-12-15  8:41 ` Eelco Chaudron
2025-12-15 11:58   ` Toke Høiland-Jørgensen
2025-12-15 12:31     ` Eelco Chaudron
2025-12-22 11:29       ` Paolo Abeni
2025-12-22 11:44         ` Eelco Chaudron
2025-12-15 13:14 ` Aaron Conole
     [not found]   ` <CAKDHXDHsdfRfaDyfzoTymsPkm=mPdFtJOA=GHb6HGx6TjvYA7w@mail.gmail.com>
2025-12-16 11:59     ` Adrián Moreno

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).