* [PATCH net] net: openvswitch: Avoid needlessly taking the RTNL on vport destroy
@ 2025-12-10 12:59 Toke Høiland-Jørgensen
2025-12-10 13:28 ` Eelco Chaudron
0 siblings, 1 reply; 6+ messages in thread
From: Toke Høiland-Jørgensen @ 2025-12-10 12:59 UTC (permalink / raw)
To: Aaron Conole, Eelco Chaudron, Ilya Maximets, Alexei Starovoitov,
Jesse Gross
Cc: Toke Høiland-Jørgensen, Adrian Moreno, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, netdev,
dev
The openvswitch teardown code will immediately call
ovs_netdev_detach_dev() in response to a NETDEV_UNREGISTER notification.
It will then start the dp_notify_work workqueue, which will later end up
calling the vport destroy() callback. This callback takes the RTNL to do
another ovs_netdev_detach_port(), which in this case is unnecessary.
This causes extra pressure on the RTNL, in some cases leading to
"unregister_netdevice: waiting for XX to become free" warnings on
teardown.
We can straight-forwardly avoid the extra RTNL lock acquisition by
checking the device flags before taking the lock, and skip the locking
altogether if the IFF_OVS_DATAPATH flag has already been unset.
Fixes: b07c26511e94 ("openvswitch: fix vport-netdev unregister")
Tested-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
---
net/openvswitch/vport-netdev.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c
index 91a11067e458..519f038526f9 100644
--- a/net/openvswitch/vport-netdev.c
+++ b/net/openvswitch/vport-netdev.c
@@ -160,10 +160,13 @@ void ovs_netdev_detach_dev(struct vport *vport)
static void netdev_destroy(struct vport *vport)
{
- rtnl_lock();
- if (netif_is_ovs_port(vport->dev))
- ovs_netdev_detach_dev(vport);
- rtnl_unlock();
+ if (netif_is_ovs_port(vport->dev)) {
+ rtnl_lock();
+ /* check again while holding the lock */
+ if (netif_is_ovs_port(vport->dev))
+ ovs_netdev_detach_dev(vport);
+ rtnl_unlock();
+ }
call_rcu(&vport->rcu, vport_netdev_free);
}
--
2.52.0
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH net] net: openvswitch: Avoid needlessly taking the RTNL on vport destroy 2025-12-10 12:59 [PATCH net] net: openvswitch: Avoid needlessly taking the RTNL on vport destroy Toke Høiland-Jørgensen @ 2025-12-10 13:28 ` Eelco Chaudron 2025-12-10 15:12 ` Adrián Moreno 0 siblings, 1 reply; 6+ messages in thread From: Eelco Chaudron @ 2025-12-10 13:28 UTC (permalink / raw) To: Toke Høiland-Jørgensen Cc: Aaron Conole, Ilya Maximets, Alexei Starovoitov, Jesse Gross, Adrian Moreno, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, netdev, dev On 10 Dec 2025, at 13:59, Toke Høiland-Jørgensen wrote: > The openvswitch teardown code will immediately call > ovs_netdev_detach_dev() in response to a NETDEV_UNREGISTER notification. > It will then start the dp_notify_work workqueue, which will later end up > calling the vport destroy() callback. This callback takes the RTNL to do > another ovs_netdev_detach_port(), which in this case is unnecessary. > This causes extra pressure on the RTNL, in some cases leading to > "unregister_netdevice: waiting for XX to become free" warnings on > teardown. > > We can straight-forwardly avoid the extra RTNL lock acquisition by > checking the device flags before taking the lock, and skip the locking > altogether if the IFF_OVS_DATAPATH flag has already been unset. > > Fixes: b07c26511e94 ("openvswitch: fix vport-netdev unregister") > Tested-by: Adrian Moreno <amorenoz@redhat.com> > Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> > --- > net/openvswitch/vport-netdev.c | 11 +++++++---- > 1 file changed, 7 insertions(+), 4 deletions(-) > > diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c > index 91a11067e458..519f038526f9 100644 > --- a/net/openvswitch/vport-netdev.c > +++ b/net/openvswitch/vport-netdev.c > @@ -160,10 +160,13 @@ void ovs_netdev_detach_dev(struct vport *vport) > > static void netdev_destroy(struct vport *vport) > { > - rtnl_lock(); > - if (netif_is_ovs_port(vport->dev)) > - ovs_netdev_detach_dev(vport); > - rtnl_unlock(); > + if (netif_is_ovs_port(vport->dev)) { Hi Toke, Thanks for digging into this! The patch looks technically correct to me, but maybe we should add a comment here explaining why we can do it this way, i.e., why we can call netif_is_ovs_port() without the lock. For example: /* We can avoid taking the rtnl lock as the IFF_OVS_DATAPATH flag is set/cleared in either netdev_create()/netdev_destroy(), which are both called under the global ovs_lock(). */ Additionally, I think the second netif_is_ovs_port() under the rtnl lock is not required due to the above. > + rtnl_lock(); > + /* check again while holding the lock */ > + if (netif_is_ovs_port(vport->dev)) > + ovs_netdev_detach_dev(vport); > + rtnl_unlock(); > + } > > call_rcu(&vport->rcu, vport_netdev_free); > } > -- > 2.52.0 ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net] net: openvswitch: Avoid needlessly taking the RTNL on vport destroy 2025-12-10 13:28 ` Eelco Chaudron @ 2025-12-10 15:12 ` Adrián Moreno 2025-12-10 15:30 ` Eelco Chaudron 0 siblings, 1 reply; 6+ messages in thread From: Adrián Moreno @ 2025-12-10 15:12 UTC (permalink / raw) To: Eelco Chaudron Cc: Toke Høiland-Jørgensen, Aaron Conole, Ilya Maximets, Alexei Starovoitov, Jesse Gross, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, netdev, dev On Wed, Dec 10, 2025 at 02:28:36PM +0100, Eelco Chaudron wrote: > > > On 10 Dec 2025, at 13:59, Toke Høiland-Jørgensen wrote: > > > The openvswitch teardown code will immediately call > > ovs_netdev_detach_dev() in response to a NETDEV_UNREGISTER notification. > > It will then start the dp_notify_work workqueue, which will later end up > > calling the vport destroy() callback. This callback takes the RTNL to do > > another ovs_netdev_detach_port(), which in this case is unnecessary. > > This causes extra pressure on the RTNL, in some cases leading to > > "unregister_netdevice: waiting for XX to become free" warnings on > > teardown. > > > > We can straight-forwardly avoid the extra RTNL lock acquisition by > > checking the device flags before taking the lock, and skip the locking > > altogether if the IFF_OVS_DATAPATH flag has already been unset. > > > > Fixes: b07c26511e94 ("openvswitch: fix vport-netdev unregister") > > Tested-by: Adrian Moreno <amorenoz@redhat.com> > > Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> > > --- > > net/openvswitch/vport-netdev.c | 11 +++++++---- > > 1 file changed, 7 insertions(+), 4 deletions(-) > > > > diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c > > index 91a11067e458..519f038526f9 100644 > > --- a/net/openvswitch/vport-netdev.c > > +++ b/net/openvswitch/vport-netdev.c > > @@ -160,10 +160,13 @@ void ovs_netdev_detach_dev(struct vport *vport) > > > > static void netdev_destroy(struct vport *vport) > > { > > - rtnl_lock(); > > - if (netif_is_ovs_port(vport->dev)) > > - ovs_netdev_detach_dev(vport); > > - rtnl_unlock(); > > + if (netif_is_ovs_port(vport->dev)) { > > Hi Toke, > > Thanks for digging into this! > > The patch looks technically correct to me, but maybe we should add a comment here explaining why we can do it this way, i.e., why we can call netif_is_ovs_port() without the lock. > For example: > > /* We can avoid taking the rtnl lock as the IFF_OVS_DATAPATH flag is set/cleared in either netdev_create()/netdev_destroy(), which are both called under the global ovs_lock(). */ > > Additionally, I think the second netif_is_ovs_port() under the rtnl lock is not required due to the above. In the case of netdevs being unregistered outside of OVS, the ovs_dp_device_notifier gets called which then runs "ovs_netdev_detach_dev" only under RTNL. Locking ovs_lock() in that callback would be problematic since the rest of the OVS code assumes ovs_lock is nested outside of RTNL. So this could race with a ovs_vport_cmd_del AFAICS. Adrián > > > + rtnl_lock(); > > + /* check again while holding the lock */ > > + if (netif_is_ovs_port(vport->dev)) > > + ovs_netdev_detach_dev(vport); > > + rtnl_unlock(); > > + } > > > > call_rcu(&vport->rcu, vport_netdev_free); > > } > > -- > > 2.52.0 > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net] net: openvswitch: Avoid needlessly taking the RTNL on vport destroy 2025-12-10 15:12 ` Adrián Moreno @ 2025-12-10 15:30 ` Eelco Chaudron 2025-12-10 15:59 ` Adrián Moreno 0 siblings, 1 reply; 6+ messages in thread From: Eelco Chaudron @ 2025-12-10 15:30 UTC (permalink / raw) To: Adrián Moreno Cc: Toke Høiland-Jørgensen, Aaron Conole, Ilya Maximets, Alexei Starovoitov, Jesse Gross, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, netdev, dev On 10 Dec 2025, at 16:12, Adrián Moreno wrote: > On Wed, Dec 10, 2025 at 02:28:36PM +0100, Eelco Chaudron wrote: >> >> >> On 10 Dec 2025, at 13:59, Toke Høiland-Jørgensen wrote: >> >>> The openvswitch teardown code will immediately call >>> ovs_netdev_detach_dev() in response to a NETDEV_UNREGISTER notification. >>> It will then start the dp_notify_work workqueue, which will later end up >>> calling the vport destroy() callback. This callback takes the RTNL to do >>> another ovs_netdev_detach_port(), which in this case is unnecessary. >>> This causes extra pressure on the RTNL, in some cases leading to >>> "unregister_netdevice: waiting for XX to become free" warnings on >>> teardown. >>> >>> We can straight-forwardly avoid the extra RTNL lock acquisition by >>> checking the device flags before taking the lock, and skip the locking >>> altogether if the IFF_OVS_DATAPATH flag has already been unset. >>> >>> Fixes: b07c26511e94 ("openvswitch: fix vport-netdev unregister") >>> Tested-by: Adrian Moreno <amorenoz@redhat.com> >>> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> >>> --- >>> net/openvswitch/vport-netdev.c | 11 +++++++---- >>> 1 file changed, 7 insertions(+), 4 deletions(-) >>> >>> diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c >>> index 91a11067e458..519f038526f9 100644 >>> --- a/net/openvswitch/vport-netdev.c >>> +++ b/net/openvswitch/vport-netdev.c >>> @@ -160,10 +160,13 @@ void ovs_netdev_detach_dev(struct vport *vport) >>> >>> static void netdev_destroy(struct vport *vport) >>> { >>> - rtnl_lock(); >>> - if (netif_is_ovs_port(vport->dev)) >>> - ovs_netdev_detach_dev(vport); >>> - rtnl_unlock(); >>> + if (netif_is_ovs_port(vport->dev)) { >> >> Hi Toke, >> >> Thanks for digging into this! >> >> The patch looks technically correct to me, but maybe we should add a comment here explaining why we can do it this way, i.e., why we can call netif_is_ovs_port() without the lock. >> For example: >> >> /* We can avoid taking the rtnl lock as the IFF_OVS_DATAPATH flag is set/cleared in either netdev_create()/netdev_destroy(), which are both called under the global ovs_lock(). */ >> >> Additionally, I think the second netif_is_ovs_port() under the rtnl lock is not required due to the above. > > In the case of netdevs being unregistered outside of OVS, the > ovs_dp_device_notifier gets called which then runs > "ovs_netdev_detach_dev" only under RTNL. Locking ovs_lock() in that > callback would be problematic since the rest of the OVS code assumes > ovs_lock is nested outside of RTNL. > > So this could race with a ovs_vport_cmd_del AFAICS. Not fully sure I understand the code path you are referring to, but if it’s through ovs_dp_notify_wq()->dp_detach_port_notify()->ovs_dp_detach_port(), it takes the ovs_lock(). By the way: in your testing, did you see the expected improvement, i.e., no more “unregister” delays? //Eelco >> >>> + rtnl_lock(); >>> + /* check again while holding the lock */ >>> + if (netif_is_ovs_port(vport->dev)) >>> + ovs_netdev_detach_dev(vport); >>> + rtnl_unlock(); >>> + } >>> >>> call_rcu(&vport->rcu, vport_netdev_free); >>> } >>> -- >>> 2.52.0 >> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net] net: openvswitch: Avoid needlessly taking the RTNL on vport destroy 2025-12-10 15:30 ` Eelco Chaudron @ 2025-12-10 15:59 ` Adrián Moreno 2025-12-10 16:40 ` Toke Høiland-Jørgensen 0 siblings, 1 reply; 6+ messages in thread From: Adrián Moreno @ 2025-12-10 15:59 UTC (permalink / raw) To: Eelco Chaudron Cc: Toke Høiland-Jørgensen, Aaron Conole, Ilya Maximets, Alexei Starovoitov, Jesse Gross, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, netdev, dev On Wed, Dec 10, 2025 at 04:30:13PM +0100, Eelco Chaudron wrote: > > > On 10 Dec 2025, at 16:12, Adrián Moreno wrote: > > > On Wed, Dec 10, 2025 at 02:28:36PM +0100, Eelco Chaudron wrote: > >> > >> > >> On 10 Dec 2025, at 13:59, Toke Høiland-Jørgensen wrote: > >> > >>> The openvswitch teardown code will immediately call > >>> ovs_netdev_detach_dev() in response to a NETDEV_UNREGISTER notification. > >>> It will then start the dp_notify_work workqueue, which will later end up > >>> calling the vport destroy() callback. This callback takes the RTNL to do > >>> another ovs_netdev_detach_port(), which in this case is unnecessary. > >>> This causes extra pressure on the RTNL, in some cases leading to > >>> "unregister_netdevice: waiting for XX to become free" warnings on > >>> teardown. > >>> > >>> We can straight-forwardly avoid the extra RTNL lock acquisition by > >>> checking the device flags before taking the lock, and skip the locking > >>> altogether if the IFF_OVS_DATAPATH flag has already been unset. > >>> > >>> Fixes: b07c26511e94 ("openvswitch: fix vport-netdev unregister") > >>> Tested-by: Adrian Moreno <amorenoz@redhat.com> > >>> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> > >>> --- > >>> net/openvswitch/vport-netdev.c | 11 +++++++---- > >>> 1 file changed, 7 insertions(+), 4 deletions(-) > >>> > >>> diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c > >>> index 91a11067e458..519f038526f9 100644 > >>> --- a/net/openvswitch/vport-netdev.c > >>> +++ b/net/openvswitch/vport-netdev.c > >>> @@ -160,10 +160,13 @@ void ovs_netdev_detach_dev(struct vport *vport) > >>> > >>> static void netdev_destroy(struct vport *vport) > >>> { > >>> - rtnl_lock(); > >>> - if (netif_is_ovs_port(vport->dev)) > >>> - ovs_netdev_detach_dev(vport); > >>> - rtnl_unlock(); > >>> + if (netif_is_ovs_port(vport->dev)) { > >> > >> Hi Toke, > >> > >> Thanks for digging into this! > >> > >> The patch looks technically correct to me, but maybe we should add a comment here explaining why we can do it this way, i.e., why we can call netif_is_ovs_port() without the lock. > >> For example: > >> > >> /* We can avoid taking the rtnl lock as the IFF_OVS_DATAPATH flag is set/cleared in either netdev_create()/netdev_destroy(), which are both called under the global ovs_lock(). */ > >> > >> Additionally, I think the second netif_is_ovs_port() under the rtnl lock is not required due to the above. > > > > In the case of netdevs being unregistered outside of OVS, the > > ovs_dp_device_notifier gets called which then runs > > "ovs_netdev_detach_dev" only under RTNL. Locking ovs_lock() in that > > callback would be problematic since the rest of the OVS code assumes > > ovs_lock is nested outside of RTNL. > > > > So this could race with a ovs_vport_cmd_del AFAICS. > > Not fully sure I understand the code path you are referring to, but if it’s through ovs_dp_notify_wq()->dp_detach_port_notify()->ovs_dp_detach_port(), it takes the ovs_lock(). The codepath described by Toke is: (netdev gets unregistered outside of OVS) -> dp_device_event (under RTNL) -> ovs_netdev_detach_dev() (IFF_OVS_DATAPATH is cleared) Then dp_notify_work is scheduled and it does what you mention: ovs_dp_notify_wq (lock ovs_mutex) -> dp_detach_port_notify -> ovs_dp_detach_port -> ovs_vport_del -> netdev_destroy (at this point netif_is_ovs_port is false) The first part of this codepath (NETDEV_UNREGISTER notification) happens under RTNL, not under ovs_mutex and it manipulates vport->dev->priv_flags. So in theory we could receive the netdev notification while we process a ovs_vport_cmd_del() command from userspace, which also ends up calling netdev_destroy. > > By the way: in your testing, did you see the expected improvement, i.e., no more “unregister” delays? I did see a reduction in the use of RTNL, which is obvious. I have not been able to reproduce the "unregister_netdevice: waiting ..." spat yet. In such a high RTNL-contented scenario, I still don't know how much that extra rtnl_lock is slowing things up or whether the optimization will be enough to reduce the spat in all cases, I guess not. I will try simulating the contention with delay-kfunc. What I have tried is some manual concurrent manipulation of netdevs and also ran the OVS kernel unit tests. Thanks. Adrián > > //Eelco > > >> > >>> + rtnl_lock(); > >>> + /* check again while holding the lock */ > >>> + if (netif_is_ovs_port(vport->dev)) > >>> + ovs_netdev_detach_dev(vport); > >>> + rtnl_unlock(); > >>> + } > >>> > >>> call_rcu(&vport->rcu, vport_netdev_free); > >>> } > >>> -- > >>> 2.52.0 > >> > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net] net: openvswitch: Avoid needlessly taking the RTNL on vport destroy 2025-12-10 15:59 ` Adrián Moreno @ 2025-12-10 16:40 ` Toke Høiland-Jørgensen 0 siblings, 0 replies; 6+ messages in thread From: Toke Høiland-Jørgensen @ 2025-12-10 16:40 UTC (permalink / raw) To: Adrián Moreno, Eelco Chaudron Cc: Aaron Conole, Ilya Maximets, Alexei Starovoitov, Jesse Gross, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, netdev, dev Adrián Moreno <amorenoz@redhat.com> writes: >> Not fully sure I understand the code path you are referring to, but >> if it’s through >> ovs_dp_notify_wq()->dp_detach_port_notify()->ovs_dp_detach_port(), it >> takes the ovs_lock(). > > The codepath described by Toke is: > (netdev gets unregistered outside of OVS) -> > dp_device_event (under RTNL) -> ovs_netdev_detach_dev() > (IFF_OVS_DATAPATH is cleared) > > Then dp_notify_work is scheduled and it does what you mention: > ovs_dp_notify_wq (lock ovs_mutex) -> dp_detach_port_notify -> ovs_dp_detach_port > -> ovs_vport_del -> netdev_destroy (at this point > netif_is_ovs_port is false) > > The first part of this codepath (NETDEV_UNREGISTER notification) happens > under RTNL, not under ovs_mutex and it manipulates vport->dev->priv_flags. > > So in theory we could receive the netdev notification while we process a > ovs_vport_cmd_del() command from userspace, which also ends up calling > netdev_destroy. Yeah, I agree, it's not guaranteed that reading the flags outside the lock will be race free, so re-checking seems safer here (and is also quite cheap). There does seem to be other uses of netif_is_ovs_port() that are outside the RTNL, though, so maybe not such a likely race in practice? Anyway, I'll respin with a comment. -Toke ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-12-10 16:40 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-12-10 12:59 [PATCH net] net: openvswitch: Avoid needlessly taking the RTNL on vport destroy Toke Høiland-Jørgensen 2025-12-10 13:28 ` Eelco Chaudron 2025-12-10 15:12 ` Adrián Moreno 2025-12-10 15:30 ` Eelco Chaudron 2025-12-10 15:59 ` Adrián Moreno 2025-12-10 16:40 ` Toke Høiland-Jørgensen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).