* [PATCH net 1/2] openvswitch: vport: fix self-deadlock on release of tunnel ports
2026-04-29 15:16 [PATCH net 0/2] openvswitch: fix self-deadlock on release of tunnel vports Ilya Maximets
@ 2026-04-29 15:16 ` Ilya Maximets
2026-04-29 15:16 ` [PATCH net 2/2] selftests: openvswitch: add tests for tunnel vport refcounting Ilya Maximets
2026-04-29 17:46 ` [PATCH net 0/2] openvswitch: fix self-deadlock on release of tunnel vports Eelco Chaudron
2 siblings, 0 replies; 5+ messages in thread
From: Ilya Maximets @ 2026-04-29 15:16 UTC (permalink / raw)
To: netdev
Cc: Aaron Conole, Eelco Chaudron, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan, Yuan Tan,
Yang Yang, dev, linux-kernel, linux-kselftest, Ilya Maximets,
stable
vports are used concurrently and protected by RCU, so netdev_put()
must happen after the RCU grace period. So, either in an RCU call or
after the synchronize_net(). The rtnl_delete_link() must happen under
RTNL and so can't be executed in RCU context. Calling synchronize_net()
while holding RTNL is not a good idea for performance and system
stability under load in general, so calling netdev_put() in RCU call
is the right solution here.
However,
when the device is deleted, rtnl_unlock() will call netdev_run_todo()
and block until all the references are gone. In the current code this
means that we never reach the call_rcu() and the vport is never freed
and the reference is never released, causing a self-deadlock on device
removal.
Fix that by moving the rcu_call() before the rtnl_unlock(), so the
scheduled RCU callback will be executed when synchronize_net() is
called from the rtnl_unlock()->netdev_run_todo() while the RTNL itself
is already released.
Fixes: 6931d21f87bc ("openvswitch: defer tunnel netdev_put to RCU release")
Cc: stable@vger.kernel.org
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
---
net/openvswitch/vport-netdev.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c
index 12055af832dc0..a1df551e915bc 100644
--- a/net/openvswitch/vport-netdev.c
+++ b/net/openvswitch/vport-netdev.c
@@ -196,9 +196,13 @@ void ovs_netdev_tunnel_destroy(struct vport *vport)
*/
if (vport->dev->reg_state == NETREG_REGISTERED)
rtnl_delete_link(vport->dev, 0, NULL);
- rtnl_unlock();
+ /* We can't put the device reference yet, since it can still be in
+ * use, but rtnl_unlock()->netdev_run_todo() will block until all
+ * the references are released, so the RCU call must be before it.
+ */
call_rcu(&vport->rcu, vport_netdev_free);
+ rtnl_unlock();
}
EXPORT_SYMBOL_GPL(ovs_netdev_tunnel_destroy);
--
2.53.0
^ permalink raw reply related [flat|nested] 5+ messages in thread* [PATCH net 2/2] selftests: openvswitch: add tests for tunnel vport refcounting
2026-04-29 15:16 [PATCH net 0/2] openvswitch: fix self-deadlock on release of tunnel vports Ilya Maximets
2026-04-29 15:16 ` [PATCH net 1/2] openvswitch: vport: fix self-deadlock on release of tunnel ports Ilya Maximets
@ 2026-04-29 15:16 ` Ilya Maximets
2026-04-29 16:08 ` Ilya Maximets
2026-04-29 17:46 ` [PATCH net 0/2] openvswitch: fix self-deadlock on release of tunnel vports Eelco Chaudron
2 siblings, 1 reply; 5+ messages in thread
From: Ilya Maximets @ 2026-04-29 15:16 UTC (permalink / raw)
To: netdev
Cc: Aaron Conole, Eelco Chaudron, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan, Yuan Tan,
Yang Yang, dev, linux-kernel, linux-kselftest, Ilya Maximets
There were a few issues found with the tunnel vport types around the
vport destruction code. Add some basic tests, so at least we know that
they can be properly added and removed without obvious issues.
The test creates OVS datapath, adds a non-LWT tunnel port, makes sure
they are created, and then removes the datapath and waits for all the
ports to be gone.
The dpctl script had a few bugs in the none-lwt tunnel creation code,
so fixing them as well to make the testing possible:
- The type of the --lwt option changed in order to properly disable it.
- Removed byte order conversion for the port numbers, as the value
supposed to be in the host order.
- Added missing 'gre' choice for the tunnel type.
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
---
.../selftests/net/openvswitch/openvswitch.sh | 37 +++++++++++++++++++
.../selftests/net/openvswitch/ovs-dpctl.py | 10 ++---
2 files changed, 42 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/net/openvswitch/openvswitch.sh b/tools/testing/selftests/net/openvswitch/openvswitch.sh
index b327d3061ed53..3cdd953f68132 100755
--- a/tools/testing/selftests/net/openvswitch/openvswitch.sh
+++ b/tools/testing/selftests/net/openvswitch/openvswitch.sh
@@ -26,6 +26,7 @@ tests="
netlink_checks ovsnl: validate netlink attrs and settings
upcall_interfaces ovs: test the upcall interfaces
tunnel_metadata ovs: test extraction of tunnel metadata
+ tunnel_refcount ovs: test tunnel vport reference cleanup
drop_reason drop: test drop reasons are emitted
psample psample: Sampling packets with psample"
@@ -830,6 +831,42 @@ test_tunnel_metadata() {
return 0
}
+test_tunnel_refcount() {
+ sbxname="test_tunnel_refcount"
+ sbx_add "${sbxname}" || return 1
+
+ ovs_sbx "${sbxname}" ip netns add trefns || return 1
+ on_exit "ovs_sbx ${sbxname} ip netns del trefns"
+
+ for tun_type in gre vxlan geneve; do
+ info "testing ${tun_type} tunnel vport refcount"
+
+ ovs_sbx "${sbxname}" ip netns exec trefns \
+ python3 $ovs_base/ovs-dpctl.py \
+ add-dp dp-${tun_type} || return 1
+
+ ovs_sbx "${sbxname}" ip netns exec trefns \
+ python3 $ovs_base/ovs-dpctl.py \
+ add-if --no-lwt -t ${tun_type} \
+ dp-${tun_type} ovs-${tun_type}0 || return 1
+
+ ovs_wait ip -netns trefns link show \
+ ovs-${tun_type}0 >/dev/null 2>&1 || return 1
+
+ info "deleting dp - may hang if reference counting is broken"
+ ovs_sbx "${sbxname}" ip netns exec trefns \
+ python3 $ovs_base/ovs-dpctl.py \
+ del-dp dp-${tun_type} &
+
+ dev_removed() {
+ ! ip -netns trefns link show "$1" >/dev/null 2>&1
+ }
+ ovs_wait dev_removed dp-${tun_type} || return 1
+ ovs_wait dev_removed ovs-${tun_type}0 || return 1
+ done
+ return 0
+}
+
run_test() {
(
tname="$1"
diff --git a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
index 848f61fdcee09..196813f476434 100644
--- a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
+++ b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
@@ -2069,7 +2069,7 @@ class OvsVport(GenericNetlinkSocket):
elif vport_type == "internal":
return OvsVport.OVS_VPORT_TYPE_INTERNAL
elif vport_type == "gre":
- return OvsVport.OVS_VPORT_TYPE_INTERNAL
+ return OvsVport.OVS_VPORT_TYPE_GRE
elif vport_type == "vxlan":
return OvsVport.OVS_VPORT_TYPE_VXLAN
elif vport_type == "geneve":
@@ -2131,7 +2131,7 @@ class OvsVport(GenericNetlinkSocket):
if not lwt:
vportopt = OvsVport.ovs_vport_msg.vportopts()
vportopt["attrs"].append(
- ["OVS_TUNNEL_ATTR_DST_PORT", socket.htons(dport)]
+ ["OVS_TUNNEL_ATTR_DST_PORT", dport]
)
msg["attrs"].append(
["OVS_VPORT_ATTR_OPTIONS", vportopt]
@@ -2563,7 +2563,7 @@ def print_ovsdp_full(dp_lookup_rep, ifindex, ndb=NDB(), vpl=OvsVport()):
if vpo:
dpo = vpo.get_attr("OVS_TUNNEL_ATTR_DST_PORT")
if dpo:
- opts += " tnl-dport:%s" % socket.ntohs(dpo)
+ opts += " tnl-dport:%s" % dpo
print(
" port %d: %s (%s%s)"
% (
@@ -2632,7 +2632,7 @@ def main(argv):
"--ptype",
type=str,
default="netdev",
- choices=["netdev", "internal", "geneve", "vxlan"],
+ choices=["netdev", "internal", "gre", "geneve", "vxlan"],
help="Interface type (default netdev)",
)
addifcmd.add_argument(
@@ -2645,7 +2645,7 @@ def main(argv):
addifcmd.add_argument(
"-l",
"--lwt",
- type=bool,
+ action=argparse.BooleanOptionalAction,
default=True,
help="Use LWT infrastructure instead of vport (default true)."
)
--
2.53.0
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH net 0/2] openvswitch: fix self-deadlock on release of tunnel vports
2026-04-29 15:16 [PATCH net 0/2] openvswitch: fix self-deadlock on release of tunnel vports Ilya Maximets
2026-04-29 15:16 ` [PATCH net 1/2] openvswitch: vport: fix self-deadlock on release of tunnel ports Ilya Maximets
2026-04-29 15:16 ` [PATCH net 2/2] selftests: openvswitch: add tests for tunnel vport refcounting Ilya Maximets
@ 2026-04-29 17:46 ` Eelco Chaudron
2 siblings, 0 replies; 5+ messages in thread
From: Eelco Chaudron @ 2026-04-29 17:46 UTC (permalink / raw)
To: Ilya Maximets
Cc: netdev, Aaron Conole, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan, Yuan Tan,
Yang Yang, dev, linux-kernel, linux-kselftest
On 29 Apr 2026, at 17:16, Ilya Maximets wrote:
> Two patches - the fix for the actual bug and the selftest that reproduces it.
>
> I missed the self-deadlock in the original patch that introduced the issue,
> because testing required code modification in the ovs-vswitchd to force it to
> use legacy tunnel ports. I thought I made the change correctly, but apparently
> something went wrong and the tests were run with the standard LWT infra instead.
> The selftest added in this patch set will at least prevent this kind of mistakes
> in the future.
>
> I mentioned, however, that these tunnel vports are legacy and not actually used
> by ovs-vswitchd. RTM_NEWLINK + COLLECT_METADATA is used in conjunction with the
> standard OVS_VPORT_TYPE_NETDEV instead since 2017. The code to use the legacy
> tunnels still exists in ovs-vswitchd however, but only as a fallback for older
> kernels and we're planning to remove it in the next release. I'll be sending an
> RFC to remove support for these legacy tunnel types from the kernel, as they
> serve no real purpose today and only increase the uAPI surface for CVEs, but
> we need to fix the known bugs for stable versions.
Thanks, Ilya, for working on this patch! It looks good to me, assuming you
remove the unused socket library in v2.
For the series:
Acked-by: Eelco Chaudron <echaudro@redhat.com>
^ permalink raw reply [flat|nested] 5+ messages in thread