* [PATCH net v2 0/2] openvswitch: fix self-deadlock on release of tunnel vports
@ 2026-04-30 23:38 Ilya Maximets
2026-04-30 23:38 ` [PATCH net v2 1/2] openvswitch: vport: fix self-deadlock on release of tunnel ports Ilya Maximets
` (3 more replies)
0 siblings, 4 replies; 11+ messages in thread
From: Ilya Maximets @ 2026-04-30 23:38 UTC (permalink / raw)
To: netdev
Cc: Aaron Conole, Eelco Chaudron, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan, Yuan Tan,
Yang Yang, dev, linux-kernel, linux-kselftest, Ilya Maximets
Two patches - the fix for the actual bug and the selftest that reproduces it.
I missed the self-deadlock in the original patch that introduced the issue,
because testing required code modification in the ovs-vswitchd to force it to
use legacy tunnel ports. I thought I made the change correctly, but apparently
something went wrong and the tests were run with the standard LWT infra instead.
The selftest added in this patch set will at least prevent this kind of mistakes
in the future.
I mentioned, however, that these tunnel vports are legacy and not actually used
by ovs-vswitchd. RTM_NEWLINK + COLLECT_METADATA is used in conjunction with the
standard OVS_VPORT_TYPE_NETDEV instead since 2017. The code to use the legacy
tunnels still exists in ovs-vswitchd however, but only as a fallback for older
kernels and we're planning to remove it in the next release. I'll be sending an
RFC to remove support for these legacy tunnel types from the kernel, as they
serve no real purpose today and only increase the uAPI surface for CVEs, but
we need to fix the known bugs for stable versions.
Version 2:
- Added Ack from Eelco to the first patch (not to the second as it
changed a little).
- Removed now unused import socket in the dpctl.py [pylint/ruff].
- Regarding comments from both Sashiko instances on the selftest patch:
* The background process is not waited for / not killed.
If it hangs it will not be killable anyway, so it's not a problem.
* The 'gre' choice for dpctl.py --ptype is not fully handled for --lwt.
While this is not needed for this patch, I agree that it's not
fully consistent. Added the proper handling in the TUNNEL_DEFAULTS
loop in this version.
* Python version concern for argparse.BooleanOptionalAction.
Python 3.9 is the oldest supported version and it has it, so it's
not an issue. Creating extra detection will only complicate the
script with no real benefits.
Version 1:
https://lore.kernel.org/netdev/20260429151756.4157670-1-i.maximets@ovn.org/
Ilya Maximets (2):
openvswitch: vport: fix self-deadlock on release of tunnel ports
selftests: openvswitch: add tests for tunnel vport refcounting
net/openvswitch/vport-netdev.c | 6 ++-
.../selftests/net/openvswitch/openvswitch.sh | 37 +++++++++++++++++++
.../selftests/net/openvswitch/ovs-dpctl.py | 19 +++++++---
3 files changed, 55 insertions(+), 7 deletions(-)
--
2.53.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH net v2 1/2] openvswitch: vport: fix self-deadlock on release of tunnel ports
2026-04-30 23:38 [PATCH net v2 0/2] openvswitch: fix self-deadlock on release of tunnel vports Ilya Maximets
@ 2026-04-30 23:38 ` Ilya Maximets
2026-05-04 15:57 ` Aaron Conole
2026-04-30 23:38 ` [PATCH net v2 2/2] selftests: openvswitch: add tests for tunnel vport refcounting Ilya Maximets
` (2 subsequent siblings)
3 siblings, 1 reply; 11+ messages in thread
From: Ilya Maximets @ 2026-04-30 23:38 UTC (permalink / raw)
To: netdev
Cc: Aaron Conole, Eelco Chaudron, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan, Yuan Tan,
Yang Yang, dev, linux-kernel, linux-kselftest, Ilya Maximets,
stable
vports are used concurrently and protected by RCU, so netdev_put()
must happen after the RCU grace period. So, either in an RCU call or
after the synchronize_net(). The rtnl_delete_link() must happen under
RTNL and so can't be executed in RCU context. Calling synchronize_net()
while holding RTNL is not a good idea for performance and system
stability under load in general, so calling netdev_put() in RCU call
is the right solution here.
However,
when the device is deleted, rtnl_unlock() will call netdev_run_todo()
and block until all the references are gone. In the current code this
means that we never reach the call_rcu() and the vport is never freed
and the reference is never released, causing a self-deadlock on device
removal.
Fix that by moving the rcu_call() before the rtnl_unlock(), so the
scheduled RCU callback will be executed when synchronize_net() is
called from the rtnl_unlock()->netdev_run_todo() while the RTNL itself
is already released.
Fixes: 6931d21f87bc ("openvswitch: defer tunnel netdev_put to RCU release")
Cc: stable@vger.kernel.org
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
---
net/openvswitch/vport-netdev.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c
index 12055af832dc0..a1df551e915bc 100644
--- a/net/openvswitch/vport-netdev.c
+++ b/net/openvswitch/vport-netdev.c
@@ -196,9 +196,13 @@ void ovs_netdev_tunnel_destroy(struct vport *vport)
*/
if (vport->dev->reg_state == NETREG_REGISTERED)
rtnl_delete_link(vport->dev, 0, NULL);
- rtnl_unlock();
+ /* We can't put the device reference yet, since it can still be in
+ * use, but rtnl_unlock()->netdev_run_todo() will block until all
+ * the references are released, so the RCU call must be before it.
+ */
call_rcu(&vport->rcu, vport_netdev_free);
+ rtnl_unlock();
}
EXPORT_SYMBOL_GPL(ovs_netdev_tunnel_destroy);
--
2.53.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH net v2 2/2] selftests: openvswitch: add tests for tunnel vport refcounting
2026-04-30 23:38 [PATCH net v2 0/2] openvswitch: fix self-deadlock on release of tunnel vports Ilya Maximets
2026-04-30 23:38 ` [PATCH net v2 1/2] openvswitch: vport: fix self-deadlock on release of tunnel ports Ilya Maximets
@ 2026-04-30 23:38 ` Ilya Maximets
2026-05-01 8:56 ` Eelco Chaudron
` (2 more replies)
2026-05-04 11:43 ` [PATCH net v2 0/2] openvswitch: fix self-deadlock on release of tunnel vports Ilya Maximets
2026-05-05 13:30 ` patchwork-bot+netdevbpf
3 siblings, 3 replies; 11+ messages in thread
From: Ilya Maximets @ 2026-04-30 23:38 UTC (permalink / raw)
To: netdev
Cc: Aaron Conole, Eelco Chaudron, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan, Yuan Tan,
Yang Yang, dev, linux-kernel, linux-kselftest, Ilya Maximets
There were a few issues found with the tunnel vport types around the
vport destruction code. Add some basic tests, so at least we know that
they can be properly added and removed without obvious issues.
The test creates OVS datapath, adds a non-LWT tunnel port, makes sure
they are created, and then removes the datapath and waits for all the
ports to be gone.
The dpctl script had a few bugs in the none-lwt tunnel creation code,
so fixing them as well to make the testing possible:
- The type of the --lwt option changed in order to properly disable it.
- Removed byte order conversion for the port numbers, as the value
supposed to be in the host order.
- Added missing 'gre' choice for the tunnel type.
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
---
.../selftests/net/openvswitch/openvswitch.sh | 37 +++++++++++++++++++
.../selftests/net/openvswitch/ovs-dpctl.py | 19 +++++++---
2 files changed, 50 insertions(+), 6 deletions(-)
diff --git a/tools/testing/selftests/net/openvswitch/openvswitch.sh b/tools/testing/selftests/net/openvswitch/openvswitch.sh
index b327d3061ed53..3cdd953f68132 100755
--- a/tools/testing/selftests/net/openvswitch/openvswitch.sh
+++ b/tools/testing/selftests/net/openvswitch/openvswitch.sh
@@ -26,6 +26,7 @@ tests="
netlink_checks ovsnl: validate netlink attrs and settings
upcall_interfaces ovs: test the upcall interfaces
tunnel_metadata ovs: test extraction of tunnel metadata
+ tunnel_refcount ovs: test tunnel vport reference cleanup
drop_reason drop: test drop reasons are emitted
psample psample: Sampling packets with psample"
@@ -830,6 +831,42 @@ test_tunnel_metadata() {
return 0
}
+test_tunnel_refcount() {
+ sbxname="test_tunnel_refcount"
+ sbx_add "${sbxname}" || return 1
+
+ ovs_sbx "${sbxname}" ip netns add trefns || return 1
+ on_exit "ovs_sbx ${sbxname} ip netns del trefns"
+
+ for tun_type in gre vxlan geneve; do
+ info "testing ${tun_type} tunnel vport refcount"
+
+ ovs_sbx "${sbxname}" ip netns exec trefns \
+ python3 $ovs_base/ovs-dpctl.py \
+ add-dp dp-${tun_type} || return 1
+
+ ovs_sbx "${sbxname}" ip netns exec trefns \
+ python3 $ovs_base/ovs-dpctl.py \
+ add-if --no-lwt -t ${tun_type} \
+ dp-${tun_type} ovs-${tun_type}0 || return 1
+
+ ovs_wait ip -netns trefns link show \
+ ovs-${tun_type}0 >/dev/null 2>&1 || return 1
+
+ info "deleting dp - may hang if reference counting is broken"
+ ovs_sbx "${sbxname}" ip netns exec trefns \
+ python3 $ovs_base/ovs-dpctl.py \
+ del-dp dp-${tun_type} &
+
+ dev_removed() {
+ ! ip -netns trefns link show "$1" >/dev/null 2>&1
+ }
+ ovs_wait dev_removed dp-${tun_type} || return 1
+ ovs_wait dev_removed ovs-${tun_type}0 || return 1
+ done
+ return 0
+}
+
run_test() {
(
tname="$1"
diff --git a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
index 848f61fdcee09..bbe35e2718d26 100644
--- a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
+++ b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
@@ -11,7 +11,6 @@ import logging
import math
import multiprocessing
import re
-import socket
import struct
import sys
import time
@@ -2069,7 +2068,7 @@ class OvsVport(GenericNetlinkSocket):
elif vport_type == "internal":
return OvsVport.OVS_VPORT_TYPE_INTERNAL
elif vport_type == "gre":
- return OvsVport.OVS_VPORT_TYPE_INTERNAL
+ return OvsVport.OVS_VPORT_TYPE_GRE
elif vport_type == "vxlan":
return OvsVport.OVS_VPORT_TYPE_VXLAN
elif vport_type == "geneve":
@@ -2121,6 +2120,7 @@ class OvsVport(GenericNetlinkSocket):
)
TUNNEL_DEFAULTS = [("geneve", 6081),
+ ("gre", 0),
("vxlan", 4789)]
for tnl in TUNNEL_DEFAULTS:
@@ -2129,9 +2129,13 @@ class OvsVport(GenericNetlinkSocket):
dport = tnl[1]
if not lwt:
+ if tnl[0] == "gre":
+ # GRE tunnels have no options.
+ break
+
vportopt = OvsVport.ovs_vport_msg.vportopts()
vportopt["attrs"].append(
- ["OVS_TUNNEL_ATTR_DST_PORT", socket.htons(dport)]
+ ["OVS_TUNNEL_ATTR_DST_PORT", dport]
)
msg["attrs"].append(
["OVS_VPORT_ATTR_OPTIONS", vportopt]
@@ -2145,6 +2149,9 @@ class OvsVport(GenericNetlinkSocket):
geneve_port=dport,
geneve_collect_metadata=True,
geneve_udp_zero_csum6_rx=1)
+ elif tnl[0] == "gre":
+ ipr.link("add", ifname=vport_ifname, kind="gretap",
+ gre_collect_metadata=True)
elif tnl[0] == "vxlan":
ipr.link("add", ifname=vport_ifname, kind=tnl[0],
vxlan_learning=0, vxlan_collect_metadata=1,
@@ -2563,7 +2570,7 @@ def print_ovsdp_full(dp_lookup_rep, ifindex, ndb=NDB(), vpl=OvsVport()):
if vpo:
dpo = vpo.get_attr("OVS_TUNNEL_ATTR_DST_PORT")
if dpo:
- opts += " tnl-dport:%s" % socket.ntohs(dpo)
+ opts += " tnl-dport:%s" % dpo
print(
" port %d: %s (%s%s)"
% (
@@ -2632,7 +2639,7 @@ def main(argv):
"--ptype",
type=str,
default="netdev",
- choices=["netdev", "internal", "geneve", "vxlan"],
+ choices=["netdev", "internal", "gre", "geneve", "vxlan"],
help="Interface type (default netdev)",
)
addifcmd.add_argument(
@@ -2645,7 +2652,7 @@ def main(argv):
addifcmd.add_argument(
"-l",
"--lwt",
- type=bool,
+ action=argparse.BooleanOptionalAction,
default=True,
help="Use LWT infrastructure instead of vport (default true)."
)
--
2.53.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH net v2 2/2] selftests: openvswitch: add tests for tunnel vport refcounting
2026-04-30 23:38 ` [PATCH net v2 2/2] selftests: openvswitch: add tests for tunnel vport refcounting Ilya Maximets
@ 2026-05-01 8:56 ` Eelco Chaudron
2026-05-04 15:57 ` Aaron Conole
2026-05-05 13:25 ` Paolo Abeni
2 siblings, 0 replies; 11+ messages in thread
From: Eelco Chaudron @ 2026-05-01 8:56 UTC (permalink / raw)
To: Ilya Maximets
Cc: netdev, Aaron Conole, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan, Yuan Tan,
Yang Yang, dev, linux-kernel, linux-kselftest
On 1 May 2026, at 1:38, Ilya Maximets wrote:
> There were a few issues found with the tunnel vport types around the
> vport destruction code. Add some basic tests, so at least we know that
> they can be properly added and removed without obvious issues.
>
> The test creates OVS datapath, adds a non-LWT tunnel port, makes sure
> they are created, and then removes the datapath and waits for all the
> ports to be gone.
>
> The dpctl script had a few bugs in the none-lwt tunnel creation code,
> so fixing them as well to make the testing possible:
> - The type of the --lwt option changed in order to properly disable it.
> - Removed byte order conversion for the port numbers, as the value
> supposed to be in the host order.
> - Added missing 'gre' choice for the tunnel type.
>
> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Changes on the v2 look good to me!
Acked-by: Eelco Chaudron <echaudro@redhat.com>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net v2 0/2] openvswitch: fix self-deadlock on release of tunnel vports
2026-04-30 23:38 [PATCH net v2 0/2] openvswitch: fix self-deadlock on release of tunnel vports Ilya Maximets
2026-04-30 23:38 ` [PATCH net v2 1/2] openvswitch: vport: fix self-deadlock on release of tunnel ports Ilya Maximets
2026-04-30 23:38 ` [PATCH net v2 2/2] selftests: openvswitch: add tests for tunnel vport refcounting Ilya Maximets
@ 2026-05-04 11:43 ` Ilya Maximets
2026-05-04 20:24 ` Aaron Conole
2026-05-05 13:30 ` patchwork-bot+netdevbpf
3 siblings, 1 reply; 11+ messages in thread
From: Ilya Maximets @ 2026-05-04 11:43 UTC (permalink / raw)
To: netdev
Cc: i.maximets, Aaron Conole, Eelco Chaudron, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
Shuah Khan, Yuan Tan, Yang Yang, dev, linux-kernel,
linux-kselftest
On 5/1/26 1:38 AM, Ilya Maximets wrote:
> Two patches - the fix for the actual bug and the selftest that reproduces it.
>
> I missed the self-deadlock in the original patch that introduced the issue,
> because testing required code modification in the ovs-vswitchd to force it to
> use legacy tunnel ports. I thought I made the change correctly, but apparently
> something went wrong and the tests were run with the standard LWT infra instead.
> The selftest added in this patch set will at least prevent this kind of mistakes
> in the future.
>
> I mentioned, however, that these tunnel vports are legacy and not actually used
> by ovs-vswitchd. RTM_NEWLINK + COLLECT_METADATA is used in conjunction with the
> standard OVS_VPORT_TYPE_NETDEV instead since 2017. The code to use the legacy
> tunnels still exists in ovs-vswitchd however, but only as a fallback for older
> kernels and we're planning to remove it in the next release. I'll be sending an
> RFC to remove support for these legacy tunnel types from the kernel, as they
> serve no real purpose today and only increase the uAPI surface for CVEs, but
> we need to fix the known bugs for stable versions.
>
>
> Version 2:
> - Added Ack from Eelco to the first patch (not to the second as it
> changed a little).
> - Removed now unused import socket in the dpctl.py [pylint/ruff].
>
> - Regarding comments from both Sashiko instances on the selftest patch:
>
> * The background process is not waited for / not killed.
> If it hangs it will not be killable anyway, so it's not a problem.
Both sashiko instances still flag this. Looks like the cover letter is not
included in the prompt.
If someone thinks I should add the suggested kill on exit, I can, but it will
not be effective in case the process hangs.
Best regards, Ilya Maximets.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net v2 2/2] selftests: openvswitch: add tests for tunnel vport refcounting
2026-04-30 23:38 ` [PATCH net v2 2/2] selftests: openvswitch: add tests for tunnel vport refcounting Ilya Maximets
2026-05-01 8:56 ` Eelco Chaudron
@ 2026-05-04 15:57 ` Aaron Conole
2026-05-05 13:25 ` Paolo Abeni
2 siblings, 0 replies; 11+ messages in thread
From: Aaron Conole @ 2026-05-04 15:57 UTC (permalink / raw)
To: Ilya Maximets
Cc: netdev, Eelco Chaudron, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan, Yuan Tan,
Yang Yang, dev, linux-kernel, linux-kselftest
Ilya Maximets <i.maximets@ovn.org> writes:
> There were a few issues found with the tunnel vport types around the
> vport destruction code. Add some basic tests, so at least we know that
> they can be properly added and removed without obvious issues.
>
> The test creates OVS datapath, adds a non-LWT tunnel port, makes sure
> they are created, and then removes the datapath and waits for all the
> ports to be gone.
>
> The dpctl script had a few bugs in the none-lwt tunnel creation code,
> so fixing them as well to make the testing possible:
> - The type of the --lwt option changed in order to properly disable it.
> - Removed byte order conversion for the port numbers, as the value
> supposed to be in the host order.
> - Added missing 'gre' choice for the tunnel type.
>
> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> ---
Looks good to me. Thanks for the test.
Acked-by: Aaron Conole <aconole@redhat.com>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net v2 1/2] openvswitch: vport: fix self-deadlock on release of tunnel ports
2026-04-30 23:38 ` [PATCH net v2 1/2] openvswitch: vport: fix self-deadlock on release of tunnel ports Ilya Maximets
@ 2026-05-04 15:57 ` Aaron Conole
0 siblings, 0 replies; 11+ messages in thread
From: Aaron Conole @ 2026-05-04 15:57 UTC (permalink / raw)
To: Ilya Maximets
Cc: netdev, Eelco Chaudron, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan, Yuan Tan,
Yang Yang, dev, linux-kernel, linux-kselftest, stable
Ilya Maximets <i.maximets@ovn.org> writes:
> vports are used concurrently and protected by RCU, so netdev_put()
> must happen after the RCU grace period. So, either in an RCU call or
> after the synchronize_net(). The rtnl_delete_link() must happen under
> RTNL and so can't be executed in RCU context. Calling synchronize_net()
> while holding RTNL is not a good idea for performance and system
> stability under load in general, so calling netdev_put() in RCU call
> is the right solution here.
>
> However,
> when the device is deleted, rtnl_unlock() will call netdev_run_todo()
> and block until all the references are gone. In the current code this
> means that we never reach the call_rcu() and the vport is never freed
> and the reference is never released, causing a self-deadlock on device
> removal.
>
> Fix that by moving the rcu_call() before the rtnl_unlock(), so the
> scheduled RCU callback will be executed when synchronize_net() is
> called from the rtnl_unlock()->netdev_run_todo() while the RTNL itself
> is already released.
>
> Fixes: 6931d21f87bc ("openvswitch: defer tunnel netdev_put to RCU release")
> Cc: stable@vger.kernel.org
> Acked-by: Eelco Chaudron <echaudro@redhat.com>
> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> ---
Acked-by: Aaron Conole <aconole@redhat.com>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net v2 0/2] openvswitch: fix self-deadlock on release of tunnel vports
2026-05-04 11:43 ` [PATCH net v2 0/2] openvswitch: fix self-deadlock on release of tunnel vports Ilya Maximets
@ 2026-05-04 20:24 ` Aaron Conole
0 siblings, 0 replies; 11+ messages in thread
From: Aaron Conole @ 2026-05-04 20:24 UTC (permalink / raw)
To: Ilya Maximets
Cc: netdev, Eelco Chaudron, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan, Yuan Tan,
Yang Yang, dev, linux-kernel, linux-kselftest
Ilya Maximets <i.maximets@ovn.org> writes:
> On 5/1/26 1:38 AM, Ilya Maximets wrote:
>> Two patches - the fix for the actual bug and the selftest that reproduces it.
>>
>> I missed the self-deadlock in the original patch that introduced the issue,
>> because testing required code modification in the ovs-vswitchd to force it to
>> use legacy tunnel ports. I thought I made the change correctly, but apparently
>> something went wrong and the tests were run with the standard LWT infra instead.
>> The selftest added in this patch set will at least prevent this kind of mistakes
>> in the future.
>>
>> I mentioned, however, that these tunnel vports are legacy and not actually used
>> by ovs-vswitchd. RTM_NEWLINK + COLLECT_METADATA is used in conjunction with the
>> standard OVS_VPORT_TYPE_NETDEV instead since 2017. The code to use the legacy
>> tunnels still exists in ovs-vswitchd however, but only as a fallback for older
>> kernels and we're planning to remove it in the next release. I'll be sending an
>> RFC to remove support for these legacy tunnel types from the kernel, as they
>> serve no real purpose today and only increase the uAPI surface for CVEs, but
>> we need to fix the known bugs for stable versions.
>>
>>
>> Version 2:
>> - Added Ack from Eelco to the first patch (not to the second as it
>> changed a little).
>> - Removed now unused import socket in the dpctl.py [pylint/ruff].
>>
>> - Regarding comments from both Sashiko instances on the selftest patch:
>>
>> * The background process is not waited for / not killed.
>> If it hangs it will not be killable anyway, so it's not a problem.
>
> Both sashiko instances still flag this. Looks like the cover letter is not
> included in the prompt.
>
> If someone thinks I should add the suggested kill on exit, I can, but it will
> not be effective in case the process hangs.
One option is to put a comment in the test itself documenting this kind
of behavior. At least, then the model might not flag it. I don't feel
strongly about that, however.
> Best regards, Ilya Maximets.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net v2 2/2] selftests: openvswitch: add tests for tunnel vport refcounting
2026-04-30 23:38 ` [PATCH net v2 2/2] selftests: openvswitch: add tests for tunnel vport refcounting Ilya Maximets
2026-05-01 8:56 ` Eelco Chaudron
2026-05-04 15:57 ` Aaron Conole
@ 2026-05-05 13:25 ` Paolo Abeni
2026-05-05 13:28 ` Ilya Maximets
2 siblings, 1 reply; 11+ messages in thread
From: Paolo Abeni @ 2026-05-05 13:25 UTC (permalink / raw)
To: Ilya Maximets, netdev
Cc: Aaron Conole, Eelco Chaudron, David S. Miller, Eric Dumazet,
Jakub Kicinski, Simon Horman, Shuah Khan, Yuan Tan, Yang Yang,
dev, linux-kernel, linux-kselftest
On 5/1/26 1:38 AM, Ilya Maximets wrote:
> @@ -830,6 +831,42 @@ test_tunnel_metadata() {
> return 0
> }
>
> +test_tunnel_refcount() {
> + sbxname="test_tunnel_refcount"
> + sbx_add "${sbxname}" || return 1
> +
> + ovs_sbx "${sbxname}" ip netns add trefns || return 1
> + on_exit "ovs_sbx ${sbxname} ip netns del trefns"
> +
> + for tun_type in gre vxlan geneve; do
> + info "testing ${tun_type} tunnel vport refcount"
> +
> + ovs_sbx "${sbxname}" ip netns exec trefns \
> + python3 $ovs_base/ovs-dpctl.py \
> + add-dp dp-${tun_type} || return 1
> +
> + ovs_sbx "${sbxname}" ip netns exec trefns \
> + python3 $ovs_base/ovs-dpctl.py \
> + add-if --no-lwt -t ${tun_type} \
> + dp-${tun_type} ovs-${tun_type}0 || return 1
> +
> + ovs_wait ip -netns trefns link show \
> + ovs-${tun_type}0 >/dev/null 2>&1 || return 1
> +
> + info "deleting dp - may hang if reference counting is broken"
> + ovs_sbx "${sbxname}" ip netns exec trefns \
> + python3 $ovs_base/ovs-dpctl.py \
> + del-dp dp-${tun_type} &
> +
> + dev_removed() {
> + ! ip -netns trefns link show "$1" >/dev/null 2>&1
> + }
> + ovs_wait dev_removed dp-${tun_type} || return 1
> + ovs_wait dev_removed ovs-${tun_type}0 || return 1
FTR, here sashiko laments that if the reference counting is broken and
the del-dp process hangs, this could leave the background del-dp python
process running indefinitely.
I guess that if reference counting is broken inside the kernel, very
likely an host/VM reboot is needed, and the above does not matter.
/P
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net v2 2/2] selftests: openvswitch: add tests for tunnel vport refcounting
2026-05-05 13:25 ` Paolo Abeni
@ 2026-05-05 13:28 ` Ilya Maximets
0 siblings, 0 replies; 11+ messages in thread
From: Ilya Maximets @ 2026-05-05 13:28 UTC (permalink / raw)
To: Paolo Abeni, netdev
Cc: i.maximets, Aaron Conole, Eelco Chaudron, David S. Miller,
Eric Dumazet, Jakub Kicinski, Simon Horman, Shuah Khan, Yuan Tan,
Yang Yang, dev, linux-kernel, linux-kselftest
On 5/5/26 3:25 PM, Paolo Abeni wrote:
> On 5/1/26 1:38 AM, Ilya Maximets wrote:
>> @@ -830,6 +831,42 @@ test_tunnel_metadata() {
>> return 0
>> }
>>
>> +test_tunnel_refcount() {
>> + sbxname="test_tunnel_refcount"
>> + sbx_add "${sbxname}" || return 1
>> +
>> + ovs_sbx "${sbxname}" ip netns add trefns || return 1
>> + on_exit "ovs_sbx ${sbxname} ip netns del trefns"
>> +
>> + for tun_type in gre vxlan geneve; do
>> + info "testing ${tun_type} tunnel vport refcount"
>> +
>> + ovs_sbx "${sbxname}" ip netns exec trefns \
>> + python3 $ovs_base/ovs-dpctl.py \
>> + add-dp dp-${tun_type} || return 1
>> +
>> + ovs_sbx "${sbxname}" ip netns exec trefns \
>> + python3 $ovs_base/ovs-dpctl.py \
>> + add-if --no-lwt -t ${tun_type} \
>> + dp-${tun_type} ovs-${tun_type}0 || return 1
>> +
>> + ovs_wait ip -netns trefns link show \
>> + ovs-${tun_type}0 >/dev/null 2>&1 || return 1
>> +
>> + info "deleting dp - may hang if reference counting is broken"
>> + ovs_sbx "${sbxname}" ip netns exec trefns \
>> + python3 $ovs_base/ovs-dpctl.py \
>> + del-dp dp-${tun_type} &
>> +
>> + dev_removed() {
>> + ! ip -netns trefns link show "$1" >/dev/null 2>&1
>> + }
>> + ovs_wait dev_removed dp-${tun_type} || return 1
>> + ovs_wait dev_removed ovs-${tun_type}0 || return 1
>
> FTR, here sashiko laments that if the reference counting is broken and
> the del-dp process hangs, this could leave the background del-dp python
> process running indefinitely.
>
> I guess that if reference counting is broken inside the kernel, very
> likely an host/VM reboot is needed, and the above does not matter.
Yes, I have a note about that in the cover letter.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net v2 0/2] openvswitch: fix self-deadlock on release of tunnel vports
2026-04-30 23:38 [PATCH net v2 0/2] openvswitch: fix self-deadlock on release of tunnel vports Ilya Maximets
` (2 preceding siblings ...)
2026-05-04 11:43 ` [PATCH net v2 0/2] openvswitch: fix self-deadlock on release of tunnel vports Ilya Maximets
@ 2026-05-05 13:30 ` patchwork-bot+netdevbpf
3 siblings, 0 replies; 11+ messages in thread
From: patchwork-bot+netdevbpf @ 2026-05-05 13:30 UTC (permalink / raw)
To: Ilya Maximets
Cc: netdev, aconole, echaudro, davem, edumazet, kuba, pabeni, horms,
shuah, tanyuan98, n05ec, dev, linux-kernel, linux-kselftest
Hello:
This series was applied to netdev/net.git (main)
by Paolo Abeni <pabeni@redhat.com>:
On Fri, 1 May 2026 01:38:36 +0200 you wrote:
> Two patches - the fix for the actual bug and the selftest that reproduces it.
>
> I missed the self-deadlock in the original patch that introduced the issue,
> because testing required code modification in the ovs-vswitchd to force it to
> use legacy tunnel ports. I thought I made the change correctly, but apparently
> something went wrong and the tests were run with the standard LWT infra instead.
> The selftest added in this patch set will at least prevent this kind of mistakes
> in the future.
>
> [...]
Here is the summary with links:
- [net,v2,1/2] openvswitch: vport: fix self-deadlock on release of tunnel ports
https://git.kernel.org/netdev/net/c/aa69918bd418
- [net,v2,2/2] selftests: openvswitch: add tests for tunnel vport refcounting
https://git.kernel.org/netdev/net/c/05416ada37aa
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2026-05-05 13:30 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-30 23:38 [PATCH net v2 0/2] openvswitch: fix self-deadlock on release of tunnel vports Ilya Maximets
2026-04-30 23:38 ` [PATCH net v2 1/2] openvswitch: vport: fix self-deadlock on release of tunnel ports Ilya Maximets
2026-05-04 15:57 ` Aaron Conole
2026-04-30 23:38 ` [PATCH net v2 2/2] selftests: openvswitch: add tests for tunnel vport refcounting Ilya Maximets
2026-05-01 8:56 ` Eelco Chaudron
2026-05-04 15:57 ` Aaron Conole
2026-05-05 13:25 ` Paolo Abeni
2026-05-05 13:28 ` Ilya Maximets
2026-05-04 11:43 ` [PATCH net v2 0/2] openvswitch: fix self-deadlock on release of tunnel vports Ilya Maximets
2026-05-04 20:24 ` Aaron Conole
2026-05-05 13:30 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox