* [PATCH net] openvswitch: vport: fix race between tunnel creation and linking
@ 2026-04-30 21:32 Ilya Maximets
2026-05-01 8:53 ` Eelco Chaudron
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Ilya Maximets @ 2026-04-30 21:32 UTC (permalink / raw)
To: netdev
Cc: Aaron Conole, Eelco Chaudron, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, dev, linux-kernel,
Ilya Maximets, Yuan Tan, Yifan Wu, Juefei Pu, Xin Liu, Yang Yang
When a tunnel vport is created it first creates the tunnel device, e.g.,
with geneve_dev_create_fb(), then it calls ovs_netdev_link() to take a
reference and link it to the device that represents openvswitch datapath.
The creation of the device is happening under RTNL, but then RTNL is
released and re-acquired to find the device by name. It is technically
possible for the tunnel device to be re-named or deleted within that
window while RTNL is not held, and some other device created in its
place. This will cause a non-tunnel device to be referenced in the
vport and tunnel-specific functions used on it, e.g. vxlan_get_options()
that directly casts the private netdev data into a struct vxlan_dev
causing an invalid memory access:
BUG: KASAN: slab-use-after-free in vxlan_get_options+0x323/0x3a0
vxlan_get_options+0x323/0x3a0
ovs_vport_cmd_new+0x6e3/0xd30
Fix that by taking a reference to the just created device before
releasing RTNL. This ensures that the device in the vport is always
the one that was just created. The search by name is only needed
for a standard vport-netdev that links pre-existing devices, so that
functionality and device type checks are moved to netdev_create().
It is also awkward that ovs_netdev_link() takes ownership of the vport
and destroys it on failure. It doesn't know the type of the port it is
dealing with, so we need to pass down the indicator that it's a tunnel,
so the link can be properly deleted on failure.
It's possible to refactor the logic to make the ovs_netdev_link() do
only the linking part and let the callers perform a proper destruction,
but it will be much more code for each legacy tunnel port type, so it
is not worth it for the bug fix.
Fixes: 614732eaa12d ("openvswitch: Use regular VXLAN net_device device")
Reported-by: Yuan Tan <tanyuan98@outlook.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Reported-by: Yang Yang <n05ec@lzu.edu.cn>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
---
net/openvswitch/vport-geneve.c | 5 ++-
net/openvswitch/vport-gre.c | 5 ++-
net/openvswitch/vport-netdev.c | 58 ++++++++++++++++++++--------------
net/openvswitch/vport-netdev.h | 2 +-
net/openvswitch/vport-vxlan.c | 5 ++-
5 files changed, 48 insertions(+), 27 deletions(-)
diff --git a/net/openvswitch/vport-geneve.c b/net/openvswitch/vport-geneve.c
index b10e1602c6b14..cb5ea4424ffc8 100644
--- a/net/openvswitch/vport-geneve.c
+++ b/net/openvswitch/vport-geneve.c
@@ -97,6 +97,9 @@ static struct vport *geneve_tnl_create(const struct vport_parms *parms)
goto error;
}
+ vport->dev = dev;
+ netdev_hold(vport->dev, &vport->dev_tracker, GFP_KERNEL);
+
rtnl_unlock();
return vport;
error:
@@ -111,7 +114,7 @@ static struct vport *geneve_create(const struct vport_parms *parms)
if (IS_ERR(vport))
return vport;
- return ovs_netdev_link(vport, parms->name);
+ return ovs_netdev_link(vport, true);
}
static struct vport_ops ovs_geneve_vport_ops = {
diff --git a/net/openvswitch/vport-gre.c b/net/openvswitch/vport-gre.c
index 4014c9b5eb798..6cb5a697b396a 100644
--- a/net/openvswitch/vport-gre.c
+++ b/net/openvswitch/vport-gre.c
@@ -63,6 +63,9 @@ static struct vport *gre_tnl_create(const struct vport_parms *parms)
return ERR_PTR(err);
}
+ vport->dev = dev;
+ netdev_hold(vport->dev, &vport->dev_tracker, GFP_KERNEL);
+
rtnl_unlock();
return vport;
}
@@ -75,7 +78,7 @@ static struct vport *gre_create(const struct vport_parms *parms)
if (IS_ERR(vport))
return vport;
- return ovs_netdev_link(vport, parms->name);
+ return ovs_netdev_link(vport, true);
}
static struct vport_ops ovs_gre_vport_ops = {
diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c
index 12055af832dc0..a92ca8b37f96a 100644
--- a/net/openvswitch/vport-netdev.c
+++ b/net/openvswitch/vport-netdev.c
@@ -73,37 +73,21 @@ static struct net_device *get_dpdev(const struct datapath *dp)
return local->dev;
}
-struct vport *ovs_netdev_link(struct vport *vport, const char *name)
+struct vport *ovs_netdev_link(struct vport *vport, bool tunnel)
{
int err;
- vport->dev = dev_get_by_name(ovs_dp_get_net(vport->dp), name);
- if (!vport->dev) {
+ if (WARN_ON_ONCE(!vport->dev)) {
err = -ENODEV;
goto error_free_vport;
}
- /* Ensure that the device exists and that the provided
- * name is not one of its aliases.
- */
- if (strcmp(name, ovs_vport_name(vport))) {
- err = -ENODEV;
- goto error_put;
- }
- netdev_tracker_alloc(vport->dev, &vport->dev_tracker, GFP_KERNEL);
- if (vport->dev->flags & IFF_LOOPBACK ||
- (vport->dev->type != ARPHRD_ETHER &&
- vport->dev->type != ARPHRD_NONE) ||
- ovs_is_internal_dev(vport->dev)) {
- err = -EINVAL;
- goto error_put;
- }
rtnl_lock();
err = netdev_master_upper_dev_link(vport->dev,
get_dpdev(vport->dp),
NULL, NULL, NULL);
if (err)
- goto error_unlock;
+ goto error_put_unlock;
err = netdev_rx_handler_register(vport->dev, netdev_frame_hook,
vport);
@@ -119,10 +103,11 @@ struct vport *ovs_netdev_link(struct vport *vport, const char *name)
error_master_upper_dev_unlink:
netdev_upper_dev_unlink(vport->dev, get_dpdev(vport->dp));
-error_unlock:
- rtnl_unlock();
-error_put:
+error_put_unlock:
+ if (tunnel && vport->dev->reg_state == NETREG_REGISTERED)
+ rtnl_delete_link(vport->dev, 0, NULL);
netdev_put(vport->dev, &vport->dev_tracker);
+ rtnl_unlock();
error_free_vport:
ovs_vport_free(vport);
return ERR_PTR(err);
@@ -132,12 +117,39 @@ EXPORT_SYMBOL_GPL(ovs_netdev_link);
static struct vport *netdev_create(const struct vport_parms *parms)
{
struct vport *vport;
+ int err;
vport = ovs_vport_alloc(0, &ovs_netdev_vport_ops, parms);
if (IS_ERR(vport))
return vport;
- return ovs_netdev_link(vport, parms->name);
+ vport->dev = dev_get_by_name(ovs_dp_get_net(vport->dp), parms->name);
+ if (!vport->dev) {
+ err = -ENODEV;
+ goto error_free_vport;
+ }
+ netdev_tracker_alloc(vport->dev, &vport->dev_tracker, GFP_KERNEL);
+
+ /* Ensure that the provided name is not an alias. */
+ if (strcmp(parms->name, ovs_vport_name(vport))) {
+ err = -ENODEV;
+ goto error_put;
+ }
+
+ if (vport->dev->flags & IFF_LOOPBACK ||
+ (vport->dev->type != ARPHRD_ETHER &&
+ vport->dev->type != ARPHRD_NONE) ||
+ ovs_is_internal_dev(vport->dev)) {
+ err = -EINVAL;
+ goto error_put;
+ }
+
+ return ovs_netdev_link(vport, false);
+error_put:
+ netdev_put(vport->dev, &vport->dev_tracker);
+error_free_vport:
+ ovs_vport_free(vport);
+ return ERR_PTR(err);
}
static void vport_netdev_free(struct rcu_head *rcu)
diff --git a/net/openvswitch/vport-netdev.h b/net/openvswitch/vport-netdev.h
index c5d83a43bfc49..6c0d7366f9862 100644
--- a/net/openvswitch/vport-netdev.h
+++ b/net/openvswitch/vport-netdev.h
@@ -13,7 +13,7 @@
struct vport *ovs_netdev_get_vport(struct net_device *dev);
-struct vport *ovs_netdev_link(struct vport *vport, const char *name);
+struct vport *ovs_netdev_link(struct vport *vport, bool tunnel);
void ovs_netdev_detach_dev(struct vport *);
int __init ovs_netdev_init(void);
diff --git a/net/openvswitch/vport-vxlan.c b/net/openvswitch/vport-vxlan.c
index 0b881b043bcf4..c1b37b50d29e1 100644
--- a/net/openvswitch/vport-vxlan.c
+++ b/net/openvswitch/vport-vxlan.c
@@ -126,6 +126,9 @@ static struct vport *vxlan_tnl_create(const struct vport_parms *parms)
goto error;
}
+ vport->dev = dev;
+ netdev_hold(vport->dev, &vport->dev_tracker, GFP_KERNEL);
+
rtnl_unlock();
return vport;
error:
@@ -140,7 +143,7 @@ static struct vport *vxlan_create(const struct vport_parms *parms)
if (IS_ERR(vport))
return vport;
- return ovs_netdev_link(vport, parms->name);
+ return ovs_netdev_link(vport, true);
}
static struct vport_ops ovs_vxlan_netdev_vport_ops = {
--
2.53.0
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH net] openvswitch: vport: fix race between tunnel creation and linking
2026-04-30 21:32 [PATCH net] openvswitch: vport: fix race between tunnel creation and linking Ilya Maximets
@ 2026-05-01 8:53 ` Eelco Chaudron
2026-05-04 11:38 ` Ilya Maximets
2026-05-05 13:20 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 4+ messages in thread
From: Eelco Chaudron @ 2026-05-01 8:53 UTC (permalink / raw)
To: Ilya Maximets
Cc: netdev, Aaron Conole, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, dev, linux-kernel,
Yuan Tan, Yifan Wu, Juefei Pu, Xin Liu, Yang Yang
On 30 Apr 2026, at 23:32, Ilya Maximets wrote:
> When a tunnel vport is created it first creates the tunnel device, e.g.,
> with geneve_dev_create_fb(), then it calls ovs_netdev_link() to take a
> reference and link it to the device that represents openvswitch datapath.
>
> The creation of the device is happening under RTNL, but then RTNL is
> released and re-acquired to find the device by name. It is technically
> possible for the tunnel device to be re-named or deleted within that
> window while RTNL is not held, and some other device created in its
> place. This will cause a non-tunnel device to be referenced in the
> vport and tunnel-specific functions used on it, e.g. vxlan_get_options()
> that directly casts the private netdev data into a struct vxlan_dev
> causing an invalid memory access:
>
> BUG: KASAN: slab-use-after-free in vxlan_get_options+0x323/0x3a0
> vxlan_get_options+0x323/0x3a0
> ovs_vport_cmd_new+0x6e3/0xd30
>
> Fix that by taking a reference to the just created device before
> releasing RTNL. This ensures that the device in the vport is always
> the one that was just created. The search by name is only needed
> for a standard vport-netdev that links pre-existing devices, so that
> functionality and device type checks are moved to netdev_create().
>
> It is also awkward that ovs_netdev_link() takes ownership of the vport
> and destroys it on failure. It doesn't know the type of the port it is
> dealing with, so we need to pass down the indicator that it's a tunnel,
> so the link can be properly deleted on failure.
>
> It's possible to refactor the logic to make the ovs_netdev_link() do
> only the linking part and let the callers perform a proper destruction,
> but it will be much more code for each legacy tunnel port type, so it
> is not worth it for the bug fix.
>
> Fixes: 614732eaa12d ("openvswitch: Use regular VXLAN net_device device")
> Reported-by: Yuan Tan <tanyuan98@outlook.com>
> Reported-by: Yifan Wu <yifanwucs@gmail.com>
> Reported-by: Juefei Pu <tomapufckgml@gmail.com>
> Reported-by: Xin Liu <bird@lzu.edu.cn>
> Reported-by: Yang Yang <n05ec@lzu.edu.cn>
> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Thanks for working on this Ilya! The changes look good to me.
Acked-by: Eelco Chaudron <echaudro@redhat.com>
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH net] openvswitch: vport: fix race between tunnel creation and linking
2026-04-30 21:32 [PATCH net] openvswitch: vport: fix race between tunnel creation and linking Ilya Maximets
2026-05-01 8:53 ` Eelco Chaudron
@ 2026-05-04 11:38 ` Ilya Maximets
2026-05-05 13:20 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 4+ messages in thread
From: Ilya Maximets @ 2026-05-04 11:38 UTC (permalink / raw)
To: netdev
Cc: i.maximets, Aaron Conole, Eelco Chaudron, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, dev,
linux-kernel, Yuan Tan, Yifan Wu, Juefei Pu, Xin Liu, Yang Yang
On 4/30/26 11:32 PM, Ilya Maximets wrote:
> When a tunnel vport is created it first creates the tunnel device, e.g.,
> with geneve_dev_create_fb(), then it calls ovs_netdev_link() to take a
> reference and link it to the device that represents openvswitch datapath.
>
> The creation of the device is happening under RTNL, but then RTNL is
> released and re-acquired to find the device by name. It is technically
> possible for the tunnel device to be re-named or deleted within that
> window while RTNL is not held, and some other device created in its
> place. This will cause a non-tunnel device to be referenced in the
> vport and tunnel-specific functions used on it, e.g. vxlan_get_options()
> that directly casts the private netdev data into a struct vxlan_dev
> causing an invalid memory access:
>
> BUG: KASAN: slab-use-after-free in vxlan_get_options+0x323/0x3a0
> vxlan_get_options+0x323/0x3a0
> ovs_vport_cmd_new+0x6e3/0xd30
>
> Fix that by taking a reference to the just created device before
> releasing RTNL. This ensures that the device in the vport is always
> the one that was just created. The search by name is only needed
> for a standard vport-netdev that links pre-existing devices, so that
> functionality and device type checks are moved to netdev_create().
>
> It is also awkward that ovs_netdev_link() takes ownership of the vport
> and destroys it on failure. It doesn't know the type of the port it is
> dealing with, so we need to pass down the indicator that it's a tunnel,
> so the link can be properly deleted on failure.
>
> It's possible to refactor the logic to make the ovs_netdev_link() do
> only the linking part and let the callers perform a proper destruction,
> but it will be much more code for each legacy tunnel port type, so it
> is not worth it for the bug fix.
>
> Fixes: 614732eaa12d ("openvswitch: Use regular VXLAN net_device device")
> Reported-by: Yuan Tan <tanyuan98@outlook.com>
> Reported-by: Yifan Wu <yifanwucs@gmail.com>
> Reported-by: Juefei Pu <tomapufckgml@gmail.com>
> Reported-by: Xin Liu <bird@lzu.edu.cn>
> Reported-by: Yang Yang <n05ec@lzu.edu.cn>
> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> ---
> net/openvswitch/vport-geneve.c | 5 ++-
> net/openvswitch/vport-gre.c | 5 ++-
> net/openvswitch/vport-netdev.c | 58 ++++++++++++++++++++--------------
> net/openvswitch/vport-netdev.h | 2 +-
> net/openvswitch/vport-vxlan.c | 5 ++-
> 5 files changed, 48 insertions(+), 27 deletions(-)
>
...
> diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c
> index 12055af832dc0..a92ca8b37f96a 100644
> --- a/net/openvswitch/vport-netdev.c
> +++ b/net/openvswitch/vport-netdev.c
> @@ -73,37 +73,21 @@ static struct net_device *get_dpdev(const struct datapath *dp)
> return local->dev;
> }
>
> -struct vport *ovs_netdev_link(struct vport *vport, const char *name)
> +struct vport *ovs_netdev_link(struct vport *vport, bool tunnel)
> {
> int err;
>
> - vport->dev = dev_get_by_name(ovs_dp_get_net(vport->dp), name);
> - if (!vport->dev) {
> + if (WARN_ON_ONCE(!vport->dev)) {
> err = -ENODEV;
> goto error_free_vport;
> }
> - /* Ensure that the device exists and that the provided
> - * name is not one of its aliases.
> - */
> - if (strcmp(name, ovs_vport_name(vport))) {
> - err = -ENODEV;
> - goto error_put;
> - }
> - netdev_tracker_alloc(vport->dev, &vport->dev_tracker, GFP_KERNEL);
> - if (vport->dev->flags & IFF_LOOPBACK ||
> - (vport->dev->type != ARPHRD_ETHER &&
> - vport->dev->type != ARPHRD_NONE) ||
> - ovs_is_internal_dev(vport->dev)) {
> - err = -EINVAL;
> - goto error_put;
> - }
>
> rtnl_lock();
> err = netdev_master_upper_dev_link(vport->dev,
> get_dpdev(vport->dp),
> NULL, NULL, NULL);
> if (err)
> - goto error_unlock;
> + goto error_put_unlock;
>
> err = netdev_rx_handler_register(vport->dev, netdev_frame_hook,
> vport);
Sashiko-gemini reports that here we could be linking an already unregistering
device since we're not checking the registration status after re-acquiring the
lock. Which seems like an issue, which is related, but fairly separate from
what this patch is trying to fix. It is also not specific to the tunnel ports.
So, should be addressed separately.
Best regards, Ilya Maximets.
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH net] openvswitch: vport: fix race between tunnel creation and linking
2026-04-30 21:32 [PATCH net] openvswitch: vport: fix race between tunnel creation and linking Ilya Maximets
2026-05-01 8:53 ` Eelco Chaudron
2026-05-04 11:38 ` Ilya Maximets
@ 2026-05-05 13:20 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 4+ messages in thread
From: patchwork-bot+netdevbpf @ 2026-05-05 13:20 UTC (permalink / raw)
To: Ilya Maximets
Cc: netdev, aconole, echaudro, davem, edumazet, kuba, pabeni, horms,
dev, linux-kernel, tanyuan98, yifanwucs, tomapufckgml, bird,
n05ec
Hello:
This patch was applied to netdev/net.git (main)
by Paolo Abeni <pabeni@redhat.com>:
On Thu, 30 Apr 2026 23:32:50 +0200 you wrote:
> When a tunnel vport is created it first creates the tunnel device, e.g.,
> with geneve_dev_create_fb(), then it calls ovs_netdev_link() to take a
> reference and link it to the device that represents openvswitch datapath.
>
> The creation of the device is happening under RTNL, but then RTNL is
> released and re-acquired to find the device by name. It is technically
> possible for the tunnel device to be re-named or deleted within that
> window while RTNL is not held, and some other device created in its
> place. This will cause a non-tunnel device to be referenced in the
> vport and tunnel-specific functions used on it, e.g. vxlan_get_options()
> that directly casts the private netdev data into a struct vxlan_dev
> causing an invalid memory access:
>
> [...]
Here is the summary with links:
- [net] openvswitch: vport: fix race between tunnel creation and linking
https://git.kernel.org/netdev/net/c/83861c48ba12
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-05-05 13:20 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-30 21:32 [PATCH net] openvswitch: vport: fix race between tunnel creation and linking Ilya Maximets
2026-05-01 8:53 ` Eelco Chaudron
2026-05-04 11:38 ` Ilya Maximets
2026-05-05 13:20 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox