* [PATCH net 0/1] pull request: fixes for ovpn 2026-03-20
@ 2026-03-20 10:03 Antonio Quartulli
2026-03-20 10:03 ` [PATCH net 1/1] ovpn: fix race between deleting interface and adding new peer Antonio Quartulli
0 siblings, 1 reply; 9+ messages in thread
From: Antonio Quartulli @ 2026-03-20 10:03 UTC (permalink / raw)
To: netdev
Cc: Hyunwoo Kim, Antonio Quartulli, Sabrina Dubroca, Jakub Kicinski,
Paolo Abeni, Andrew Lunn, David S. Miller, Eric Dumazet
Hello there!
This PR includes just one fix sparked by Hyunwoo's report (and fix
attempt).
Hyunwoo found out that adding a new peer while the ovpn interface is
being destroyed may lead to an interesting race condition.
The proposed fix is based on upon Sabrina's suggestion.
Please pull or let me know of any issue!
Thank you,
Antonio
The following changes since commit e7577a06ae28287ca415aec5c12277e3a80ee372:
Merge tag 'nf-26-03-19' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf (2026-03-19 15:39:33 +0100)
are available in the Git repository at:
https://github.com/OpenVPN/ovpn-net-next.git tags/ovpn-net-20260320
for you to fetch changes up to 9d0bdffbb5d06fa2aaac59a479d9d288a4a1c7e7:
ovpn: fix race between deleting interface and adding new peer (2026-03-20 10:50:47 +0100)
----------------------------------------------------------------
Included change:
* fix race condition between interface teardown and new peer being
added via netlink
----------------------------------------------------------------
Antonio Quartulli (1):
ovpn: fix race between deleting interface and adding new peer
drivers/net/ovpn/main.c | 12 ++----------
1 file changed, 2 insertions(+), 10 deletions(-)
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH net 1/1] ovpn: fix race between deleting interface and adding new peer
2026-03-20 10:03 [PATCH net 0/1] pull request: fixes for ovpn 2026-03-20 Antonio Quartulli
@ 2026-03-20 10:03 ` Antonio Quartulli
2026-03-24 1:43 ` Jakub Kicinski
0 siblings, 1 reply; 9+ messages in thread
From: Antonio Quartulli @ 2026-03-20 10:03 UTC (permalink / raw)
To: netdev
Cc: Hyunwoo Kim, Antonio Quartulli, Sabrina Dubroca, Jakub Kicinski,
Paolo Abeni, Andrew Lunn, David S. Miller, Eric Dumazet
While deleting an existing ovpn interface, there is a very
narrow window where adding a new peer via netlink may cause
the netdevice to hang and prevent its unregistration.
It may happen during ovpn_dellink(), when all existing peers are
freed and the device is queued for deregistration, but a
CMD_PEER_NEW message comes in adding a new peer that takes again
a reference to the netdev.
At this point there is no way to release the device because we are
under the assumption that all peers were already released.
Fix the race condition by releasing all peers in ndo_uninit(),
when the netdevice has already been removed from the netdev
list and thus an incoming CMD_PEER_NEW cannot have any effect
anymore.
At this point ovpn_dellink() becomes empty and can just be
removed.
Reported-by: Hyunwoo Kim <imv4bel@gmail.com>
Closes: https://lore.kernel.org/netdev/aaVgJ16edTfQkYbx@v4bel/
Suggested-by: Sabrina Dubroca <sd@queasysnail.net>
Fixes: 80747caef33d ("ovpn: introduce the ovpn_peer object")
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
---
drivers/net/ovpn/main.c | 12 ++----------
1 file changed, 2 insertions(+), 10 deletions(-)
diff --git a/drivers/net/ovpn/main.c b/drivers/net/ovpn/main.c
index 2e0420febda0..0eab305780c7 100644
--- a/drivers/net/ovpn/main.c
+++ b/drivers/net/ovpn/main.c
@@ -92,6 +92,8 @@ static void ovpn_net_uninit(struct net_device *dev)
{
struct ovpn_priv *ovpn = netdev_priv(dev);
+ cancel_delayed_work_sync(&ovpn->keepalive_work);
+ ovpn_peers_free(ovpn, NULL, OVPN_DEL_PEER_REASON_TEARDOWN);
gro_cells_destroy(&ovpn->gro_cells);
}
@@ -208,15 +210,6 @@ static int ovpn_newlink(struct net_device *dev,
return register_netdevice(dev);
}
-static void ovpn_dellink(struct net_device *dev, struct list_head *head)
-{
- struct ovpn_priv *ovpn = netdev_priv(dev);
-
- cancel_delayed_work_sync(&ovpn->keepalive_work);
- ovpn_peers_free(ovpn, NULL, OVPN_DEL_PEER_REASON_TEARDOWN);
- unregister_netdevice_queue(dev, head);
-}
-
static int ovpn_fill_info(struct sk_buff *skb, const struct net_device *dev)
{
struct ovpn_priv *ovpn = netdev_priv(dev);
@@ -235,7 +228,6 @@ static struct rtnl_link_ops ovpn_link_ops = {
.policy = ovpn_policy,
.maxtype = IFLA_OVPN_MAX,
.newlink = ovpn_newlink,
- .dellink = ovpn_dellink,
.fill_info = ovpn_fill_info,
};
--
2.52.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH net 1/1] ovpn: fix race between deleting interface and adding new peer
2026-03-20 10:03 ` [PATCH net 1/1] ovpn: fix race between deleting interface and adding new peer Antonio Quartulli
@ 2026-03-24 1:43 ` Jakub Kicinski
2026-03-24 1:45 ` Jakub Kicinski
0 siblings, 1 reply; 9+ messages in thread
From: Jakub Kicinski @ 2026-03-24 1:43 UTC (permalink / raw)
To: Antonio Quartulli
Cc: netdev, Hyunwoo Kim, Sabrina Dubroca, Paolo Abeni, Andrew Lunn,
David S. Miller, Eric Dumazet
On Fri, 20 Mar 2026 11:03:51 +0100 Antonio Quartulli wrote:
> While deleting an existing ovpn interface, there is a very
> narrow window where adding a new peer via netlink may cause
> the netdevice to hang and prevent its unregistration.
>
> It may happen during ovpn_dellink(), when all existing peers are
> freed and the device is queued for deregistration, but a
> CMD_PEER_NEW message comes in adding a new peer that takes again
> a reference to the netdev.
>
> At this point there is no way to release the device because we are
> under the assumption that all peers were already released.
>
> Fix the race condition by releasing all peers in ndo_uninit(),
> when the netdevice has already been removed from the netdev
> list and thus an incoming CMD_PEER_NEW cannot have any effect
> anymore.
>
> At this point ovpn_dellink() becomes empty and can just be
> removed.
This looks like a step in the right direction but AI points out that
it's not enough:
Does this completely resolve the race condition?
If a CMD_PEER_NEW netlink message executes concurrently, could the
following sequence occur since ovpn_nl_family uses parallel_ops:
1. In ovpn_nl_pre_doit(), the netlink thread looks up the device via
dev_get_by_index_rcu(), increments its reference count via netdev_hold(),
and drops the RCU lock.
2. Concurrently, device unregistration unlists the device and calls
synchronize_net(). Since the RCU lock in ovpn_nl_pre_doit() was dropped,
unregistration proceeds without waiting for the netlink command.
3. Unregistration executes ndo_uninit() (ovpn_net_uninit()), which calls
cancel_delayed_work_sync() and ovpn_peers_free(), emptying the interface.
4. The preempted CMD_PEER_NEW thread resumes and adds the new peer via
ovpn_peer_add(). Because the device registration state isn't verified
(e.g., checking if dev->reg_state == NETREG_REGISTERED), the peer is
added to the cleared hash tables.
If this sequence happens, wouldn't it cause a permanent hang for UDP sockets?
ovpn_socket_new() acquires a permanent netdev reference. Since
ovpn_peers_free() already ran, this peer is never removed, causing
netdev_wait_allrefs() to hang the kernel indefinitely.
Additionally, for TCP sockets, the keepalive timer would be re-armed
after being canceled here, leading to a use-after-free when the timer
eventually fires on the freed device memory.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net 1/1] ovpn: fix race between deleting interface and adding new peer
2026-03-24 1:43 ` Jakub Kicinski
@ 2026-03-24 1:45 ` Jakub Kicinski
2026-03-24 10:09 ` Sabrina Dubroca
0 siblings, 1 reply; 9+ messages in thread
From: Jakub Kicinski @ 2026-03-24 1:45 UTC (permalink / raw)
To: Antonio Quartulli
Cc: netdev, Hyunwoo Kim, Sabrina Dubroca, Paolo Abeni, Andrew Lunn,
David S. Miller, Eric Dumazet
On Mon, 23 Mar 2026 18:43:04 -0700 Jakub Kicinski wrote:
> On Fri, 20 Mar 2026 11:03:51 +0100 Antonio Quartulli wrote:
> > While deleting an existing ovpn interface, there is a very
> > narrow window where adding a new peer via netlink may cause
> > the netdevice to hang and prevent its unregistration.
> >
> > It may happen during ovpn_dellink(), when all existing peers are
> > freed and the device is queued for deregistration, but a
> > CMD_PEER_NEW message comes in adding a new peer that takes again
> > a reference to the netdev.
> >
> > At this point there is no way to release the device because we are
> > under the assumption that all peers were already released.
> >
> > Fix the race condition by releasing all peers in ndo_uninit(),
> > when the netdevice has already been removed from the netdev
> > list and thus an incoming CMD_PEER_NEW cannot have any effect
> > anymore.
> >
> > At this point ovpn_dellink() becomes empty and can just be
> > removed.
>
> This looks like a step in the right direction but AI points out that
> it's not enough:
On second thought I wonder if the fix will not be to move the flush
even later. So please fix the AI-reported issue in the same submission.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net 1/1] ovpn: fix race between deleting interface and adding new peer
2026-03-24 1:45 ` Jakub Kicinski
@ 2026-03-24 10:09 ` Sabrina Dubroca
2026-03-24 21:30 ` Jakub Kicinski
0 siblings, 1 reply; 9+ messages in thread
From: Sabrina Dubroca @ 2026-03-24 10:09 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Antonio Quartulli, netdev, Hyunwoo Kim, Paolo Abeni, Andrew Lunn,
David S. Miller, Eric Dumazet
2026-03-23, 18:45:43 -0700, Jakub Kicinski wrote:
> On Mon, 23 Mar 2026 18:43:04 -0700 Jakub Kicinski wrote:
> > On Fri, 20 Mar 2026 11:03:51 +0100 Antonio Quartulli wrote:
> > > While deleting an existing ovpn interface, there is a very
> > > narrow window where adding a new peer via netlink may cause
> > > the netdevice to hang and prevent its unregistration.
> > >
> > > It may happen during ovpn_dellink(), when all existing peers are
> > > freed and the device is queued for deregistration, but a
> > > CMD_PEER_NEW message comes in adding a new peer that takes again
> > > a reference to the netdev.
> > >
> > > At this point there is no way to release the device because we are
> > > under the assumption that all peers were already released.
> > >
> > > Fix the race condition by releasing all peers in ndo_uninit(),
> > > when the netdevice has already been removed from the netdev
> > > list and thus an incoming CMD_PEER_NEW cannot have any effect
> > > anymore.
> > >
> > > At this point ovpn_dellink() becomes empty and can just be
> > > removed.
> >
> > This looks like a step in the right direction but AI points out that
> > it's not enough:
>
> On second thought I wonder if the fix will not be to move the flush
> even later. So please fix the AI-reported issue in the same submission.
But if we move it later (priv_destructor? that's the only driver CB
left), netdev_wait_allrefs_any won't be happy.
An idea, on top of this patch:
-------- 8< --------
diff --git a/drivers/net/ovpn/netlink.c b/drivers/net/ovpn/netlink.c
index c7f382437630..ebec8c2ff427 100644
--- a/drivers/net/ovpn/netlink.c
+++ b/drivers/net/ovpn/netlink.c
@@ -90,8 +90,11 @@ void ovpn_nl_post_doit(const struct genl_split_ops *ops, struct sk_buff *skb,
netdevice_tracker *tracker = (netdevice_tracker *)&info->user_ptr[1];
struct ovpn_priv *ovpn = info->user_ptr[0];
- if (ovpn)
+ if (ovpn) {
+ if (READ_ONCE(dev->reg_state) >= NETREG_UNREGISTERING)
+ ovpn_peers_free(ovpn, NULL, OVPN_DEL_PEER_REASON_TEARDOWN);
netdev_put(ovpn->dev, tracker);
+ }
}
static bool ovpn_nl_attr_sockaddr_remote(struct nlattr **attrs,
-------- 8< --------
This would clean up a peer that may have been added while we were
starting device unregistration. We hold a reference on the device so
no UAF possible, netdev_wait_allrefs_any will wait for this. If we
don't have a racing peer creation, ndo_uninit takes care of the peers.
Or we can call ovpn_peers_free on every NETDEV_UNREGISTER notification
that netdev_wait_allrefs_any sends us (but then we don't need it in
ndo_uninit).
And s/cancel_delayed_work_sync/disable_delayed_work_sync/ for the
keepalive_work.
LLM claims it's because of parallel_ops, I don't think this is
related? It also claims this issue is only for UDP sockets (and TCP
would see a UAF on the keepalive), but ovpn_peer_new always holds the
ovpn netdev, so I don't think there's a difference there.
@Antonio: btw, I've always been a bit unsure about the "if (schedule)
hold(peer)" in ovpn_peer_keepalive_work_single, all those races make
me worry again that we could schedule while the refcount drops to 0.
--
Sabrina
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH net 1/1] ovpn: fix race between deleting interface and adding new peer
2026-03-24 10:09 ` Sabrina Dubroca
@ 2026-03-24 21:30 ` Jakub Kicinski
2026-03-24 22:40 ` Sabrina Dubroca
0 siblings, 1 reply; 9+ messages in thread
From: Jakub Kicinski @ 2026-03-24 21:30 UTC (permalink / raw)
To: Sabrina Dubroca
Cc: Antonio Quartulli, netdev, Hyunwoo Kim, Paolo Abeni, Andrew Lunn,
David S. Miller, Eric Dumazet
On Tue, 24 Mar 2026 11:09:11 +0100 Sabrina Dubroca wrote:
> -------- 8< --------
> diff --git a/drivers/net/ovpn/netlink.c b/drivers/net/ovpn/netlink.c
> index c7f382437630..ebec8c2ff427 100644
> --- a/drivers/net/ovpn/netlink.c
> +++ b/drivers/net/ovpn/netlink.c
> @@ -90,8 +90,11 @@ void ovpn_nl_post_doit(const struct genl_split_ops *ops, struct sk_buff *skb,
> netdevice_tracker *tracker = (netdevice_tracker *)&info->user_ptr[1];
> struct ovpn_priv *ovpn = info->user_ptr[0];
>
> - if (ovpn)
> + if (ovpn) {
> + if (READ_ONCE(dev->reg_state) >= NETREG_UNREGISTERING)
> + ovpn_peers_free(ovpn, NULL, OVPN_DEL_PEER_REASON_TEARDOWN);
> netdev_put(ovpn->dev, tracker);
> + }
> }
>
> static bool ovpn_nl_attr_sockaddr_remote(struct nlattr **attrs,
> -------- 8< --------
>
> This would clean up a peer that may have been added while we were
> starting device unregistration. We hold a reference on the device so
> no UAF possible, netdev_wait_allrefs_any will wait for this. If we
> don't have a racing peer creation, ndo_uninit takes care of the peers.
LGTM. This or change all the write paths to check if the device is still
alive after taking the lock.
> Or we can call ovpn_peers_free on every NETDEV_UNREGISTER notification
> that netdev_wait_allrefs_any sends us (but then we don't need it in
> ndo_uninit).
Hm, wouldn't we need a notification _after_ netdev_wait_allrefs_any() ?
> And s/cancel_delayed_work_sync/disable_delayed_work_sync/ for the
> keepalive_work.
>
>
> LLM claims it's because of parallel_ops, I don't think this is
> related? It also claims this issue is only for UDP sockets (and TCP
> would see a UAF on the keepalive), but ovpn_peer_new always holds the
> ovpn netdev, so I don't think there's a difference there.
Yup, I think the LLMs are trying to be helpful and are looking for some
write lock earlier in the path. As much as they are annoying I can't
blame them here, I feel like we try to make things lockless too often.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net 1/1] ovpn: fix race between deleting interface and adding new peer
2026-03-24 21:30 ` Jakub Kicinski
@ 2026-03-24 22:40 ` Sabrina Dubroca
2026-03-25 13:37 ` Antonio Quartulli
0 siblings, 1 reply; 9+ messages in thread
From: Sabrina Dubroca @ 2026-03-24 22:40 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Antonio Quartulli, netdev, Hyunwoo Kim, Paolo Abeni, Andrew Lunn,
David S. Miller, Eric Dumazet
2026-03-24, 14:30:06 -0700, Jakub Kicinski wrote:
> On Tue, 24 Mar 2026 11:09:11 +0100 Sabrina Dubroca wrote:
> > -------- 8< --------
> > diff --git a/drivers/net/ovpn/netlink.c b/drivers/net/ovpn/netlink.c
> > index c7f382437630..ebec8c2ff427 100644
> > --- a/drivers/net/ovpn/netlink.c
> > +++ b/drivers/net/ovpn/netlink.c
> > @@ -90,8 +90,11 @@ void ovpn_nl_post_doit(const struct genl_split_ops *ops, struct sk_buff *skb,
> > netdevice_tracker *tracker = (netdevice_tracker *)&info->user_ptr[1];
> > struct ovpn_priv *ovpn = info->user_ptr[0];
> >
> > - if (ovpn)
> > + if (ovpn) {
> > + if (READ_ONCE(dev->reg_state) >= NETREG_UNREGISTERING)
> > + ovpn_peers_free(ovpn, NULL, OVPN_DEL_PEER_REASON_TEARDOWN);
> > netdev_put(ovpn->dev, tracker);
> > + }
> > }
> >
> > static bool ovpn_nl_attr_sockaddr_remote(struct nlattr **attrs,
> > -------- 8< --------
> >
> > This would clean up a peer that may have been added while we were
> > starting device unregistration. We hold a reference on the device so
> > no UAF possible, netdev_wait_allrefs_any will wait for this. If we
> > don't have a racing peer creation, ndo_uninit takes care of the peers.
>
> LGTM. This or change all the write paths to check if the device is still
> alive after taking the lock.
I think peer_new is the only relevant path here (other paths use an
existing peer and modify/delete the peer itself or its keys while
holding a reference), and we don't take a lock except to insert the
peer in the ovpn instance (ie netdev)'s hashtable (or peer slot, for
single-peer instances). I guess we could add the reg_state >=
NETREG_UNREGISTERING check to ovpn_peer_add_{p2p,mp} and reject adding
the peer. It seems cleaner than my ovpn_nl_post_doit() diff above.
> > Or we can call ovpn_peers_free on every NETDEV_UNREGISTER notification
> > that netdev_wait_allrefs_any sends us (but then we don't need it in
> > ndo_uninit).
>
> Hm, wouldn't we need a notification _after_ netdev_wait_allrefs_any() ?
netdev_wait_allrefs_any() will never complete if we don't clean up all
the peers, since they hold a ref on the netdev.
But calling ovpn_peers_free on netdev_wait_allrefs_any()'s
/* Rebroadcast unregister notification */
should clean up peers that got added while we were unregistering.
> > And s/cancel_delayed_work_sync/disable_delayed_work_sync/ for the
> > keepalive_work.
> >
> >
> > LLM claims it's because of parallel_ops, I don't think this is
> > related? It also claims this issue is only for UDP sockets (and TCP
> > would see a UAF on the keepalive), but ovpn_peer_new always holds the
> > ovpn netdev, so I don't think there's a difference there.
>
> Yup, I think the LLMs are trying to be helpful and are looking for some
> write lock earlier in the path. As much as they are annoying I can't
> blame them here, I feel like we try to make things lockless too often.
Well, if we want to protect against netdev removal, we have to use the
one lock one we're trying hard to not depend on so much (rtnl)?
At the ovpn level operations are not lockless.
The LLM found a legit race that I missed, but I'm always puzzled by
inconsistencies like that.
--
Sabrina
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net 1/1] ovpn: fix race between deleting interface and adding new peer
2026-03-24 22:40 ` Sabrina Dubroca
@ 2026-03-25 13:37 ` Antonio Quartulli
2026-03-26 9:13 ` Sabrina Dubroca
0 siblings, 1 reply; 9+ messages in thread
From: Antonio Quartulli @ 2026-03-25 13:37 UTC (permalink / raw)
To: Sabrina Dubroca, Jakub Kicinski
Cc: netdev, Hyunwoo Kim, Paolo Abeni, Andrew Lunn, David S. Miller,
Eric Dumazet
Hi!
On 24/03/2026 23:40, Sabrina Dubroca wrote:
> 2026-03-24, 14:30:06 -0700, Jakub Kicinski wrote:
>> On Tue, 24 Mar 2026 11:09:11 +0100 Sabrina Dubroca wrote:
>>> -------- 8< --------
>>> diff --git a/drivers/net/ovpn/netlink.c b/drivers/net/ovpn/netlink.c
>>> index c7f382437630..ebec8c2ff427 100644
>>> --- a/drivers/net/ovpn/netlink.c
>>> +++ b/drivers/net/ovpn/netlink.c
>>> @@ -90,8 +90,11 @@ void ovpn_nl_post_doit(const struct genl_split_ops *ops, struct sk_buff *skb,
>>> netdevice_tracker *tracker = (netdevice_tracker *)&info->user_ptr[1];
>>> struct ovpn_priv *ovpn = info->user_ptr[0];
>>>
>>> - if (ovpn)
>>> + if (ovpn) {
>>> + if (READ_ONCE(dev->reg_state) >= NETREG_UNREGISTERING)
>>> + ovpn_peers_free(ovpn, NULL, OVPN_DEL_PEER_REASON_TEARDOWN);
>>> netdev_put(ovpn->dev, tracker);
>>> + }
>>> }
>>>
>>> static bool ovpn_nl_attr_sockaddr_remote(struct nlattr **attrs,
>>> -------- 8< --------
>>>
>>> This would clean up a peer that may have been added while we were
>>> starting device unregistration. We hold a reference on the device so
>>> no UAF possible, netdev_wait_allrefs_any will wait for this. If we
>>> don't have a racing peer creation, ndo_uninit takes care of the peers.
>>
>> LGTM. This or change all the write paths to check if the device is still
>> alive after taking the lock.
>
> I think peer_new is the only relevant path here (other paths use an
> existing peer and modify/delete the peer itself or its keys while
> holding a reference), and we don't take a lock except to insert the
> peer in the ovpn instance (ie netdev)'s hashtable (or peer slot, for
> single-peer instances). I guess we could add the reg_state >=
> NETREG_UNREGISTERING check to ovpn_peer_add_{p2p,mp} and reject adding
> the peer. It seems cleaner than my ovpn_nl_post_doit() diff above.
Yeah, I like the check in ovpn_peer_add_* too.
>
>>> Or we can call ovpn_peers_free on every NETDEV_UNREGISTER notification
>>> that netdev_wait_allrefs_any sends us (but then we don't need it in
>>> ndo_uninit).
>>
>> Hm, wouldn't we need a notification _after_ netdev_wait_allrefs_any() ?
>
> netdev_wait_allrefs_any() will never complete if we don't clean up all
> the peers, since they hold a ref on the netdev.
>
> But calling ovpn_peers_free on netdev_wait_allrefs_any()'s
>
> /* Rebroadcast unregister notification */
>
> should clean up peers that got added while we were unregistering.
But with the check in ovpn_peer_add_*, we don't need this extra call to
ovpn_peers_free(), right?
>
>>> And s/cancel_delayed_work_sync/disable_delayed_work_sync/ for the
>>> keepalive_work.
ACK
Regards,
>>>
>>>
>>> LLM claims it's because of parallel_ops, I don't think this is
>>> related? It also claims this issue is only for UDP sockets (and TCP
>>> would see a UAF on the keepalive), but ovpn_peer_new always holds the
>>> ovpn netdev, so I don't think there's a difference there.
>>
>> Yup, I think the LLMs are trying to be helpful and are looking for some
>> write lock earlier in the path. As much as they are annoying I can't
>> blame them here, I feel like we try to make things lockless too often.
>
> Well, if we want to protect against netdev removal, we have to use the
> one lock one we're trying hard to not depend on so much (rtnl)?
> At the ovpn level operations are not lockless.
>
> The LLM found a legit race that I missed, but I'm always puzzled by
> inconsistencies like that.
--
Antonio Quartulli
OpenVPN Inc.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net 1/1] ovpn: fix race between deleting interface and adding new peer
2026-03-25 13:37 ` Antonio Quartulli
@ 2026-03-26 9:13 ` Sabrina Dubroca
0 siblings, 0 replies; 9+ messages in thread
From: Sabrina Dubroca @ 2026-03-26 9:13 UTC (permalink / raw)
To: Antonio Quartulli
Cc: Jakub Kicinski, netdev, Hyunwoo Kim, Paolo Abeni, Andrew Lunn,
David S. Miller, Eric Dumazet
2026-03-25, 14:37:36 +0100, Antonio Quartulli wrote:
> Hi!
>
> On 24/03/2026 23:40, Sabrina Dubroca wrote:
> > 2026-03-24, 14:30:06 -0700, Jakub Kicinski wrote:
> > > LGTM. This or change all the write paths to check if the device is still
> > > alive after taking the lock.
> >
> > I think peer_new is the only relevant path here (other paths use an
> > existing peer and modify/delete the peer itself or its keys while
> > holding a reference), and we don't take a lock except to insert the
> > peer in the ovpn instance (ie netdev)'s hashtable (or peer slot, for
> > single-peer instances). I guess we could add the reg_state >=
> > NETREG_UNREGISTERING check to ovpn_peer_add_{p2p,mp} and reject adding
> > the peer. It seems cleaner than my ovpn_nl_post_doit() diff above.
>
> Yeah, I like the check in ovpn_peer_add_* too.
>
> >
> > > > Or we can call ovpn_peers_free on every NETDEV_UNREGISTER notification
> > > > that netdev_wait_allrefs_any sends us (but then we don't need it in
> > > > ndo_uninit).
> > >
> > > Hm, wouldn't we need a notification _after_ netdev_wait_allrefs_any() ?
> >
> > netdev_wait_allrefs_any() will never complete if we don't clean up all
> > the peers, since they hold a ref on the netdev.
> >
> > But calling ovpn_peers_free on netdev_wait_allrefs_any()'s
> >
> > /* Rebroadcast unregister notification */
> >
> > should clean up peers that got added while we were unregistering.
>
> But with the check in ovpn_peer_add_*, we don't need this extra call to
> ovpn_peers_free(), right?
Yes. This was an alternative idea for the same thing.
--
Sabrina
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-03-26 9:13 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-20 10:03 [PATCH net 0/1] pull request: fixes for ovpn 2026-03-20 Antonio Quartulli
2026-03-20 10:03 ` [PATCH net 1/1] ovpn: fix race between deleting interface and adding new peer Antonio Quartulli
2026-03-24 1:43 ` Jakub Kicinski
2026-03-24 1:45 ` Jakub Kicinski
2026-03-24 10:09 ` Sabrina Dubroca
2026-03-24 21:30 ` Jakub Kicinski
2026-03-24 22:40 ` Sabrina Dubroca
2026-03-25 13:37 ` Antonio Quartulli
2026-03-26 9:13 ` Sabrina Dubroca
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox