public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net 0/2] pull request: fixes for ovpn 2026-04-29
@ 2026-04-29 12:01 Antonio Quartulli
  2026-04-29 12:01 ` [PATCH net 1/2] ovpn: reset MAC header before passing skb up Antonio Quartulli
  2026-04-29 12:01 ` [PATCH net 2/2] ovpn: ensure gro_cells_receive() is invoked with BH disabled Antonio Quartulli
  0 siblings, 2 replies; 9+ messages in thread
From: Antonio Quartulli @ 2026-04-29 12:01 UTC (permalink / raw)
  To: netdev
  Cc: ralf, Antonio Quartulli, Sabrina Dubroca, Jakub Kicinski,
	Paolo Abeni, Andrew Lunn, David S. Miller, Eric Dumazet

Hello netdev team,

here are two fixes for ovpn.

Patch 1 fixes the inconsistent call context for gro_cells_receive().
On a TCP connection, we can end up calling gro_cells_receive() from
process context, which is unexpected (and could potentially trigger
deadlocks). For this reason we're wrapping the aforementioned call
within local_bh_disable/enable().

Patch 2 is ensuring that the MAC header offset is reset before
deliering the packet to the upper layer. Not doing so could trick other
parts of the networking stack into wrong calculations.



On a side note, the bug being fixed by patch 1 was also recently reported
by the netdev CI.
We still got some spurious TCP failures that we are tracking down.


Please pull or let me know of any issue.

Thanks,
	Antonio


The following changes since commit 3bc179bc7146c26c9dff75d2943d10528274e301:

  netpoll: fix IPv6 local-address corruption (2026-04-27 19:16:18 -0700)

are available in the Git repository at:

  https://github.com/OpenVPN/ovpn-net-next.git tags/ovpn-net-20260429

for you to fetch changes up to 39b17bb8b1ce3ba488d81db31916aacd5803b18f:

  ovpn: ensure gro_cells_receive() is invoked with BH disabled (2026-04-29 11:33:43 +0200)

----------------------------------------------------------------
Includes changes:
* ensure gro_cells_receive() is called with BH disabled
* ensure MAC header offset is reset before delivering packet

----------------------------------------------------------------
Qingfang Deng (1):
      ovpn: reset MAC header before passing skb up

Ralf Lici (1):
      ovpn: ensure gro_cells_receive() is invoked with BH disabled

 drivers/net/ovpn/io.c | 7 +++++++
 1 file changed, 7 insertions(+)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH net 1/2] ovpn: reset MAC header before passing skb up
  2026-04-29 12:01 [PATCH net 0/2] pull request: fixes for ovpn 2026-04-29 Antonio Quartulli
@ 2026-04-29 12:01 ` Antonio Quartulli
  2026-04-29 12:01 ` [PATCH net 2/2] ovpn: ensure gro_cells_receive() is invoked with BH disabled Antonio Quartulli
  1 sibling, 0 replies; 9+ messages in thread
From: Antonio Quartulli @ 2026-04-29 12:01 UTC (permalink / raw)
  To: netdev
  Cc: ralf, Qingfang Deng, Sabrina Dubroca, Jakub Kicinski, Paolo Abeni,
	Andrew Lunn, David S. Miller, Eric Dumazet, Minqiang Chen,
	Antonio Quartulli

From: Qingfang Deng <qingfang.deng@linux.dev>

After decapsulating a packet, the skb->mac_header still points to the
outer transport header.

Fix this by calling skb_reset_mac_header() in ovpn_netdev_write() to
ensure the MAC header points to the beginning of
the inner IP/network packet, as expected by the rest of the stack.

Reported-by: Minqiang Chen <ptpt52@gmail.com>
Fixes: 8534731dbf2d ("ovpn: implement packet processing")
Signed-off-by: Qingfang Deng <qingfang.deng@linux.dev>
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
---
 drivers/net/ovpn/io.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ovpn/io.c b/drivers/net/ovpn/io.c
index db43a1f8a07a..d92bb87be2b2 100644
--- a/drivers/net/ovpn/io.c
+++ b/drivers/net/ovpn/io.c
@@ -85,6 +85,7 @@ static void ovpn_netdev_write(struct ovpn_peer *peer, struct sk_buff *skb)
 	skb_scrub_packet(skb, true);
 
 	/* network header reset in ovpn_decrypt_post() */
+	skb_reset_mac_header(skb);
 	skb_reset_transport_header(skb);
 	skb_reset_inner_headers(skb);
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net 2/2] ovpn: ensure gro_cells_receive() is invoked with BH disabled
  2026-04-29 12:01 [PATCH net 0/2] pull request: fixes for ovpn 2026-04-29 Antonio Quartulli
  2026-04-29 12:01 ` [PATCH net 1/2] ovpn: reset MAC header before passing skb up Antonio Quartulli
@ 2026-04-29 12:01 ` Antonio Quartulli
  2026-04-30 13:28   ` Antonio Quartulli
  1 sibling, 1 reply; 9+ messages in thread
From: Antonio Quartulli @ 2026-04-29 12:01 UTC (permalink / raw)
  To: netdev
  Cc: ralf, Sabrina Dubroca, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
	David S. Miller, Eric Dumazet, Antonio Quartulli

From: Ralf Lici <ralf@mandelbit.com>

ovpn injects decrypted packets into the netdev RX path through
ovpn_netdev_write() -> gro_cells_receive().
In case of TCP connections, that injection path can be reached
from either softirq or process context.

However, gro_cells_receive() expects to be invoked in softirq
context to avoid racing its lock acquisition.

When being invoked from process context, we must ensure BH are
disabled to safeguard gro_cells_receive() assumption.

As of now this is not the case and the following WARNING is
triggered:

  [  230.183747][   T12] WARNING: net/core/gro_cells.c:30 at gro_cells_receive+0x708/0xaa0, CPU#1: kworker/u16:0/12

Moreover, not disabling BH was also triggering lockdep, which was
reporting a potential deadlock in case of concurrent softirq and
process context gro_cells->bh_lock acquisition (i.e in case of
concurrent UDP and TCP connection for the same ovpn interface):

  WARNING: inconsistent lock state
  7.0.0-rc4+ #246 Tainted: G        W
  --------------------------------
  inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.

Fix all this by invoking local_bh_disable/enable() around
gro_cells_receive().

Fixes: 11851cbd60ea ("ovpn: implement TCP transport")
Signed-off-by: Ralf Lici <ralf@mandelbit.com>
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
---
 drivers/net/ovpn/io.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/ovpn/io.c b/drivers/net/ovpn/io.c
index d92bb87be2b2..c0fdb9504241 100644
--- a/drivers/net/ovpn/io.c
+++ b/drivers/net/ovpn/io.c
@@ -91,7 +91,13 @@ static void ovpn_netdev_write(struct ovpn_peer *peer, struct sk_buff *skb)
 
 	/* cause packet to be "received" by the interface */
 	pkt_len = skb->len;
+	/* we may get here in process context in case of TCP connections,
+	 * therefore we have to disable BHs to ensure gro_cells_receive()
+	 * doesn't enter deadlock
+	 */
+	local_bh_disable();
 	ret = gro_cells_receive(&peer->ovpn->gro_cells, skb);
+	local_bh_enable();
 	if (likely(ret == NET_RX_SUCCESS)) {
 		/* update RX stats with the size of decrypted packet */
 		ovpn_peer_stats_increment_rx(&peer->vpn_stats, pkt_len);
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH net 2/2] ovpn: ensure gro_cells_receive() is invoked with BH disabled
  2026-04-29 12:01 ` [PATCH net 2/2] ovpn: ensure gro_cells_receive() is invoked with BH disabled Antonio Quartulli
@ 2026-04-30 13:28   ` Antonio Quartulli
  2026-04-30 13:37     ` Eric Dumazet
  0 siblings, 1 reply; 9+ messages in thread
From: Antonio Quartulli @ 2026-04-30 13:28 UTC (permalink / raw)
  To: netdev, Jakub Kicinski
  Cc: ralf, Sabrina Dubroca, Paolo Abeni, Andrew Lunn, David S. Miller,
	Eric Dumazet

Hi Jakub,

sashiko came back with an interesting review of the per-cpu stats update 
in the surrounding code.

As far as I can tell its explanation makes sense, but I am no per-cpu 
expert.

IIUC it basically says that if gro_cells_receive() is invoked with 
bottom halves disabled, the following dev_dstats_rx_add() should be too 
to avoid deadlocks and corruptions.

See below:

On 29/04/2026 14:01, Antonio Quartulli wrote:
> diff --git a/drivers/net/ovpn/io.c b/drivers/net/ovpn/io.c
> index d92bb87be2b2..c0fdb9504241 100644
> --- a/drivers/net/ovpn/io.c
> +++ b/drivers/net/ovpn/io.c
> @@ -91,7 +91,13 @@ static void ovpn_netdev_write(struct ovpn_peer *peer, struct sk_buff *skb)
>   
>   	/* cause packet to be "received" by the interface */
>   	pkt_len = skb->len;
> +	/* we may get here in process context in case of TCP connections,
> +	 * therefore we have to disable BHs to ensure gro_cells_receive()
> +	 * doesn't enter deadlock
> +	 */
> +	local_bh_disable();
>   	ret = gro_cells_receive(&peer->ovpn->gro_cells, skb);
> +	local_bh_enable();
>   	if (likely(ret == NET_RX_SUCCESS)) {
>   		/* update RX stats with the size of decrypted packet */
>   		ovpn_peer_stats_increment_rx(&peer->vpn_stats, pkt_len);
 >>		dev_dstats_rx_add(peer->ovpn->dev...) << not visible in the diff, 
but it's here

from sashiko:

Can this lead to seqcount deadlocks and stat corruption since
dev_dstats_rx_add() is executed with preemption and bottom-halves enabled?
dev_dstats_rx_add() operates on per-CPU statistics and protects 64-bit
updates with a sequence counter (u64_stats_update_begin()). When called
from process context with BH enabled, on 64-bit systems, this_cpu_ptr()
is evaluated with preemption enabled. If the task is migrated mid-update,
it risks cross-CPU stat corruption.
On 32-bit systems, u64_stats_update_begin() disables preemption but not
bottom-halves. If a softirq (e.g., a concurrent UDP packet reception)
interrupts the process context and calls dev_dstats_rx_add() for the same
interface, it will re-enter the seqcount writer lock on the exact same CPU.
This corrupts the sequence counter, causing readers to see an unlocked
sequence during active writes, leading to torn reads and corrupted stats.
Should local_bh_enable() be moved after the statistics updates to ensure
the entire per-CPU update is atomic with respect to softirqs?


Do you have an opinion?

Thanks a lot.

Regards,

-- 
Antonio Quartulli
OpenVPN Inc.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net 2/2] ovpn: ensure gro_cells_receive() is invoked with BH disabled
  2026-04-30 13:28   ` Antonio Quartulli
@ 2026-04-30 13:37     ` Eric Dumazet
  2026-04-30 13:40       ` Antonio Quartulli
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2026-04-30 13:37 UTC (permalink / raw)
  To: Antonio Quartulli
  Cc: netdev, Jakub Kicinski, ralf, Sabrina Dubroca, Paolo Abeni,
	Andrew Lunn, David S. Miller

On Thu, Apr 30, 2026 at 6:28 AM Antonio Quartulli <antonio@openvpn.net> wrote:
>
> Hi Jakub,
>
> sashiko came back with an interesting review of the per-cpu stats update
> in the surrounding code.
>
> As far as I can tell its explanation makes sense, but I am no per-cpu
> expert.
>
> IIUC it basically says that if gro_cells_receive() is invoked with
> bottom halves disabled, the following dev_dstats_rx_add() should be too
> to avoid deadlocks and corruptions.
>
> See below:
>
> On 29/04/2026 14:01, Antonio Quartulli wrote:
> > diff --git a/drivers/net/ovpn/io.c b/drivers/net/ovpn/io.c
> > index d92bb87be2b2..c0fdb9504241 100644
> > --- a/drivers/net/ovpn/io.c
> > +++ b/drivers/net/ovpn/io.c
> > @@ -91,7 +91,13 @@ static void ovpn_netdev_write(struct ovpn_peer *peer, struct sk_buff *skb)
> >
> >       /* cause packet to be "received" by the interface */
> >       pkt_len = skb->len;
> > +     /* we may get here in process context in case of TCP connections,
> > +      * therefore we have to disable BHs to ensure gro_cells_receive()
> > +      * doesn't enter deadlock
> > +      */
> > +     local_bh_disable();
> >       ret = gro_cells_receive(&peer->ovpn->gro_cells, skb);
> > +     local_bh_enable();
> >       if (likely(ret == NET_RX_SUCCESS)) {
> >               /* update RX stats with the size of decrypted packet */
> >               ovpn_peer_stats_increment_rx(&peer->vpn_stats, pkt_len);
>  >>             dev_dstats_rx_add(peer->ovpn->dev...) << not visible in the diff,
> but it's here
>
> from sashiko:
>
> Can this lead to seqcount deadlocks and stat corruption since
> dev_dstats_rx_add() is executed with preemption and bottom-halves enabled?
> dev_dstats_rx_add() operates on per-CPU statistics and protects 64-bit
> updates with a sequence counter (u64_stats_update_begin()). When called
> from process context with BH enabled, on 64-bit systems, this_cpu_ptr()
> is evaluated with preemption enabled. If the task is migrated mid-update,
> it risks cross-CPU stat corruption.
> On 32-bit systems, u64_stats_update_begin() disables preemption but not
> bottom-halves. If a softirq (e.g., a concurrent UDP packet reception)
> interrupts the process context and calls dev_dstats_rx_add() for the same
> interface, it will re-enter the seqcount writer lock on the exact same CPU.
> This corrupts the sequence counter, causing readers to see an unlocked
> sequence during active writes, leading to torn reads and corrupted stats.
> Should local_bh_enable() be moved after the statistics updates to ensure
> the entire per-CPU update is atomic with respect to softirqs?
>
>
> Do you have an opinion?

Sashiko suggestion seems good to me.

Proper Fixes: tag would be:

Fixes: ab66abbc769b ("ovpn: implement basic RX path (UDP)")

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net 2/2] ovpn: ensure gro_cells_receive() is invoked with BH disabled
  2026-04-30 13:37     ` Eric Dumazet
@ 2026-04-30 13:40       ` Antonio Quartulli
  2026-04-30 13:43         ` Eric Dumazet
  0 siblings, 1 reply; 9+ messages in thread
From: Antonio Quartulli @ 2026-04-30 13:40 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: netdev, Jakub Kicinski, ralf, Sabrina Dubroca, Paolo Abeni,
	Andrew Lunn, David S. Miller

Hi Eric,

On 30/04/2026 15:37, Eric Dumazet wrote:
> On Thu, Apr 30, 2026 at 6:28 AM Antonio Quartulli <antonio@openvpn.net> wrote:
>>
>> Hi Jakub,
>>
>> sashiko came back with an interesting review of the per-cpu stats update
>> in the surrounding code.
>>
>> As far as I can tell its explanation makes sense, but I am no per-cpu
>> expert.
>>
>> IIUC it basically says that if gro_cells_receive() is invoked with
>> bottom halves disabled, the following dev_dstats_rx_add() should be too
>> to avoid deadlocks and corruptions.
>>
>> See below:
>>
>> On 29/04/2026 14:01, Antonio Quartulli wrote:
>>> diff --git a/drivers/net/ovpn/io.c b/drivers/net/ovpn/io.c
>>> index d92bb87be2b2..c0fdb9504241 100644
>>> --- a/drivers/net/ovpn/io.c
>>> +++ b/drivers/net/ovpn/io.c
>>> @@ -91,7 +91,13 @@ static void ovpn_netdev_write(struct ovpn_peer *peer, struct sk_buff *skb)
>>>
>>>        /* cause packet to be "received" by the interface */
>>>        pkt_len = skb->len;
>>> +     /* we may get here in process context in case of TCP connections,
>>> +      * therefore we have to disable BHs to ensure gro_cells_receive()
>>> +      * doesn't enter deadlock
>>> +      */
>>> +     local_bh_disable();
>>>        ret = gro_cells_receive(&peer->ovpn->gro_cells, skb);
>>> +     local_bh_enable();
>>>        if (likely(ret == NET_RX_SUCCESS)) {
>>>                /* update RX stats with the size of decrypted packet */
>>>                ovpn_peer_stats_increment_rx(&peer->vpn_stats, pkt_len);
>>   >>             dev_dstats_rx_add(peer->ovpn->dev...) << not visible in the diff,
>> but it's here
>>
>> from sashiko:
>>
>> Can this lead to seqcount deadlocks and stat corruption since
>> dev_dstats_rx_add() is executed with preemption and bottom-halves enabled?
>> dev_dstats_rx_add() operates on per-CPU statistics and protects 64-bit
>> updates with a sequence counter (u64_stats_update_begin()). When called
>> from process context with BH enabled, on 64-bit systems, this_cpu_ptr()
>> is evaluated with preemption enabled. If the task is migrated mid-update,
>> it risks cross-CPU stat corruption.
>> On 32-bit systems, u64_stats_update_begin() disables preemption but not
>> bottom-halves. If a softirq (e.g., a concurrent UDP packet reception)
>> interrupts the process context and calls dev_dstats_rx_add() for the same
>> interface, it will re-enter the seqcount writer lock on the exact same CPU.
>> This corrupts the sequence counter, causing readers to see an unlocked
>> sequence during active writes, leading to torn reads and corrupted stats.
>> Should local_bh_enable() be moved after the statistics updates to ensure
>> the entire per-CPU update is atomic with respect to softirqs?
>>
>>
>> Do you have an opinion?
> 
> Sashiko suggestion seems good to me.

But am I right saying that this bug existed before and it is not 
introduced by this patch?

A concurrent softirq (UDP RX pkt) could already trigger this problem 
before we introduced the local_bh_disable/enable() calls, right?


Regards,


> 
> Proper Fixes: tag would be:
> 
> Fixes: ab66abbc769b ("ovpn: implement basic RX path (UDP)")

-- 
Antonio Quartulli
OpenVPN Inc.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net 2/2] ovpn: ensure gro_cells_receive() is invoked with BH disabled
  2026-04-30 13:40       ` Antonio Quartulli
@ 2026-04-30 13:43         ` Eric Dumazet
  2026-04-30 14:00           ` Antonio Quartulli
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2026-04-30 13:43 UTC (permalink / raw)
  To: Antonio Quartulli
  Cc: netdev, Jakub Kicinski, ralf, Sabrina Dubroca, Paolo Abeni,
	Andrew Lunn, David S. Miller

On Thu, Apr 30, 2026 at 6:40 AM Antonio Quartulli <antonio@openvpn.net> wrote:
>
> Hi Eric,
>
> On 30/04/2026 15:37, Eric Dumazet wrote:
> > On Thu, Apr 30, 2026 at 6:28 AM Antonio Quartulli <antonio@openvpn.net> wrote:
> >>
> >> Hi Jakub,
> >>
> >> sashiko came back with an interesting review of the per-cpu stats update
> >> in the surrounding code.
> >>
> >> As far as I can tell its explanation makes sense, but I am no per-cpu
> >> expert.
> >>
> >> IIUC it basically says that if gro_cells_receive() is invoked with
> >> bottom halves disabled, the following dev_dstats_rx_add() should be too
> >> to avoid deadlocks and corruptions.
> >>
> >> See below:
> >>
> >> On 29/04/2026 14:01, Antonio Quartulli wrote:
> >>> diff --git a/drivers/net/ovpn/io.c b/drivers/net/ovpn/io.c
> >>> index d92bb87be2b2..c0fdb9504241 100644
> >>> --- a/drivers/net/ovpn/io.c
> >>> +++ b/drivers/net/ovpn/io.c
> >>> @@ -91,7 +91,13 @@ static void ovpn_netdev_write(struct ovpn_peer *peer, struct sk_buff *skb)
> >>>
> >>>        /* cause packet to be "received" by the interface */
> >>>        pkt_len = skb->len;
> >>> +     /* we may get here in process context in case of TCP connections,
> >>> +      * therefore we have to disable BHs to ensure gro_cells_receive()
> >>> +      * doesn't enter deadlock
> >>> +      */
> >>> +     local_bh_disable();
> >>>        ret = gro_cells_receive(&peer->ovpn->gro_cells, skb);
> >>> +     local_bh_enable();
> >>>        if (likely(ret == NET_RX_SUCCESS)) {
> >>>                /* update RX stats with the size of decrypted packet */
> >>>                ovpn_peer_stats_increment_rx(&peer->vpn_stats, pkt_len);
> >>   >>             dev_dstats_rx_add(peer->ovpn->dev...) << not visible in the diff,
> >> but it's here
> >>
> >> from sashiko:
> >>
> >> Can this lead to seqcount deadlocks and stat corruption since
> >> dev_dstats_rx_add() is executed with preemption and bottom-halves enabled?
> >> dev_dstats_rx_add() operates on per-CPU statistics and protects 64-bit
> >> updates with a sequence counter (u64_stats_update_begin()). When called
> >> from process context with BH enabled, on 64-bit systems, this_cpu_ptr()
> >> is evaluated with preemption enabled. If the task is migrated mid-update,
> >> it risks cross-CPU stat corruption.
> >> On 32-bit systems, u64_stats_update_begin() disables preemption but not
> >> bottom-halves. If a softirq (e.g., a concurrent UDP packet reception)
> >> interrupts the process context and calls dev_dstats_rx_add() for the same
> >> interface, it will re-enter the seqcount writer lock on the exact same CPU.
> >> This corrupts the sequence counter, causing readers to see an unlocked
> >> sequence during active writes, leading to torn reads and corrupted stats.
> >> Should local_bh_enable() be moved after the statistics updates to ensure
> >> the entire per-CPU update is atomic with respect to softirqs?
> >>
> >>
> >> Do you have an opinion?
> >
> > Sashiko suggestion seems good to me.
>
> But am I right saying that this bug existed before and it is not
> introduced by this patch?
>
> A concurrent softirq (UDP RX pkt) could already trigger this problem
> before we introduced the local_bh_disable/enable() calls, right?

I think we are saying the same thing.

Let me rephrase: The bug was introduced in:

Fixes: ab66abbc769b ("ovpn: implement basic RX path (UDP)")

Please respin a V2.

>
>
> Regards,
>
>
> >
> > Proper Fixes: tag would be:
> >
> > Fixes: ab66abbc769b ("ovpn: implement basic RX path (UDP)")
>
> --
> Antonio Quartulli
> OpenVPN Inc.
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net 2/2] ovpn: ensure gro_cells_receive() is invoked with BH disabled
  2026-04-30 13:43         ` Eric Dumazet
@ 2026-04-30 14:00           ` Antonio Quartulli
  2026-04-30 14:10             ` Eric Dumazet
  0 siblings, 1 reply; 9+ messages in thread
From: Antonio Quartulli @ 2026-04-30 14:00 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: netdev, Jakub Kicinski, ralf, Sabrina Dubroca, Paolo Abeni,
	Andrew Lunn, David S. Miller

On 30/04/2026 15:43, Eric Dumazet wrote:
> On Thu, Apr 30, 2026 at 6:40 AM Antonio Quartulli <antonio@openvpn.net> wrote:
>>
>> Hi Eric,
>>
>> On 30/04/2026 15:37, Eric Dumazet wrote:
>>> On Thu, Apr 30, 2026 at 6:28 AM Antonio Quartulli <antonio@openvpn.net> wrote:
>>>>
>>>> Hi Jakub,
>>>>
>>>> sashiko came back with an interesting review of the per-cpu stats update
>>>> in the surrounding code.
>>>>
>>>> As far as I can tell its explanation makes sense, but I am no per-cpu
>>>> expert.
>>>>
>>>> IIUC it basically says that if gro_cells_receive() is invoked with
>>>> bottom halves disabled, the following dev_dstats_rx_add() should be too
>>>> to avoid deadlocks and corruptions.
>>>>
>>>> See below:
>>>>
>>>> On 29/04/2026 14:01, Antonio Quartulli wrote:
>>>>> diff --git a/drivers/net/ovpn/io.c b/drivers/net/ovpn/io.c
>>>>> index d92bb87be2b2..c0fdb9504241 100644
>>>>> --- a/drivers/net/ovpn/io.c
>>>>> +++ b/drivers/net/ovpn/io.c
>>>>> @@ -91,7 +91,13 @@ static void ovpn_netdev_write(struct ovpn_peer *peer, struct sk_buff *skb)
>>>>>
>>>>>         /* cause packet to be "received" by the interface */
>>>>>         pkt_len = skb->len;
>>>>> +     /* we may get here in process context in case of TCP connections,
>>>>> +      * therefore we have to disable BHs to ensure gro_cells_receive()
>>>>> +      * doesn't enter deadlock
>>>>> +      */
>>>>> +     local_bh_disable();
>>>>>         ret = gro_cells_receive(&peer->ovpn->gro_cells, skb);
>>>>> +     local_bh_enable();
>>>>>         if (likely(ret == NET_RX_SUCCESS)) {
>>>>>                 /* update RX stats with the size of decrypted packet */
>>>>>                 ovpn_peer_stats_increment_rx(&peer->vpn_stats, pkt_len);
>>>>    >>             dev_dstats_rx_add(peer->ovpn->dev...) << not visible in the diff,
>>>> but it's here
>>>>
>>>> from sashiko:
>>>>
>>>> Can this lead to seqcount deadlocks and stat corruption since
>>>> dev_dstats_rx_add() is executed with preemption and bottom-halves enabled?
>>>> dev_dstats_rx_add() operates on per-CPU statistics and protects 64-bit
>>>> updates with a sequence counter (u64_stats_update_begin()). When called
>>>> from process context with BH enabled, on 64-bit systems, this_cpu_ptr()
>>>> is evaluated with preemption enabled. If the task is migrated mid-update,
>>>> it risks cross-CPU stat corruption.
>>>> On 32-bit systems, u64_stats_update_begin() disables preemption but not
>>>> bottom-halves. If a softirq (e.g., a concurrent UDP packet reception)
>>>> interrupts the process context and calls dev_dstats_rx_add() for the same
>>>> interface, it will re-enter the seqcount writer lock on the exact same CPU.
>>>> This corrupts the sequence counter, causing readers to see an unlocked
>>>> sequence during active writes, leading to torn reads and corrupted stats.
>>>> Should local_bh_enable() be moved after the statistics updates to ensure
>>>> the entire per-CPU update is atomic with respect to softirqs?
>>>>
>>>>
>>>> Do you have an opinion?
>>>
>>> Sashiko suggestion seems good to me.
>>
>> But am I right saying that this bug existed before and it is not
>> introduced by this patch?
>>
>> A concurrent softirq (UDP RX pkt) could already trigger this problem
>> before we introduced the local_bh_disable/enable() calls, right?
> 
> I think we are saying the same thing.

Ok.

> 
> Let me rephrase: The bug was introduced in:
> 
> Fixes: ab66abbc769b ("ovpn: implement basic RX path (UDP)")

Well, the process context was introduced by

     11851cbd60ea ("ovpn: implement TCP transport")

because the TCP code relies on strparser.

Before 11851cbd60ea we had UDP only, therefore everything was happening 
in softirq, which means no bug existed (if I have understood this 
correctly).

Makes sense?

[in any case, both commits were basically merged side by side]


Regards,

-- 
Antonio Quartulli
OpenVPN Inc.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net 2/2] ovpn: ensure gro_cells_receive() is invoked with BH disabled
  2026-04-30 14:00           ` Antonio Quartulli
@ 2026-04-30 14:10             ` Eric Dumazet
  0 siblings, 0 replies; 9+ messages in thread
From: Eric Dumazet @ 2026-04-30 14:10 UTC (permalink / raw)
  To: Antonio Quartulli
  Cc: netdev, Jakub Kicinski, ralf, Sabrina Dubroca, Paolo Abeni,
	Andrew Lunn, David S. Miller

On Thu, Apr 30, 2026 at 7:00 AM Antonio Quartulli <antonio@openvpn.net> wrote:

> Well, the process context was introduced by
>
>      11851cbd60ea ("ovpn: implement TCP transport")
>
> because the TCP code relies on strparser.
>
> Before 11851cbd60ea we had UDP only, therefore everything was happening
> in softirq, which means no bug existed (if I have understood this
> correctly).
>
> Makes sense?
>
> [in any case, both commits were basically merged side by side]
>

OK then, seems good.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2026-04-30 14:10 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-29 12:01 [PATCH net 0/2] pull request: fixes for ovpn 2026-04-29 Antonio Quartulli
2026-04-29 12:01 ` [PATCH net 1/2] ovpn: reset MAC header before passing skb up Antonio Quartulli
2026-04-29 12:01 ` [PATCH net 2/2] ovpn: ensure gro_cells_receive() is invoked with BH disabled Antonio Quartulli
2026-04-30 13:28   ` Antonio Quartulli
2026-04-30 13:37     ` Eric Dumazet
2026-04-30 13:40       ` Antonio Quartulli
2026-04-30 13:43         ` Eric Dumazet
2026-04-30 14:00           ` Antonio Quartulli
2026-04-30 14:10             ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox