From: Antonio Quartulli <antonio@openvpn.net>
To: Eric Dumazet <edumazet@google.com>
Cc: netdev@vger.kernel.org, Jakub Kicinski <kuba@kernel.org>,
ralf@mandelbit.com, Sabrina Dubroca <sd@queasysnail.net>,
Paolo Abeni <pabeni@redhat.com>,
Andrew Lunn <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>
Subject: Re: [PATCH net 2/2] ovpn: ensure gro_cells_receive() is invoked with BH disabled
Date: Thu, 30 Apr 2026 16:00:39 +0200 [thread overview]
Message-ID: <a643b413-9997-4a28-a0de-919b83b327e9@openvpn.net> (raw)
In-Reply-To: <CANn89i+5q8D50sEKUwk1OzGt8tZD2ddSK-kHGq6dFm5QRRpaDQ@mail.gmail.com>
On 30/04/2026 15:43, Eric Dumazet wrote:
> On Thu, Apr 30, 2026 at 6:40 AM Antonio Quartulli <antonio@openvpn.net> wrote:
>>
>> Hi Eric,
>>
>> On 30/04/2026 15:37, Eric Dumazet wrote:
>>> On Thu, Apr 30, 2026 at 6:28 AM Antonio Quartulli <antonio@openvpn.net> wrote:
>>>>
>>>> Hi Jakub,
>>>>
>>>> sashiko came back with an interesting review of the per-cpu stats update
>>>> in the surrounding code.
>>>>
>>>> As far as I can tell its explanation makes sense, but I am no per-cpu
>>>> expert.
>>>>
>>>> IIUC it basically says that if gro_cells_receive() is invoked with
>>>> bottom halves disabled, the following dev_dstats_rx_add() should be too
>>>> to avoid deadlocks and corruptions.
>>>>
>>>> See below:
>>>>
>>>> On 29/04/2026 14:01, Antonio Quartulli wrote:
>>>>> diff --git a/drivers/net/ovpn/io.c b/drivers/net/ovpn/io.c
>>>>> index d92bb87be2b2..c0fdb9504241 100644
>>>>> --- a/drivers/net/ovpn/io.c
>>>>> +++ b/drivers/net/ovpn/io.c
>>>>> @@ -91,7 +91,13 @@ static void ovpn_netdev_write(struct ovpn_peer *peer, struct sk_buff *skb)
>>>>>
>>>>> /* cause packet to be "received" by the interface */
>>>>> pkt_len = skb->len;
>>>>> + /* we may get here in process context in case of TCP connections,
>>>>> + * therefore we have to disable BHs to ensure gro_cells_receive()
>>>>> + * doesn't enter deadlock
>>>>> + */
>>>>> + local_bh_disable();
>>>>> ret = gro_cells_receive(&peer->ovpn->gro_cells, skb);
>>>>> + local_bh_enable();
>>>>> if (likely(ret == NET_RX_SUCCESS)) {
>>>>> /* update RX stats with the size of decrypted packet */
>>>>> ovpn_peer_stats_increment_rx(&peer->vpn_stats, pkt_len);
>>>> >> dev_dstats_rx_add(peer->ovpn->dev...) << not visible in the diff,
>>>> but it's here
>>>>
>>>> from sashiko:
>>>>
>>>> Can this lead to seqcount deadlocks and stat corruption since
>>>> dev_dstats_rx_add() is executed with preemption and bottom-halves enabled?
>>>> dev_dstats_rx_add() operates on per-CPU statistics and protects 64-bit
>>>> updates with a sequence counter (u64_stats_update_begin()). When called
>>>> from process context with BH enabled, on 64-bit systems, this_cpu_ptr()
>>>> is evaluated with preemption enabled. If the task is migrated mid-update,
>>>> it risks cross-CPU stat corruption.
>>>> On 32-bit systems, u64_stats_update_begin() disables preemption but not
>>>> bottom-halves. If a softirq (e.g., a concurrent UDP packet reception)
>>>> interrupts the process context and calls dev_dstats_rx_add() for the same
>>>> interface, it will re-enter the seqcount writer lock on the exact same CPU.
>>>> This corrupts the sequence counter, causing readers to see an unlocked
>>>> sequence during active writes, leading to torn reads and corrupted stats.
>>>> Should local_bh_enable() be moved after the statistics updates to ensure
>>>> the entire per-CPU update is atomic with respect to softirqs?
>>>>
>>>>
>>>> Do you have an opinion?
>>>
>>> Sashiko suggestion seems good to me.
>>
>> But am I right saying that this bug existed before and it is not
>> introduced by this patch?
>>
>> A concurrent softirq (UDP RX pkt) could already trigger this problem
>> before we introduced the local_bh_disable/enable() calls, right?
>
> I think we are saying the same thing.
Ok.
>
> Let me rephrase: The bug was introduced in:
>
> Fixes: ab66abbc769b ("ovpn: implement basic RX path (UDP)")
Well, the process context was introduced by
11851cbd60ea ("ovpn: implement TCP transport")
because the TCP code relies on strparser.
Before 11851cbd60ea we had UDP only, therefore everything was happening
in softirq, which means no bug existed (if I have understood this
correctly).
Makes sense?
[in any case, both commits were basically merged side by side]
Regards,
--
Antonio Quartulli
OpenVPN Inc.
next prev parent reply other threads:[~2026-04-30 14:00 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-29 12:01 [PATCH net 0/2] pull request: fixes for ovpn 2026-04-29 Antonio Quartulli
2026-04-29 12:01 ` [PATCH net 1/2] ovpn: reset MAC header before passing skb up Antonio Quartulli
2026-04-29 12:01 ` [PATCH net 2/2] ovpn: ensure gro_cells_receive() is invoked with BH disabled Antonio Quartulli
2026-04-30 13:28 ` Antonio Quartulli
2026-04-30 13:37 ` Eric Dumazet
2026-04-30 13:40 ` Antonio Quartulli
2026-04-30 13:43 ` Eric Dumazet
2026-04-30 14:00 ` Antonio Quartulli [this message]
2026-04-30 14:10 ` Eric Dumazet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a643b413-9997-4a28-a0de-919b83b327e9@openvpn.net \
--to=antonio@openvpn.net \
--cc=andrew+netdev@lunn.ch \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=ralf@mandelbit.com \
--cc=sd@queasysnail.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox