From: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
To: Antonio Quartulli <antonio@openvpn.net>
Cc: netdev@vger.kernel.org, Sabrina Dubroca <sd@queasysnail.net>,
"David S . Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Oleksandr Natalenko <oleksandr@natalenko.name>,
Qingfang Deng <dqfext@gmail.com>,
Gert Doering <gert@greenie.muc.de>
Subject: Re: [PATCH net 2/5] ovpn: ensure sk is still valid during cleanup
Date: Tue, 3 Jun 2025 08:40:45 +0200 [thread overview]
Message-ID: <aD6Y7b8xnObUjeJn@mev-dev.igk.intel.com> (raw)
In-Reply-To: <20250530101254.24044-3-antonio@openvpn.net>
On Fri, May 30, 2025 at 12:12:51PM +0200, Antonio Quartulli wrote:
> Removing a peer while userspace attempts to close its transport
> socket triggers a race condition resulting in the following
> crash:
>
> Oops: general protection fault, probably for non-canonical address 0xdffffc0000000077: 0000 [#1] SMP KASAN
> KASAN: null-ptr-deref in range [0x00000000000003b8-0x00000000000003bf]
> CPU: 12 UID: 0 PID: 162 Comm: kworker/12:1 Tainted: G O 6.15.0-rc2-00635-g521139ac3840 #272 PREEMPT(full)
> Tainted: [O]=OOT_MODULE
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-20240910_120124-localhost 04/01/2014
> Workqueue: events ovpn_peer_keepalive_work [ovpn]
> RIP: 0010:ovpn_socket_release+0x23c/0x500 [ovpn]
> Code: ea 03 80 3c 02 00 0f 85 71 02 00 00 48 b8 00 00 00 00 00 fc ff df 4d 8b 64 24 18 49 8d bc 24 be 03 00 00 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 01 38 d0 7c 08 84 d2 0f 85 30
> RSP: 0018:ffffc90000c9fb18 EFLAGS: 00010217
> RAX: dffffc0000000000 RBX: ffff8881148d7940 RCX: ffffffff817787bb
> RDX: 0000000000000077 RSI: 0000000000000008 RDI: 00000000000003be
> RBP: ffffc90000c9fb30 R08: 0000000000000000 R09: fffffbfff0d3e840
> R10: ffffffff869f4207 R11: 0000000000000000 R12: 0000000000000000
> R13: ffff888115eb9300 R14: ffffc90000c9fbc8 R15: 000000000000000c
> FS: 0000000000000000(0000) GS:ffff8882b0151000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f37266b6114 CR3: 00000000054a8000 CR4: 0000000000750ef0
> PKRU: 55555554
> Call Trace:
> <TASK>
> unlock_ovpn+0x8b/0xe0 [ovpn]
> ovpn_peer_keepalive_work+0xe3/0x540 [ovpn]
> ? ovpn_peers_free+0x780/0x780 [ovpn]
> ? lock_acquire+0x56/0x70
> ? process_one_work+0x888/0x1740
> process_one_work+0x933/0x1740
> ? pwq_dec_nr_in_flight+0x10b0/0x10b0
> ? move_linked_works+0x12d/0x2c0
> ? assign_work+0x163/0x270
> worker_thread+0x4d6/0xd90
> ? preempt_count_sub+0x4c/0x70
> ? process_one_work+0x1740/0x1740
> kthread+0x36c/0x710
> ? trace_preempt_on+0x8c/0x1e0
> ? kthread_is_per_cpu+0xc0/0xc0
> ? preempt_count_sub+0x4c/0x70
> ? _raw_spin_unlock_irq+0x36/0x60
> ? calculate_sigpending+0x7b/0xa0
> ? kthread_is_per_cpu+0xc0/0xc0
> ret_from_fork+0x3a/0x80
> ? kthread_is_per_cpu+0xc0/0xc0
> ret_from_fork_asm+0x11/0x20
> </TASK>
> Modules linked in: ovpn(O)
>
> This happens because the peer deletion operation reaches
> ovpn_socket_release() while ovpn_sock->sock (struct socket *)
> and its sk member (struct sock *) are still both valid.
> Here synchronize_rcu() is invoked, after which ovpn_sock->sock->sk
> becomes NULL, due to the concurrent socket closing triggered
> from userspace.
>
> After having invoked synchronize_rcu(), ovpn_socket_release() will
> attempt dereferencing ovpn_sock->sock->sk, triggering the crash
> reported above.
>
> The reason for accessing sk is that we need to retrieve its
> protocol and continue the cleanup routine accordingly.
>
> This crash can be easily produced by running openvpn userspace in
> client mode with `--keepalive 10 20`, while entirely omitting this
> option on the server side.
> After 20 seconds ovpn will assume the peer (server) to be dead,
> will start removing it and will notify userspace. The latter will
> receive the notification and close the transport socket, thus
> triggering the crash.
>
> To fix the race condition for good, we need to refactor struct ovpn_socket.
> Since ovpn is always only interested in the sock->sk member (struct sock *)
> we can directly hold a reference to it, raher than accessing it via
> its struct socket container.
>
> This means changing "struct socket *ovpn_socket->sock" to
> "struct sock *ovpn_socket->sk".
>
> While acquiring a reference to sk, we can increase its refcounter
> without affecting the socket close()/destroy() notification
> (which we rely on when userspace closes a socket we are using).
>
> By increasing sk's refcounter we know we can dereference it
> in ovpn_socket_release() without incurring in any race condition
> anymore.
>
> ovpn_socket_release() will ultimately decrease the reference
> counter.
>
> Cc: Oleksandr Natalenko <oleksandr@natalenko.name>
> Fixes: 11851cbd60ea ("ovpn: implement TCP transport")
> Reported-by: Qingfang Deng <dqfext@gmail.com>
> Closes: https://github.com/OpenVPN/ovpn-net-next/issues/1
> Tested-by: Gert Doering <gert@greenie.muc.de>
> Link: https://www.mail-archive.com/openvpn-devel@lists.sourceforge.net/msg31575.html
> Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
> ---
> drivers/net/ovpn/io.c | 8 ++---
> drivers/net/ovpn/netlink.c | 16 ++++-----
> drivers/net/ovpn/peer.c | 4 +--
> drivers/net/ovpn/socket.c | 68 +++++++++++++++++++++-----------------
> drivers/net/ovpn/socket.h | 4 +--
> drivers/net/ovpn/tcp.c | 65 ++++++++++++++++++------------------
> drivers/net/ovpn/tcp.h | 3 +-
> drivers/net/ovpn/udp.c | 34 +++++++------------
> drivers/net/ovpn/udp.h | 4 +--
> 9 files changed, 102 insertions(+), 104 deletions(-)
>
Thanks for wide description in commit message. Changes looks fine for
me.
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
> --
> 2.49.0
next prev parent reply other threads:[~2025-06-03 6:41 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-30 10:12 [PATCH net 0/5] pull request: fixes for ovpn 2025-05-30 Antonio Quartulli
2025-05-30 10:12 ` [PATCH net 1/5] ovpn: properly deconfigure UDP-tunnel Antonio Quartulli
2025-06-03 6:30 ` Michal Swiatkowski
2025-06-03 9:02 ` Paolo Abeni
2025-06-03 9:08 ` Antonio Quartulli
2025-06-03 9:58 ` Paolo Abeni
2025-05-30 10:12 ` [PATCH net 2/5] ovpn: ensure sk is still valid during cleanup Antonio Quartulli
2025-06-03 6:40 ` Michal Swiatkowski [this message]
2025-05-30 10:12 ` [PATCH net 3/5] ovpn: avoid sleep in atomic context in TCP RX error path Antonio Quartulli
2025-06-03 6:42 ` Michal Swiatkowski
2025-05-30 10:12 ` [PATCH net 4/5] selftest/net/ovpn: fix TCP socket creation Antonio Quartulli
2025-05-30 10:12 ` [PATCH net 5/5] selftest/net/ovpn: fix missing file Antonio Quartulli
-- strict thread matches above, loose matches on Subject: below --
2025-06-03 11:11 [PATCH net 0/5] pull request: fixes for ovpn 2025-06-03 Antonio Quartulli
2025-06-03 11:11 ` [PATCH net 2/5] ovpn: ensure sk is still valid during cleanup Antonio Quartulli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aD6Y7b8xnObUjeJn@mev-dev.igk.intel.com \
--to=michal.swiatkowski@linux.intel.com \
--cc=antonio@openvpn.net \
--cc=davem@davemloft.net \
--cc=dqfext@gmail.com \
--cc=edumazet@google.com \
--cc=gert@greenie.muc.de \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=oleksandr@natalenko.name \
--cc=pabeni@redhat.com \
--cc=sd@queasysnail.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.