* [PATCH net 0/5] pull request: fixes for ovpn 2025-05-30
@ 2025-05-30 10:12 Antonio Quartulli
2025-05-30 10:12 ` [PATCH net 1/5] ovpn: properly deconfigure UDP-tunnel Antonio Quartulli
` (4 more replies)
0 siblings, 5 replies; 13+ messages in thread
From: Antonio Quartulli @ 2025-05-30 10:12 UTC (permalink / raw)
To: netdev
Cc: Antonio Quartulli, Sabrina Dubroca, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni
Hi netdev-team,
I am targeting net this time as I see that ovpn has landed there.
In this batch you can find the following bug fixes:
Patch 1: when releasing a UDP socket we were wrongly invoking
setup_udp_tunnel_sock() with an empty config. This was not
properly shutting down the UDP encap state.
With this patch we simply undo what was done during setup.
Patch 2: ovpn was holding a reference to a 'struct socket'
without increasing its reference counter. This was intended
and worked as expected until we hit a race condition where
user space tries to close the socket while kernel space is
also releasing it. In this case the (struct socket *)->sk
member would disappear under our feet leading to a null-ptr-deref.
This patch fixes this issue by having struct ovpn_socket hold
a reference directly to the sk member while also increasing
its reference counter.
Patch 3: in case of errors along the TCP RX path (softirq)
we want to immediately delete the peer, but this operation may
sleep. With this patch we move the peer deletion to a scheduled
worker.
Patch 4 and 5 are instead fixing minor issues in the ovpn
kselftests.
Please pull or let me know of any issue
Thanks a lot,
Antonio
The following changes since commit f65dca1752b70ec4f678ae4dbdd5892335bcbbd8:
Merge tag 'linux-can-fixes-for-6.16-20250529' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can (2025-05-29 12:55:34 +0200)
are available in the Git repository at:
https://github.com/OpenVPN/ovpn-net-next tags/ovpn-net-next-20250530
for you to fetch changes up to 64a63e888318cf3259a549662411fa1bd8babb44:
selftest/net/ovpn: fix missing file (2025-05-30 11:45:27 +0200)
----------------------------------------------------------------
This bugfix batch includes the following changes:
* dropped bogus call to setup_udp_tunnel_sock() during
cleanup, substituted by proper state unwind
* fixed race condition between peer removal (by kernel
space) and socket closing (by user space)
* fixed sleep in atomic context along TCP RX error path
* fixes for ovpn kselftests
----------------------------------------------------------------
Antonio Quartulli (5):
ovpn: properly deconfigure UDP-tunnel
ovpn: ensure sk is still valid during cleanup
ovpn: avoid sleep in atomic context in TCP RX error path
selftest/net/ovpn: fix TCP socket creation
selftest/net/ovpn: fix missing file
drivers/net/ovpn/io.c | 8 +--
drivers/net/ovpn/netlink.c | 16 ++---
drivers/net/ovpn/peer.c | 4 +-
drivers/net/ovpn/socket.c | 68 +++++++++++---------
drivers/net/ovpn/socket.h | 4 +-
drivers/net/ovpn/tcp.c | 73 +++++++++++-----------
drivers/net/ovpn/tcp.h | 3 +-
drivers/net/ovpn/udp.c | 46 +++++++-------
drivers/net/ovpn/udp.h | 4 +-
tools/testing/selftests/net/ovpn/ovpn-cli.c | 1 +
tools/testing/selftests/net/ovpn/test-large-mtu.sh | 9 +++
11 files changed, 128 insertions(+), 108 deletions(-)
create mode 100755 tools/testing/selftests/net/ovpn/test-large-mtu.sh
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH net 1/5] ovpn: properly deconfigure UDP-tunnel
2025-05-30 10:12 [PATCH net 0/5] pull request: fixes for ovpn 2025-05-30 Antonio Quartulli
@ 2025-05-30 10:12 ` Antonio Quartulli
2025-06-03 6:30 ` Michal Swiatkowski
2025-06-03 9:02 ` Paolo Abeni
2025-05-30 10:12 ` [PATCH net 2/5] ovpn: ensure sk is still valid during cleanup Antonio Quartulli
` (3 subsequent siblings)
4 siblings, 2 replies; 13+ messages in thread
From: Antonio Quartulli @ 2025-05-30 10:12 UTC (permalink / raw)
To: netdev
Cc: Antonio Quartulli, Sabrina Dubroca, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Oleksandr Natalenko
When deconfiguring a UDP-tunnel from a socket, we cannot
call setup_udp_tunnel_sock() with an empty config, because
this helper is expected to be invoked only during setup.
Get rid of the call to setup_udp_tunnel_sock() and just
revert what it did during socket initialization..
Note that the global udp_encap_needed_key and the GRO state
are left untouched: udp_destroy_socket() will eventually
take care of them.
Cc: Sabrina Dubroca <sd@queasysnail.net>
Cc: Oleksandr Natalenko <oleksandr@natalenko.name>
Fixes: ab66abbc769b ("ovpn: implement basic RX path (UDP)")
Reported-by: Paolo Abeni <pabeni@redhat.com>
Closes: https://lore.kernel.org/netdev/1a47ce02-fd42-4761-8697-f3f315011cc6@redhat.com
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
---
drivers/net/ovpn/udp.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ovpn/udp.c b/drivers/net/ovpn/udp.c
index aef8c0406ec9..89bb50f94ddb 100644
--- a/drivers/net/ovpn/udp.c
+++ b/drivers/net/ovpn/udp.c
@@ -442,8 +442,16 @@ int ovpn_udp_socket_attach(struct ovpn_socket *ovpn_sock,
*/
void ovpn_udp_socket_detach(struct ovpn_socket *ovpn_sock)
{
- struct udp_tunnel_sock_cfg cfg = { };
+ struct sock *sk = ovpn_sock->sock->sk;
- setup_udp_tunnel_sock(sock_net(ovpn_sock->sock->sk), ovpn_sock->sock,
- &cfg);
+ /* Re-enable multicast loopback */
+ inet_set_bit(MC_LOOP, sk);
+ /* Disable CHECKSUM_UNNECESSARY to CHECKSUM_COMPLETE conversion */
+ inet_dec_convert_csum(sk);
+
+ udp_sk(sk)->encap_type = 0;
+ udp_sk(sk)->encap_rcv = NULL;
+ udp_sk(sk)->encap_destroy = NULL;
+
+ rcu_assign_sk_user_data(sk, NULL);
}
--
2.49.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH net 2/5] ovpn: ensure sk is still valid during cleanup
2025-05-30 10:12 [PATCH net 0/5] pull request: fixes for ovpn 2025-05-30 Antonio Quartulli
2025-05-30 10:12 ` [PATCH net 1/5] ovpn: properly deconfigure UDP-tunnel Antonio Quartulli
@ 2025-05-30 10:12 ` Antonio Quartulli
2025-06-03 6:40 ` Michal Swiatkowski
2025-05-30 10:12 ` [PATCH net 3/5] ovpn: avoid sleep in atomic context in TCP RX error path Antonio Quartulli
` (2 subsequent siblings)
4 siblings, 1 reply; 13+ messages in thread
From: Antonio Quartulli @ 2025-05-30 10:12 UTC (permalink / raw)
To: netdev
Cc: Antonio Quartulli, Sabrina Dubroca, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Oleksandr Natalenko,
Qingfang Deng, Gert Doering
Removing a peer while userspace attempts to close its transport
socket triggers a race condition resulting in the following
crash:
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000077: 0000 [#1] SMP KASAN
KASAN: null-ptr-deref in range [0x00000000000003b8-0x00000000000003bf]
CPU: 12 UID: 0 PID: 162 Comm: kworker/12:1 Tainted: G O 6.15.0-rc2-00635-g521139ac3840 #272 PREEMPT(full)
Tainted: [O]=OOT_MODULE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-20240910_120124-localhost 04/01/2014
Workqueue: events ovpn_peer_keepalive_work [ovpn]
RIP: 0010:ovpn_socket_release+0x23c/0x500 [ovpn]
Code: ea 03 80 3c 02 00 0f 85 71 02 00 00 48 b8 00 00 00 00 00 fc ff df 4d 8b 64 24 18 49 8d bc 24 be 03 00 00 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 01 38 d0 7c 08 84 d2 0f 85 30
RSP: 0018:ffffc90000c9fb18 EFLAGS: 00010217
RAX: dffffc0000000000 RBX: ffff8881148d7940 RCX: ffffffff817787bb
RDX: 0000000000000077 RSI: 0000000000000008 RDI: 00000000000003be
RBP: ffffc90000c9fb30 R08: 0000000000000000 R09: fffffbfff0d3e840
R10: ffffffff869f4207 R11: 0000000000000000 R12: 0000000000000000
R13: ffff888115eb9300 R14: ffffc90000c9fbc8 R15: 000000000000000c
FS: 0000000000000000(0000) GS:ffff8882b0151000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f37266b6114 CR3: 00000000054a8000 CR4: 0000000000750ef0
PKRU: 55555554
Call Trace:
<TASK>
unlock_ovpn+0x8b/0xe0 [ovpn]
ovpn_peer_keepalive_work+0xe3/0x540 [ovpn]
? ovpn_peers_free+0x780/0x780 [ovpn]
? lock_acquire+0x56/0x70
? process_one_work+0x888/0x1740
process_one_work+0x933/0x1740
? pwq_dec_nr_in_flight+0x10b0/0x10b0
? move_linked_works+0x12d/0x2c0
? assign_work+0x163/0x270
worker_thread+0x4d6/0xd90
? preempt_count_sub+0x4c/0x70
? process_one_work+0x1740/0x1740
kthread+0x36c/0x710
? trace_preempt_on+0x8c/0x1e0
? kthread_is_per_cpu+0xc0/0xc0
? preempt_count_sub+0x4c/0x70
? _raw_spin_unlock_irq+0x36/0x60
? calculate_sigpending+0x7b/0xa0
? kthread_is_per_cpu+0xc0/0xc0
ret_from_fork+0x3a/0x80
? kthread_is_per_cpu+0xc0/0xc0
ret_from_fork_asm+0x11/0x20
</TASK>
Modules linked in: ovpn(O)
This happens because the peer deletion operation reaches
ovpn_socket_release() while ovpn_sock->sock (struct socket *)
and its sk member (struct sock *) are still both valid.
Here synchronize_rcu() is invoked, after which ovpn_sock->sock->sk
becomes NULL, due to the concurrent socket closing triggered
from userspace.
After having invoked synchronize_rcu(), ovpn_socket_release() will
attempt dereferencing ovpn_sock->sock->sk, triggering the crash
reported above.
The reason for accessing sk is that we need to retrieve its
protocol and continue the cleanup routine accordingly.
This crash can be easily produced by running openvpn userspace in
client mode with `--keepalive 10 20`, while entirely omitting this
option on the server side.
After 20 seconds ovpn will assume the peer (server) to be dead,
will start removing it and will notify userspace. The latter will
receive the notification and close the transport socket, thus
triggering the crash.
To fix the race condition for good, we need to refactor struct ovpn_socket.
Since ovpn is always only interested in the sock->sk member (struct sock *)
we can directly hold a reference to it, raher than accessing it via
its struct socket container.
This means changing "struct socket *ovpn_socket->sock" to
"struct sock *ovpn_socket->sk".
While acquiring a reference to sk, we can increase its refcounter
without affecting the socket close()/destroy() notification
(which we rely on when userspace closes a socket we are using).
By increasing sk's refcounter we know we can dereference it
in ovpn_socket_release() without incurring in any race condition
anymore.
ovpn_socket_release() will ultimately decrease the reference
counter.
Cc: Oleksandr Natalenko <oleksandr@natalenko.name>
Fixes: 11851cbd60ea ("ovpn: implement TCP transport")
Reported-by: Qingfang Deng <dqfext@gmail.com>
Closes: https://github.com/OpenVPN/ovpn-net-next/issues/1
Tested-by: Gert Doering <gert@greenie.muc.de>
Link: https://www.mail-archive.com/openvpn-devel@lists.sourceforge.net/msg31575.html
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
---
drivers/net/ovpn/io.c | 8 ++---
drivers/net/ovpn/netlink.c | 16 ++++-----
drivers/net/ovpn/peer.c | 4 +--
drivers/net/ovpn/socket.c | 68 +++++++++++++++++++++-----------------
drivers/net/ovpn/socket.h | 4 +--
drivers/net/ovpn/tcp.c | 65 ++++++++++++++++++------------------
drivers/net/ovpn/tcp.h | 3 +-
drivers/net/ovpn/udp.c | 34 +++++++------------
drivers/net/ovpn/udp.h | 4 +--
9 files changed, 102 insertions(+), 104 deletions(-)
diff --git a/drivers/net/ovpn/io.c b/drivers/net/ovpn/io.c
index 10d8afecec55..ebf1e849506b 100644
--- a/drivers/net/ovpn/io.c
+++ b/drivers/net/ovpn/io.c
@@ -134,7 +134,7 @@ void ovpn_decrypt_post(void *data, int ret)
rcu_read_lock();
sock = rcu_dereference(peer->sock);
- if (sock && sock->sock->sk->sk_protocol == IPPROTO_UDP)
+ if (sock && sock->sk->sk_protocol == IPPROTO_UDP)
/* check if this peer changed local or remote endpoint */
ovpn_peer_endpoints_update(peer, skb);
rcu_read_unlock();
@@ -270,12 +270,12 @@ void ovpn_encrypt_post(void *data, int ret)
if (unlikely(!sock))
goto err_unlock;
- switch (sock->sock->sk->sk_protocol) {
+ switch (sock->sk->sk_protocol) {
case IPPROTO_UDP:
- ovpn_udp_send_skb(peer, sock->sock, skb);
+ ovpn_udp_send_skb(peer, sock->sk, skb);
break;
case IPPROTO_TCP:
- ovpn_tcp_send_skb(peer, sock->sock, skb);
+ ovpn_tcp_send_skb(peer, sock->sk, skb);
break;
default:
/* no transport configured yet */
diff --git a/drivers/net/ovpn/netlink.c b/drivers/net/ovpn/netlink.c
index bea03913bfb1..a4ec53def46e 100644
--- a/drivers/net/ovpn/netlink.c
+++ b/drivers/net/ovpn/netlink.c
@@ -501,7 +501,7 @@ int ovpn_nl_peer_set_doit(struct sk_buff *skb, struct genl_info *info)
/* when using a TCP socket the remote IP is not expected */
rcu_read_lock();
sock = rcu_dereference(peer->sock);
- if (sock && sock->sock->sk->sk_protocol == IPPROTO_TCP &&
+ if (sock && sock->sk->sk_protocol == IPPROTO_TCP &&
(attrs[OVPN_A_PEER_REMOTE_IPV4] ||
attrs[OVPN_A_PEER_REMOTE_IPV6])) {
rcu_read_unlock();
@@ -559,14 +559,14 @@ static int ovpn_nl_send_peer(struct sk_buff *skb, const struct genl_info *info,
goto err_unlock;
}
- if (!net_eq(genl_info_net(info), sock_net(sock->sock->sk))) {
+ if (!net_eq(genl_info_net(info), sock_net(sock->sk))) {
id = peernet2id_alloc(genl_info_net(info),
- sock_net(sock->sock->sk),
+ sock_net(sock->sk),
GFP_ATOMIC);
if (nla_put_s32(skb, OVPN_A_PEER_SOCKET_NETNSID, id))
goto err_unlock;
}
- local_port = inet_sk(sock->sock->sk)->inet_sport;
+ local_port = inet_sk(sock->sk)->inet_sport;
rcu_read_unlock();
if (nla_put_u32(skb, OVPN_A_PEER_ID, peer->id))
@@ -1153,8 +1153,8 @@ int ovpn_nl_peer_del_notify(struct ovpn_peer *peer)
ret = -EINVAL;
goto err_unlock;
}
- genlmsg_multicast_netns(&ovpn_nl_family, sock_net(sock->sock->sk),
- msg, 0, OVPN_NLGRP_PEERS, GFP_ATOMIC);
+ genlmsg_multicast_netns(&ovpn_nl_family, sock_net(sock->sk), msg, 0,
+ OVPN_NLGRP_PEERS, GFP_ATOMIC);
rcu_read_unlock();
return 0;
@@ -1218,8 +1218,8 @@ int ovpn_nl_key_swap_notify(struct ovpn_peer *peer, u8 key_id)
ret = -EINVAL;
goto err_unlock;
}
- genlmsg_multicast_netns(&ovpn_nl_family, sock_net(sock->sock->sk),
- msg, 0, OVPN_NLGRP_PEERS, GFP_ATOMIC);
+ genlmsg_multicast_netns(&ovpn_nl_family, sock_net(sock->sk), msg, 0,
+ OVPN_NLGRP_PEERS, GFP_ATOMIC);
rcu_read_unlock();
return 0;
diff --git a/drivers/net/ovpn/peer.c b/drivers/net/ovpn/peer.c
index a1fd27b9c038..4bfcab0c8652 100644
--- a/drivers/net/ovpn/peer.c
+++ b/drivers/net/ovpn/peer.c
@@ -1145,7 +1145,7 @@ static void ovpn_peer_release_p2p(struct ovpn_priv *ovpn, struct sock *sk,
if (sk) {
ovpn_sock = rcu_access_pointer(peer->sock);
- if (!ovpn_sock || ovpn_sock->sock->sk != sk) {
+ if (!ovpn_sock || ovpn_sock->sk != sk) {
spin_unlock_bh(&ovpn->lock);
ovpn_peer_put(peer);
return;
@@ -1175,7 +1175,7 @@ static void ovpn_peers_release_mp(struct ovpn_priv *ovpn, struct sock *sk,
if (sk) {
rcu_read_lock();
ovpn_sock = rcu_dereference(peer->sock);
- remove = ovpn_sock && ovpn_sock->sock->sk == sk;
+ remove = ovpn_sock && ovpn_sock->sk == sk;
rcu_read_unlock();
}
diff --git a/drivers/net/ovpn/socket.c b/drivers/net/ovpn/socket.c
index a83cbab72591..9750871ab65c 100644
--- a/drivers/net/ovpn/socket.c
+++ b/drivers/net/ovpn/socket.c
@@ -24,9 +24,9 @@ static void ovpn_socket_release_kref(struct kref *kref)
struct ovpn_socket *sock = container_of(kref, struct ovpn_socket,
refcount);
- if (sock->sock->sk->sk_protocol == IPPROTO_UDP)
+ if (sock->sk->sk_protocol == IPPROTO_UDP)
ovpn_udp_socket_detach(sock);
- else if (sock->sock->sk->sk_protocol == IPPROTO_TCP)
+ else if (sock->sk->sk_protocol == IPPROTO_TCP)
ovpn_tcp_socket_detach(sock);
}
@@ -75,14 +75,6 @@ void ovpn_socket_release(struct ovpn_peer *peer)
if (!sock)
return;
- /* sanity check: we should not end up here if the socket
- * was already closed
- */
- if (!sock->sock->sk) {
- DEBUG_NET_WARN_ON_ONCE(1);
- return;
- }
-
/* Drop the reference while holding the sock lock to avoid
* concurrent ovpn_socket_new call to mess up with a partially
* detached socket.
@@ -90,22 +82,24 @@ void ovpn_socket_release(struct ovpn_peer *peer)
* Holding the lock ensures that a socket with refcnt 0 is fully
* detached before it can be picked by a concurrent reader.
*/
- lock_sock(sock->sock->sk);
+ lock_sock(sock->sk);
released = ovpn_socket_put(peer, sock);
- release_sock(sock->sock->sk);
+ release_sock(sock->sk);
/* align all readers with sk_user_data being NULL */
synchronize_rcu();
/* following cleanup should happen with lock released */
if (released) {
- if (sock->sock->sk->sk_protocol == IPPROTO_UDP) {
+ if (sock->sk->sk_protocol == IPPROTO_UDP) {
netdev_put(sock->ovpn->dev, &sock->dev_tracker);
- } else if (sock->sock->sk->sk_protocol == IPPROTO_TCP) {
+ } else if (sock->sk->sk_protocol == IPPROTO_TCP) {
/* wait for TCP jobs to terminate */
ovpn_tcp_socket_wait_finish(sock);
ovpn_peer_put(sock->peer);
}
+ /* drop reference acquired in ovpn_socket_new() */
+ sock_put(sock->sk);
/* we can call plain kfree() because we already waited one RCU
* period due to synchronize_rcu()
*/
@@ -118,12 +112,14 @@ static bool ovpn_socket_hold(struct ovpn_socket *sock)
return kref_get_unless_zero(&sock->refcount);
}
-static int ovpn_socket_attach(struct ovpn_socket *sock, struct ovpn_peer *peer)
+static int ovpn_socket_attach(struct ovpn_socket *ovpn_sock,
+ struct socket *sock,
+ struct ovpn_peer *peer)
{
- if (sock->sock->sk->sk_protocol == IPPROTO_UDP)
- return ovpn_udp_socket_attach(sock, peer->ovpn);
- else if (sock->sock->sk->sk_protocol == IPPROTO_TCP)
- return ovpn_tcp_socket_attach(sock, peer);
+ if (sock->sk->sk_protocol == IPPROTO_UDP)
+ return ovpn_udp_socket_attach(ovpn_sock, sock, peer->ovpn);
+ else if (sock->sk->sk_protocol == IPPROTO_TCP)
+ return ovpn_tcp_socket_attach(ovpn_sock, peer);
return -EOPNOTSUPP;
}
@@ -138,14 +134,15 @@ static int ovpn_socket_attach(struct ovpn_socket *sock, struct ovpn_peer *peer)
struct ovpn_socket *ovpn_socket_new(struct socket *sock, struct ovpn_peer *peer)
{
struct ovpn_socket *ovpn_sock;
+ struct sock *sk = sock->sk;
int ret;
- lock_sock(sock->sk);
+ lock_sock(sk);
/* a TCP socket can only be owned by a single peer, therefore there
* can't be any other user
*/
- if (sock->sk->sk_protocol == IPPROTO_TCP && sock->sk->sk_user_data) {
+ if (sk->sk_protocol == IPPROTO_TCP && sk->sk_user_data) {
ovpn_sock = ERR_PTR(-EBUSY);
goto sock_release;
}
@@ -153,8 +150,8 @@ struct ovpn_socket *ovpn_socket_new(struct socket *sock, struct ovpn_peer *peer)
/* a UDP socket can be shared across multiple peers, but we must make
* sure it is not owned by something else
*/
- if (sock->sk->sk_protocol == IPPROTO_UDP) {
- u8 type = READ_ONCE(udp_sk(sock->sk)->encap_type);
+ if (sk->sk_protocol == IPPROTO_UDP) {
+ u8 type = READ_ONCE(udp_sk(sk)->encap_type);
/* socket owned by other encapsulation module */
if (type && type != UDP_ENCAP_OVPNINUDP) {
@@ -163,7 +160,7 @@ struct ovpn_socket *ovpn_socket_new(struct socket *sock, struct ovpn_peer *peer)
}
rcu_read_lock();
- ovpn_sock = rcu_dereference_sk_user_data(sock->sk);
+ ovpn_sock = rcu_dereference_sk_user_data(sk);
if (ovpn_sock) {
/* socket owned by another ovpn instance, we can't use it */
if (ovpn_sock->ovpn != peer->ovpn) {
@@ -200,11 +197,22 @@ struct ovpn_socket *ovpn_socket_new(struct socket *sock, struct ovpn_peer *peer)
goto sock_release;
}
- ovpn_sock->sock = sock;
+ ovpn_sock->sk = sk;
kref_init(&ovpn_sock->refcount);
- ret = ovpn_socket_attach(ovpn_sock, peer);
+ /* the newly created ovpn_socket is holding reference to sk,
+ * therefore we increase its refcounter.
+ *
+ * This ovpn_socket instance is referenced by all peers
+ * using the same socket.
+ *
+ * ovpn_socket_release() will take care of dropping the reference.
+ */
+ sock_hold(sk);
+
+ ret = ovpn_socket_attach(ovpn_sock, sock, peer);
if (ret < 0) {
+ sock_put(sk);
kfree(ovpn_sock);
ovpn_sock = ERR_PTR(ret);
goto sock_release;
@@ -213,11 +221,11 @@ struct ovpn_socket *ovpn_socket_new(struct socket *sock, struct ovpn_peer *peer)
/* TCP sockets are per-peer, therefore they are linked to their unique
* peer
*/
- if (sock->sk->sk_protocol == IPPROTO_TCP) {
+ if (sk->sk_protocol == IPPROTO_TCP) {
INIT_WORK(&ovpn_sock->tcp_tx_work, ovpn_tcp_tx_work);
ovpn_sock->peer = peer;
ovpn_peer_hold(peer);
- } else if (sock->sk->sk_protocol == IPPROTO_UDP) {
+ } else if (sk->sk_protocol == IPPROTO_UDP) {
/* in UDP we only link the ovpn instance since the socket is
* shared among multiple peers
*/
@@ -226,8 +234,8 @@ struct ovpn_socket *ovpn_socket_new(struct socket *sock, struct ovpn_peer *peer)
GFP_KERNEL);
}
- rcu_assign_sk_user_data(sock->sk, ovpn_sock);
+ rcu_assign_sk_user_data(sk, ovpn_sock);
sock_release:
- release_sock(sock->sk);
+ release_sock(sk);
return ovpn_sock;
}
diff --git a/drivers/net/ovpn/socket.h b/drivers/net/ovpn/socket.h
index 00d856b1a5d8..4afcec71040d 100644
--- a/drivers/net/ovpn/socket.h
+++ b/drivers/net/ovpn/socket.h
@@ -22,7 +22,7 @@ struct ovpn_peer;
* @ovpn: ovpn instance owning this socket (UDP only)
* @dev_tracker: reference tracker for associated dev (UDP only)
* @peer: unique peer transmitting over this socket (TCP only)
- * @sock: the low level sock object
+ * @sk: the low level sock object
* @refcount: amount of contexts currently referencing this object
* @work: member used to schedule release routine (it may block)
* @tcp_tx_work: work for deferring outgoing packet processing (TCP only)
@@ -36,7 +36,7 @@ struct ovpn_socket {
struct ovpn_peer *peer;
};
- struct socket *sock;
+ struct sock *sk;
struct kref refcount;
struct work_struct work;
struct work_struct tcp_tx_work;
diff --git a/drivers/net/ovpn/tcp.c b/drivers/net/ovpn/tcp.c
index 7c42d84987ad..7e79aad0b043 100644
--- a/drivers/net/ovpn/tcp.c
+++ b/drivers/net/ovpn/tcp.c
@@ -186,18 +186,18 @@ static int ovpn_tcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
void ovpn_tcp_socket_detach(struct ovpn_socket *ovpn_sock)
{
struct ovpn_peer *peer = ovpn_sock->peer;
- struct socket *sock = ovpn_sock->sock;
+ struct sock *sk = ovpn_sock->sk;
strp_stop(&peer->tcp.strp);
skb_queue_purge(&peer->tcp.user_queue);
/* restore CBs that were saved in ovpn_sock_set_tcp_cb() */
- sock->sk->sk_data_ready = peer->tcp.sk_cb.sk_data_ready;
- sock->sk->sk_write_space = peer->tcp.sk_cb.sk_write_space;
- sock->sk->sk_prot = peer->tcp.sk_cb.prot;
- sock->sk->sk_socket->ops = peer->tcp.sk_cb.ops;
+ sk->sk_data_ready = peer->tcp.sk_cb.sk_data_ready;
+ sk->sk_write_space = peer->tcp.sk_cb.sk_write_space;
+ sk->sk_prot = peer->tcp.sk_cb.prot;
+ sk->sk_socket->ops = peer->tcp.sk_cb.ops;
- rcu_assign_sk_user_data(sock->sk, NULL);
+ rcu_assign_sk_user_data(sk, NULL);
}
void ovpn_tcp_socket_wait_finish(struct ovpn_socket *sock)
@@ -283,10 +283,10 @@ void ovpn_tcp_tx_work(struct work_struct *work)
sock = container_of(work, struct ovpn_socket, tcp_tx_work);
- lock_sock(sock->sock->sk);
+ lock_sock(sock->sk);
if (sock->peer)
- ovpn_tcp_send_sock(sock->peer, sock->sock->sk);
- release_sock(sock->sock->sk);
+ ovpn_tcp_send_sock(sock->peer, sock->sk);
+ release_sock(sock->sk);
}
static void ovpn_tcp_send_sock_skb(struct ovpn_peer *peer, struct sock *sk,
@@ -307,15 +307,15 @@ static void ovpn_tcp_send_sock_skb(struct ovpn_peer *peer, struct sock *sk,
ovpn_tcp_send_sock(peer, sk);
}
-void ovpn_tcp_send_skb(struct ovpn_peer *peer, struct socket *sock,
+void ovpn_tcp_send_skb(struct ovpn_peer *peer, struct sock *sk,
struct sk_buff *skb)
{
u16 len = skb->len;
*(__be16 *)__skb_push(skb, sizeof(u16)) = htons(len);
- spin_lock_nested(&sock->sk->sk_lock.slock, OVPN_TCP_DEPTH_NESTING);
- if (sock_owned_by_user(sock->sk)) {
+ spin_lock_nested(&sk->sk_lock.slock, OVPN_TCP_DEPTH_NESTING);
+ if (sock_owned_by_user(sk)) {
if (skb_queue_len(&peer->tcp.out_queue) >=
READ_ONCE(net_hotdata.max_backlog)) {
dev_dstats_tx_dropped(peer->ovpn->dev);
@@ -324,10 +324,10 @@ void ovpn_tcp_send_skb(struct ovpn_peer *peer, struct socket *sock,
}
__skb_queue_tail(&peer->tcp.out_queue, skb);
} else {
- ovpn_tcp_send_sock_skb(peer, sock->sk, skb);
+ ovpn_tcp_send_sock_skb(peer, sk, skb);
}
unlock:
- spin_unlock(&sock->sk->sk_lock.slock);
+ spin_unlock(&sk->sk_lock.slock);
}
static void ovpn_tcp_release(struct sock *sk)
@@ -474,7 +474,6 @@ static void ovpn_tcp_peer_del_work(struct work_struct *work)
int ovpn_tcp_socket_attach(struct ovpn_socket *ovpn_sock,
struct ovpn_peer *peer)
{
- struct socket *sock = ovpn_sock->sock;
struct strp_callbacks cb = {
.rcv_msg = ovpn_tcp_rcv,
.parse_msg = ovpn_tcp_parse,
@@ -482,20 +481,20 @@ int ovpn_tcp_socket_attach(struct ovpn_socket *ovpn_sock,
int ret;
/* make sure no pre-existing encapsulation handler exists */
- if (sock->sk->sk_user_data)
+ if (ovpn_sock->sk->sk_user_data)
return -EBUSY;
/* only a fully connected socket is expected. Connection should be
* handled in userspace
*/
- if (sock->sk->sk_state != TCP_ESTABLISHED) {
+ if (ovpn_sock->sk->sk_state != TCP_ESTABLISHED) {
net_err_ratelimited("%s: provided TCP socket is not in ESTABLISHED state: %d\n",
netdev_name(peer->ovpn->dev),
- sock->sk->sk_state);
+ ovpn_sock->sk->sk_state);
return -EINVAL;
}
- ret = strp_init(&peer->tcp.strp, sock->sk, &cb);
+ ret = strp_init(&peer->tcp.strp, ovpn_sock->sk, &cb);
if (ret < 0) {
DEBUG_NET_WARN_ON_ONCE(1);
return ret;
@@ -503,31 +502,31 @@ int ovpn_tcp_socket_attach(struct ovpn_socket *ovpn_sock,
INIT_WORK(&peer->tcp.defer_del_work, ovpn_tcp_peer_del_work);
- __sk_dst_reset(sock->sk);
+ __sk_dst_reset(ovpn_sock->sk);
skb_queue_head_init(&peer->tcp.user_queue);
skb_queue_head_init(&peer->tcp.out_queue);
/* save current CBs so that they can be restored upon socket release */
- peer->tcp.sk_cb.sk_data_ready = sock->sk->sk_data_ready;
- peer->tcp.sk_cb.sk_write_space = sock->sk->sk_write_space;
- peer->tcp.sk_cb.prot = sock->sk->sk_prot;
- peer->tcp.sk_cb.ops = sock->sk->sk_socket->ops;
+ peer->tcp.sk_cb.sk_data_ready = ovpn_sock->sk->sk_data_ready;
+ peer->tcp.sk_cb.sk_write_space = ovpn_sock->sk->sk_write_space;
+ peer->tcp.sk_cb.prot = ovpn_sock->sk->sk_prot;
+ peer->tcp.sk_cb.ops = ovpn_sock->sk->sk_socket->ops;
/* assign our static CBs and prot/ops */
- sock->sk->sk_data_ready = ovpn_tcp_data_ready;
- sock->sk->sk_write_space = ovpn_tcp_write_space;
+ ovpn_sock->sk->sk_data_ready = ovpn_tcp_data_ready;
+ ovpn_sock->sk->sk_write_space = ovpn_tcp_write_space;
- if (sock->sk->sk_family == AF_INET) {
- sock->sk->sk_prot = &ovpn_tcp_prot;
- sock->sk->sk_socket->ops = &ovpn_tcp_ops;
+ if (ovpn_sock->sk->sk_family == AF_INET) {
+ ovpn_sock->sk->sk_prot = &ovpn_tcp_prot;
+ ovpn_sock->sk->sk_socket->ops = &ovpn_tcp_ops;
} else {
- sock->sk->sk_prot = &ovpn_tcp6_prot;
- sock->sk->sk_socket->ops = &ovpn_tcp6_ops;
+ ovpn_sock->sk->sk_prot = &ovpn_tcp6_prot;
+ ovpn_sock->sk->sk_socket->ops = &ovpn_tcp6_ops;
}
/* avoid using task_frag */
- sock->sk->sk_allocation = GFP_ATOMIC;
- sock->sk->sk_use_task_frag = false;
+ ovpn_sock->sk->sk_allocation = GFP_ATOMIC;
+ ovpn_sock->sk->sk_use_task_frag = false;
/* enqueue the RX worker */
strp_check_rcv(&peer->tcp.strp);
diff --git a/drivers/net/ovpn/tcp.h b/drivers/net/ovpn/tcp.h
index 10aefa834cf3..a3aa3570ae5e 100644
--- a/drivers/net/ovpn/tcp.h
+++ b/drivers/net/ovpn/tcp.h
@@ -30,7 +30,8 @@ void ovpn_tcp_socket_wait_finish(struct ovpn_socket *sock);
* Required by the OpenVPN protocol in order to extract packets from
* the TCP stream on the receiver side.
*/
-void ovpn_tcp_send_skb(struct ovpn_peer *peer, struct socket *sock, struct sk_buff *skb);
+void ovpn_tcp_send_skb(struct ovpn_peer *peer, struct sock *sk,
+ struct sk_buff *skb);
void ovpn_tcp_tx_work(struct work_struct *work);
#endif /* _NET_OVPN_TCP_H_ */
diff --git a/drivers/net/ovpn/udp.c b/drivers/net/ovpn/udp.c
index 89bb50f94ddb..c99e8d72042d 100644
--- a/drivers/net/ovpn/udp.c
+++ b/drivers/net/ovpn/udp.c
@@ -43,7 +43,7 @@ static struct ovpn_socket *ovpn_socket_from_udp_sock(struct sock *sk)
return NULL;
/* make sure that sk matches our stored transport socket */
- if (unlikely(!ovpn_sock->sock || sk != ovpn_sock->sock->sk))
+ if (unlikely(!ovpn_sock->sk || sk != ovpn_sock->sk))
return NULL;
return ovpn_sock;
@@ -335,32 +335,22 @@ static int ovpn_udp_output(struct ovpn_peer *peer, struct dst_cache *cache,
/**
* ovpn_udp_send_skb - prepare skb and send it over via UDP
* @peer: the destination peer
- * @sock: the RCU protected peer socket
+ * @sk: peer socket
* @skb: the packet to send
*/
-void ovpn_udp_send_skb(struct ovpn_peer *peer, struct socket *sock,
+void ovpn_udp_send_skb(struct ovpn_peer *peer, struct sock *sk,
struct sk_buff *skb)
{
- int ret = -1;
+ int ret;
skb->dev = peer->ovpn->dev;
/* no checksum performed at this layer */
skb->ip_summed = CHECKSUM_NONE;
- /* get socket info */
- if (unlikely(!sock)) {
- net_warn_ratelimited("%s: no sock for remote peer %u\n",
- netdev_name(peer->ovpn->dev), peer->id);
- goto out;
- }
-
/* crypto layer -> transport (UDP) */
- ret = ovpn_udp_output(peer, &peer->dst_cache, sock->sk, skb);
-out:
- if (unlikely(ret < 0)) {
+ ret = ovpn_udp_output(peer, &peer->dst_cache, sk, skb);
+ if (unlikely(ret < 0))
kfree_skb(skb);
- return;
- }
}
static void ovpn_udp_encap_destroy(struct sock *sk)
@@ -383,6 +373,7 @@ static void ovpn_udp_encap_destroy(struct sock *sk)
/**
* ovpn_udp_socket_attach - set udp-tunnel CBs on socket and link it to ovpn
* @ovpn_sock: socket to configure
+ * @sock: the socket container to be passed to setup_udp_tunnel_sock()
* @ovpn: the openvp instance to link
*
* After invoking this function, the sock will be controlled by ovpn so that
@@ -390,7 +381,7 @@ static void ovpn_udp_encap_destroy(struct sock *sk)
*
* Return: 0 on success or a negative error code otherwise
*/
-int ovpn_udp_socket_attach(struct ovpn_socket *ovpn_sock,
+int ovpn_udp_socket_attach(struct ovpn_socket *ovpn_sock, struct socket *sock,
struct ovpn_priv *ovpn)
{
struct udp_tunnel_sock_cfg cfg = {
@@ -398,17 +389,16 @@ int ovpn_udp_socket_attach(struct ovpn_socket *ovpn_sock,
.encap_rcv = ovpn_udp_encap_recv,
.encap_destroy = ovpn_udp_encap_destroy,
};
- struct socket *sock = ovpn_sock->sock;
struct ovpn_socket *old_data;
int ret;
/* make sure no pre-existing encapsulation handler exists */
rcu_read_lock();
- old_data = rcu_dereference_sk_user_data(sock->sk);
+ old_data = rcu_dereference_sk_user_data(ovpn_sock->sk);
if (!old_data) {
/* socket is currently unused - we can take it */
rcu_read_unlock();
- setup_udp_tunnel_sock(sock_net(sock->sk), sock, &cfg);
+ setup_udp_tunnel_sock(sock_net(ovpn_sock->sk), sock, &cfg);
return 0;
}
@@ -421,7 +411,7 @@ int ovpn_udp_socket_attach(struct ovpn_socket *ovpn_sock,
* Unlikely TCP, a single UDP socket can be used to talk to many remote
* hosts and therefore openvpn instantiates one only for all its peers
*/
- if ((READ_ONCE(udp_sk(sock->sk)->encap_type) == UDP_ENCAP_OVPNINUDP) &&
+ if ((READ_ONCE(udp_sk(ovpn_sock->sk)->encap_type) == UDP_ENCAP_OVPNINUDP) &&
old_data->ovpn == ovpn) {
netdev_dbg(ovpn->dev,
"provided socket already owned by this interface\n");
@@ -442,7 +432,7 @@ int ovpn_udp_socket_attach(struct ovpn_socket *ovpn_sock,
*/
void ovpn_udp_socket_detach(struct ovpn_socket *ovpn_sock)
{
- struct sock *sk = ovpn_sock->sock->sk;
+ struct sock *sk = ovpn_sock->sk;
/* Re-enable multicast loopback */
inet_set_bit(MC_LOOP, sk);
diff --git a/drivers/net/ovpn/udp.h b/drivers/net/ovpn/udp.h
index 9994eb6e0428..fe26fbe25c5a 100644
--- a/drivers/net/ovpn/udp.h
+++ b/drivers/net/ovpn/udp.h
@@ -15,11 +15,11 @@ struct ovpn_peer;
struct ovpn_priv;
struct socket;
-int ovpn_udp_socket_attach(struct ovpn_socket *ovpn_sock,
+int ovpn_udp_socket_attach(struct ovpn_socket *ovpn_sock, struct socket *sock,
struct ovpn_priv *ovpn);
void ovpn_udp_socket_detach(struct ovpn_socket *ovpn_sock);
-void ovpn_udp_send_skb(struct ovpn_peer *peer, struct socket *sock,
+void ovpn_udp_send_skb(struct ovpn_peer *peer, struct sock *sk,
struct sk_buff *skb);
#endif /* _NET_OVPN_UDP_H_ */
--
2.49.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH net 3/5] ovpn: avoid sleep in atomic context in TCP RX error path
2025-05-30 10:12 [PATCH net 0/5] pull request: fixes for ovpn 2025-05-30 Antonio Quartulli
2025-05-30 10:12 ` [PATCH net 1/5] ovpn: properly deconfigure UDP-tunnel Antonio Quartulli
2025-05-30 10:12 ` [PATCH net 2/5] ovpn: ensure sk is still valid during cleanup Antonio Quartulli
@ 2025-05-30 10:12 ` Antonio Quartulli
2025-06-03 6:42 ` Michal Swiatkowski
2025-05-30 10:12 ` [PATCH net 4/5] selftest/net/ovpn: fix TCP socket creation Antonio Quartulli
2025-05-30 10:12 ` [PATCH net 5/5] selftest/net/ovpn: fix missing file Antonio Quartulli
4 siblings, 1 reply; 13+ messages in thread
From: Antonio Quartulli @ 2025-05-30 10:12 UTC (permalink / raw)
To: netdev
Cc: Antonio Quartulli, Sabrina Dubroca, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Qingfang Deng
Upon error along the TCP data_ready event path, we have
the following chain of calls:
strp_data_ready()
ovpn_tcp_rcv()
ovpn_peer_del()
ovpn_socket_release()
Since strp_data_ready() may be invoked from softirq context, and
ovpn_socket_release() may sleep, the above sequence may cause a
sleep in atomic context like the following:
BUG: sleeping function called from invalid context at ./ovpn-backports-ovpn-net-next-main-6.15.0-rc5-20250522/drivers/net/ovpn/socket.c:71
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 25, name: ksoftirqd/3
5 locks held by ksoftirqd/3/25:
#0: ffffffe000cd0580 (rcu_read_lock){....}-{1:2}, at: netif_receive_skb+0xb8/0x5b0
OpenVPN/ovpn-backports#1: ffffffe000cd0580 (rcu_read_lock){....}-{1:2}, at: netif_receive_skb+0xb8/0x5b0
OpenVPN/ovpn-backports#2: ffffffe000cd0580 (rcu_read_lock){....}-{1:2}, at: ip_local_deliver_finish+0x66/0x1e0
OpenVPN/ovpn-backports#3: ffffffe003ce9818 (slock-AF_INET/1){+.-.}-{2:2}, at: tcp_v4_rcv+0x156e/0x17a0
OpenVPN/ovpn-backports#4: ffffffe000cd0580 (rcu_read_lock){....}-{1:2}, at: ovpn_tcp_data_ready+0x0/0x1b0 [ovpn]
CPU: 3 PID: 25 Comm: ksoftirqd/3 Not tainted 5.10.104+ #0
Call Trace:
walk_stackframe+0x0/0x1d0
show_stack+0x2e/0x44
dump_stack+0xc2/0x102
___might_sleep+0x29c/0x2b0
__might_sleep+0x62/0xa0
ovpn_socket_release+0x24/0x2d0 [ovpn]
unlock_ovpn+0x6e/0x190 [ovpn]
ovpn_peer_del+0x13c/0x390 [ovpn]
ovpn_tcp_rcv+0x280/0x560 [ovpn]
__strp_recv+0x262/0x940
strp_recv+0x66/0x80
tcp_read_sock+0x122/0x410
strp_data_ready+0x156/0x1f0
ovpn_tcp_data_ready+0x92/0x1b0 [ovpn]
tcp_data_ready+0x6c/0x150
tcp_rcv_established+0xb36/0xc50
tcp_v4_do_rcv+0x25e/0x380
tcp_v4_rcv+0x166a/0x17a0
ip_protocol_deliver_rcu+0x8c/0x250
ip_local_deliver_finish+0xf8/0x1e0
ip_local_deliver+0xc2/0x2d0
ip_rcv+0x1f2/0x330
__netif_receive_skb+0xfc/0x290
netif_receive_skb+0x104/0x5b0
br_pass_frame_up+0x190/0x3f0
br_handle_frame_finish+0x3e2/0x7a0
br_handle_frame+0x750/0xab0
__netif_receive_skb_core.constprop.0+0x4c0/0x17f0
__netif_receive_skb+0xc6/0x290
netif_receive_skb+0x104/0x5b0
xgmac_dma_rx+0x962/0xb40
__napi_poll.constprop.0+0x5a/0x350
net_rx_action+0x1fe/0x4b0
__do_softirq+0x1f8/0x85c
run_ksoftirqd+0x80/0xd0
smpboot_thread_fn+0x1f0/0x3e0
kthread+0x1e6/0x210
ret_from_kernel_thread+0x8/0xc
Fix this issue by postponing the ovpn_peer_del() call to
a scheduled worker, as we already do in ovpn_tcp_send_sock()
for the very same reason.
Fixes: 11851cbd60ea ("ovpn: implement TCP transport")
Reported-by: Qingfang Deng <dqfext@gmail.com>
Closes: https://github.com/OpenVPN/ovpn-net-next/issues/13
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
---
drivers/net/ovpn/tcp.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ovpn/tcp.c b/drivers/net/ovpn/tcp.c
index 7e79aad0b043..289f62c5d2c7 100644
--- a/drivers/net/ovpn/tcp.c
+++ b/drivers/net/ovpn/tcp.c
@@ -124,14 +124,18 @@ static void ovpn_tcp_rcv(struct strparser *strp, struct sk_buff *skb)
* this peer, therefore ovpn_peer_hold() is not expected to fail
*/
if (WARN_ON(!ovpn_peer_hold(peer)))
- goto err;
+ goto err_nopeer;
ovpn_recv(peer, skb);
return;
err:
+ /* take reference for deferred peer deletion. should never fail */
+ if (WARN_ON(!ovpn_peer_hold(peer)))
+ goto err_nopeer;
+ schedule_work(&peer->tcp.defer_del_work);
dev_dstats_rx_dropped(peer->ovpn->dev);
+err_nopeer:
kfree_skb(skb);
- ovpn_peer_del(peer, OVPN_DEL_PEER_REASON_TRANSPORT_ERROR);
}
static int ovpn_tcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
--
2.49.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH net 4/5] selftest/net/ovpn: fix TCP socket creation
2025-05-30 10:12 [PATCH net 0/5] pull request: fixes for ovpn 2025-05-30 Antonio Quartulli
` (2 preceding siblings ...)
2025-05-30 10:12 ` [PATCH net 3/5] ovpn: avoid sleep in atomic context in TCP RX error path Antonio Quartulli
@ 2025-05-30 10:12 ` Antonio Quartulli
2025-05-30 10:12 ` [PATCH net 5/5] selftest/net/ovpn: fix missing file Antonio Quartulli
4 siblings, 0 replies; 13+ messages in thread
From: Antonio Quartulli @ 2025-05-30 10:12 UTC (permalink / raw)
To: netdev
Cc: Antonio Quartulli, Sabrina Dubroca, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni
TCP sockets cannot be created with AF_UNSPEC, but
one among the supported family must be used.
Since commit 944f8b6abab6 ("selftest/net/ovpn: extend
coverage with more test cases") the default address
family for all tests was changed from AF_INET to AF_UNSPEC,
thus breaking all TCP cases.
Restore AF_INET as default address family for TCP listeners.
Fixes: 944f8b6abab6 ("selftest/net/ovpn: extend coverage with more test cases")
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
---
tools/testing/selftests/net/ovpn/ovpn-cli.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/net/ovpn/ovpn-cli.c b/tools/testing/selftests/net/ovpn/ovpn-cli.c
index de9c26f98b2e..9201f2905f2c 100644
--- a/tools/testing/selftests/net/ovpn/ovpn-cli.c
+++ b/tools/testing/selftests/net/ovpn/ovpn-cli.c
@@ -2166,6 +2166,7 @@ static int ovpn_parse_cmd_args(struct ovpn_ctx *ovpn, int argc, char *argv[])
ovpn->peers_file = argv[4];
+ ovpn->sa_family = AF_INET;
if (argc > 5 && !strcmp(argv[5], "ipv6"))
ovpn->sa_family = AF_INET6;
break;
--
2.49.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH net 5/5] selftest/net/ovpn: fix missing file
2025-05-30 10:12 [PATCH net 0/5] pull request: fixes for ovpn 2025-05-30 Antonio Quartulli
` (3 preceding siblings ...)
2025-05-30 10:12 ` [PATCH net 4/5] selftest/net/ovpn: fix TCP socket creation Antonio Quartulli
@ 2025-05-30 10:12 ` Antonio Quartulli
4 siblings, 0 replies; 13+ messages in thread
From: Antonio Quartulli @ 2025-05-30 10:12 UTC (permalink / raw)
To: netdev
Cc: Antonio Quartulli, Sabrina Dubroca, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni
test-large-mtu.sh is referenced by the Makefile
but does not exist.
Add it along the other scripts.
Fixes: 944f8b6abab6 ("selftest/net/ovpn: extend coverage with more test cases")
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
---
tools/testing/selftests/net/ovpn/test-large-mtu.sh | 9 +++++++++
1 file changed, 9 insertions(+)
create mode 100755 tools/testing/selftests/net/ovpn/test-large-mtu.sh
diff --git a/tools/testing/selftests/net/ovpn/test-large-mtu.sh b/tools/testing/selftests/net/ovpn/test-large-mtu.sh
new file mode 100755
index 000000000000..ce2a2cb64f72
--- /dev/null
+++ b/tools/testing/selftests/net/ovpn/test-large-mtu.sh
@@ -0,0 +1,9 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (C) 2025 OpenVPN, Inc.
+#
+# Author: Antonio Quartulli <antonio@openvpn.net>
+
+MTU="1500"
+
+source test.sh
--
2.49.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH net 1/5] ovpn: properly deconfigure UDP-tunnel
2025-05-30 10:12 ` [PATCH net 1/5] ovpn: properly deconfigure UDP-tunnel Antonio Quartulli
@ 2025-06-03 6:30 ` Michal Swiatkowski
2025-06-03 9:02 ` Paolo Abeni
1 sibling, 0 replies; 13+ messages in thread
From: Michal Swiatkowski @ 2025-06-03 6:30 UTC (permalink / raw)
To: Antonio Quartulli
Cc: netdev, Sabrina Dubroca, David S . Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Oleksandr Natalenko
On Fri, May 30, 2025 at 12:12:50PM +0200, Antonio Quartulli wrote:
> When deconfiguring a UDP-tunnel from a socket, we cannot
> call setup_udp_tunnel_sock() with an empty config, because
> this helper is expected to be invoked only during setup.
>
> Get rid of the call to setup_udp_tunnel_sock() and just
> revert what it did during socket initialization..
>
> Note that the global udp_encap_needed_key and the GRO state
> are left untouched: udp_destroy_socket() will eventually
> take care of them.
>
> Cc: Sabrina Dubroca <sd@queasysnail.net>
> Cc: Oleksandr Natalenko <oleksandr@natalenko.name>
> Fixes: ab66abbc769b ("ovpn: implement basic RX path (UDP)")
> Reported-by: Paolo Abeni <pabeni@redhat.com>
> Closes: https://lore.kernel.org/netdev/1a47ce02-fd42-4761-8697-f3f315011cc6@redhat.com
> Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
> ---
> drivers/net/ovpn/udp.c | 14 +++++++++++---
> 1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/ovpn/udp.c b/drivers/net/ovpn/udp.c
> index aef8c0406ec9..89bb50f94ddb 100644
> --- a/drivers/net/ovpn/udp.c
> +++ b/drivers/net/ovpn/udp.c
> @@ -442,8 +442,16 @@ int ovpn_udp_socket_attach(struct ovpn_socket *ovpn_sock,
> */
> void ovpn_udp_socket_detach(struct ovpn_socket *ovpn_sock)
> {
> - struct udp_tunnel_sock_cfg cfg = { };
> + struct sock *sk = ovpn_sock->sock->sk;
>
> - setup_udp_tunnel_sock(sock_net(ovpn_sock->sock->sk), ovpn_sock->sock,
> - &cfg);
> + /* Re-enable multicast loopback */
> + inet_set_bit(MC_LOOP, sk);
> + /* Disable CHECKSUM_UNNECESSARY to CHECKSUM_COMPLETE conversion */
> + inet_dec_convert_csum(sk);
> +
> + udp_sk(sk)->encap_type = 0;
> + udp_sk(sk)->encap_rcv = NULL;
> + udp_sk(sk)->encap_destroy = NULL;
> +
> + rcu_assign_sk_user_data(sk, NULL);
> }
> --
> 2.49.0
LGTM, thanks
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net 2/5] ovpn: ensure sk is still valid during cleanup
2025-05-30 10:12 ` [PATCH net 2/5] ovpn: ensure sk is still valid during cleanup Antonio Quartulli
@ 2025-06-03 6:40 ` Michal Swiatkowski
0 siblings, 0 replies; 13+ messages in thread
From: Michal Swiatkowski @ 2025-06-03 6:40 UTC (permalink / raw)
To: Antonio Quartulli
Cc: netdev, Sabrina Dubroca, David S . Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Oleksandr Natalenko, Qingfang Deng,
Gert Doering
On Fri, May 30, 2025 at 12:12:51PM +0200, Antonio Quartulli wrote:
> Removing a peer while userspace attempts to close its transport
> socket triggers a race condition resulting in the following
> crash:
>
> Oops: general protection fault, probably for non-canonical address 0xdffffc0000000077: 0000 [#1] SMP KASAN
> KASAN: null-ptr-deref in range [0x00000000000003b8-0x00000000000003bf]
> CPU: 12 UID: 0 PID: 162 Comm: kworker/12:1 Tainted: G O 6.15.0-rc2-00635-g521139ac3840 #272 PREEMPT(full)
> Tainted: [O]=OOT_MODULE
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-20240910_120124-localhost 04/01/2014
> Workqueue: events ovpn_peer_keepalive_work [ovpn]
> RIP: 0010:ovpn_socket_release+0x23c/0x500 [ovpn]
> Code: ea 03 80 3c 02 00 0f 85 71 02 00 00 48 b8 00 00 00 00 00 fc ff df 4d 8b 64 24 18 49 8d bc 24 be 03 00 00 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 01 38 d0 7c 08 84 d2 0f 85 30
> RSP: 0018:ffffc90000c9fb18 EFLAGS: 00010217
> RAX: dffffc0000000000 RBX: ffff8881148d7940 RCX: ffffffff817787bb
> RDX: 0000000000000077 RSI: 0000000000000008 RDI: 00000000000003be
> RBP: ffffc90000c9fb30 R08: 0000000000000000 R09: fffffbfff0d3e840
> R10: ffffffff869f4207 R11: 0000000000000000 R12: 0000000000000000
> R13: ffff888115eb9300 R14: ffffc90000c9fbc8 R15: 000000000000000c
> FS: 0000000000000000(0000) GS:ffff8882b0151000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f37266b6114 CR3: 00000000054a8000 CR4: 0000000000750ef0
> PKRU: 55555554
> Call Trace:
> <TASK>
> unlock_ovpn+0x8b/0xe0 [ovpn]
> ovpn_peer_keepalive_work+0xe3/0x540 [ovpn]
> ? ovpn_peers_free+0x780/0x780 [ovpn]
> ? lock_acquire+0x56/0x70
> ? process_one_work+0x888/0x1740
> process_one_work+0x933/0x1740
> ? pwq_dec_nr_in_flight+0x10b0/0x10b0
> ? move_linked_works+0x12d/0x2c0
> ? assign_work+0x163/0x270
> worker_thread+0x4d6/0xd90
> ? preempt_count_sub+0x4c/0x70
> ? process_one_work+0x1740/0x1740
> kthread+0x36c/0x710
> ? trace_preempt_on+0x8c/0x1e0
> ? kthread_is_per_cpu+0xc0/0xc0
> ? preempt_count_sub+0x4c/0x70
> ? _raw_spin_unlock_irq+0x36/0x60
> ? calculate_sigpending+0x7b/0xa0
> ? kthread_is_per_cpu+0xc0/0xc0
> ret_from_fork+0x3a/0x80
> ? kthread_is_per_cpu+0xc0/0xc0
> ret_from_fork_asm+0x11/0x20
> </TASK>
> Modules linked in: ovpn(O)
>
> This happens because the peer deletion operation reaches
> ovpn_socket_release() while ovpn_sock->sock (struct socket *)
> and its sk member (struct sock *) are still both valid.
> Here synchronize_rcu() is invoked, after which ovpn_sock->sock->sk
> becomes NULL, due to the concurrent socket closing triggered
> from userspace.
>
> After having invoked synchronize_rcu(), ovpn_socket_release() will
> attempt dereferencing ovpn_sock->sock->sk, triggering the crash
> reported above.
>
> The reason for accessing sk is that we need to retrieve its
> protocol and continue the cleanup routine accordingly.
>
> This crash can be easily produced by running openvpn userspace in
> client mode with `--keepalive 10 20`, while entirely omitting this
> option on the server side.
> After 20 seconds ovpn will assume the peer (server) to be dead,
> will start removing it and will notify userspace. The latter will
> receive the notification and close the transport socket, thus
> triggering the crash.
>
> To fix the race condition for good, we need to refactor struct ovpn_socket.
> Since ovpn is always only interested in the sock->sk member (struct sock *)
> we can directly hold a reference to it, raher than accessing it via
> its struct socket container.
>
> This means changing "struct socket *ovpn_socket->sock" to
> "struct sock *ovpn_socket->sk".
>
> While acquiring a reference to sk, we can increase its refcounter
> without affecting the socket close()/destroy() notification
> (which we rely on when userspace closes a socket we are using).
>
> By increasing sk's refcounter we know we can dereference it
> in ovpn_socket_release() without incurring in any race condition
> anymore.
>
> ovpn_socket_release() will ultimately decrease the reference
> counter.
>
> Cc: Oleksandr Natalenko <oleksandr@natalenko.name>
> Fixes: 11851cbd60ea ("ovpn: implement TCP transport")
> Reported-by: Qingfang Deng <dqfext@gmail.com>
> Closes: https://github.com/OpenVPN/ovpn-net-next/issues/1
> Tested-by: Gert Doering <gert@greenie.muc.de>
> Link: https://www.mail-archive.com/openvpn-devel@lists.sourceforge.net/msg31575.html
> Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
> ---
> drivers/net/ovpn/io.c | 8 ++---
> drivers/net/ovpn/netlink.c | 16 ++++-----
> drivers/net/ovpn/peer.c | 4 +--
> drivers/net/ovpn/socket.c | 68 +++++++++++++++++++++-----------------
> drivers/net/ovpn/socket.h | 4 +--
> drivers/net/ovpn/tcp.c | 65 ++++++++++++++++++------------------
> drivers/net/ovpn/tcp.h | 3 +-
> drivers/net/ovpn/udp.c | 34 +++++++------------
> drivers/net/ovpn/udp.h | 4 +--
> 9 files changed, 102 insertions(+), 104 deletions(-)
>
Thanks for wide description in commit message. Changes looks fine for
me.
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
> --
> 2.49.0
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net 3/5] ovpn: avoid sleep in atomic context in TCP RX error path
2025-05-30 10:12 ` [PATCH net 3/5] ovpn: avoid sleep in atomic context in TCP RX error path Antonio Quartulli
@ 2025-06-03 6:42 ` Michal Swiatkowski
0 siblings, 0 replies; 13+ messages in thread
From: Michal Swiatkowski @ 2025-06-03 6:42 UTC (permalink / raw)
To: Antonio Quartulli
Cc: netdev, Sabrina Dubroca, David S . Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Qingfang Deng
On Fri, May 30, 2025 at 12:12:52PM +0200, Antonio Quartulli wrote:
> Upon error along the TCP data_ready event path, we have
> the following chain of calls:
>
> strp_data_ready()
> ovpn_tcp_rcv()
> ovpn_peer_del()
> ovpn_socket_release()
>
> Since strp_data_ready() may be invoked from softirq context, and
> ovpn_socket_release() may sleep, the above sequence may cause a
> sleep in atomic context like the following:
>
> BUG: sleeping function called from invalid context at ./ovpn-backports-ovpn-net-next-main-6.15.0-rc5-20250522/drivers/net/ovpn/socket.c:71
> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 25, name: ksoftirqd/3
> 5 locks held by ksoftirqd/3/25:
> #0: ffffffe000cd0580 (rcu_read_lock){....}-{1:2}, at: netif_receive_skb+0xb8/0x5b0
> OpenVPN/ovpn-backports#1: ffffffe000cd0580 (rcu_read_lock){....}-{1:2}, at: netif_receive_skb+0xb8/0x5b0
> OpenVPN/ovpn-backports#2: ffffffe000cd0580 (rcu_read_lock){....}-{1:2}, at: ip_local_deliver_finish+0x66/0x1e0
> OpenVPN/ovpn-backports#3: ffffffe003ce9818 (slock-AF_INET/1){+.-.}-{2:2}, at: tcp_v4_rcv+0x156e/0x17a0
> OpenVPN/ovpn-backports#4: ffffffe000cd0580 (rcu_read_lock){....}-{1:2}, at: ovpn_tcp_data_ready+0x0/0x1b0 [ovpn]
> CPU: 3 PID: 25 Comm: ksoftirqd/3 Not tainted 5.10.104+ #0
> Call Trace:
> walk_stackframe+0x0/0x1d0
> show_stack+0x2e/0x44
> dump_stack+0xc2/0x102
> ___might_sleep+0x29c/0x2b0
> __might_sleep+0x62/0xa0
> ovpn_socket_release+0x24/0x2d0 [ovpn]
> unlock_ovpn+0x6e/0x190 [ovpn]
> ovpn_peer_del+0x13c/0x390 [ovpn]
> ovpn_tcp_rcv+0x280/0x560 [ovpn]
> __strp_recv+0x262/0x940
> strp_recv+0x66/0x80
> tcp_read_sock+0x122/0x410
> strp_data_ready+0x156/0x1f0
> ovpn_tcp_data_ready+0x92/0x1b0 [ovpn]
> tcp_data_ready+0x6c/0x150
> tcp_rcv_established+0xb36/0xc50
> tcp_v4_do_rcv+0x25e/0x380
> tcp_v4_rcv+0x166a/0x17a0
> ip_protocol_deliver_rcu+0x8c/0x250
> ip_local_deliver_finish+0xf8/0x1e0
> ip_local_deliver+0xc2/0x2d0
> ip_rcv+0x1f2/0x330
> __netif_receive_skb+0xfc/0x290
> netif_receive_skb+0x104/0x5b0
> br_pass_frame_up+0x190/0x3f0
> br_handle_frame_finish+0x3e2/0x7a0
> br_handle_frame+0x750/0xab0
> __netif_receive_skb_core.constprop.0+0x4c0/0x17f0
> __netif_receive_skb+0xc6/0x290
> netif_receive_skb+0x104/0x5b0
> xgmac_dma_rx+0x962/0xb40
> __napi_poll.constprop.0+0x5a/0x350
> net_rx_action+0x1fe/0x4b0
> __do_softirq+0x1f8/0x85c
> run_ksoftirqd+0x80/0xd0
> smpboot_thread_fn+0x1f0/0x3e0
> kthread+0x1e6/0x210
> ret_from_kernel_thread+0x8/0xc
>
> Fix this issue by postponing the ovpn_peer_del() call to
> a scheduled worker, as we already do in ovpn_tcp_send_sock()
> for the very same reason.
>
> Fixes: 11851cbd60ea ("ovpn: implement TCP transport")
> Reported-by: Qingfang Deng <dqfext@gmail.com>
> Closes: https://github.com/OpenVPN/ovpn-net-next/issues/13
> Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
> ---
> drivers/net/ovpn/tcp.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ovpn/tcp.c b/drivers/net/ovpn/tcp.c
> index 7e79aad0b043..289f62c5d2c7 100644
> --- a/drivers/net/ovpn/tcp.c
> +++ b/drivers/net/ovpn/tcp.c
> @@ -124,14 +124,18 @@ static void ovpn_tcp_rcv(struct strparser *strp, struct sk_buff *skb)
> * this peer, therefore ovpn_peer_hold() is not expected to fail
> */
> if (WARN_ON(!ovpn_peer_hold(peer)))
> - goto err;
> + goto err_nopeer;
>
> ovpn_recv(peer, skb);
> return;
> err:
> + /* take reference for deferred peer deletion. should never fail */
> + if (WARN_ON(!ovpn_peer_hold(peer)))
> + goto err_nopeer;
> + schedule_work(&peer->tcp.defer_del_work);
> dev_dstats_rx_dropped(peer->ovpn->dev);
> +err_nopeer:
> kfree_skb(skb);
> - ovpn_peer_del(peer, OVPN_DEL_PEER_REASON_TRANSPORT_ERROR);
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
> }
>
> static int ovpn_tcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
> --
> 2.49.0
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net 1/5] ovpn: properly deconfigure UDP-tunnel
2025-05-30 10:12 ` [PATCH net 1/5] ovpn: properly deconfigure UDP-tunnel Antonio Quartulli
2025-06-03 6:30 ` Michal Swiatkowski
@ 2025-06-03 9:02 ` Paolo Abeni
2025-06-03 9:08 ` Antonio Quartulli
1 sibling, 1 reply; 13+ messages in thread
From: Paolo Abeni @ 2025-06-03 9:02 UTC (permalink / raw)
To: Antonio Quartulli, netdev
Cc: Sabrina Dubroca, David S . Miller, Eric Dumazet, Jakub Kicinski,
Oleksandr Natalenko
On 5/30/25 12:12 PM, Antonio Quartulli wrote:
> diff --git a/drivers/net/ovpn/udp.c b/drivers/net/ovpn/udp.c
> index aef8c0406ec9..89bb50f94ddb 100644
> --- a/drivers/net/ovpn/udp.c
> +++ b/drivers/net/ovpn/udp.c
> @@ -442,8 +442,16 @@ int ovpn_udp_socket_attach(struct ovpn_socket *ovpn_sock,
> */
> void ovpn_udp_socket_detach(struct ovpn_socket *ovpn_sock)
> {
> - struct udp_tunnel_sock_cfg cfg = { };
> + struct sock *sk = ovpn_sock->sock->sk;
>
> - setup_udp_tunnel_sock(sock_net(ovpn_sock->sock->sk), ovpn_sock->sock,
> - &cfg);
> + /* Re-enable multicast loopback */
> + inet_set_bit(MC_LOOP, sk);
> + /* Disable CHECKSUM_UNNECESSARY to CHECKSUM_COMPLETE conversion */
> + inet_dec_convert_csum(sk);
> +
> + udp_sk(sk)->encap_type = 0;
> + udp_sk(sk)->encap_rcv = NULL;
> + udp_sk(sk)->encap_destroy = NULL;
I'm sorry for not noticing this earlier, but you need to add
WRITE_ONCE() annotation to the above statements, because readers access
such fields lockless.
/P
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net 1/5] ovpn: properly deconfigure UDP-tunnel
2025-06-03 9:02 ` Paolo Abeni
@ 2025-06-03 9:08 ` Antonio Quartulli
2025-06-03 9:58 ` Paolo Abeni
0 siblings, 1 reply; 13+ messages in thread
From: Antonio Quartulli @ 2025-06-03 9:08 UTC (permalink / raw)
To: Paolo Abeni, netdev
Cc: Sabrina Dubroca, David S . Miller, Eric Dumazet, Jakub Kicinski,
Oleksandr Natalenko
On 03/06/2025 11:02, Paolo Abeni wrote:
> On 5/30/25 12:12 PM, Antonio Quartulli wrote:
>> diff --git a/drivers/net/ovpn/udp.c b/drivers/net/ovpn/udp.c
>> index aef8c0406ec9..89bb50f94ddb 100644
>> --- a/drivers/net/ovpn/udp.c
>> +++ b/drivers/net/ovpn/udp.c
>> @@ -442,8 +442,16 @@ int ovpn_udp_socket_attach(struct ovpn_socket *ovpn_sock,
>> */
>> void ovpn_udp_socket_detach(struct ovpn_socket *ovpn_sock)
>> {
>> - struct udp_tunnel_sock_cfg cfg = { };
>> + struct sock *sk = ovpn_sock->sock->sk;
>>
>> - setup_udp_tunnel_sock(sock_net(ovpn_sock->sock->sk), ovpn_sock->sock,
>> - &cfg);
>> + /* Re-enable multicast loopback */
>> + inet_set_bit(MC_LOOP, sk);
>> + /* Disable CHECKSUM_UNNECESSARY to CHECKSUM_COMPLETE conversion */
>> + inet_dec_convert_csum(sk);
>> +
>> + udp_sk(sk)->encap_type = 0;
>> + udp_sk(sk)->encap_rcv = NULL;
>> + udp_sk(sk)->encap_destroy = NULL;
>
> I'm sorry for not noticing this earlier, but you need to add
> WRITE_ONCE() annotation to the above statements, because readers access
> such fields lockless.
I should have noticed the READ_ONCE on the reader side..
Any specific reason why setup_udp_tunnel_sock() does not use WRITE_ONCE
though?
Regards,
--
Antonio Quartulli
OpenVPN Inc.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net 1/5] ovpn: properly deconfigure UDP-tunnel
2025-06-03 9:08 ` Antonio Quartulli
@ 2025-06-03 9:58 ` Paolo Abeni
0 siblings, 0 replies; 13+ messages in thread
From: Paolo Abeni @ 2025-06-03 9:58 UTC (permalink / raw)
To: Antonio Quartulli, netdev
Cc: Sabrina Dubroca, David S . Miller, Eric Dumazet, Jakub Kicinski,
Oleksandr Natalenko
On 6/3/25 11:08 AM, Antonio Quartulli wrote:
> On 03/06/2025 11:02, Paolo Abeni wrote:
>> On 5/30/25 12:12 PM, Antonio Quartulli wrote:
>>> diff --git a/drivers/net/ovpn/udp.c b/drivers/net/ovpn/udp.c
>>> index aef8c0406ec9..89bb50f94ddb 100644
>>> --- a/drivers/net/ovpn/udp.c
>>> +++ b/drivers/net/ovpn/udp.c
>>> @@ -442,8 +442,16 @@ int ovpn_udp_socket_attach(struct ovpn_socket *ovpn_sock,
>>> */
>>> void ovpn_udp_socket_detach(struct ovpn_socket *ovpn_sock)
>>> {
>>> - struct udp_tunnel_sock_cfg cfg = { };
>>> + struct sock *sk = ovpn_sock->sock->sk;
>>>
>>> - setup_udp_tunnel_sock(sock_net(ovpn_sock->sock->sk), ovpn_sock->sock,
>>> - &cfg);
>>> + /* Re-enable multicast loopback */
>>> + inet_set_bit(MC_LOOP, sk);
>>> + /* Disable CHECKSUM_UNNECESSARY to CHECKSUM_COMPLETE conversion */
>>> + inet_dec_convert_csum(sk);
>>> +
>>> + udp_sk(sk)->encap_type = 0;
>>> + udp_sk(sk)->encap_rcv = NULL;
>>> + udp_sk(sk)->encap_destroy = NULL;
>>
>> I'm sorry for not noticing this earlier, but you need to add
>> WRITE_ONCE() annotation to the above statements, because readers access
>> such fields lockless.
>
> I should have noticed the READ_ONCE on the reader side..
>
> Any specific reason why setup_udp_tunnel_sock() does not use WRITE_ONCE
> though?
AFAICS, it's a bug being there since the beginning. _ONCE cleanups
started quite later WRT udp tunnel support introduction. I guess it was
left over because syzkaller is less prone to stuble upon it. But it
would be better to deal correctly with such access in new code.
Thanks,
Paolo
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH net 3/5] ovpn: avoid sleep in atomic context in TCP RX error path
2025-06-03 11:11 [PATCH net 0/5] pull request: fixes for ovpn 2025-06-03 Antonio Quartulli
@ 2025-06-03 11:11 ` Antonio Quartulli
0 siblings, 0 replies; 13+ messages in thread
From: Antonio Quartulli @ 2025-06-03 11:11 UTC (permalink / raw)
To: netdev
Cc: Michal Swiatkowski, Antonio Quartulli, Sabrina Dubroca,
David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Qingfang Deng
Upon error along the TCP data_ready event path, we have
the following chain of calls:
strp_data_ready()
ovpn_tcp_rcv()
ovpn_peer_del()
ovpn_socket_release()
Since strp_data_ready() may be invoked from softirq context, and
ovpn_socket_release() may sleep, the above sequence may cause a
sleep in atomic context like the following:
BUG: sleeping function called from invalid context at ./ovpn-backports-ovpn-net-next-main-6.15.0-rc5-20250522/drivers/net/ovpn/socket.c:71
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 25, name: ksoftirqd/3
5 locks held by ksoftirqd/3/25:
#0: ffffffe000cd0580 (rcu_read_lock){....}-{1:2}, at: netif_receive_skb+0xb8/0x5b0
OpenVPN/ovpn-backports#1: ffffffe000cd0580 (rcu_read_lock){....}-{1:2}, at: netif_receive_skb+0xb8/0x5b0
OpenVPN/ovpn-backports#2: ffffffe000cd0580 (rcu_read_lock){....}-{1:2}, at: ip_local_deliver_finish+0x66/0x1e0
OpenVPN/ovpn-backports#3: ffffffe003ce9818 (slock-AF_INET/1){+.-.}-{2:2}, at: tcp_v4_rcv+0x156e/0x17a0
OpenVPN/ovpn-backports#4: ffffffe000cd0580 (rcu_read_lock){....}-{1:2}, at: ovpn_tcp_data_ready+0x0/0x1b0 [ovpn]
CPU: 3 PID: 25 Comm: ksoftirqd/3 Not tainted 5.10.104+ #0
Call Trace:
walk_stackframe+0x0/0x1d0
show_stack+0x2e/0x44
dump_stack+0xc2/0x102
___might_sleep+0x29c/0x2b0
__might_sleep+0x62/0xa0
ovpn_socket_release+0x24/0x2d0 [ovpn]
unlock_ovpn+0x6e/0x190 [ovpn]
ovpn_peer_del+0x13c/0x390 [ovpn]
ovpn_tcp_rcv+0x280/0x560 [ovpn]
__strp_recv+0x262/0x940
strp_recv+0x66/0x80
tcp_read_sock+0x122/0x410
strp_data_ready+0x156/0x1f0
ovpn_tcp_data_ready+0x92/0x1b0 [ovpn]
tcp_data_ready+0x6c/0x150
tcp_rcv_established+0xb36/0xc50
tcp_v4_do_rcv+0x25e/0x380
tcp_v4_rcv+0x166a/0x17a0
ip_protocol_deliver_rcu+0x8c/0x250
ip_local_deliver_finish+0xf8/0x1e0
ip_local_deliver+0xc2/0x2d0
ip_rcv+0x1f2/0x330
__netif_receive_skb+0xfc/0x290
netif_receive_skb+0x104/0x5b0
br_pass_frame_up+0x190/0x3f0
br_handle_frame_finish+0x3e2/0x7a0
br_handle_frame+0x750/0xab0
__netif_receive_skb_core.constprop.0+0x4c0/0x17f0
__netif_receive_skb+0xc6/0x290
netif_receive_skb+0x104/0x5b0
xgmac_dma_rx+0x962/0xb40
__napi_poll.constprop.0+0x5a/0x350
net_rx_action+0x1fe/0x4b0
__do_softirq+0x1f8/0x85c
run_ksoftirqd+0x80/0xd0
smpboot_thread_fn+0x1f0/0x3e0
kthread+0x1e6/0x210
ret_from_kernel_thread+0x8/0xc
Fix this issue by postponing the ovpn_peer_del() call to
a scheduled worker, as we already do in ovpn_tcp_send_sock()
for the very same reason.
Fixes: 11851cbd60ea ("ovpn: implement TCP transport")
Reported-by: Qingfang Deng <dqfext@gmail.com>
Closes: https://github.com/OpenVPN/ovpn-net-next/issues/13
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
---
drivers/net/ovpn/tcp.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ovpn/tcp.c b/drivers/net/ovpn/tcp.c
index 7e79aad0b043..289f62c5d2c7 100644
--- a/drivers/net/ovpn/tcp.c
+++ b/drivers/net/ovpn/tcp.c
@@ -124,14 +124,18 @@ static void ovpn_tcp_rcv(struct strparser *strp, struct sk_buff *skb)
* this peer, therefore ovpn_peer_hold() is not expected to fail
*/
if (WARN_ON(!ovpn_peer_hold(peer)))
- goto err;
+ goto err_nopeer;
ovpn_recv(peer, skb);
return;
err:
+ /* take reference for deferred peer deletion. should never fail */
+ if (WARN_ON(!ovpn_peer_hold(peer)))
+ goto err_nopeer;
+ schedule_work(&peer->tcp.defer_del_work);
dev_dstats_rx_dropped(peer->ovpn->dev);
+err_nopeer:
kfree_skb(skb);
- ovpn_peer_del(peer, OVPN_DEL_PEER_REASON_TRANSPORT_ERROR);
}
static int ovpn_tcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
--
2.49.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
end of thread, other threads:[~2025-06-03 11:11 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-30 10:12 [PATCH net 0/5] pull request: fixes for ovpn 2025-05-30 Antonio Quartulli
2025-05-30 10:12 ` [PATCH net 1/5] ovpn: properly deconfigure UDP-tunnel Antonio Quartulli
2025-06-03 6:30 ` Michal Swiatkowski
2025-06-03 9:02 ` Paolo Abeni
2025-06-03 9:08 ` Antonio Quartulli
2025-06-03 9:58 ` Paolo Abeni
2025-05-30 10:12 ` [PATCH net 2/5] ovpn: ensure sk is still valid during cleanup Antonio Quartulli
2025-06-03 6:40 ` Michal Swiatkowski
2025-05-30 10:12 ` [PATCH net 3/5] ovpn: avoid sleep in atomic context in TCP RX error path Antonio Quartulli
2025-06-03 6:42 ` Michal Swiatkowski
2025-05-30 10:12 ` [PATCH net 4/5] selftest/net/ovpn: fix TCP socket creation Antonio Quartulli
2025-05-30 10:12 ` [PATCH net 5/5] selftest/net/ovpn: fix missing file Antonio Quartulli
-- strict thread matches above, loose matches on Subject: below --
2025-06-03 11:11 [PATCH net 0/5] pull request: fixes for ovpn 2025-06-03 Antonio Quartulli
2025-06-03 11:11 ` [PATCH net 3/5] ovpn: avoid sleep in atomic context in TCP RX error path Antonio Quartulli
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).