* [PATCH net v2 0/3] fix poll behaviour for TCP-based tunnel protocols
@ 2025-10-20 7:37 Ralf Lici
2025-10-20 7:37 ` [PATCH net v2 1/3] net: datagram: introduce datagram_poll_queue for custom receive queues Ralf Lici
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Ralf Lici @ 2025-10-20 7:37 UTC (permalink / raw)
To: netdev
Cc: Ralf Lici, Sabrina Dubroca, Antonio Quartulli, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni
Hi all,
This patch series introduces a polling function for datagram-style
sockets that operates on custom skb queues, and updates ovpn (the
OpenVPN data-channel offload module) and espintcp (the TCP Encapsulation
of IKE and IPsec Packets implementation) to use it accordingly.
Protocols like the aforementioned one decapsulate packets received over
TCP and deliver userspace-bound data through a separate skb queue, not
the standard sk_receive_queue. Previously, both relied on
datagram_poll(), which would signal readiness based on non-userspace
packets, leading to misleading poll results and unnecessary recv
attempts in userspace.
Patch 1 introduces datagram_poll_queue(), a variant of datagram_poll()
that accepts an explicit receive queue. Patch 2 and 3 update
ovpn_tcp_poll() and espintcp_poll() respectively to use this helper,
ensuring readiness is only signaled when userspace data is available.
Each patch is self-contained and the ovpn one includes rationale and
lifecycle enforcement where appropriate.
Thanks for your time and feedback.
Best Regards,
Ralf Lici
Mandelbit Srl
Changes since v1:
- Documented return value in datagram_poll_queue() kernel-doc
- Added missing CCs
---
Ralf Lici (3):
net: datagram: introduce datagram_poll_queue for custom receive queues
espintcp: use datagram_poll_queue for socket readiness
ovpn: use datagram_poll_queue for socket readiness in TCP
drivers/net/ovpn/tcp.c | 26 ++++++++++++++++++++----
include/linux/skbuff.h | 3 +++
net/core/datagram.c | 46 ++++++++++++++++++++++++++++++------------
net/xfrm/espintcp.c | 6 +-----
4 files changed, 59 insertions(+), 22 deletions(-)
--
2.51.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH net v2 1/3] net: datagram: introduce datagram_poll_queue for custom receive queues
2025-10-20 7:37 [PATCH net v2 0/3] fix poll behaviour for TCP-based tunnel protocols Ralf Lici
@ 2025-10-20 7:37 ` Ralf Lici
2025-10-20 10:17 ` Sabrina Dubroca
2025-10-20 7:37 ` [PATCH net v2 2/3] espintcp: use datagram_poll_queue for socket readiness Ralf Lici
2025-10-20 7:37 ` [PATCH net v2 3/3] ovpn: use datagram_poll_queue for socket readiness in TCP Ralf Lici
2 siblings, 1 reply; 8+ messages in thread
From: Ralf Lici @ 2025-10-20 7:37 UTC (permalink / raw)
To: netdev
Cc: Ralf Lici, David S . Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, Mina Almasry, Eric Biggers,
Sabrina Dubroca, Antonio Quartulli
Some protocols using TCP encapsulation (e.g., espintcp, openvpn) deliver
userspace-bound packets through a custom skb queue rather than the
standard sk_receive_queue.
Introduce datagram_poll_queue that accepts an explicit receive queue,
and convert datagram_poll into a wrapper around datagram_poll_queue.
This allows protocols with custom skb queues to reuse the core polling
logic without relying on sk_receive_queue.
Cc: Sabrina Dubroca <sd@queasysnail.net>
Cc: Antonio Quartulli <antonio@openvpn.net>
Signed-off-by: Ralf Lici <ralf@mandelbit.com>
---
include/linux/skbuff.h | 3 +++
net/core/datagram.c | 46 ++++++++++++++++++++++++++++++------------
2 files changed, 36 insertions(+), 13 deletions(-)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index fb3fec9affaa..a7cc3d1f4fd1 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -4204,6 +4204,9 @@ struct sk_buff *__skb_recv_datagram(struct sock *sk,
struct sk_buff_head *sk_queue,
unsigned int flags, int *off, int *err);
struct sk_buff *skb_recv_datagram(struct sock *sk, unsigned int flags, int *err);
+__poll_t datagram_poll_queue(struct file *file, struct socket *sock,
+ struct poll_table_struct *wait,
+ struct sk_buff_head *rcv_queue);
__poll_t datagram_poll(struct file *file, struct socket *sock,
struct poll_table_struct *wait);
int skb_copy_datagram_iter(const struct sk_buff *from, int offset,
diff --git a/net/core/datagram.c b/net/core/datagram.c
index cb4b9ef2e4e3..11ff1f9b0b61 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -920,21 +920,20 @@ int skb_copy_and_csum_datagram_msg(struct sk_buff *skb,
EXPORT_SYMBOL(skb_copy_and_csum_datagram_msg);
/**
- * datagram_poll - generic datagram poll
- * @file: file struct
- * @sock: socket
- * @wait: poll table
+ * datagram_poll_queue - same as datagram_poll, but on a specific receive queue
+ * @file: file struct
+ * @sock: socket
+ * @wait: poll table
+ * @rcv_queue: receive queue to poll
*
- * Datagram poll: Again totally generic. This also handles
- * sequenced packet sockets providing the socket receive queue
- * is only ever holding data ready to receive.
+ * Performs polling on the given receive queue, handling shutdown, error, and
+ * connection state. This is useful for protocols that deliver userspace-bound
+ * packets through a custom queue instead of sk->sk_receive_queue.
*
- * Note: when you *don't* use this routine for this protocol,
- * and you use a different write policy from sock_writeable()
- * then please supply your own write_space callback.
+ * Return: poll bitmask indicating the socket's current state
*/
-__poll_t datagram_poll(struct file *file, struct socket *sock,
- poll_table *wait)
+__poll_t datagram_poll_queue(struct file *file, struct socket *sock,
+ poll_table *wait, struct sk_buff_head *rcv_queue)
{
struct sock *sk = sock->sk;
__poll_t mask;
@@ -956,7 +955,7 @@ __poll_t datagram_poll(struct file *file, struct socket *sock,
mask |= EPOLLHUP;
/* readable? */
- if (!skb_queue_empty_lockless(&sk->sk_receive_queue))
+ if (!skb_queue_empty_lockless(rcv_queue))
mask |= EPOLLIN | EPOLLRDNORM;
/* Connection-based need to check for termination and startup */
@@ -978,4 +977,25 @@ __poll_t datagram_poll(struct file *file, struct socket *sock,
return mask;
}
+EXPORT_SYMBOL(datagram_poll_queue);
+
+/**
+ * datagram_poll - generic datagram poll
+ * @file: file struct
+ * @sock: socket
+ * @wait: poll table
+ *
+ * Datagram poll: Again totally generic. This also handles
+ * sequenced packet sockets providing the socket receive queue
+ * is only ever holding data ready to receive.
+ *
+ * Note: when you *don't* use this routine for this protocol,
+ * and you use a different write policy from sock_writeable()
+ * then please supply your own write_space callback.
+ */
+__poll_t datagram_poll(struct file *file, struct socket *sock, poll_table *wait)
+{
+ return datagram_poll_queue(file, sock, wait,
+ &sock->sk->sk_receive_queue);
+}
EXPORT_SYMBOL(datagram_poll);
--
2.51.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH net v2 2/3] espintcp: use datagram_poll_queue for socket readiness
2025-10-20 7:37 [PATCH net v2 0/3] fix poll behaviour for TCP-based tunnel protocols Ralf Lici
2025-10-20 7:37 ` [PATCH net v2 1/3] net: datagram: introduce datagram_poll_queue for custom receive queues Ralf Lici
@ 2025-10-20 7:37 ` Ralf Lici
2025-10-20 7:37 ` [PATCH net v2 3/3] ovpn: use datagram_poll_queue for socket readiness in TCP Ralf Lici
2 siblings, 0 replies; 8+ messages in thread
From: Ralf Lici @ 2025-10-20 7:37 UTC (permalink / raw)
To: netdev
Cc: Ralf Lici, Steffen Klassert, Herbert Xu, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
Sabrina Dubroca, Antonio Quartulli
espintcp uses a custom queue (ike_queue) to deliver packets to
userspace. The polling logic relies on datagram_poll, which checks
sk_receive_queue, which can lead to false readiness signals when that
queue contains non-userspace packets.
Switch espintcp_poll to use datagram_poll_queue with ike_queue, ensuring
poll only signals readiness when userspace data is actually available.
Fixes: e27cca96cd68 ("xfrm: add espintcp (RFC 8229)")
Signed-off-by: Ralf Lici <ralf@mandelbit.com>
---
net/xfrm/espintcp.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/net/xfrm/espintcp.c b/net/xfrm/espintcp.c
index fc7a603b04f1..bf744ac9d5a7 100644
--- a/net/xfrm/espintcp.c
+++ b/net/xfrm/espintcp.c
@@ -555,14 +555,10 @@ static void espintcp_close(struct sock *sk, long timeout)
static __poll_t espintcp_poll(struct file *file, struct socket *sock,
poll_table *wait)
{
- __poll_t mask = datagram_poll(file, sock, wait);
struct sock *sk = sock->sk;
struct espintcp_ctx *ctx = espintcp_getctx(sk);
- if (!skb_queue_empty(&ctx->ike_queue))
- mask |= EPOLLIN | EPOLLRDNORM;
-
- return mask;
+ return datagram_poll_queue(file, sock, wait, &ctx->ike_queue);
}
static void build_protos(struct proto *espintcp_prot,
--
2.51.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH net v2 3/3] ovpn: use datagram_poll_queue for socket readiness in TCP
2025-10-20 7:37 [PATCH net v2 0/3] fix poll behaviour for TCP-based tunnel protocols Ralf Lici
2025-10-20 7:37 ` [PATCH net v2 1/3] net: datagram: introduce datagram_poll_queue for custom receive queues Ralf Lici
2025-10-20 7:37 ` [PATCH net v2 2/3] espintcp: use datagram_poll_queue for socket readiness Ralf Lici
@ 2025-10-20 7:37 ` Ralf Lici
2025-10-20 10:17 ` Sabrina Dubroca
2 siblings, 1 reply; 8+ messages in thread
From: Ralf Lici @ 2025-10-20 7:37 UTC (permalink / raw)
To: netdev
Cc: Ralf Lici, Antonio Quartulli, Sabrina Dubroca, Andrew Lunn,
David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
openvpn TCP encapsulation uses a custom queue to deliver packets to
userspace. Currently it relies on datagram_poll, which checks
sk_receive_queue, leading to false readiness signals when that queue
contains non-userspace packets.
Switch ovpn_tcp_poll to use datagram_poll_queue with the peer's
user_queue, ensuring poll only signals readiness when userspace data is
actually available. Also refactor ovpn_tcp_poll in order to enforce the
assumption we can make on the lifetime of ovpn_sock and peer.
Fixes: 11851cbd60ea ("ovpn: implement TCP transport")
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
Signed-off-by: Ralf Lici <ralf@mandelbit.com>
---
drivers/net/ovpn/tcp.c | 26 ++++++++++++++++++++++----
1 file changed, 22 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ovpn/tcp.c b/drivers/net/ovpn/tcp.c
index 289f62c5d2c7..308fdbb75cea 100644
--- a/drivers/net/ovpn/tcp.c
+++ b/drivers/net/ovpn/tcp.c
@@ -560,16 +560,34 @@ static void ovpn_tcp_close(struct sock *sk, long timeout)
static __poll_t ovpn_tcp_poll(struct file *file, struct socket *sock,
poll_table *wait)
{
- __poll_t mask = datagram_poll(file, sock, wait);
+ struct sk_buff_head *queue = &sock->sk->sk_receive_queue;
struct ovpn_socket *ovpn_sock;
+ struct ovpn_peer *peer = NULL;
+ __poll_t mask;
rcu_read_lock();
ovpn_sock = rcu_dereference_sk_user_data(sock->sk);
- if (ovpn_sock && ovpn_sock->peer &&
- !skb_queue_empty(&ovpn_sock->peer->tcp.user_queue))
- mask |= EPOLLIN | EPOLLRDNORM;
+ /* if we landed in this callback, we expect to have a
+ * meaningful state. The ovpn_socket lifecycle would
+ * prevent it otherwise.
+ */
+ if (WARN_ON(!ovpn_sock || !ovpn_sock->peer)) {
+ rcu_read_unlock();
+ pr_err_ratelimited("ovpn: null state in ovpn_tcp_poll!\n");
+ return 0;
+ }
+
+ if (ovpn_peer_hold(ovpn_sock->peer)) {
+ peer = ovpn_sock->peer;
+ queue = &peer->tcp.user_queue;
+ }
rcu_read_unlock();
+ mask = datagram_poll_queue(file, sock, wait, queue);
+
+ if (peer)
+ ovpn_peer_put(peer);
+
return mask;
}
--
2.51.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH net v2 3/3] ovpn: use datagram_poll_queue for socket readiness in TCP
2025-10-20 7:37 ` [PATCH net v2 3/3] ovpn: use datagram_poll_queue for socket readiness in TCP Ralf Lici
@ 2025-10-20 10:17 ` Sabrina Dubroca
2025-10-20 12:22 ` Ralf Lici
0 siblings, 1 reply; 8+ messages in thread
From: Sabrina Dubroca @ 2025-10-20 10:17 UTC (permalink / raw)
To: Ralf Lici
Cc: netdev, Antonio Quartulli, Andrew Lunn, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni
2025-10-20, 09:37:31 +0200, Ralf Lici wrote:
> diff --git a/drivers/net/ovpn/tcp.c b/drivers/net/ovpn/tcp.c
> index 289f62c5d2c7..308fdbb75cea 100644
> --- a/drivers/net/ovpn/tcp.c
> +++ b/drivers/net/ovpn/tcp.c
> @@ -560,16 +560,34 @@ static void ovpn_tcp_close(struct sock *sk, long timeout)
> static __poll_t ovpn_tcp_poll(struct file *file, struct socket *sock,
> poll_table *wait)
> {
> - __poll_t mask = datagram_poll(file, sock, wait);
> + struct sk_buff_head *queue = &sock->sk->sk_receive_queue;
> struct ovpn_socket *ovpn_sock;
> + struct ovpn_peer *peer = NULL;
> + __poll_t mask;
>
> rcu_read_lock();
> ovpn_sock = rcu_dereference_sk_user_data(sock->sk);
> - if (ovpn_sock && ovpn_sock->peer &&
> - !skb_queue_empty(&ovpn_sock->peer->tcp.user_queue))
> - mask |= EPOLLIN | EPOLLRDNORM;
> + /* if we landed in this callback, we expect to have a
> + * meaningful state. The ovpn_socket lifecycle would
> + * prevent it otherwise.
> + */
> + if (WARN_ON(!ovpn_sock || !ovpn_sock->peer)) {
> + rcu_read_unlock();
> + pr_err_ratelimited("ovpn: null state in ovpn_tcp_poll!\n");
nit: the extra print is not really necessary once we've done a full
WARN. But if you want the custom message alongside the WARN, maybe:
if (WARN(!ovpn_sock || !ovpn_sock->peer, "ovpn: null state in ovpn_tcp_poll!")) {
...
}
(you can find examples of the "if (WARN(cond, msg))" pattern in
net/core/skbuff.c:
drop_reasons_register_subsys/drop_reasons_unregister_subsys
and other places)
Other than that, the patch looks good, thanks.
> + return 0;
> + }
--
Sabrina
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net v2 1/3] net: datagram: introduce datagram_poll_queue for custom receive queues
2025-10-20 7:37 ` [PATCH net v2 1/3] net: datagram: introduce datagram_poll_queue for custom receive queues Ralf Lici
@ 2025-10-20 10:17 ` Sabrina Dubroca
2025-10-20 12:22 ` Ralf Lici
0 siblings, 1 reply; 8+ messages in thread
From: Sabrina Dubroca @ 2025-10-20 10:17 UTC (permalink / raw)
To: Ralf Lici
Cc: netdev, David S . Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, Mina Almasry, Eric Biggers,
Antonio Quartulli
LGTM, just a small nit:
2025-10-20, 09:37:29 +0200, Ralf Lici wrote:
> +/**
> + * datagram_poll - generic datagram poll
> + * @file: file struct
> + * @sock: socket
> + * @wait: poll table
> + *
> + * Datagram poll: Again totally generic. This also handles
> + * sequenced packet sockets providing the socket receive queue
> + * is only ever holding data ready to receive.
> + *
> + * Note: when you *don't* use this routine for this protocol,
> + * and you use a different write policy from sock_writeable()
> + * then please supply your own write_space callback.
Maybe you could document the return value here as well, since you're
touching this code.
> + */
> +__poll_t datagram_poll(struct file *file, struct socket *sock, poll_table *wait)
--
Sabrina
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net v2 3/3] ovpn: use datagram_poll_queue for socket readiness in TCP
2025-10-20 10:17 ` Sabrina Dubroca
@ 2025-10-20 12:22 ` Ralf Lici
0 siblings, 0 replies; 8+ messages in thread
From: Ralf Lici @ 2025-10-20 12:22 UTC (permalink / raw)
To: Sabrina Dubroca
Cc: netdev, Antonio Quartulli, Andrew Lunn, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni
On Mon, 2025-10-20 at 12:17 +0200, Sabrina Dubroca wrote:
> 2025-10-20, 09:37:31 +0200, Ralf Lici wrote:
> > diff --git a/drivers/net/ovpn/tcp.c b/drivers/net/ovpn/tcp.c
> > index 289f62c5d2c7..308fdbb75cea 100644
> > --- a/drivers/net/ovpn/tcp.c
> > +++ b/drivers/net/ovpn/tcp.c
> > @@ -560,16 +560,34 @@ static void ovpn_tcp_close(struct sock *sk,
> > long timeout)
> > static __poll_t ovpn_tcp_poll(struct file *file, struct socket
> > *sock,
> > poll_table *wait)
> > {
> > - __poll_t mask = datagram_poll(file, sock, wait);
> > + struct sk_buff_head *queue = &sock->sk->sk_receive_queue;
> > struct ovpn_socket *ovpn_sock;
> > + struct ovpn_peer *peer = NULL;
> > + __poll_t mask;
> >
> > rcu_read_lock();
> > ovpn_sock = rcu_dereference_sk_user_data(sock->sk);
> > - if (ovpn_sock && ovpn_sock->peer &&
> > - !skb_queue_empty(&ovpn_sock->peer->tcp.user_queue))
> > - mask |= EPOLLIN | EPOLLRDNORM;
> > + /* if we landed in this callback, we expect to have a
> > + * meaningful state. The ovpn_socket lifecycle would
> > + * prevent it otherwise.
> > + */
> > + if (WARN_ON(!ovpn_sock || !ovpn_sock->peer)) {
> > + rcu_read_unlock();
> > + pr_err_ratelimited("ovpn: null state in
> > ovpn_tcp_poll!\n");
>
> nit: the extra print is not really necessary once we've done a full
> WARN. But if you want the custom message alongside the WARN, maybe:
>
> if (WARN(!ovpn_sock || !ovpn_sock->peer, "ovpn: null state in
> ovpn_tcp_poll!")) {
> ...
> }
>
> (you can find examples of the "if (WARN(cond, msg))" pattern in
> net/core/skbuff.c:
> drop_reasons_register_subsys/drop_reasons_unregister_subsys
> and other places)
I wasn't aware of this macro, thanks for pointing it out. I'll use it on
the next version.
--
Ralf Lici
Mandelbit Srl
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net v2 1/3] net: datagram: introduce datagram_poll_queue for custom receive queues
2025-10-20 10:17 ` Sabrina Dubroca
@ 2025-10-20 12:22 ` Ralf Lici
0 siblings, 0 replies; 8+ messages in thread
From: Ralf Lici @ 2025-10-20 12:22 UTC (permalink / raw)
To: Sabrina Dubroca
Cc: netdev, David S . Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, Mina Almasry, Eric Biggers,
Antonio Quartulli
On Mon, 2025-10-20 at 12:17 +0200, Sabrina Dubroca wrote:
> LGTM, just a small nit:
>
> 2025-10-20, 09:37:29 +0200, Ralf Lici wrote:
> > +/**
> > + * datagram_poll - generic datagram poll
> > + * @file: file struct
> > + * @sock: socket
> > + * @wait: poll table
> > + *
> > + * Datagram poll: Again totally generic. This also handles
> > + * sequenced packet sockets providing the socket receive queue
> > + * is only ever holding data ready to receive.
> > + *
> > + * Note: when you *don't* use this routine for this protocol,
> > + * and you use a different write policy from sock_writeable()
> > + * then please supply your own write_space callback.
>
> Maybe you could document the return value here as well, since you're
> touching this code.
Will do.
--
Ralf Lici
Mandelbit Srl
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-10-20 12:22 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-20 7:37 [PATCH net v2 0/3] fix poll behaviour for TCP-based tunnel protocols Ralf Lici
2025-10-20 7:37 ` [PATCH net v2 1/3] net: datagram: introduce datagram_poll_queue for custom receive queues Ralf Lici
2025-10-20 10:17 ` Sabrina Dubroca
2025-10-20 12:22 ` Ralf Lici
2025-10-20 7:37 ` [PATCH net v2 2/3] espintcp: use datagram_poll_queue for socket readiness Ralf Lici
2025-10-20 7:37 ` [PATCH net v2 3/3] ovpn: use datagram_poll_queue for socket readiness in TCP Ralf Lici
2025-10-20 10:17 ` Sabrina Dubroca
2025-10-20 12:22 ` Ralf Lici
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).