* [PATCH mptcp-next v3 0/3] mptcp: MSG_ERRQUEUE support on the parent socket
@ 2026-04-21 22:33 David Carlier
2026-04-21 22:33 ` [PATCH mptcp-next v3 1/3] mptcp: propagate RECVERR sockopts to subflows David Carlier
` (5 more replies)
0 siblings, 6 replies; 31+ messages in thread
From: David Carlier @ 2026-04-21 22:33 UTC (permalink / raw)
To: mptcp; +Cc: Matthieu Baerts, Mat Martineau, Geliang Tang, David Carlier
MPTCP already advertises IP_RECVERR/IPV6_RECVERR as supported, but the
parent socket does not currently provide usable MSG_ERRQUEUE handling.
This series wires the MPTCP socket up to the IPv4/IPv6 error queue
paths. It propagates RECVERR-related sockopts to existing and future
subflows, makes poll() report pending errqueue activity through the
parent socket, and allows recvmsg(MSG_ERRQUEUE) on the MPTCP socket to
consume queued errors with the parent socket ABI.
The series also handles mixed-family subflows by applying the matching
sockopt according to each subflow family, and avoids silently losing an
error skb if requeueing to the parent socket fails under rmem pressure.
v2 -> v3:
- Only consume ssk->sk_err in the fallback / MPC-connect branch of
__mptcp_subflow_error_report(). Steady-state MPTCP now leaves
TCP's one-shot sk_err to TCP's own consumer instead of silently
draining it via sock_error().
- In mptcp_recv_error(), also route to inet_recv_error() when
sk->sk_err is set, so a fallback-propagated error reaches userspace
even when the parent errqueue is empty.
- Scope the new selftest to IP_RECVERR sockopt propagation only.
End-to-end errqueue delivery (TX timestamps, ICMP, zerocopy)
depends on subflow-side producers that are out of scope for this
series and will be covered by follow-up work. Fixes the
mptcp_sockopt selftest timeout reported by the MPTCP CI on v2.
v1 -> v2:
- Retargeted to mptcp-next per Matthieu Baerts' feedback (net-next
closed during the merge window; iterate on the MPTCP tree).
- Guard mptcp_setsockopt_v6_recverr() and its dispatch cases in
mptcp_setsockopt_v6() with #if IS_ENABLED(CONFIG_IPV6) to fix
the MPTCP CI link break on without_ipv6/with_mptcp configs
(undefined reference to ipv6_setsockopt).
v1: https://lore.kernel.org/mptcp/20260421152216.38127-1-devnexen@gmail.com/
v2: https://lore.kernel.org/mptcp/20260421191337.58341-1-devnexen@gmail.com/
David Carlier (3):
mptcp: propagate RECVERR sockopts to subflows
mptcp: support MSG_ERRQUEUE on the parent socket
selftests: mptcp: cover IP_RECVERR sockopt propagation
net/mptcp/protocol.c | 123 ++++++++++++++---
net/mptcp/sockopt.c | 129 ++++++++++++++++++
.../selftests/net/mptcp/mptcp_sockopt.c | 55 ++++++++
3 files changed, 287 insertions(+), 20 deletions(-)
base-commit: 4464afe97dc56e817a23b730979cbc6fc48f1912
--
2.53.0
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH mptcp-next v3 1/3] mptcp: propagate RECVERR sockopts to subflows
2026-04-21 22:33 [PATCH mptcp-next v3 0/3] mptcp: MSG_ERRQUEUE support on the parent socket David Carlier
@ 2026-04-21 22:33 ` David Carlier
2026-04-22 8:05 ` Paolo Abeni
2026-04-21 22:33 ` [PATCH mptcp-next v3 2/3] mptcp: support MSG_ERRQUEUE on the parent socket David Carlier
` (4 subsequent siblings)
5 siblings, 1 reply; 31+ messages in thread
From: David Carlier @ 2026-04-21 22:33 UTC (permalink / raw)
To: mptcp; +Cc: Matthieu Baerts, Mat Martineau, Geliang Tang, David Carlier
Propagate IP_RECVERR/IP_RECVERR_RFC4884 and
IPV6_RECVERR/IPV6_RECVERR_RFC4884 from the MPTCP socket to
existing and future subflows.
Apply the matching sockopt according to the subflow family so mixed-
family subflows stay aligned with the parent socket configuration,
including disable-time errqueue purge semantics.
Signed-off-by: David Carlier <devnexen@gmail.com>
Assisted-by: Codex:gpt-5
Signed-off-by: David Carlier <devnexen@gmail.com>
---
net/mptcp/sockopt.c | 129 ++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 129 insertions(+)
diff --git a/net/mptcp/sockopt.c b/net/mptcp/sockopt.c
index 79db15903e7a..acb0ca330e44 100644
--- a/net/mptcp/sockopt.c
+++ b/net/mptcp/sockopt.c
@@ -8,6 +8,8 @@
#include <linux/kernel.h>
#include <linux/module.h>
+#include <net/ip.h>
+#include <net/ipv6.h>
#include <net/sock.h>
#include <net/protocol.h>
#include <net/tcp.h>
@@ -384,6 +386,72 @@ static int mptcp_setsockopt_sol_socket(struct mptcp_sock *msk, int optname,
return -EOPNOTSUPP;
}
+static bool mptcp_recverr_enabled(const struct sock *sk, bool rfc4884)
+{
+ bool enabled;
+
+ enabled = rfc4884 ? inet_test_bit(RECVERR_RFC4884, sk) :
+ inet_test_bit(RECVERR, sk);
+
+#if IS_ENABLED(CONFIG_IPV6)
+ if (sk->sk_family == AF_INET6)
+ enabled |= rfc4884 ? inet6_test_bit(RECVERR6_RFC4884, sk) :
+ inet6_test_bit(RECVERR6, sk);
+#endif
+
+ return enabled;
+}
+
+static int mptcp_subflow_set_recverr(struct sock *sk, struct sock *ssk,
+ bool rfc4884)
+{
+ int level, optname, val;
+
+#if IS_ENABLED(CONFIG_IPV6)
+ if (ssk->sk_family == AF_INET6) {
+ level = SOL_IPV6;
+ optname = rfc4884 ? IPV6_RECVERR_RFC4884 : IPV6_RECVERR;
+ } else
+#endif
+ {
+ level = SOL_IP;
+ optname = rfc4884 ? IP_RECVERR_RFC4884 : IP_RECVERR;
+ }
+
+ val = mptcp_recverr_enabled(sk, rfc4884);
+ return tcp_setsockopt(ssk, level, optname, KERNEL_SOCKPTR(&val),
+ sizeof(val));
+}
+
+#if IS_ENABLED(CONFIG_IPV6)
+static int mptcp_setsockopt_v6_recverr(struct mptcp_sock *msk, int optname,
+ sockptr_t optval, unsigned int optlen)
+{
+ struct mptcp_subflow_context *subflow;
+ struct sock *sk = (struct sock *)msk;
+ int ret;
+
+ ret = ipv6_setsockopt(sk, SOL_IPV6, optname, optval, optlen);
+ if (ret)
+ return ret;
+
+ lock_sock(sk);
+ sockopt_seq_inc(msk);
+ mptcp_for_each_subflow(msk, subflow) {
+ struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
+ bool rfc4884 = optname == IPV6_RECVERR_RFC4884;
+
+ ret = mptcp_subflow_set_recverr(sk, ssk, rfc4884);
+ if (ret)
+ break;
+ subflow->setsockopt_seq = msk->setsockopt_seq;
+ }
+ release_sock(sk);
+
+ return ret;
+}
+#endif
+
static int mptcp_setsockopt_v6(struct mptcp_sock *msk, int optname,
sockptr_t optval, unsigned int optlen)
{
@@ -426,6 +494,12 @@ static int mptcp_setsockopt_v6(struct mptcp_sock *msk, int optname,
release_sock(sk);
break;
+#if IS_ENABLED(CONFIG_IPV6)
+ case IPV6_RECVERR:
+ case IPV6_RECVERR_RFC4884:
+ ret = mptcp_setsockopt_v6_recverr(msk, optname, optval, optlen);
+ break;
+#endif
}
return ret;
@@ -760,6 +834,33 @@ static int mptcp_setsockopt_v4_set_tos(struct mptcp_sock *msk, int optname,
return 0;
}
+static int mptcp_setsockopt_v4_recverr(struct mptcp_sock *msk, int optname,
+ sockptr_t optval, unsigned int optlen)
+{
+ struct mptcp_subflow_context *subflow;
+ struct sock *sk = (struct sock *)msk;
+ int err;
+
+ err = ip_setsockopt(sk, SOL_IP, optname, optval, optlen);
+ if (err)
+ return err;
+
+ lock_sock(sk);
+ sockopt_seq_inc(msk);
+ mptcp_for_each_subflow(msk, subflow) {
+ struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
+ bool rfc4884 = optname == IP_RECVERR_RFC4884;
+
+ err = mptcp_subflow_set_recverr(sk, ssk, rfc4884);
+ if (err)
+ break;
+ subflow->setsockopt_seq = msk->setsockopt_seq;
+ }
+ release_sock(sk);
+
+ return err;
+}
+
static int mptcp_setsockopt_v4(struct mptcp_sock *msk, int optname,
sockptr_t optval, unsigned int optlen)
{
@@ -771,6 +872,9 @@ static int mptcp_setsockopt_v4(struct mptcp_sock *msk, int optname,
return mptcp_setsockopt_sol_ip_set(msk, optname, optval, optlen);
case IP_TOS:
return mptcp_setsockopt_v4_set_tos(msk, optname, optval, optlen);
+ case IP_RECVERR:
+ case IP_RECVERR_RFC4884:
+ return mptcp_setsockopt_v4_recverr(msk, optname, optval, optlen);
}
return -EOPNOTSUPP;
@@ -1459,6 +1563,12 @@ static int mptcp_getsockopt_v4(struct mptcp_sock *msk, int optname,
case IP_LOCAL_PORT_RANGE:
return mptcp_put_int_option(msk, optval, optlen,
READ_ONCE(inet_sk(sk)->local_port_range));
+ case IP_RECVERR:
+ return mptcp_put_int_option(msk, optval, optlen,
+ inet_test_bit(RECVERR, sk));
+ case IP_RECVERR_RFC4884:
+ return mptcp_put_int_option(msk, optval, optlen,
+ inet_test_bit(RECVERR_RFC4884, sk));
}
return -EOPNOTSUPP;
@@ -1479,6 +1589,12 @@ static int mptcp_getsockopt_v6(struct mptcp_sock *msk, int optname,
case IPV6_FREEBIND:
return mptcp_put_int_option(msk, optval, optlen,
inet_test_bit(FREEBIND, sk));
+ case IPV6_RECVERR:
+ return mptcp_put_int_option(msk, optval, optlen,
+ inet6_test_bit(RECVERR6, sk));
+ case IPV6_RECVERR_RFC4884:
+ return mptcp_put_int_option(msk, optval, optlen,
+ inet6_test_bit(RECVERR6_RFC4884, sk));
}
return -EOPNOTSUPP;
@@ -1536,6 +1652,7 @@ static void sync_socket_options(struct mptcp_sock *msk, struct sock *ssk)
{
static const unsigned int tx_rx_locks = SOCK_RCVBUF_LOCK | SOCK_SNDBUF_LOCK;
struct sock *sk = (struct sock *)msk;
+ bool recverr, recverr_rfc4884;
bool keep_open;
keep_open = sock_flag(sk, SOCK_KEEPOPEN);
@@ -1586,6 +1703,18 @@ static void sync_socket_options(struct mptcp_sock *msk, struct sock *ssk)
inet_assign_bit(FREEBIND, ssk, inet_test_bit(FREEBIND, sk));
inet_assign_bit(BIND_ADDRESS_NO_PORT, ssk, inet_test_bit(BIND_ADDRESS_NO_PORT, sk));
WRITE_ONCE(inet_sk(ssk)->local_port_range, READ_ONCE(inet_sk(sk)->local_port_range));
+ recverr = mptcp_recverr_enabled(sk, false);
+ recverr_rfc4884 = mptcp_recverr_enabled(sk, true);
+#if IS_ENABLED(CONFIG_IPV6)
+ if (ssk->sk_family == AF_INET6) {
+ inet6_assign_bit(RECVERR6, ssk, recverr);
+ inet6_assign_bit(RECVERR6_RFC4884, ssk, recverr_rfc4884);
+ } else
+#endif
+ {
+ inet_assign_bit(RECVERR, ssk, recverr);
+ inet_assign_bit(RECVERR_RFC4884, ssk, recverr_rfc4884);
+ }
}
void mptcp_sockopt_sync_locked(struct mptcp_sock *msk, struct sock *ssk)
--
2.53.0
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH mptcp-next v3 2/3] mptcp: support MSG_ERRQUEUE on the parent socket
2026-04-21 22:33 [PATCH mptcp-next v3 0/3] mptcp: MSG_ERRQUEUE support on the parent socket David Carlier
2026-04-21 22:33 ` [PATCH mptcp-next v3 1/3] mptcp: propagate RECVERR sockopts to subflows David Carlier
@ 2026-04-21 22:33 ` David Carlier
2026-04-22 8:28 ` Paolo Abeni
2026-04-21 22:33 ` [PATCH mptcp-next v3 3/3] selftests: mptcp: cover IP_RECVERR sockopt propagation David Carlier
` (3 subsequent siblings)
5 siblings, 1 reply; 31+ messages in thread
From: David Carlier @ 2026-04-21 22:33 UTC (permalink / raw)
To: mptcp; +Cc: Matthieu Baerts, Mat Martineau, Geliang Tang, David Carlier
Handle MSG_ERRQUEUE on the MPTCP socket by selecting a subflow with
pending errqueue data, moving one error skb to the parent socket, and
consuming it through the parent socket ABI.
This surfaces subflow errqueue activity through poll(), keeps the
userspace ABI tied to the socket being used, and restores the skb to
the subflow errqueue if requeueing to the parent fails under rmem
pressure.
Signed-off-by: David Carlier <devnexen@gmail.com>
Assisted-by: Codex:gpt-5
Signed-off-by: David Carlier <devnexen@gmail.com>
---
net/mptcp/protocol.c | 123 ++++++++++++++++++++++++++++++++++++-------
1 file changed, 103 insertions(+), 20 deletions(-)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 6b486fc94c16..87871216bab2 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -818,28 +818,29 @@ static bool __mptcp_ofo_queue(struct mptcp_sock *msk)
static bool __mptcp_subflow_error_report(struct sock *sk, struct sock *ssk)
{
int ssk_state;
- int err;
+ int err = 0;
+ bool has_errqueue;
+
+ has_errqueue = !skb_queue_empty_lockless(&ssk->sk_error_queue);
- /* only propagate errors on fallen-back sockets or
- * on MPC connect
+ /* Only fallback sockets and the MPC connect path inherit TCP's sk_err
+ * semantics; consume ssk->sk_err only on those paths so steady-state
+ * MPTCP doesn't silently drop TCP's one-shot errors.
*/
- if (sk->sk_state != TCP_SYN_SENT && !__mptcp_check_fallback(mptcp_sk(sk)))
- return false;
+ if (sk->sk_state == TCP_SYN_SENT ||
+ __mptcp_check_fallback(mptcp_sk(sk))) {
+ err = sock_error(ssk);
+ if (err) {
+ ssk_state = inet_sk_state_load(ssk);
+ if (ssk_state == TCP_CLOSE && !sock_flag(sk, SOCK_DEAD))
+ mptcp_set_state(sk, ssk_state);
+ WRITE_ONCE(sk->sk_err, -err);
+ }
+ }
- err = sock_error(ssk);
- if (!err)
+ if (!err && !has_errqueue)
return false;
- /* We need to propagate only transition to CLOSE state.
- * Orphaned socket will see such state change via
- * subflow_sched_work_if_closed() and that path will properly
- * destroy the msk as needed.
- */
- ssk_state = inet_sk_state_load(ssk);
- if (ssk_state == TCP_CLOSE && !sock_flag(sk, SOCK_DEAD))
- mptcp_set_state(sk, ssk_state);
- WRITE_ONCE(sk->sk_err, -err);
-
/* This barrier is coupled with smp_rmb() in mptcp_poll() */
smp_wmb();
sk_error_report(sk);
@@ -2286,6 +2287,68 @@ static unsigned int mptcp_inq_hint(const struct sock *sk)
return 0;
}
+static struct sock *mptcp_pick_errqueue_subflow(struct sock *sk)
+{
+ struct mptcp_subflow_context *subflow;
+ struct sock *ssk = NULL;
+
+ lock_sock(sk);
+ mptcp_for_each_subflow(mptcp_sk(sk), subflow) {
+ struct sock *subflow_sk = mptcp_subflow_tcp_sock(subflow);
+
+ if (skb_queue_empty_lockless(&subflow_sk->sk_error_queue))
+ continue;
+
+ if (!refcount_inc_not_zero(&subflow_sk->sk_refcnt))
+ continue;
+
+ ssk = subflow_sk;
+ break;
+ }
+ release_sock(sk);
+
+ return ssk;
+}
+
+static bool mptcp_has_error_queue(const struct sock *sk)
+{
+ return !skb_queue_empty_lockless(&sk->sk_error_queue);
+}
+
+static int mptcp_recv_error(struct sock *sk, struct msghdr *msg, int len)
+{
+ struct sk_buff *skb;
+ struct sock *ssk;
+ int ret, ret2;
+
+ if (READ_ONCE(sk->sk_err) || mptcp_has_error_queue(sk))
+ return inet_recv_error(sk, msg, len);
+
+ ssk = mptcp_pick_errqueue_subflow(sk);
+ if (!ssk)
+ return -EAGAIN;
+
+ skb = sock_dequeue_err_skb(ssk);
+ if (!skb)
+ goto put_ssk;
+
+ ret = sock_queue_err_skb(sk, skb);
+ if (ret) {
+ ret2 = sock_queue_err_skb(ssk, skb);
+ sock_put(ssk);
+ if (ret2)
+ kfree_skb(skb);
+ return ret;
+ }
+
+ sock_put(ssk);
+ return inet_recv_error(sk, msg, len);
+
+put_ssk:
+ sock_put(ssk);
+ return -EAGAIN;
+}
+
static int mptcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
int flags)
{
@@ -2295,9 +2358,8 @@ static int mptcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
int target;
long timeo;
- /* MSG_ERRQUEUE is really a no-op till we support IP_RECVERR */
if (unlikely(flags & MSG_ERRQUEUE))
- return inet_recv_error(sk, msg, len);
+ return mptcp_recv_error(sk, msg, len);
lock_sock(sk);
if (unlikely(sk->sk_state == TCP_LISTEN)) {
@@ -4296,6 +4358,26 @@ static __poll_t mptcp_check_writeable(struct mptcp_sock *msk)
return 0;
}
+static bool mptcp_subflow_has_error(struct sock *sk)
+{
+ struct mptcp_subflow_context *subflow;
+ bool has_error = false;
+
+ mptcp_data_lock(sk);
+ mptcp_for_each_subflow(mptcp_sk(sk), subflow) {
+ struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
+
+ if (READ_ONCE(ssk->sk_err) ||
+ !skb_queue_empty_lockless(&ssk->sk_error_queue)) {
+ has_error = true;
+ break;
+ }
+ }
+ mptcp_data_unlock(sk);
+
+ return has_error;
+}
+
static __poll_t mptcp_poll(struct file *file, struct socket *sock,
struct poll_table_struct *wait)
{
@@ -4339,7 +4421,8 @@ static __poll_t mptcp_poll(struct file *file, struct socket *sock,
/* This barrier is coupled with smp_wmb() in __mptcp_error_report() */
smp_rmb();
- if (READ_ONCE(sk->sk_err))
+ if (READ_ONCE(sk->sk_err) || mptcp_has_error_queue(sk) ||
+ mptcp_subflow_has_error(sk))
mask |= EPOLLERR;
return mask;
--
2.53.0
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH mptcp-next v3 3/3] selftests: mptcp: cover IP_RECVERR sockopt propagation
2026-04-21 22:33 [PATCH mptcp-next v3 0/3] mptcp: MSG_ERRQUEUE support on the parent socket David Carlier
2026-04-21 22:33 ` [PATCH mptcp-next v3 1/3] mptcp: propagate RECVERR sockopts to subflows David Carlier
2026-04-21 22:33 ` [PATCH mptcp-next v3 2/3] mptcp: support MSG_ERRQUEUE on the parent socket David Carlier
@ 2026-04-21 22:33 ` David Carlier
2026-04-21 23:38 ` [PATCH mptcp-next v3 0/3] mptcp: MSG_ERRQUEUE support on the parent socket MPTCP CI
` (2 subsequent siblings)
5 siblings, 0 replies; 31+ messages in thread
From: David Carlier @ 2026-04-21 22:33 UTC (permalink / raw)
To: mptcp; +Cc: Matthieu Baerts, Mat Martineau, Geliang Tang, David Carlier
Exercise setsockopt/getsockopt of IP_RECVERR and IPV6_RECVERR on the
MPTCP parent socket, including the empty-errqueue EAGAIN contract on
MSG_ERRQUEUE|MSG_DONTWAIT.
End-to-end errqueue delivery (ICMP, TX timestamps, zerocopy) depends on
subflow-side producers that are out of scope for this series and will be
covered by follow-up work.
Assisted-by: Codex:gpt-5
Signed-off-by: David Carlier <devnexen@gmail.com>
---
.../selftests/net/mptcp/mptcp_sockopt.c | 55 +++++++++++++++++++
1 file changed, 55 insertions(+)
diff --git a/tools/testing/selftests/net/mptcp/mptcp_sockopt.c b/tools/testing/selftests/net/mptcp/mptcp_sockopt.c
index b6e58d936ebe..95bb2cc8e2ff 100644
--- a/tools/testing/selftests/net/mptcp/mptcp_sockopt.c
+++ b/tools/testing/selftests/net/mptcp/mptcp_sockopt.c
@@ -769,6 +769,60 @@ static void test_ip_tos_sockopt(int fd)
xerror("expect socklen_t == -1");
}
+static void test_ip_recverr_sockopt(int fd)
+{
+ struct iovec iov = {
+ .iov_base = &(char){ 0 },
+ .iov_len = 1,
+ };
+ struct msghdr msg = {
+ .msg_iov = &iov,
+ .msg_iovlen = 1,
+ };
+ int one = 1, zero = 0, val = -1;
+ socklen_t s = sizeof(val);
+ int level, optname, r;
+
+ switch (pf) {
+ case AF_INET:
+ level = SOL_IP;
+ optname = IP_RECVERR;
+ break;
+ case AF_INET6:
+ level = SOL_IPV6;
+ optname = IPV6_RECVERR;
+ break;
+ default:
+ xerror("Unknown pf %d\n", pf);
+ }
+
+ r = setsockopt(fd, level, optname, &one, sizeof(one));
+ if (r)
+ die_perror("setsockopt recverr on");
+
+ r = getsockopt(fd, level, optname, &val, &s);
+ if (r)
+ die_perror("getsockopt recverr on");
+ if (s != sizeof(val) || val != one)
+ xerror("recverr on mismatch val=%d len=%u", val, s);
+
+ r = recvmsg(fd, &msg, MSG_ERRQUEUE | MSG_DONTWAIT);
+ if (r != -1 || errno != EAGAIN)
+ xerror("expected empty errqueue to return EAGAIN, ret=%d errno=%d", r, errno);
+
+ r = setsockopt(fd, level, optname, &zero, sizeof(zero));
+ if (r)
+ die_perror("setsockopt recverr off");
+
+ val = -1;
+ s = sizeof(val);
+ r = getsockopt(fd, level, optname, &val, &s);
+ if (r)
+ die_perror("getsockopt recverr off");
+ if (s != sizeof(val) || val != zero)
+ xerror("recverr off mismatch val=%d len=%u", val, s);
+}
+
static int client(int pipefd)
{
int fd = -1;
@@ -787,6 +841,7 @@ static int client(int pipefd)
}
test_ip_tos_sockopt(fd);
+ test_ip_recverr_sockopt(fd);
connect_one_server(fd, pipefd);
--
2.53.0
^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH mptcp-next v3 0/3] mptcp: MSG_ERRQUEUE support on the parent socket
2026-04-21 22:33 [PATCH mptcp-next v3 0/3] mptcp: MSG_ERRQUEUE support on the parent socket David Carlier
` (2 preceding siblings ...)
2026-04-21 22:33 ` [PATCH mptcp-next v3 3/3] selftests: mptcp: cover IP_RECVERR sockopt propagation David Carlier
@ 2026-04-21 23:38 ` MPTCP CI
2026-04-22 8:22 ` Matthieu Baerts
2026-04-27 21:10 ` [PATCH mptcp-next v4 0/4] " David Carlier
5 siblings, 0 replies; 31+ messages in thread
From: MPTCP CI @ 2026-04-21 23:38 UTC (permalink / raw)
To: David Carlier; +Cc: mptcp
Hi David,
Thank you for your modifications, that's great!
Our CI did some validations and here is its report:
- KVM Validation: normal (except selftest_mptcp_join): Success! ✅
- KVM Validation: normal (only selftest_mptcp_join): Unstable: 1 failed test(s): selftest_mptcp_join ⚠️
- KVM Validation: debug (except selftest_mptcp_join): Unstable: 1 failed test(s): packetdrill_fastopen ⚠️
- KVM Validation: debug (only selftest_mptcp_join): Unstable: 1 failed test(s): selftest_mptcp_join ⚠️
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/24750414123
Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/3d39e1ac876f
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=1084059
If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:
$ cd [kernel source code]
$ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
--pull always mptcp/mptcp-upstream-virtme-docker:latest \
auto-normal
For more details:
https://github.com/multipath-tcp/mptcp-upstream-virtme-docker
Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)
Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH mptcp-next v3 1/3] mptcp: propagate RECVERR sockopts to subflows
2026-04-21 22:33 ` [PATCH mptcp-next v3 1/3] mptcp: propagate RECVERR sockopts to subflows David Carlier
@ 2026-04-22 8:05 ` Paolo Abeni
2026-04-22 8:32 ` Matthieu Baerts
2026-04-22 21:51 ` David CARLIER
0 siblings, 2 replies; 31+ messages in thread
From: Paolo Abeni @ 2026-04-22 8:05 UTC (permalink / raw)
To: David Carlier, mptcp; +Cc: Matthieu Baerts, Mat Martineau, Geliang Tang
On 4/22/26 12:33 AM, David Carlier wrote:
> Propagate IP_RECVERR/IP_RECVERR_RFC4884 and
> IPV6_RECVERR/IPV6_RECVERR_RFC4884 from the MPTCP socket to
> existing and future subflows.
>
> Apply the matching sockopt according to the subflow family so mixed-
> family subflows stay aligned with the parent socket configuration,
> including disable-time errqueue purge semantics.
>
> Signed-off-by: David Carlier <devnexen@gmail.com>
You should drop this line, only a single SoB tag is needed.
[...]
> +#if IS_ENABLED(CONFIG_IPV6)
Is this compiler guard strictly needed?
> +static int mptcp_setsockopt_v6_recverr(struct mptcp_sock *msk, int optname,
> + sockptr_t optval, unsigned int optlen)
> +{
> + struct mptcp_subflow_context *subflow;
> + struct sock *sk = (struct sock *)msk;
> + int ret;
> +
> + ret = ipv6_setsockopt(sk, SOL_IPV6, optname, optval, optlen);
> + if (ret)
> + return ret;
> +
> + lock_sock(sk);
> + sockopt_seq_inc(msk);
> + mptcp_for_each_subflow(msk, subflow) {
> + struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
> + bool rfc4884 = optname == IPV6_RECVERR_RFC4884;
> +
> + ret = mptcp_subflow_set_recverr(sk, ssk, rfc4884);
The above looks a bit overcomplicated?!? It looks like you could
leverage mptcp_setsockopt_all_sf() in and drop the
mptcp_subflow_set_recverr() and mptcp_recverr_enabled() helpers.
@Mattbe: it's not clear to me why mptcp_setsockopt_all_sf() does not
call sockopt_seq_inc(msk) or sync the setsockopt_seq, possibly an
unrelated bug?
[...]
> @@ -760,6 +834,33 @@ static int mptcp_setsockopt_v4_set_tos(struct mptcp_sock *msk, int optname,
> return 0;
> }
>
> +static int mptcp_setsockopt_v4_recverr(struct mptcp_sock *msk, int optname,
> + sockptr_t optval, unsigned int optlen)
> +{
> + struct mptcp_subflow_context *subflow;
> + struct sock *sk = (struct sock *)msk;
> + int err;
> +
> + err = ip_setsockopt(sk, SOL_IP, optname, optval, optlen);
> + if (err)
> + return err;
> +
> + lock_sock(sk);
> + sockopt_seq_inc(msk);
> + mptcp_for_each_subflow(msk, subflow) {
> + struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
> + bool rfc4884 = optname == IP_RECVERR_RFC4884;
> +
> + err = mptcp_subflow_set_recverr(sk, ssk, rfc4884);
> + if (err)
> + break;
Same note here WRT possible mptcp_setsockopt_all_sf() usage.
> + subflow->setsockopt_seq = msk->setsockopt_seq;
> + }
> + release_sock(sk);
> +
> + return err;
> +}
> +
> static int mptcp_setsockopt_v4(struct mptcp_sock *msk, int optname,
> sockptr_t optval, unsigned int optlen)
> {
> @@ -771,6 +872,9 @@ static int mptcp_setsockopt_v4(struct mptcp_sock *msk, int optname,
> return mptcp_setsockopt_sol_ip_set(msk, optname, optval, optlen);
> case IP_TOS:
> return mptcp_setsockopt_v4_set_tos(msk, optname, optval, optlen);
> + case IP_RECVERR:
> + case IP_RECVERR_RFC4884:
> + return mptcp_setsockopt_v4_recverr(msk, optname, optval, optlen);
> }
>
> return -EOPNOTSUPP;
> @@ -1459,6 +1563,12 @@ static int mptcp_getsockopt_v4(struct mptcp_sock *msk, int optname,
> case IP_LOCAL_PORT_RANGE:
> return mptcp_put_int_option(msk, optval, optlen,
> READ_ONCE(inet_sk(sk)->local_port_range));
> + case IP_RECVERR:
> + return mptcp_put_int_option(msk, optval, optlen,
> + inet_test_bit(RECVERR, sk));
> + case IP_RECVERR_RFC4884:
> + return mptcp_put_int_option(msk, optval, optlen,
> + inet_test_bit(RECVERR_RFC4884, sk));
> }
>
> return -EOPNOTSUPP;
> @@ -1479,6 +1589,12 @@ static int mptcp_getsockopt_v6(struct mptcp_sock *msk, int optname,
> case IPV6_FREEBIND:
> return mptcp_put_int_option(msk, optval, optlen,
> inet_test_bit(FREEBIND, sk));
> + case IPV6_RECVERR:
> + return mptcp_put_int_option(msk, optval, optlen,
> + inet6_test_bit(RECVERR6, sk));
> + case IPV6_RECVERR_RFC4884:
> + return mptcp_put_int_option(msk, optval, optlen,
> + inet6_test_bit(RECVERR6_RFC4884, sk));
> }
>
> return -EOPNOTSUPP;
> @@ -1536,6 +1652,7 @@ static void sync_socket_options(struct mptcp_sock *msk, struct sock *ssk)
> {
> static const unsigned int tx_rx_locks = SOCK_RCVBUF_LOCK | SOCK_SNDBUF_LOCK;
> struct sock *sk = (struct sock *)msk;
> + bool recverr, recverr_rfc4884;
> bool keep_open;
>
> keep_open = sock_flag(sk, SOCK_KEEPOPEN);
> @@ -1586,6 +1703,18 @@ static void sync_socket_options(struct mptcp_sock *msk, struct sock *ssk)
> inet_assign_bit(FREEBIND, ssk, inet_test_bit(FREEBIND, sk));
> inet_assign_bit(BIND_ADDRESS_NO_PORT, ssk, inet_test_bit(BIND_ADDRESS_NO_PORT, sk));
> WRITE_ONCE(inet_sk(ssk)->local_port_range, READ_ONCE(inet_sk(sk)->local_port_range));
> + recverr = mptcp_recverr_enabled(sk, false);
> + recverr_rfc4884 = mptcp_recverr_enabled(sk, true);
> +#if IS_ENABLED(CONFIG_IPV6)
> + if (ssk->sk_family == AF_INET6) {
> + inet6_assign_bit(RECVERR6, ssk, recverr);
> + inet6_assign_bit(RECVERR6_RFC4884, ssk, recverr_rfc4884);
> + } else
> +#endif
> + {
> + inet_assign_bit(RECVERR, ssk, recverr);
> + inet_assign_bit(RECVERR_RFC4884, ssk, recverr_rfc4884);
> + }
Instead of the above you could add a pre req patch converting the
existing inet_assign_bit() to something alike:
#define MPTCP_INET_FLAGS_MASK (INET_FLAGS_TRANSPARENT |
INET_FLAGS_FREEBIND /* ... */ )
inet_sk(ssk)->inet_flags = inet_sk(sk)->inet_flags & MPTCP_INET_FLAGS_MASK;
and just expand the MPTCP_INET_FLAGS_MASK.
/P
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH mptcp-next v3 0/3] mptcp: MSG_ERRQUEUE support on the parent socket
2026-04-21 22:33 [PATCH mptcp-next v3 0/3] mptcp: MSG_ERRQUEUE support on the parent socket David Carlier
` (3 preceding siblings ...)
2026-04-21 23:38 ` [PATCH mptcp-next v3 0/3] mptcp: MSG_ERRQUEUE support on the parent socket MPTCP CI
@ 2026-04-22 8:22 ` Matthieu Baerts
2026-04-22 8:56 ` David CARLIER
2026-04-27 21:10 ` [PATCH mptcp-next v4 0/4] " David Carlier
5 siblings, 1 reply; 31+ messages in thread
From: Matthieu Baerts @ 2026-04-22 8:22 UTC (permalink / raw)
To: David Carlier, mptcp; +Cc: Mat Martineau, Geliang Tang
Hi David,
On 22/04/2026 00:33, David Carlier wrote:
> MPTCP already advertises IP_RECVERR/IPV6_RECVERR as supported, but the
> parent socket does not currently provide usable MSG_ERRQUEUE handling.
>
> This series wires the MPTCP socket up to the IPv4/IPv6 error queue
> paths. It propagates RECVERR-related sockopts to existing and future
> subflows, makes poll() report pending errqueue activity through the
> parent socket, and allows recvmsg(MSG_ERRQUEUE) on the MPTCP socket to
> consume queued errors with the parent socket ABI.
>
> The series also handles mixed-family subflows by applying the matching
> sockopt according to each subflow family, and avoids silently losing an
> error skb if requeueing to the parent socket fails under rmem pressure.
>
> v2 -> v3:
Thank you for the v3.
Do you mind sending max 1 series per day, please? Each version generates
a lot of emails that are sent and need to be triaged, it is then harder
for us to follow, plus a lot of shared resources are used.
If you need CI support, either execute the tests locally with the docker
image (preferred), or send your patches on a public fork on GitHub,
after having enabled "Actions" support there ;)
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH mptcp-next v3 2/3] mptcp: support MSG_ERRQUEUE on the parent socket
2026-04-21 22:33 ` [PATCH mptcp-next v3 2/3] mptcp: support MSG_ERRQUEUE on the parent socket David Carlier
@ 2026-04-22 8:28 ` Paolo Abeni
2026-04-22 21:54 ` David CARLIER
0 siblings, 1 reply; 31+ messages in thread
From: Paolo Abeni @ 2026-04-22 8:28 UTC (permalink / raw)
To: David Carlier, mptcp; +Cc: Matthieu Baerts, Mat Martineau, Geliang Tang
On 4/22/26 12:33 AM, David Carlier wrote:
> Handle MSG_ERRQUEUE on the MPTCP socket by selecting a subflow with
> pending errqueue data, moving one error skb to the parent socket, and
> consuming it through the parent socket ABI.
>
> This surfaces subflow errqueue activity through poll(), keeps the
> userspace ABI tied to the socket being used, and restores the skb to
> the subflow errqueue if requeueing to the parent fails under rmem
> pressure.
>
> Signed-off-by: David Carlier <devnexen@gmail.com>
> Assisted-by: Codex:gpt-5
> Signed-off-by: David Carlier <devnexen@gmail.com>
> ---
> net/mptcp/protocol.c | 123 ++++++++++++++++++++++++++++++++++++-------
> 1 file changed, 103 insertions(+), 20 deletions(-)
>
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index 6b486fc94c16..87871216bab2 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -818,28 +818,29 @@ static bool __mptcp_ofo_queue(struct mptcp_sock *msk)
> static bool __mptcp_subflow_error_report(struct sock *sk, struct sock *ssk)
> {
> int ssk_state;
> - int err;
> + int err = 0;
> + bool has_errqueue;
Reverse christmas tree above.
> +
> + has_errqueue = !skb_queue_empty_lockless(&ssk->sk_error_queue);
>
> - /* only propagate errors on fallen-back sockets or
> - * on MPC connect
> + /* Only fallback sockets and the MPC connect path inherit TCP's sk_err
> + * semantics; consume ssk->sk_err only on those paths so steady-state
> + * MPTCP doesn't silently drop TCP's one-shot errors.
> */
> - if (sk->sk_state != TCP_SYN_SENT && !__mptcp_check_fallback(mptcp_sk(sk)))
> - return false;
> + if (sk->sk_state == TCP_SYN_SENT ||
> + __mptcp_check_fallback(mptcp_sk(sk))) {
> + err = sock_error(ssk);
> + if (err) {
> + ssk_state = inet_sk_state_load(ssk);
> + if (ssk_state == TCP_CLOSE && !sock_flag(sk, SOCK_DEAD))
> + mptcp_set_state(sk, ssk_state);
> + WRITE_ONCE(sk->sk_err, -err);
> + }
> + }
>
> - err = sock_error(ssk);
> - if (!err)
> + if (!err && !has_errqueue)
> return false;
>
> - /* We need to propagate only transition to CLOSE state.
> - * Orphaned socket will see such state change via
> - * subflow_sched_work_if_closed() and that path will properly
> - * destroy the msk as needed.
> - */
> - ssk_state = inet_sk_state_load(ssk);
> - if (ssk_state == TCP_CLOSE && !sock_flag(sk, SOCK_DEAD))
> - mptcp_set_state(sk, ssk_state);
> - WRITE_ONCE(sk->sk_err, -err);
Avoid reordering the code: it makes the patch more complex and hard to
read. Keep the existing path as-is and add the needed handling under the
existing:
if (sk->sk_state != TCP_SYN_SENT && !__mptcp_check_fallback(mptcp_sk(sk)))
condition. If needed use goto/label to consolidate the return path.
More important: am not sure propagating subflow errors to the main
socket is the right thing. A fatal error on a subflow does not affect
the mptcp unless there is a single active subflow. Application will
start receiving unexpected errors when the transfer is working just fine.
-
> /* This barrier is coupled with smp_rmb() in mptcp_poll() */
> smp_wmb();
> sk_error_report(sk);
> @@ -2286,6 +2287,68 @@ static unsigned int mptcp_inq_hint(const struct sock *sk)
> return 0;
> }
>
> +static struct sock *mptcp_pick_errqueue_subflow(struct sock *sk)
> +{
> + struct mptcp_subflow_context *subflow;
> + struct sock *ssk = NULL;
> +
> + lock_sock(sk);
> + mptcp_for_each_subflow(mptcp_sk(sk), subflow) {
> + struct sock *subflow_sk = mptcp_subflow_tcp_sock(subflow);
> +
> + if (skb_queue_empty_lockless(&subflow_sk->sk_error_queue))
> + continue;
> +
> + if (!refcount_inc_not_zero(&subflow_sk->sk_refcnt))
> + continue;
> +
> + ssk = subflow_sk;
> + break;
> + }
> + release_sock(sk);
> +
> + return ssk;
> +}
> +
> +static bool mptcp_has_error_queue(const struct sock *sk)
> +{
> + return !skb_queue_empty_lockless(&sk->sk_error_queue);
> +}
> +
> +static int mptcp_recv_error(struct sock *sk, struct msghdr *msg, int len)
> +{
> + struct sk_buff *skb;
> + struct sock *ssk;
> + int ret, ret2;
> +
> + if (READ_ONCE(sk->sk_err) || mptcp_has_error_queue(sk))
> + return inet_recv_error(sk, msg, len);
> +
> + ssk = mptcp_pick_errqueue_subflow(sk);
> + if (!ssk)
> + return -EAGAIN;
> +
> + skb = sock_dequeue_err_skb(ssk);
> + if (!skb)
> + goto put_ssk;
> +
> + ret = sock_queue_err_skb(sk, skb);
> + if (ret) {
> + ret2 = sock_queue_err_skb(ssk, skb);
> + sock_put(ssk);
> + if (ret2)
> + kfree_skb(skb);
> + return ret;
> + }
> +
> + sock_put(ssk);
> + return inet_recv_error(sk, msg, len);
> +
> +put_ssk:
> + sock_put(ssk);
> + return -EAGAIN;
> +}
> +
> static int mptcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
> int flags)
> {
> @@ -2295,9 +2358,8 @@ static int mptcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
> int target;
> long timeo;
>
> - /* MSG_ERRQUEUE is really a no-op till we support IP_RECVERR */
> if (unlikely(flags & MSG_ERRQUEUE))
> - return inet_recv_error(sk, msg, len);
> + return mptcp_recv_error(sk, msg, len);
>
> lock_sock(sk);
> if (unlikely(sk->sk_state == TCP_LISTEN)) {
> @@ -4296,6 +4358,26 @@ static __poll_t mptcp_check_writeable(struct mptcp_sock *msk)
> return 0;
> }
>
> +static bool mptcp_subflow_has_error(struct sock *sk)
> +{
> + struct mptcp_subflow_context *subflow;
> + bool has_error = false;
> +
> + mptcp_data_lock(sk);
> + mptcp_for_each_subflow(mptcp_sk(sk), subflow) {
> + struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
> +
> + if (READ_ONCE(ssk->sk_err) ||
> + !skb_queue_empty_lockless(&ssk->sk_error_queue)) {
> + has_error = true;
> + break;
> + }
> + }
> + mptcp_data_unlock(sk);
The data lock is not enough to protect the subflows list, and
unconditionally acquiring it in the poll callback is an no go (will
impact performances negatively).
_If_ error propagation makes sense, you should probably try to move the
err skb from the subflow error queue into the msk one at
sk_error_queue() time.
/P
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH mptcp-next v3 1/3] mptcp: propagate RECVERR sockopts to subflows
2026-04-22 8:05 ` Paolo Abeni
@ 2026-04-22 8:32 ` Matthieu Baerts
2026-04-22 8:35 ` Matthieu Baerts
2026-04-22 21:51 ` David CARLIER
1 sibling, 1 reply; 31+ messages in thread
From: Matthieu Baerts @ 2026-04-22 8:32 UTC (permalink / raw)
To: Paolo Abeni, David Carlier, mptcp; +Cc: Mat Martineau, Geliang Tang
Hi Paolo,
Thank you for the review!
On 22/04/2026 10:05, Paolo Abeni wrote:
> On 4/22/26 12:33 AM, David Carlier wrote:
>> Propagate IP_RECVERR/IP_RECVERR_RFC4884 and
>> IPV6_RECVERR/IPV6_RECVERR_RFC4884 from the MPTCP socket to
>> existing and future subflows.
>>
>> Apply the matching sockopt according to the subflow family so mixed-
>> family subflows stay aligned with the parent socket configuration,
>> including disable-time errqueue purge semantics.
(...)
>> +static int mptcp_setsockopt_v6_recverr(struct mptcp_sock *msk, int optname,
>> + sockptr_t optval, unsigned int optlen)
>> +{
>> + struct mptcp_subflow_context *subflow;
>> + struct sock *sk = (struct sock *)msk;
>> + int ret;
>> +
>> + ret = ipv6_setsockopt(sk, SOL_IPV6, optname, optval, optlen);
>> + if (ret)
>> + return ret;
>> +
>> + lock_sock(sk);
>> + sockopt_seq_inc(msk);
>> + mptcp_for_each_subflow(msk, subflow) {
>> + struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
>> + bool rfc4884 = optname == IPV6_RECVERR_RFC4884;
>> +
>> + ret = mptcp_subflow_set_recverr(sk, ssk, rfc4884);
>
> The above looks a bit overcomplicated?!? It looks like you could
> leverage mptcp_setsockopt_all_sf() in and drop the
> mptcp_subflow_set_recverr() and mptcp_recverr_enabled() helpers.
>
> @Mattbe: it's not clear to me why mptcp_setsockopt_all_sf() does not
> call sockopt_seq_inc(msk) or sync the setsockopt_seq, possibly an
> unrelated bug?
Oh yes, good catch! And a lock of the sk (msk) before iterating the
subflows, no? Same in __mptcp_setsockopt_set_val()?
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH mptcp-next v3 1/3] mptcp: propagate RECVERR sockopts to subflows
2026-04-22 8:32 ` Matthieu Baerts
@ 2026-04-22 8:35 ` Matthieu Baerts
2026-04-22 8:36 ` Matthieu Baerts
2026-04-22 8:48 ` Paolo Abeni
0 siblings, 2 replies; 31+ messages in thread
From: Matthieu Baerts @ 2026-04-22 8:35 UTC (permalink / raw)
To: Paolo Abeni; +Cc: Mat Martineau, Geliang Tang, David Carlier, mptcp
On 22/04/2026 10:32, Matthieu Baerts wrote:
> Hi Paolo,
>
> Thank you for the review!
>
> On 22/04/2026 10:05, Paolo Abeni wrote:
>> On 4/22/26 12:33 AM, David Carlier wrote:
>>> Propagate IP_RECVERR/IP_RECVERR_RFC4884 and
>>> IPV6_RECVERR/IPV6_RECVERR_RFC4884 from the MPTCP socket to
>>> existing and future subflows.
>>>
>>> Apply the matching sockopt according to the subflow family so mixed-
>>> family subflows stay aligned with the parent socket configuration,
>>> including disable-time errqueue purge semantics.
>
> (...)
>
>>> +static int mptcp_setsockopt_v6_recverr(struct mptcp_sock *msk, int optname,
>>> + sockptr_t optval, unsigned int optlen)
>>> +{
>>> + struct mptcp_subflow_context *subflow;
>>> + struct sock *sk = (struct sock *)msk;
>>> + int ret;
>>> +
>>> + ret = ipv6_setsockopt(sk, SOL_IPV6, optname, optval, optlen);
>>> + if (ret)
>>> + return ret;
>>> +
>>> + lock_sock(sk);
>>> + sockopt_seq_inc(msk);
>>> + mptcp_for_each_subflow(msk, subflow) {
>>> + struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
>>> + bool rfc4884 = optname == IPV6_RECVERR_RFC4884;
>>> +
>>> + ret = mptcp_subflow_set_recverr(sk, ssk, rfc4884);
>>
>> The above looks a bit overcomplicated?!? It looks like you could
>> leverage mptcp_setsockopt_all_sf() in and drop the
>> mptcp_subflow_set_recverr() and mptcp_recverr_enabled() helpers.
>>
>> @Mattbe: it's not clear to me why mptcp_setsockopt_all_sf() does not
>> call sockopt_seq_inc(msk) or sync the setsockopt_seq, possibly an
>> unrelated bug?
>
> Oh yes, good catch! And a lock of the sk (msk) before iterating the
> subflows, no? Same in __mptcp_setsockopt_set_val()?
(Not the same in __mptcp_setsockopt_set_val(): the sk is already locked)
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH mptcp-next v3 1/3] mptcp: propagate RECVERR sockopts to subflows
2026-04-22 8:35 ` Matthieu Baerts
@ 2026-04-22 8:36 ` Matthieu Baerts
2026-04-22 8:48 ` Paolo Abeni
1 sibling, 0 replies; 31+ messages in thread
From: Matthieu Baerts @ 2026-04-22 8:36 UTC (permalink / raw)
To: Paolo Abeni; +Cc: Mat Martineau, Geliang Tang, David Carlier, mptcp
On 22/04/2026 10:35, Matthieu Baerts wrote:
> On 22/04/2026 10:32, Matthieu Baerts wrote:
>> Hi Paolo,
>>
>> Thank you for the review!
>>
>> On 22/04/2026 10:05, Paolo Abeni wrote:
>>> On 4/22/26 12:33 AM, David Carlier wrote:
>>>> Propagate IP_RECVERR/IP_RECVERR_RFC4884 and
>>>> IPV6_RECVERR/IPV6_RECVERR_RFC4884 from the MPTCP socket to
>>>> existing and future subflows.
>>>>
>>>> Apply the matching sockopt according to the subflow family so mixed-
>>>> family subflows stay aligned with the parent socket configuration,
>>>> including disable-time errqueue purge semantics.
>>
>> (...)
>>
>>>> +static int mptcp_setsockopt_v6_recverr(struct mptcp_sock *msk, int optname,
>>>> + sockptr_t optval, unsigned int optlen)
>>>> +{
>>>> + struct mptcp_subflow_context *subflow;
>>>> + struct sock *sk = (struct sock *)msk;
>>>> + int ret;
>>>> +
>>>> + ret = ipv6_setsockopt(sk, SOL_IPV6, optname, optval, optlen);
>>>> + if (ret)
>>>> + return ret;
>>>> +
>>>> + lock_sock(sk);
>>>> + sockopt_seq_inc(msk);
>>>> + mptcp_for_each_subflow(msk, subflow) {
>>>> + struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
>>>> + bool rfc4884 = optname == IPV6_RECVERR_RFC4884;
>>>> +
>>>> + ret = mptcp_subflow_set_recverr(sk, ssk, rfc4884);
>>>
>>> The above looks a bit overcomplicated?!? It looks like you could
>>> leverage mptcp_setsockopt_all_sf() in and drop the
>>> mptcp_subflow_set_recverr() and mptcp_recverr_enabled() helpers.
>>>
>>> @Mattbe: it's not clear to me why mptcp_setsockopt_all_sf() does not
>>> call sockopt_seq_inc(msk) or sync the setsockopt_seq, possibly an
>>> unrelated bug?
>>
>> Oh yes, good catch! And a lock of the sk (msk) before iterating the
>> subflows, no? Same in __mptcp_setsockopt_set_val()?
>
> (Not the same in __mptcp_setsockopt_set_val(): the sk is already locked)
... same when calling mptcp_setsockopt_all_sf() for the moment. Maybe I
should have prefixed with "__".
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH mptcp-next v3 1/3] mptcp: propagate RECVERR sockopts to subflows
2026-04-22 8:35 ` Matthieu Baerts
2026-04-22 8:36 ` Matthieu Baerts
@ 2026-04-22 8:48 ` Paolo Abeni
2026-04-22 8:50 ` Matthieu Baerts
1 sibling, 1 reply; 31+ messages in thread
From: Paolo Abeni @ 2026-04-22 8:48 UTC (permalink / raw)
To: Matthieu Baerts; +Cc: Mat Martineau, Geliang Tang, David Carlier, mptcp
On 4/22/26 10:35 AM, Matthieu Baerts wrote:
> On 22/04/2026 10:32, Matthieu Baerts wrote:
>> On 22/04/2026 10:05, Paolo Abeni wrote:
>>> On 4/22/26 12:33 AM, David Carlier wrote:
>>>> Propagate IP_RECVERR/IP_RECVERR_RFC4884 and
>>>> IPV6_RECVERR/IPV6_RECVERR_RFC4884 from the MPTCP socket to
>>>> existing and future subflows.
>>>>
>>>> Apply the matching sockopt according to the subflow family so mixed-
>>>> family subflows stay aligned with the parent socket configuration,
>>>> including disable-time errqueue purge semantics.
>>
>> (...)
>>
>>>> +static int mptcp_setsockopt_v6_recverr(struct mptcp_sock *msk, int optname,
>>>> + sockptr_t optval, unsigned int optlen)
>>>> +{
>>>> + struct mptcp_subflow_context *subflow;
>>>> + struct sock *sk = (struct sock *)msk;
>>>> + int ret;
>>>> +
>>>> + ret = ipv6_setsockopt(sk, SOL_IPV6, optname, optval, optlen);
>>>> + if (ret)
>>>> + return ret;
>>>> +
>>>> + lock_sock(sk);
>>>> + sockopt_seq_inc(msk);
>>>> + mptcp_for_each_subflow(msk, subflow) {
>>>> + struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
>>>> + bool rfc4884 = optname == IPV6_RECVERR_RFC4884;
>>>> +
>>>> + ret = mptcp_subflow_set_recverr(sk, ssk, rfc4884);
>>>
>>> The above looks a bit overcomplicated?!? It looks like you could
>>> leverage mptcp_setsockopt_all_sf() in and drop the
>>> mptcp_subflow_set_recverr() and mptcp_recverr_enabled() helpers.
>>>
>>> @Mattbe: it's not clear to me why mptcp_setsockopt_all_sf() does not
>>> call sockopt_seq_inc(msk) or sync the setsockopt_seq, possibly an
>>> unrelated bug?
>>
>> Oh yes, good catch! And a lock of the sk (msk) before iterating the
>> subflows, no? Same in __mptcp_setsockopt_set_val()?
>
> (Not the same in __mptcp_setsockopt_set_val(): the sk is already locked)
AFAICS for mptcp_setsockopt_all_sf() the msk lock is acquired by the
caller. Same for __mptcp_setsockopt_set_val().
Side note; AFAICS a few other paths apparently lack the setsockopt_seq;
that is minor problem/not a real issue: worst case it will cause a
redundant mptcp_sockopt_sync_locked() call at finish_join time.
The missing sockopt_seq_inc() instead looks really bogus, as it will
caused missing synchronization for later subflows.
/P
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH mptcp-next v3 1/3] mptcp: propagate RECVERR sockopts to subflows
2026-04-22 8:48 ` Paolo Abeni
@ 2026-04-22 8:50 ` Matthieu Baerts
2026-04-22 13:53 ` Paolo Abeni
0 siblings, 1 reply; 31+ messages in thread
From: Matthieu Baerts @ 2026-04-22 8:50 UTC (permalink / raw)
To: Paolo Abeni; +Cc: Mat Martineau, Geliang Tang, David Carlier, mptcp
On 22/04/2026 10:48, Paolo Abeni wrote:
> On 4/22/26 10:35 AM, Matthieu Baerts wrote:
>> On 22/04/2026 10:32, Matthieu Baerts wrote:
>>> On 22/04/2026 10:05, Paolo Abeni wrote:
>>>> On 4/22/26 12:33 AM, David Carlier wrote:
>>>>> Propagate IP_RECVERR/IP_RECVERR_RFC4884 and
>>>>> IPV6_RECVERR/IPV6_RECVERR_RFC4884 from the MPTCP socket to
>>>>> existing and future subflows.
>>>>>
>>>>> Apply the matching sockopt according to the subflow family so mixed-
>>>>> family subflows stay aligned with the parent socket configuration,
>>>>> including disable-time errqueue purge semantics.
>>>
>>> (...)
>>>
>>>>> +static int mptcp_setsockopt_v6_recverr(struct mptcp_sock *msk, int optname,
>>>>> + sockptr_t optval, unsigned int optlen)
>>>>> +{
>>>>> + struct mptcp_subflow_context *subflow;
>>>>> + struct sock *sk = (struct sock *)msk;
>>>>> + int ret;
>>>>> +
>>>>> + ret = ipv6_setsockopt(sk, SOL_IPV6, optname, optval, optlen);
>>>>> + if (ret)
>>>>> + return ret;
>>>>> +
>>>>> + lock_sock(sk);
>>>>> + sockopt_seq_inc(msk);
>>>>> + mptcp_for_each_subflow(msk, subflow) {
>>>>> + struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
>>>>> + bool rfc4884 = optname == IPV6_RECVERR_RFC4884;
>>>>> +
>>>>> + ret = mptcp_subflow_set_recverr(sk, ssk, rfc4884);
>>>>
>>>> The above looks a bit overcomplicated?!? It looks like you could
>>>> leverage mptcp_setsockopt_all_sf() in and drop the
>>>> mptcp_subflow_set_recverr() and mptcp_recverr_enabled() helpers.
>>>>
>>>> @Mattbe: it's not clear to me why mptcp_setsockopt_all_sf() does not
>>>> call sockopt_seq_inc(msk) or sync the setsockopt_seq, possibly an
>>>> unrelated bug?
>>>
>>> Oh yes, good catch! And a lock of the sk (msk) before iterating the
>>> subflows, no? Same in __mptcp_setsockopt_set_val()?
>>
>> (Not the same in __mptcp_setsockopt_set_val(): the sk is already locked)
>
> AFAICS for mptcp_setsockopt_all_sf() the msk lock is acquired by the
> caller. Same for __mptcp_setsockopt_set_val().
>
> Side note; AFAICS a few other paths apparently lack the setsockopt_seq;
> that is minor problem/not a real issue: worst case it will cause a
> redundant mptcp_sockopt_sync_locked() call at finish_join time.
>
> The missing sockopt_seq_inc() instead looks really bogus, as it will
> caused missing synchronization for later subflows.
Good point! Do you want me to send a patch, or do you already have one?
(or do you plan to look at it)
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH mptcp-next v3 0/3] mptcp: MSG_ERRQUEUE support on the parent socket
2026-04-22 8:22 ` Matthieu Baerts
@ 2026-04-22 8:56 ` David CARLIER
0 siblings, 0 replies; 31+ messages in thread
From: David CARLIER @ 2026-04-22 8:56 UTC (permalink / raw)
To: Matthieu Baerts; +Cc: mptcp, Mat Martineau, Geliang Tang
Hi,
On Wed, 22 Apr 2026 at 09:22, Matthieu Baerts <matttbe@kernel.org> wrote:
>
> Hi David,
>
> On 22/04/2026 00:33, David Carlier wrote:
> > MPTCP already advertises IP_RECVERR/IPV6_RECVERR as supported, but the
> > parent socket does not currently provide usable MSG_ERRQUEUE handling.
> >
> > This series wires the MPTCP socket up to the IPv4/IPv6 error queue
> > paths. It propagates RECVERR-related sockopts to existing and future
> > subflows, makes poll() report pending errqueue activity through the
> > parent socket, and allows recvmsg(MSG_ERRQUEUE) on the MPTCP socket to
> > consume queued errors with the parent socket ABI.
> >
> > The series also handles mixed-family subflows by applying the matching
> > sockopt according to each subflow family, and avoids silently losing an
> > error skb if requeueing to the parent socket fails under rmem pressure.
> >
> > v2 -> v3:
>
> Thank you for the v3.
>
> Do you mind sending max 1 series per day, please? Each version generates
> a lot of emails that are sent and need to be triaged, it is then harder
> for us to follow, plus a lot of shared resources are used.
>
Dully noted.
> If you need CI support, either execute the tests locally with the docker
> image (preferred), or send your patches on a public fork on GitHub,
> after having enabled "Actions" support there ;)
Yes I realised that only when I did the v3 :) ok I ll go through all
the remarks later. Cheers.
>
> Cheers,
> Matt
> --
> Sponsored by the NGI0 Core fund.
>
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH mptcp-next v3 1/3] mptcp: propagate RECVERR sockopts to subflows
2026-04-22 8:50 ` Matthieu Baerts
@ 2026-04-22 13:53 ` Paolo Abeni
0 siblings, 0 replies; 31+ messages in thread
From: Paolo Abeni @ 2026-04-22 13:53 UTC (permalink / raw)
To: Matthieu Baerts; +Cc: Mat Martineau, Geliang Tang, David Carlier, mptcp
On 4/22/26 10:50 AM, Matthieu Baerts wrote:
> On 22/04/2026 10:48, Paolo Abeni wrote:
>> AFAICS for mptcp_setsockopt_all_sf() the msk lock is acquired by the
>> caller. Same for __mptcp_setsockopt_set_val().
>>
>> Side note; AFAICS a few other paths apparently lack the setsockopt_seq;
>> that is minor problem/not a real issue: worst case it will cause a
>> redundant mptcp_sockopt_sync_locked() call at finish_join time.
>>
>> The missing sockopt_seq_inc() instead looks really bogus, as it will
>> caused missing synchronization for later subflows.
>
> Good point! Do you want me to send a patch, or do you already have one?
> (or do you plan to look at it)
Please go ahead, I'm lagging behind by far, sorry.
/P
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH mptcp-next v3 1/3] mptcp: propagate RECVERR sockopts to subflows
2026-04-22 8:05 ` Paolo Abeni
2026-04-22 8:32 ` Matthieu Baerts
@ 2026-04-22 21:51 ` David CARLIER
2026-04-27 17:07 ` Matthieu Baerts
1 sibling, 1 reply; 31+ messages in thread
From: David CARLIER @ 2026-04-22 21:51 UTC (permalink / raw)
To: Paolo Abeni; +Cc: mptcp, Matthieu Baerts, Mat Martineau, Geliang Tang
On 4/22/26 10:05 AM, Paolo Abeni wrote:
> You should drop this line, only a single SoB tag is needed.
Ack.
>> +#if IS_ENABLED(CONFIG_IPV6)
>
> Is this compiler guard strictly needed?
For this one yes — it wraps mptcp_setsockopt_v6_recverr(), which
calls ipv6_setsockopt() (no !CONFIG_IPV6 stub, link error
otherwise). But the helper goes away once I move to
mptcp_setsockopt_all_sf() + MPTCP_INET_FLAGS_MASK as you suggest
below, so the guard drops with it.
>> + ret = mptcp_subflow_set_recverr(sk, ssk, rfc4884);
>
> The above looks a bit overcomplicated?!? It looks like you could
> leverage mptcp_setsockopt_all_sf() [...]
Will do, rebased on Matt's sockopt_seq_inc() fix.
> Instead of the above you could add a pre req patch converting the
> existing inet_assign_bit() to something alike [...]
Ack, will add as prereq (and mirror for inet6_sk()->flags).
Cheers,
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH mptcp-next v3 2/3] mptcp: support MSG_ERRQUEUE on the parent socket
2026-04-22 8:28 ` Paolo Abeni
@ 2026-04-22 21:54 ` David CARLIER
0 siblings, 0 replies; 31+ messages in thread
From: David CARLIER @ 2026-04-22 21:54 UTC (permalink / raw)
To: Paolo Abeni; +Cc: mptcp, Matthieu Baerts, Mat Martineau, Geliang Tang
On 4/22/26 10:28 AM, Paolo Abeni wrote:
> Reverse christmas tree above.
Ack.
> Avoid reordering the code [...] use goto/label to consolidate the
> return path.
Ack.
> More important: am not sure propagating subflow errors to the main
> socket is the right thing. [...]
The sk->sk_err propagation stays gated on the existing
SYN_SENT/fallback condition — the patch only adds errqueue skb
handling (ICMP, timestamps, zerocopy completions), whose cmsg
payload
identifies the originating peer. So fatal-error semantics are
unchanged.
> _If_ error propagation makes sense, you should probably try to
move
> the err skb from the subflow error queue into the msk one at
> sk_error_queue() time.
Yes — I'll splice from subflow_error_report() onto
msk->sk_error_queue. mptcp_poll() then just adds a
!skb_queue_empty_lockless() check, recvmsg(MSG_ERRQUEUE) stays a
plain
inet_recv_error(), and the subflow walk + data-lock go away. Does
that
match what you had in mind?
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH mptcp-next v3 1/3] mptcp: propagate RECVERR sockopts to subflows
2026-04-22 21:51 ` David CARLIER
@ 2026-04-27 17:07 ` Matthieu Baerts
0 siblings, 0 replies; 31+ messages in thread
From: Matthieu Baerts @ 2026-04-27 17:07 UTC (permalink / raw)
To: David CARLIER, Paolo Abeni; +Cc: mptcp, Mat Martineau, Geliang Tang
Hi David,
On 22/04/2026 23:51, David CARLIER wrote:
> On 4/22/26 10:05 AM, Paolo Abeni wrote:
> > You should drop this line, only a single SoB tag is needed.
>
> Ack.
(I don't know why there is an indentation in your replies, but I guess
most email readers will not interpret this properly, even lore [1]:
check your messages vs. the other ones)
https://lore.kernel.org/mptcp/CA+XhMqw4xRGdJbtX02++9eS8u_bWOzbPimM1Mk_dDNed5SX94w@mail.gmail.com/T/#mf03244a0e5af6ca0fdd80dfb1bff9f2f2d769e0b
https://docs.kernel.org/process/email-clients.html
> >> + ret = mptcp_subflow_set_recverr(sk, ssk, rfc4884);
> >
> > The above looks a bit overcomplicated?!? It looks like you could
> > leverage mptcp_setsockopt_all_sf() [...]
>
> Will do, rebased on Matt's sockopt_seq_inc() fix.
FYI, I sent my patch there:
https://lore.kernel.org/mptcp/20260424-mptcp-pm-sockopt-set-all-sf-v1-1-38e7023822f8@kernel.org/
I don't think you need to rebase your series on it, except if not having
this patch breaks some tests. If that's the case, feel free to add this
line in the cover-letter:
Based-on:
<20260424-mptcp-pm-sockopt-set-all-sf-v1-1-38e7023822f8@kernel.org>
See this for more details:
https://github.com/multipath-tcp/mptcp_net-next/wiki/CI
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH mptcp-next v4 0/4] mptcp: MSG_ERRQUEUE support on the parent socket
2026-04-21 22:33 [PATCH mptcp-next v3 0/3] mptcp: MSG_ERRQUEUE support on the parent socket David Carlier
` (4 preceding siblings ...)
2026-04-22 8:22 ` Matthieu Baerts
@ 2026-04-27 21:10 ` David Carlier
2026-04-27 21:10 ` [PATCH mptcp-next v4 1/4] mptcp: sockopt: factor inet_flags propagation into a mask David Carlier
` (5 more replies)
5 siblings, 6 replies; 31+ messages in thread
From: David Carlier @ 2026-04-27 21:10 UTC (permalink / raw)
To: mptcp
Cc: Matthieu Baerts, Mat Martineau, Geliang Tang, Paolo Abeni,
David Carlier
MPTCP already advertises IP_RECVERR/IPV6_RECVERR as supported, but the
parent socket does not currently provide usable MSG_ERRQUEUE handling.
This series wires the MPTCP socket up to the IPv4/IPv6 error queue
paths. It propagates RECVERR-related sockopts to existing and future
subflows, makes poll() report pending errqueue activity through the
parent socket, and lets recvmsg(MSG_ERRQUEUE) on the MPTCP socket
consume queued errors with the parent socket ABI.
A new prerequisite patch factors the per-flag inet_flags propagation
in sync_socket_options() into a single masked word copy, so further
inet_flags propagated by MPTCP can be added by extending the mask
rather than touching the call site.
Patch 2 then leverages the existing mptcp_setsockopt_all_sf() helper
for the setsockopt path and extends MPTCP_INET_FLAGS_MASK with the
four RECVERR bits, dropping the family-specific helpers from v3.
Based-on: <20260424-mptcp-pm-sockopt-set-all-sf-v1-1-38e7023822f8@kernel.org>
v3 -> v4:
- New patch 1/4: factor inet_flags propagation in
sync_socket_options() through MPTCP_INET_FLAGS_MASK, per Paolo's
review.
- Patch 2/4 (was 1/3): drop the mptcp_recverr_enabled() and
mptcp_subflow_set_recverr() helpers; route the setsockopt path
through mptcp_setsockopt_all_sf(). Inherit the four RECVERR bits
via MPTCP_INET_FLAGS_MASK in sync_socket_options() instead of
explicit inet[6]_assign_bit() calls.
- Patch 3/4 (was 2/3): rework the MSG_ERRQUEUE plumbing per Paolo's
review. Subflow err skbs are now spliced onto the parent msk's
sk_error_queue from __mptcp_subflow_error_report() via the new
__mptcp_subflow_splice_errqueue() helper. recvmsg(MSG_ERRQUEUE)
on the parent reverts to plain inet_recv_error(), and mptcp_poll()
only inspects the parent's sk_error_queue -- no more on-demand
subflow walks, no extra lock_sock() / data_lock() in the poll or
recv paths. Keep the original early-return structure of
__mptcp_subflow_error_report() and fix the reverse christmas-tree
variable order Paolo flagged.
v2 -> v3:
- Only consume ssk->sk_err in the fallback / MPC-connect branch of
__mptcp_subflow_error_report(). Steady-state MPTCP now leaves
TCP's one-shot sk_err to TCP's own consumer instead of silently
draining it via sock_error().
- In mptcp_recv_error(), also route to inet_recv_error() when
sk->sk_err is set, so a fallback-propagated error reaches userspace
even when the parent errqueue is empty.
- Scope the new selftest to IP_RECVERR sockopt propagation only.
End-to-end errqueue delivery (TX timestamps, ICMP, zerocopy)
depends on subflow-side producers that are out of scope for this
series and will be covered by follow-up work. Fixes the
mptcp_sockopt selftest timeout reported by the MPTCP CI on v2.
v1 -> v2:
- Retargeted to mptcp-next per Matthieu Baerts' feedback (net-next
closed during the merge window; iterate on the MPTCP tree).
- Guard mptcp_setsockopt_v6_recverr() and its dispatch cases in
mptcp_setsockopt_v6() with #if IS_ENABLED(CONFIG_IPV6) to fix
the MPTCP CI link break on without_ipv6/with_mptcp configs
(undefined reference to ipv6_setsockopt).
v1: https://lore.kernel.org/mptcp/20260421152216.38127-1-devnexen@gmail.com/
v2: https://lore.kernel.org/mptcp/20260421191337.58341-1-devnexen@gmail.com/
v3: https://lore.kernel.org/mptcp/20260421223338.52743-1-devnexen@gmail.com/
David Carlier (4):
mptcp: sockopt: factor inet_flags propagation into a mask
mptcp: propagate RECVERR sockopts to subflows
mptcp: support MSG_ERRQUEUE on the parent socket
selftests: mptcp: cover IP_RECVERR sockopt propagation
net/mptcp/protocol.c | 33 +++++-
net/mptcp/sockopt.c | 107 ++++++++++++++----
.../selftests/net/mptcp/mptcp_sockopt.c | 55 +++++++++
3 files changed, 170 insertions(+), 25 deletions(-)
--
2.53.0
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH mptcp-next v4 1/4] mptcp: sockopt: factor inet_flags propagation into a mask
2026-04-27 21:10 ` [PATCH mptcp-next v4 0/4] " David Carlier
@ 2026-04-27 21:10 ` David Carlier
2026-04-27 21:10 ` [PATCH mptcp-next v4 2/4] mptcp: propagate RECVERR sockopts to subflows David Carlier
` (4 subsequent siblings)
5 siblings, 0 replies; 31+ messages in thread
From: David Carlier @ 2026-04-27 21:10 UTC (permalink / raw)
To: mptcp
Cc: Matthieu Baerts, Mat Martineau, Geliang Tang, Paolo Abeni,
David Carlier
Replace the per-flag inet_assign_bit() calls in sync_socket_options()
with a masked word-level copy of inet_sk()->inet_flags. Introduce
MPTCP_INET_FLAGS_MASK so further flags propagated by MPTCP can be
added by extending the mask rather than touching the call site.
No functional change.
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David Carlier <devnexen@gmail.com>
---
net/mptcp/sockopt.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/net/mptcp/sockopt.c b/net/mptcp/sockopt.c
index 0efe40be2fde..41c9dc9cf95e 100644
--- a/net/mptcp/sockopt.c
+++ b/net/mptcp/sockopt.c
@@ -16,6 +16,10 @@
#define MIN_INFO_OPTLEN_SIZE 16
#define MIN_FULL_INFO_OPTLEN_SIZE 40
+#define MPTCP_INET_FLAGS_MASK \
+ (BIT(INET_FLAGS_TRANSPARENT) | \
+ BIT(INET_FLAGS_FREEBIND) | \
+ BIT(INET_FLAGS_BIND_ADDRESS_NO_PORT))
static struct sock *__mptcp_tcp_fallback(struct mptcp_sock *msk)
{
@@ -1536,6 +1540,7 @@ static void sync_socket_options(struct mptcp_sock *msk, struct sock *ssk)
{
static const unsigned int tx_rx_locks = SOCK_RCVBUF_LOCK | SOCK_SNDBUF_LOCK;
struct sock *sk = (struct sock *)msk;
+ unsigned long flags;
bool keep_open;
keep_open = sock_flag(sk, SOCK_KEEPOPEN);
@@ -1582,9 +1587,10 @@ static void sync_socket_options(struct mptcp_sock *msk, struct sock *ssk)
tcp_sock_set_keepcnt(ssk, msk->keepalive_cnt);
tcp_sock_set_maxseg(ssk, msk->maxseg);
- inet_assign_bit(TRANSPARENT, ssk, inet_test_bit(TRANSPARENT, sk));
- inet_assign_bit(FREEBIND, ssk, inet_test_bit(FREEBIND, sk));
- inet_assign_bit(BIND_ADDRESS_NO_PORT, ssk, inet_test_bit(BIND_ADDRESS_NO_PORT, sk));
+ flags = inet_sk(ssk)->inet_flags;
+ flags &= ~MPTCP_INET_FLAGS_MASK;
+ flags |= inet_sk(sk)->inet_flags & MPTCP_INET_FLAGS_MASK;
+ WRITE_ONCE(inet_sk(ssk)->inet_flags, flags);
WRITE_ONCE(inet_sk(ssk)->local_port_range, READ_ONCE(inet_sk(sk)->local_port_range));
}
--
2.53.0
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH mptcp-next v4 2/4] mptcp: propagate RECVERR sockopts to subflows
2026-04-27 21:10 ` [PATCH mptcp-next v4 0/4] " David Carlier
2026-04-27 21:10 ` [PATCH mptcp-next v4 1/4] mptcp: sockopt: factor inet_flags propagation into a mask David Carlier
@ 2026-04-27 21:10 ` David Carlier
2026-05-01 15:56 ` Matthieu Baerts
2026-04-27 21:10 ` [PATCH mptcp-next v4 3/4] mptcp: support MSG_ERRQUEUE on the parent socket David Carlier
` (3 subsequent siblings)
5 siblings, 1 reply; 31+ messages in thread
From: David Carlier @ 2026-04-27 21:10 UTC (permalink / raw)
To: mptcp
Cc: Matthieu Baerts, Mat Martineau, Geliang Tang, Paolo Abeni,
David Carlier
Propagate IP_RECVERR/IP_RECVERR_RFC4884 and
IPV6_RECVERR/IPV6_RECVERR_RFC4884 from the MPTCP socket to existing
and future subflows. The setsockopt path forwards each option to
every subflow via mptcp_setsockopt_all_sf(); newly-joining subflows
inherit the four RECVERR bits through sync_socket_options() now that
MPTCP_INET_FLAGS_MASK covers them.
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Assisted-by: Codex:gpt-5
Signed-off-by: David Carlier <devnexen@gmail.com>
---
net/mptcp/sockopt.c | 97 ++++++++++++++++++++++++++++++++++++---------
1 file changed, 79 insertions(+), 18 deletions(-)
diff --git a/net/mptcp/sockopt.c b/net/mptcp/sockopt.c
index 41c9dc9cf95e..171e83e66a97 100644
--- a/net/mptcp/sockopt.c
+++ b/net/mptcp/sockopt.c
@@ -8,6 +8,8 @@
#include <linux/kernel.h>
#include <linux/module.h>
+#include <net/ip.h>
+#include <net/ipv6.h>
#include <net/sock.h>
#include <net/protocol.h>
#include <net/tcp.h>
@@ -19,7 +21,11 @@
#define MPTCP_INET_FLAGS_MASK \
(BIT(INET_FLAGS_TRANSPARENT) | \
BIT(INET_FLAGS_FREEBIND) | \
- BIT(INET_FLAGS_BIND_ADDRESS_NO_PORT))
+ BIT(INET_FLAGS_BIND_ADDRESS_NO_PORT) | \
+ BIT(INET_FLAGS_RECVERR) | \
+ BIT(INET_FLAGS_RECVERR_RFC4884) | \
+ BIT(INET_FLAGS_RECVERR6) | \
+ BIT(INET_FLAGS_RECVERR6_RFC4884))
static struct sock *__mptcp_tcp_fallback(struct mptcp_sock *msk)
{
@@ -388,6 +394,41 @@ static int mptcp_setsockopt_sol_socket(struct mptcp_sock *msk, int optname,
return -EOPNOTSUPP;
}
+static int mptcp_setsockopt_all_sf(struct mptcp_sock *msk, int level,
+ int optname, sockptr_t optval,
+ unsigned int optlen)
+{
+ struct mptcp_subflow_context *subflow;
+ int ret = 0;
+
+ mptcp_for_each_subflow(msk, subflow) {
+ struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
+
+ ret = tcp_setsockopt(ssk, level, optname, optval, optlen);
+ if (ret)
+ break;
+ }
+ return ret;
+}
+
+#if IS_ENABLED(CONFIG_IPV6)
+static int mptcp_setsockopt_v6_recverr(struct mptcp_sock *msk, int optname,
+ sockptr_t optval, unsigned int optlen)
+{
+ struct sock *sk = (struct sock *)msk;
+ int ret;
+
+ ret = ipv6_setsockopt(sk, SOL_IPV6, optname, optval, optlen);
+ if (ret)
+ return ret;
+
+ lock_sock(sk);
+ ret = mptcp_setsockopt_all_sf(msk, SOL_IPV6, optname, optval, optlen);
+ release_sock(sk);
+ return ret;
+}
+#endif
+
static int mptcp_setsockopt_v6(struct mptcp_sock *msk, int optname,
sockptr_t optval, unsigned int optlen)
{
@@ -430,6 +471,12 @@ static int mptcp_setsockopt_v6(struct mptcp_sock *msk, int optname,
release_sock(sk);
break;
+#if IS_ENABLED(CONFIG_IPV6)
+ case IPV6_RECVERR:
+ case IPV6_RECVERR_RFC4884:
+ ret = mptcp_setsockopt_v6_recverr(msk, optname, optval, optlen);
+ break;
+#endif
}
return ret;
@@ -764,6 +811,22 @@ static int mptcp_setsockopt_v4_set_tos(struct mptcp_sock *msk, int optname,
return 0;
}
+static int mptcp_setsockopt_v4_recverr(struct mptcp_sock *msk, int optname,
+ sockptr_t optval, unsigned int optlen)
+{
+ struct sock *sk = (struct sock *)msk;
+ int ret;
+
+ ret = ip_setsockopt(sk, SOL_IP, optname, optval, optlen);
+ if (ret)
+ return ret;
+
+ lock_sock(sk);
+ ret = mptcp_setsockopt_all_sf(msk, SOL_IP, optname, optval, optlen);
+ release_sock(sk);
+ return ret;
+}
+
static int mptcp_setsockopt_v4(struct mptcp_sock *msk, int optname,
sockptr_t optval, unsigned int optlen)
{
@@ -775,6 +838,9 @@ static int mptcp_setsockopt_v4(struct mptcp_sock *msk, int optname,
return mptcp_setsockopt_sol_ip_set(msk, optname, optval, optlen);
case IP_TOS:
return mptcp_setsockopt_v4_set_tos(msk, optname, optval, optlen);
+ case IP_RECVERR:
+ case IP_RECVERR_RFC4884:
+ return mptcp_setsockopt_v4_recverr(msk, optname, optval, optlen);
}
return -EOPNOTSUPP;
@@ -802,23 +868,6 @@ static int mptcp_setsockopt_first_sf_only(struct mptcp_sock *msk, int level, int
return ret;
}
-static int mptcp_setsockopt_all_sf(struct mptcp_sock *msk, int level,
- int optname, sockptr_t optval,
- unsigned int optlen)
-{
- struct mptcp_subflow_context *subflow;
- int ret = 0;
-
- mptcp_for_each_subflow(msk, subflow) {
- struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
-
- ret = tcp_setsockopt(ssk, level, optname, optval, optlen);
- if (ret)
- break;
- }
- return ret;
-}
-
static int mptcp_setsockopt_sol_tcp(struct mptcp_sock *msk, int optname,
sockptr_t optval, unsigned int optlen)
{
@@ -1463,6 +1512,12 @@ static int mptcp_getsockopt_v4(struct mptcp_sock *msk, int optname,
case IP_LOCAL_PORT_RANGE:
return mptcp_put_int_option(msk, optval, optlen,
READ_ONCE(inet_sk(sk)->local_port_range));
+ case IP_RECVERR:
+ return mptcp_put_int_option(msk, optval, optlen,
+ inet_test_bit(RECVERR, sk));
+ case IP_RECVERR_RFC4884:
+ return mptcp_put_int_option(msk, optval, optlen,
+ inet_test_bit(RECVERR_RFC4884, sk));
}
return -EOPNOTSUPP;
@@ -1483,6 +1538,12 @@ static int mptcp_getsockopt_v6(struct mptcp_sock *msk, int optname,
case IPV6_FREEBIND:
return mptcp_put_int_option(msk, optval, optlen,
inet_test_bit(FREEBIND, sk));
+ case IPV6_RECVERR:
+ return mptcp_put_int_option(msk, optval, optlen,
+ inet6_test_bit(RECVERR6, sk));
+ case IPV6_RECVERR_RFC4884:
+ return mptcp_put_int_option(msk, optval, optlen,
+ inet6_test_bit(RECVERR6_RFC4884, sk));
}
return -EOPNOTSUPP;
--
2.53.0
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH mptcp-next v4 3/4] mptcp: support MSG_ERRQUEUE on the parent socket
2026-04-27 21:10 ` [PATCH mptcp-next v4 0/4] " David Carlier
2026-04-27 21:10 ` [PATCH mptcp-next v4 1/4] mptcp: sockopt: factor inet_flags propagation into a mask David Carlier
2026-04-27 21:10 ` [PATCH mptcp-next v4 2/4] mptcp: propagate RECVERR sockopts to subflows David Carlier
@ 2026-04-27 21:10 ` David Carlier
2026-04-27 21:10 ` [PATCH mptcp-next v4 4/4] selftests: mptcp: cover IP_RECVERR sockopt propagation David Carlier
` (2 subsequent siblings)
5 siblings, 0 replies; 31+ messages in thread
From: David Carlier @ 2026-04-27 21:10 UTC (permalink / raw)
To: mptcp
Cc: Matthieu Baerts, Mat Martineau, Geliang Tang, Paolo Abeni,
David Carlier
Splice pending err skbs from each subflow's error queue onto the
parent msk's error queue at error-report time, so poll() and
recvmsg(MSG_ERRQUEUE) on the parent socket observe ICMP, tx
timestamp, and zerocopy completion notifications through the
standard inet ABI.
If sock_queue_err_skb() on the parent fails (rmem-limited), the
skb is left on the subflow queue and retried on the next error
report, avoiding silent loss.
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David Carlier <devnexen@gmail.com>
---
net/mptcp/protocol.c | 33 ++++++++++++++++++++++++++++-----
1 file changed, 28 insertions(+), 5 deletions(-)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 0db50e3715c3..131fb6ddfcd9 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -815,21 +815,39 @@ static bool __mptcp_ofo_queue(struct mptcp_sock *msk)
return moved;
}
+static bool __mptcp_subflow_splice_errqueue(struct sock *sk, struct sock *ssk)
+{
+ struct sk_buff *skb;
+ bool moved = false;
+
+ while ((skb = skb_dequeue(&ssk->sk_error_queue))) {
+ if (sock_queue_err_skb(sk, skb)) {
+ skb_queue_head(&ssk->sk_error_queue, skb);
+ break;
+ }
+ moved = true;
+ }
+
+ return moved;
+}
+
static bool __mptcp_subflow_error_report(struct sock *sk, struct sock *ssk)
{
int ssk_state;
+ bool report;
int err;
+ report = __mptcp_subflow_splice_errqueue(sk, ssk);
+
/* only propagate errors on fallen-back sockets or
* on MPC connect
*/
if (sk->sk_state != TCP_SYN_SENT && !__mptcp_check_fallback(mptcp_sk(sk)))
- return false;
+ goto out;
err = sock_error(ssk);
if (!err)
- return false;
-
+ goto out;
/* We need to propagate only transition to CLOSE state.
* Orphaned socket will see such state change via
* subflow_sched_work_if_closed() and that path will properly
@@ -839,6 +857,11 @@ static bool __mptcp_subflow_error_report(struct sock *sk, struct sock *ssk)
if (ssk_state == TCP_CLOSE && !sock_flag(sk, SOCK_DEAD))
mptcp_set_state(sk, ssk_state);
WRITE_ONCE(sk->sk_err, -err);
+ report = true;
+
+out:
+ if (!report)
+ return false;
/* This barrier is coupled with smp_rmb() in mptcp_poll() */
smp_wmb();
@@ -2295,7 +2318,6 @@ static int mptcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
int target;
long timeo;
- /* MSG_ERRQUEUE is really a no-op till we support IP_RECVERR */
if (unlikely(flags & MSG_ERRQUEUE))
return inet_recv_error(sk, msg, len);
@@ -4340,7 +4362,8 @@ static __poll_t mptcp_poll(struct file *file, struct socket *sock,
/* This barrier is coupled with smp_wmb() in __mptcp_error_report() */
smp_rmb();
- if (READ_ONCE(sk->sk_err))
+ if (READ_ONCE(sk->sk_err) ||
+ !skb_queue_empty_lockless(&sk->sk_error_queue))
mask |= EPOLLERR;
return mask;
--
2.53.0
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH mptcp-next v4 4/4] selftests: mptcp: cover IP_RECVERR sockopt propagation
2026-04-27 21:10 ` [PATCH mptcp-next v4 0/4] " David Carlier
` (2 preceding siblings ...)
2026-04-27 21:10 ` [PATCH mptcp-next v4 3/4] mptcp: support MSG_ERRQUEUE on the parent socket David Carlier
@ 2026-04-27 21:10 ` David Carlier
2026-04-28 18:48 ` [PATCH mptcp-next v4 0/4] mptcp: MSG_ERRQUEUE support on the parent socket Matthieu Baerts
2026-04-28 19:48 ` MPTCP CI
5 siblings, 0 replies; 31+ messages in thread
From: David Carlier @ 2026-04-27 21:10 UTC (permalink / raw)
To: mptcp
Cc: Matthieu Baerts, Mat Martineau, Geliang Tang, Paolo Abeni,
David Carlier
Exercise setsockopt/getsockopt of IP_RECVERR and IPV6_RECVERR on the
MPTCP parent socket, including the empty-errqueue EAGAIN contract on
MSG_ERRQUEUE|MSG_DONTWAIT.
End-to-end errqueue delivery (ICMP, TX timestamps, zerocopy) depends on
subflow-side producers that are out of scope for this series and will be
covered by follow-up work.
Assisted-by: Codex:gpt-5
Signed-off-by: David Carlier <devnexen@gmail.com>
---
.../selftests/net/mptcp/mptcp_sockopt.c | 55 +++++++++++++++++++
1 file changed, 55 insertions(+)
diff --git a/tools/testing/selftests/net/mptcp/mptcp_sockopt.c b/tools/testing/selftests/net/mptcp/mptcp_sockopt.c
index b6e58d936ebe..95bb2cc8e2ff 100644
--- a/tools/testing/selftests/net/mptcp/mptcp_sockopt.c
+++ b/tools/testing/selftests/net/mptcp/mptcp_sockopt.c
@@ -769,6 +769,60 @@ static void test_ip_tos_sockopt(int fd)
xerror("expect socklen_t == -1");
}
+static void test_ip_recverr_sockopt(int fd)
+{
+ struct iovec iov = {
+ .iov_base = &(char){ 0 },
+ .iov_len = 1,
+ };
+ struct msghdr msg = {
+ .msg_iov = &iov,
+ .msg_iovlen = 1,
+ };
+ int one = 1, zero = 0, val = -1;
+ socklen_t s = sizeof(val);
+ int level, optname, r;
+
+ switch (pf) {
+ case AF_INET:
+ level = SOL_IP;
+ optname = IP_RECVERR;
+ break;
+ case AF_INET6:
+ level = SOL_IPV6;
+ optname = IPV6_RECVERR;
+ break;
+ default:
+ xerror("Unknown pf %d\n", pf);
+ }
+
+ r = setsockopt(fd, level, optname, &one, sizeof(one));
+ if (r)
+ die_perror("setsockopt recverr on");
+
+ r = getsockopt(fd, level, optname, &val, &s);
+ if (r)
+ die_perror("getsockopt recverr on");
+ if (s != sizeof(val) || val != one)
+ xerror("recverr on mismatch val=%d len=%u", val, s);
+
+ r = recvmsg(fd, &msg, MSG_ERRQUEUE | MSG_DONTWAIT);
+ if (r != -1 || errno != EAGAIN)
+ xerror("expected empty errqueue to return EAGAIN, ret=%d errno=%d", r, errno);
+
+ r = setsockopt(fd, level, optname, &zero, sizeof(zero));
+ if (r)
+ die_perror("setsockopt recverr off");
+
+ val = -1;
+ s = sizeof(val);
+ r = getsockopt(fd, level, optname, &val, &s);
+ if (r)
+ die_perror("getsockopt recverr off");
+ if (s != sizeof(val) || val != zero)
+ xerror("recverr off mismatch val=%d len=%u", val, s);
+}
+
static int client(int pipefd)
{
int fd = -1;
@@ -787,6 +841,7 @@ static int client(int pipefd)
}
test_ip_tos_sockopt(fd);
+ test_ip_recverr_sockopt(fd);
connect_one_server(fd, pipefd);
--
2.53.0
^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH mptcp-next v4 0/4] mptcp: MSG_ERRQUEUE support on the parent socket
2026-04-27 21:10 ` [PATCH mptcp-next v4 0/4] " David Carlier
` (3 preceding siblings ...)
2026-04-27 21:10 ` [PATCH mptcp-next v4 4/4] selftests: mptcp: cover IP_RECVERR sockopt propagation David Carlier
@ 2026-04-28 18:48 ` Matthieu Baerts
2026-04-28 18:56 ` Matthieu Baerts
2026-04-28 19:48 ` MPTCP CI
5 siblings, 1 reply; 31+ messages in thread
From: Matthieu Baerts @ 2026-04-28 18:48 UTC (permalink / raw)
To: David Carlier, mptcp; +Cc: Mat Martineau, Geliang Tang, Paolo Abeni
Hi David,
Thank you for the new version.
On 27/04/2026 23:10, David Carlier wrote:
> MPTCP already advertises IP_RECVERR/IPV6_RECVERR as supported, but the
> parent socket does not currently provide usable MSG_ERRQUEUE handling.
>
> This series wires the MPTCP socket up to the IPv4/IPv6 error queue
> paths. It propagates RECVERR-related sockopts to existing and future
> subflows, makes poll() report pending errqueue activity through the
> parent socket, and lets recvmsg(MSG_ERRQUEUE) on the MPTCP socket
> consume queued errors with the parent socket ABI.
>
> A new prerequisite patch factors the per-flag inet_flags propagation
> in sync_socket_options() into a single masked word copy, so further
> inet_flags propagated by MPTCP can be added by extending the mask
> rather than touching the call site.
>
> Patch 2 then leverages the existing mptcp_setsockopt_all_sf() helper
> for the setsockopt path and extends MPTCP_INET_FLAGS_MASK with the
> four RECVERR bits, dropping the family-specific helpers from v3.
>
> Based-on: <20260424-mptcp-pm-sockopt-set-all-sf-v1-1-38e7023822f8@kernel.org>
I didn't review it, but I notice that the CI cannot apply your series,
because it looks like it is not based on the one you mentioned here.
Can you either remove this line, or rebase your series on top of this
other patch?
Also, please don't send your series as a reply to a previous posting,
please use a new thread. That's what is usually done, clearer, plus some
tools don't support replies.
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH mptcp-next v4 0/4] mptcp: MSG_ERRQUEUE support on the parent socket
2026-04-28 18:48 ` [PATCH mptcp-next v4 0/4] mptcp: MSG_ERRQUEUE support on the parent socket Matthieu Baerts
@ 2026-04-28 18:56 ` Matthieu Baerts
2026-04-28 19:15 ` David CARLIER
2026-05-01 14:49 ` Matthieu Baerts
0 siblings, 2 replies; 31+ messages in thread
From: Matthieu Baerts @ 2026-04-28 18:56 UTC (permalink / raw)
To: David Carlier, mptcp; +Cc: Mat Martineau, Geliang Tang, Paolo Abeni
On 28/04/2026 20:48, Matthieu Baerts wrote:
> Hi David,
>
> Thank you for the new version.
>
> On 27/04/2026 23:10, David Carlier wrote:
>> MPTCP already advertises IP_RECVERR/IPV6_RECVERR as supported, but the
>> parent socket does not currently provide usable MSG_ERRQUEUE handling.
>>
>> This series wires the MPTCP socket up to the IPv4/IPv6 error queue
>> paths. It propagates RECVERR-related sockopts to existing and future
>> subflows, makes poll() report pending errqueue activity through the
>> parent socket, and lets recvmsg(MSG_ERRQUEUE) on the MPTCP socket
>> consume queued errors with the parent socket ABI.
>>
>> A new prerequisite patch factors the per-flag inet_flags propagation
>> in sync_socket_options() into a single masked word copy, so further
>> inet_flags propagated by MPTCP can be added by extending the mask
>> rather than touching the call site.
>>
>> Patch 2 then leverages the existing mptcp_setsockopt_all_sf() helper
>> for the setsockopt path and extends MPTCP_INET_FLAGS_MASK with the
>> four RECVERR bits, dropping the family-specific helpers from v3.
>>
>> Based-on: <20260424-mptcp-pm-sockopt-set-all-sf-v1-1-38e7023822f8@kernel.org>
>
> I didn't review it, but I notice that the CI cannot apply your series,
> because it looks like it is not based on the one you mentioned here.
>
> Can you either remove this line, or rebase your series on top of this
> other patch?
>
> Also, please don't send your series as a reply to a previous posting,
> please use a new thread. That's what is usually done, clearer, plus some
> tools don't support replies.
Note: I just manually resolved the conflicts and sent the series to the
CI, not to have to resend a series just to retrigger the CI.
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH mptcp-next v4 0/4] mptcp: MSG_ERRQUEUE support on the parent socket
2026-04-28 18:56 ` Matthieu Baerts
@ 2026-04-28 19:15 ` David CARLIER
2026-05-01 14:49 ` Matthieu Baerts
1 sibling, 0 replies; 31+ messages in thread
From: David CARLIER @ 2026-04-28 19:15 UTC (permalink / raw)
To: Matthieu Baerts; +Cc: mptcp, Mat Martineau, Geliang Tang, Paolo Abeni
Hi Mathieu,
On Tue, 28 Apr 2026 at 19:56, Matthieu Baerts <matttbe@kernel.org> wrote:
>
> On 28/04/2026 20:48, Matthieu Baerts wrote:
> > Hi David,
> >
> > Thank you for the new version.
> >
> > On 27/04/2026 23:10, David Carlier wrote:
> >> MPTCP already advertises IP_RECVERR/IPV6_RECVERR as supported, but the
> >> parent socket does not currently provide usable MSG_ERRQUEUE handling.
> >>
> >> This series wires the MPTCP socket up to the IPv4/IPv6 error queue
> >> paths. It propagates RECVERR-related sockopts to existing and future
> >> subflows, makes poll() report pending errqueue activity through the
> >> parent socket, and lets recvmsg(MSG_ERRQUEUE) on the MPTCP socket
> >> consume queued errors with the parent socket ABI.
> >>
> >> A new prerequisite patch factors the per-flag inet_flags propagation
> >> in sync_socket_options() into a single masked word copy, so further
> >> inet_flags propagated by MPTCP can be added by extending the mask
> >> rather than touching the call site.
> >>
> >> Patch 2 then leverages the existing mptcp_setsockopt_all_sf() helper
> >> for the setsockopt path and extends MPTCP_INET_FLAGS_MASK with the
> >> four RECVERR bits, dropping the family-specific helpers from v3.
> >>
> >> Based-on: <20260424-mptcp-pm-sockopt-set-all-sf-v1-1-38e7023822f8@kernel.org>
> >
> > I didn't review it, but I notice that the CI cannot apply your series,
> > because it looks like it is not based on the one you mentioned here.
> >
> > Can you either remove this line, or rebase your series on top of this
> > other patch?
> >
> > Also, please don't send your series as a reply to a previous posting,
> > please use a new thread. That's what is usually done, clearer, plus some
> > tools don't support replies.
>
> Note: I just manually resolved the conflicts and sent the series to the
> CI, not to have to resend a series just to retrigger the CI.
appreciated. Cheers.
>
> Cheers,
> Matt
> --
> Sponsored by the NGI0 Core fund.
>
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH mptcp-next v4 0/4] mptcp: MSG_ERRQUEUE support on the parent socket
2026-04-27 21:10 ` [PATCH mptcp-next v4 0/4] " David Carlier
` (4 preceding siblings ...)
2026-04-28 18:48 ` [PATCH mptcp-next v4 0/4] mptcp: MSG_ERRQUEUE support on the parent socket Matthieu Baerts
@ 2026-04-28 19:48 ` MPTCP CI
5 siblings, 0 replies; 31+ messages in thread
From: MPTCP CI @ 2026-04-28 19:48 UTC (permalink / raw)
To: David Carlier; +Cc: mptcp
Hi David,
Thank you for your modifications, that's great!
Our CI did some validations and here is its report:
- KVM Validation: normal (except selftest_mptcp_join): Success! ✅
- KVM Validation: normal (only selftest_mptcp_join): Unstable: 1 failed test(s): selftest_mptcp_join ⚠️
- KVM Validation: debug (except selftest_mptcp_join): Unstable: 2 failed test(s): packetdrill_fastopen packetdrill_sockopts ⚠️
- KVM Validation: debug (only selftest_mptcp_join): Unstable: 1 failed test(s): selftest_mptcp_join ⚠️
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/25071789731
Initiator: Matthieu Baerts (NGI0)
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/7688d292b14a
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=1086438
If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:
$ cd [kernel source code]
$ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
--pull always mptcp/mptcp-upstream-virtme-docker:latest \
auto-normal
For more details:
https://github.com/multipath-tcp/mptcp-upstream-virtme-docker
Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)
Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH mptcp-next v4 0/4] mptcp: MSG_ERRQUEUE support on the parent socket
2026-04-28 18:56 ` Matthieu Baerts
2026-04-28 19:15 ` David CARLIER
@ 2026-05-01 14:49 ` Matthieu Baerts
2026-05-01 15:28 ` David CARLIER
1 sibling, 1 reply; 31+ messages in thread
From: Matthieu Baerts @ 2026-05-01 14:49 UTC (permalink / raw)
To: David Carlier, mptcp; +Cc: Mat Martineau, Geliang Tang, Paolo Abeni
Hi David,
On 28/04/2026 20:56, Matthieu Baerts wrote:
> On 28/04/2026 20:48, Matthieu Baerts wrote:
>> Hi David,
>>
>> Thank you for the new version.
>>
>> On 27/04/2026 23:10, David Carlier wrote:
>>> MPTCP already advertises IP_RECVERR/IPV6_RECVERR as supported, but the
>>> parent socket does not currently provide usable MSG_ERRQUEUE handling.
>>>
>>> This series wires the MPTCP socket up to the IPv4/IPv6 error queue
>>> paths. It propagates RECVERR-related sockopts to existing and future
>>> subflows, makes poll() report pending errqueue activity through the
>>> parent socket, and lets recvmsg(MSG_ERRQUEUE) on the MPTCP socket
>>> consume queued errors with the parent socket ABI.
>>>
>>> A new prerequisite patch factors the per-flag inet_flags propagation
>>> in sync_socket_options() into a single masked word copy, so further
>>> inet_flags propagated by MPTCP can be added by extending the mask
>>> rather than touching the call site.
>>>
>>> Patch 2 then leverages the existing mptcp_setsockopt_all_sf() helper
>>> for the setsockopt path and extends MPTCP_INET_FLAGS_MASK with the
>>> four RECVERR bits, dropping the family-specific helpers from v3.
>>>
>>> Based-on: <20260424-mptcp-pm-sockopt-set-all-sf-v1-1-38e7023822f8@kernel.org>
>>
>> I didn't review it, but I notice that the CI cannot apply your series,
>> because it looks like it is not based on the one you mentioned here.
>>
>> Can you either remove this line, or rebase your series on top of this
>> other patch?
>>
>> Also, please don't send your series as a reply to a previous posting,
>> please use a new thread. That's what is usually done, clearer, plus some
>> tools don't support replies.
>
> Note: I just manually resolved the conflicts and sent the series to the
> CI, not to have to resend a series just to retrigger the CI.
It looks like the CI (and sashiko) found some issues with this series.
But globally, I'm a bit puzzled: with MPTCP, there might be multiple
paths being used, and reporting errors about all of them when the
"legacy" RECVERR socket options are used will confuse the userspace that
doesn't (have to) know multiple subflows are being used. In this case,
either messages should be filtered (might be hard to handle all
use-cases and maintain that?), or this should be limited to cases where
only one subflow is being used. Which leads me to this question: what's
your use-case exactly? What are you trying to solve?
It might be easier to have a dedicated MPTCP_RECERR, and eventually
propagate more MPTCP-specific messages. Something that could be linked to:
https://github.com/multipath-tcp/mptcp_net-next/issues/78
WDYT?
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH mptcp-next v4 0/4] mptcp: MSG_ERRQUEUE support on the parent socket
2026-05-01 14:49 ` Matthieu Baerts
@ 2026-05-01 15:28 ` David CARLIER
2026-05-01 15:56 ` Matthieu Baerts
0 siblings, 1 reply; 31+ messages in thread
From: David CARLIER @ 2026-05-01 15:28 UTC (permalink / raw)
To: Matthieu Baerts; +Cc: mptcp, Mat Martineau, Geliang Tang, Paolo Abeni
Hi Matthieu,
On 01/05/2026 16:49, Matthieu Baerts wrote:
> It looks like the CI (and sashiko) found some issues with this series.
For v5:
- 1/4: per-bit inet_assign_bit() loop instead of WRITE_ONCE(), keeps
atomicity.
- 2/4: add missing sockopt_seq_inc(msk).
- 2/4: skip family-mismatched subflows in the v4/v6 helpers.
- 2/4: snapshot optval to a local int, pass KERNEL_SOCKPTR(&val) into
the loop.
- 3/4: pull-on-drain from mptcp_recv_error() so a parent-ENOMEM does
not strand subflow skbs.
Will also re-run the docker repro to check the selftest_mptcp_join /
packetdrill rows are pre-existing.
> But globally, I'm a bit puzzled: with MPTCP, there might be multiple
> paths being used, and reporting errors about all of them when the
> "legacy" RECVERR socket options are used will confuse the userspace
> that doesn't (have to) know multiple subflows are being used.
Fair, and Paolo raised it on v3. The use-case is tx timestamping and
MSG_ZEROCOPY completions - both are tied to user data, not the
subflow that carried it, so no subflow identity leaks into the cmsg.
ICMP/ICMPv6 is the part that does. v5 will filter the splice by
SO_EE_ORIGIN: forward TIMESTAMPING / ZEROCOPY / LOCAL, drop ICMP.
> It might be easier to have a dedicated MPTCP_RECERR, and
eventually
> propagate more MPTCP-specific messages. Something that could be
> linked to:
> https://github.com/multipath-tcp/mptcp_net-next/issues/78
Agreed - subflow ICMP and #78's lifecycle events belong there. As a
follow-up once v5 lands.
Cheers
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH mptcp-next v4 2/4] mptcp: propagate RECVERR sockopts to subflows
2026-04-27 21:10 ` [PATCH mptcp-next v4 2/4] mptcp: propagate RECVERR sockopts to subflows David Carlier
@ 2026-05-01 15:56 ` Matthieu Baerts
0 siblings, 0 replies; 31+ messages in thread
From: Matthieu Baerts @ 2026-05-01 15:56 UTC (permalink / raw)
To: David Carlier, mptcp; +Cc: Mat Martineau, Geliang Tang, Paolo Abeni
On 27/04/2026 23:10, David Carlier wrote:
> Propagate IP_RECVERR/IP_RECVERR_RFC4884 and
> IPV6_RECVERR/IPV6_RECVERR_RFC4884 from the MPTCP socket to existing
> and future subflows. The setsockopt path forwards each option to
> every subflow via mptcp_setsockopt_all_sf(); newly-joining subflows
> inherit the four RECVERR bits through sync_socket_options() now that
> MPTCP_INET_FLAGS_MASK covers them.
>
> Suggested-by: Paolo Abeni <pabeni@redhat.com>
> Assisted-by: Codex:gpt-5
> Signed-off-by: David Carlier <devnexen@gmail.com>
> ---
> net/mptcp/sockopt.c | 97 ++++++++++++++++++++++++++++++++++++---------
> 1 file changed, 79 insertions(+), 18 deletions(-)
>
> diff --git a/net/mptcp/sockopt.c b/net/mptcp/sockopt.c
> index 41c9dc9cf95e..171e83e66a97 100644
> --- a/net/mptcp/sockopt.c
> +++ b/net/mptcp/sockopt.c
(...)
> @@ -388,6 +394,41 @@ static int mptcp_setsockopt_sol_socket(struct mptcp_sock *msk, int optname,
> return -EOPNOTSUPP;
> }
>
> +static int mptcp_setsockopt_all_sf(struct mptcp_sock *msk, int level,
> + int optname, sockptr_t optval,
> + unsigned int optlen)
> +{
> + struct mptcp_subflow_context *subflow;
> + int ret = 0;
> +
> + mptcp_for_each_subflow(msk, subflow) {
> + struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
> +
> + ret = tcp_setsockopt(ssk, level, optname, optval, optlen);
> + if (ret)
> + break;
> + }
> + return ret;
> +}
> +
> +#if IS_ENABLED(CONFIG_IPV6)
> +static int mptcp_setsockopt_v6_recverr(struct mptcp_sock *msk, int optname,
> + sockptr_t optval, unsigned int optlen)
> +{
> + struct sock *sk = (struct sock *)msk;
> + int ret;
> +
> + ret = ipv6_setsockopt(sk, SOL_IPV6, optname, optval, optlen);
> + if (ret)
> + return ret;
> +
> + lock_sock(sk);
> + ret = mptcp_setsockopt_all_sf(msk, SOL_IPV6, optname, optval, optlen);
> + release_sock(sk);
> + return ret;
> +}
> +#endif
Maybe you could have one generic helper to call xxx_setsockopt() on the
MPTCP socket, and then call mptcp_setsockopt_all_sf(). You can pass the
level, and call the right function.
(...)
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH mptcp-next v4 0/4] mptcp: MSG_ERRQUEUE support on the parent socket
2026-05-01 15:28 ` David CARLIER
@ 2026-05-01 15:56 ` Matthieu Baerts
0 siblings, 0 replies; 31+ messages in thread
From: Matthieu Baerts @ 2026-05-01 15:56 UTC (permalink / raw)
To: David CARLIER; +Cc: mptcp, Mat Martineau, Geliang Tang, Paolo Abeni
On 01/05/2026 17:28, David CARLIER wrote:
> Hi Matthieu,
>
> On 01/05/2026 16:49, Matthieu Baerts wrote:
> > It looks like the CI (and sashiko) found some issues with this series.
(Please do fix your email client to avoid this formatting: some of your
emails are OK, but not all of them.)
> For v5:
>
> - 1/4: per-bit inet_assign_bit() loop instead of WRITE_ONCE(), keeps
> atomicity.
> - 2/4: add missing sockopt_seq_inc(msk).
> - 2/4: skip family-mismatched subflows in the v4/v6 helpers.
> - 2/4: snapshot optval to a local int, pass KERNEL_SOCKPTR(&val) into
> the loop.
(While at it, your new helpers mptcp_setsockopt_v[46]_recverr could have
a generic name)
> - 3/4: pull-on-drain from mptcp_recv_error() so a parent-ENOMEM does
> not strand subflow skbs.
>
> Will also re-run the docker repro to check the selftest_mptcp_join /
> packetdrill rows are pre-existing.
The packetdrill errors might be pre-existing, someone should look at
improving the situation there:
https://ci-results.mptcp.dev/flakes.html
> > But globally, I'm a bit puzzled: with MPTCP, there might be multiple
> > paths being used, and reporting errors about all of them when the
> > "legacy" RECVERR socket options are used will confuse the userspace
> > that doesn't (have to) know multiple subflows are being used.
>
> Fair, and Paolo raised it on v3. The use-case is tx timestamping and
> MSG_ZEROCOPY completions - both are tied to user data, not the
> subflow that carried it, so no subflow identity leaks into the cmsg.
> ICMP/ICMPv6 is the part that does. v5 will filter the splice by
> SO_EE_ORIGIN: forward TIMESTAMPING / ZEROCOPY / LOCAL, drop ICMP.
Maybe OK with this filter indeed..
> > It might be easier to have a dedicated MPTCP_RECERR, and
> eventually
> > propagate more MPTCP-specific messages. Something that could be
> > linked to:
> > https://github.com/multipath-tcp/mptcp_net-next/issues/78
>
> Agreed - subflow ICMP and #78's lifecycle events belong there. As a
> follow-up once v5 lands.
Indeed, better to split them.
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 31+ messages in thread
end of thread, other threads:[~2026-05-01 15:56 UTC | newest]
Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-21 22:33 [PATCH mptcp-next v3 0/3] mptcp: MSG_ERRQUEUE support on the parent socket David Carlier
2026-04-21 22:33 ` [PATCH mptcp-next v3 1/3] mptcp: propagate RECVERR sockopts to subflows David Carlier
2026-04-22 8:05 ` Paolo Abeni
2026-04-22 8:32 ` Matthieu Baerts
2026-04-22 8:35 ` Matthieu Baerts
2026-04-22 8:36 ` Matthieu Baerts
2026-04-22 8:48 ` Paolo Abeni
2026-04-22 8:50 ` Matthieu Baerts
2026-04-22 13:53 ` Paolo Abeni
2026-04-22 21:51 ` David CARLIER
2026-04-27 17:07 ` Matthieu Baerts
2026-04-21 22:33 ` [PATCH mptcp-next v3 2/3] mptcp: support MSG_ERRQUEUE on the parent socket David Carlier
2026-04-22 8:28 ` Paolo Abeni
2026-04-22 21:54 ` David CARLIER
2026-04-21 22:33 ` [PATCH mptcp-next v3 3/3] selftests: mptcp: cover IP_RECVERR sockopt propagation David Carlier
2026-04-21 23:38 ` [PATCH mptcp-next v3 0/3] mptcp: MSG_ERRQUEUE support on the parent socket MPTCP CI
2026-04-22 8:22 ` Matthieu Baerts
2026-04-22 8:56 ` David CARLIER
2026-04-27 21:10 ` [PATCH mptcp-next v4 0/4] " David Carlier
2026-04-27 21:10 ` [PATCH mptcp-next v4 1/4] mptcp: sockopt: factor inet_flags propagation into a mask David Carlier
2026-04-27 21:10 ` [PATCH mptcp-next v4 2/4] mptcp: propagate RECVERR sockopts to subflows David Carlier
2026-05-01 15:56 ` Matthieu Baerts
2026-04-27 21:10 ` [PATCH mptcp-next v4 3/4] mptcp: support MSG_ERRQUEUE on the parent socket David Carlier
2026-04-27 21:10 ` [PATCH mptcp-next v4 4/4] selftests: mptcp: cover IP_RECVERR sockopt propagation David Carlier
2026-04-28 18:48 ` [PATCH mptcp-next v4 0/4] mptcp: MSG_ERRQUEUE support on the parent socket Matthieu Baerts
2026-04-28 18:56 ` Matthieu Baerts
2026-04-28 19:15 ` David CARLIER
2026-05-01 14:49 ` Matthieu Baerts
2026-05-01 15:28 ` David CARLIER
2026-05-01 15:56 ` Matthieu Baerts
2026-04-28 19:48 ` MPTCP CI
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.