* [PATCH mptcp-next v7 0/4] mptcp: MSG_ERRQUEUE support on the parent socket
@ 2026-05-09 21:16 David Carlier
2026-05-09 21:16 ` [PATCH mptcp-next v7 1/4] mptcp: sockopt: factor inet_flags propagation into a mask David Carlier
` (5 more replies)
0 siblings, 6 replies; 10+ messages in thread
From: David Carlier @ 2026-05-09 21:16 UTC (permalink / raw)
To: mptcp; +Cc: matttbe, martineau, geliang, pabeni, David Carlier
This series adds MSG_ERRQUEUE support on the MPTCP parent socket, so
poll() and recvmsg(MSG_ERRQUEUE) observe TX timestamps and MSG_ZEROCOPY
completion notifications through the standard inet ABI. IP_RECVERR /
IPV6_RECVERR (and their RFC4884 variants) are propagated to existing
and future subflows.
Patch 1 factors per-flag inet_assign_bit() calls in
sync_socket_options() into a mask-driven loop so future propagated
flags only need to extend MPTCP_INET_FLAGS_MASK.
Patch 2 wires up RECVERR setsockopt/getsockopt: snapshot the value,
apply it on the parent, and forward to every subflow under lock_sock()
so concurrent setsockopt callers cannot leave parent and subflows
desynchronized. Newly-joining subflows pick up the four RECVERR bits
through sync_socket_options().
Patch 3 splices forwardable err skbs (TIMESTAMPING / ZEROCOPY / LOCAL)
from each subflow's error queue onto the parent's, so pollers see
EPOLLERR and recvmsg(MSG_ERRQUEUE) on the parent drains them. Subflow
ICMP errors are dropped — they will be carried by a future
MPTCP_RECERR channel.
Patch 4 covers IP_RECVERR / IPV6_RECVERR propagation and the empty-
errqueue EAGAIN contract on MSG_ERRQUEUE | MSG_DONTWAIT in selftest.
v6 -> v7:
- patch 2: gate SOL_IPV6 setsockopt/getsockopt dispatch on
sk_family == AF_INET6, returning -ENOPROTOOPT otherwise, mirroring
plain TCP. Addresses the sashiko Medium finding on v6 where
IPV6_RECVERR silently succeeded on AF_INET MPTCP sockets.
- patch 3: track moved skbs in mptcp_recv_error() and retry
inet_recv_error() when ret == -EAGAIN && moved, so a successful
subflow splice is not masked by the initial drain returning EAGAIN
(sashiko High #2 on v6).
- patch 3: add mptcp_subflow_errqueue_pending() and OR it into the
EPOLLERR check in mptcp_poll(), so events stranded on a subflow
when the parent is under rmem pressure still wake userspace
(sashiko High #1 on v6).
- rebased on current export.
Tested with KVM-validation auto-normal: 25/25 pass.
David Carlier (4):
mptcp: sockopt: factor inet_flags propagation into a mask
mptcp: propagate RECVERR sockopts to subflows
mptcp: support MSG_ERRQUEUE on the parent socket
selftests: mptcp: cover IP_RECVERR sockopt propagation
net/mptcp/protocol.c | 92 ++++++++++-
net/mptcp/sockopt.c | 146 ++++++++++++++----
.../selftests/net/mptcp/mptcp_sockopt.c | 55 +++++++
3 files changed, 261 insertions(+), 32 deletions(-)
base-commit: 63b133728231ebba5167bd1e53dda9bcf0bee7c7
--
2.53.0
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH mptcp-next v7 1/4] mptcp: sockopt: factor inet_flags propagation into a mask
2026-05-09 21:16 [PATCH mptcp-next v7 0/4] mptcp: MSG_ERRQUEUE support on the parent socket David Carlier
@ 2026-05-09 21:16 ` David Carlier
2026-05-09 21:16 ` [PATCH mptcp-next v7 2/4] mptcp: propagate RECVERR sockopts to subflows David Carlier
` (4 subsequent siblings)
5 siblings, 0 replies; 10+ messages in thread
From: David Carlier @ 2026-05-09 21:16 UTC (permalink / raw)
To: mptcp; +Cc: matttbe, martineau, geliang, pabeni, David Carlier
Introduce MPTCP_INET_FLAGS_MASK and replace the per-flag
inet_assign_bit() calls in sync_socket_options() with a loop driven
by the mask that calls assign_bit() per set bit, preserving the
per-bit atomicity of the original. Further flags propagated by MPTCP
can be added by extending the mask rather than touching the call
site.
No functional change.
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David Carlier <devnexen@gmail.com>
---
net/mptcp/sockopt.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/net/mptcp/sockopt.c b/net/mptcp/sockopt.c
index 1544e3563852..114436d2e401 100644
--- a/net/mptcp/sockopt.c
+++ b/net/mptcp/sockopt.c
@@ -16,6 +16,10 @@
#define MIN_INFO_OPTLEN_SIZE 16
#define MIN_FULL_INFO_OPTLEN_SIZE 40
+#define MPTCP_INET_FLAGS_MASK \
+ (BIT(INET_FLAGS_TRANSPARENT) | \
+ BIT(INET_FLAGS_FREEBIND) | \
+ BIT(INET_FLAGS_BIND_ADDRESS_NO_PORT))
static struct sock *__mptcp_tcp_fallback(struct mptcp_sock *msk)
{
@@ -1546,6 +1550,9 @@ static void sync_socket_options(struct mptcp_sock *msk, struct sock *ssk)
{
static const unsigned int tx_rx_locks = SOCK_RCVBUF_LOCK | SOCK_SNDBUF_LOCK;
struct sock *sk = (struct sock *)msk;
+ unsigned long mask = MPTCP_INET_FLAGS_MASK;
+ unsigned long src;
+ int b;
bool keep_open;
keep_open = sock_flag(sk, SOCK_KEEPOPEN);
@@ -1592,9 +1599,11 @@ static void sync_socket_options(struct mptcp_sock *msk, struct sock *ssk)
tcp_sock_set_keepcnt(ssk, msk->keepalive_cnt);
tcp_sock_set_maxseg(ssk, msk->maxseg);
- inet_assign_bit(TRANSPARENT, ssk, inet_test_bit(TRANSPARENT, sk));
- inet_assign_bit(FREEBIND, ssk, inet_test_bit(FREEBIND, sk));
- inet_assign_bit(BIND_ADDRESS_NO_PORT, ssk, inet_test_bit(BIND_ADDRESS_NO_PORT, sk));
+ src = READ_ONCE(inet_sk(sk)->inet_flags);
+
+ for_each_set_bit(b, &mask, BITS_PER_LONG)
+ assign_bit(b, &inet_sk(ssk)->inet_flags, src & BIT(b));
+
WRITE_ONCE(inet_sk(ssk)->local_port_range, READ_ONCE(inet_sk(sk)->local_port_range));
}
--
2.53.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH mptcp-next v7 2/4] mptcp: propagate RECVERR sockopts to subflows
2026-05-09 21:16 [PATCH mptcp-next v7 0/4] mptcp: MSG_ERRQUEUE support on the parent socket David Carlier
2026-05-09 21:16 ` [PATCH mptcp-next v7 1/4] mptcp: sockopt: factor inet_flags propagation into a mask David Carlier
@ 2026-05-09 21:16 ` David Carlier
2026-05-27 5:37 ` Matthieu Baerts
2026-05-09 21:16 ` [PATCH mptcp-next v7 3/4] mptcp: support MSG_ERRQUEUE on the parent socket David Carlier
` (3 subsequent siblings)
5 siblings, 1 reply; 10+ messages in thread
From: David Carlier @ 2026-05-09 21:16 UTC (permalink / raw)
To: mptcp; +Cc: matttbe, martineau, geliang, pabeni, David Carlier
Propagate IP_RECVERR/IP_RECVERR_RFC4884 and
IPV6_RECVERR/IPV6_RECVERR_RFC4884 from the MPTCP socket to existing
and future subflows.
mptcp_setsockopt_recverr() snapshots optval into a local int, applies
it to the parent socket via ip_setsockopt() / ipv6_setsockopt(), bumps
msk->setsockopt_seq, and forwards to every subflow via
mptcp_setsockopt_all_sf(). Newly-joining subflows pick up the four
RECVERR bits through sync_socket_options() now that
MPTCP_INET_FLAGS_MASK covers them.
mptcp_setsockopt_all_sf() skips IPv4 subflows when called with
SOL_IPV6 to avoid the -ENOPROTOOPT that ip_setsockopt() returns on
level mismatch in AF_INET6 msks carrying IPv4 subflows.
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Assisted-by: Codex:gpt-5
Signed-off-by: David Carlier <devnexen@gmail.com>
---
net/mptcp/sockopt.c | 133 ++++++++++++++++++++++++++++++++++++--------
1 file changed, 109 insertions(+), 24 deletions(-)
diff --git a/net/mptcp/sockopt.c b/net/mptcp/sockopt.c
index 114436d2e401..fbbd1692af7e 100644
--- a/net/mptcp/sockopt.c
+++ b/net/mptcp/sockopt.c
@@ -8,6 +8,8 @@
#include <linux/kernel.h>
#include <linux/module.h>
+#include <net/ip.h>
+#include <net/ipv6.h>
#include <net/sock.h>
#include <net/protocol.h>
#include <net/tcp.h>
@@ -19,7 +21,11 @@
#define MPTCP_INET_FLAGS_MASK \
(BIT(INET_FLAGS_TRANSPARENT) | \
BIT(INET_FLAGS_FREEBIND) | \
- BIT(INET_FLAGS_BIND_ADDRESS_NO_PORT))
+ BIT(INET_FLAGS_BIND_ADDRESS_NO_PORT) | \
+ BIT(INET_FLAGS_RECVERR) | \
+ BIT(INET_FLAGS_RECVERR_RFC4884) | \
+ BIT(INET_FLAGS_RECVERR6) | \
+ BIT(INET_FLAGS_RECVERR6_RFC4884))
static struct sock *__mptcp_tcp_fallback(struct mptcp_sock *msk)
{
@@ -394,6 +400,81 @@ static int mptcp_setsockopt_sol_socket(struct mptcp_sock *msk, int optname,
return -EOPNOTSUPP;
}
+static int mptcp_setsockopt_all_sf(struct mptcp_sock *msk, int level,
+ int optname, sockptr_t optval,
+ unsigned int optlen)
+{
+ struct mptcp_subflow_context *subflow;
+ int ret = 0;
+
+ mptcp_for_each_subflow(msk, subflow) {
+ struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
+
+ if (level == SOL_IPV6 && ssk->sk_family != AF_INET6)
+ continue;
+
+ ret = tcp_setsockopt(ssk, level, optname, optval, optlen);
+ if (ret)
+ break;
+ }
+
+ if (!ret)
+ sockopt_seq_inc(msk);
+
+ return ret;
+}
+
+static int mptcp_setsockopt_recverr(struct mptcp_sock *msk, int level,
+ int optname, sockptr_t optval,
+ unsigned int optlen)
+{
+ struct sock *sk = (struct sock *)msk;
+ int val, ret;
+
+ /* Let ip_setsockopt() / ipv6_setsockopt() validate optval and optlen
+ * (so 1-byte boolean writes keep the same ABI as plain TCP) and update
+ * the parent's RECVERR bit. Re-read that bit under lock_sock() and
+ * push it to the subflows: concurrent setsockopt callers cannot leave
+ * parent and subflows desynchronized this way.
+ */
+ if (level == SOL_IP)
+ ret = ip_setsockopt(sk, level, optname, optval, optlen);
+#if IS_ENABLED(CONFIG_IPV6)
+ else if (level == SOL_IPV6)
+ ret = ipv6_setsockopt(sk, level, optname, optval, optlen);
+#endif
+ else
+ return -EOPNOTSUPP;
+ if (ret)
+ return ret;
+
+ lock_sock(sk);
+ switch (optname) {
+ case IP_RECVERR:
+ val = inet_test_bit(RECVERR, sk);
+ break;
+ case IP_RECVERR_RFC4884:
+ val = inet_test_bit(RECVERR_RFC4884, sk);
+ break;
+#if IS_ENABLED(CONFIG_IPV6)
+ case IPV6_RECVERR:
+ val = inet6_test_bit(RECVERR6, sk);
+ break;
+ case IPV6_RECVERR_RFC4884:
+ val = inet6_test_bit(RECVERR6_RFC4884, sk);
+ break;
+#endif
+ default:
+ release_sock(sk);
+ return -EOPNOTSUPP;
+ }
+
+ ret = mptcp_setsockopt_all_sf(msk, level, optname,
+ KERNEL_SOCKPTR(&val), sizeof(val));
+ release_sock(sk);
+ return ret;
+}
+
static int mptcp_setsockopt_v6(struct mptcp_sock *msk, int optname,
sockptr_t optval, unsigned int optlen)
{
@@ -436,6 +517,10 @@ static int mptcp_setsockopt_v6(struct mptcp_sock *msk, int optname,
release_sock(sk);
break;
+ case IPV6_RECVERR:
+ case IPV6_RECVERR_RFC4884:
+ ret = mptcp_setsockopt_recverr(msk, SOL_IPV6, optname, optval, optlen);
+ break;
}
return ret;
@@ -781,6 +866,9 @@ static int mptcp_setsockopt_v4(struct mptcp_sock *msk, int optname,
return mptcp_setsockopt_sol_ip_set(msk, optname, optval, optlen);
case IP_TOS:
return mptcp_setsockopt_v4_set_tos(msk, optname, optval, optlen);
+ case IP_RECVERR:
+ case IP_RECVERR_RFC4884:
+ return mptcp_setsockopt_recverr(msk, SOL_IP, optname, optval, optlen);
}
return -EOPNOTSUPP;
@@ -808,27 +896,6 @@ static int mptcp_setsockopt_first_sf_only(struct mptcp_sock *msk, int level, int
return ret;
}
-static int mptcp_setsockopt_all_sf(struct mptcp_sock *msk, int level,
- int optname, sockptr_t optval,
- unsigned int optlen)
-{
- struct mptcp_subflow_context *subflow;
- int ret = 0;
-
- mptcp_for_each_subflow(msk, subflow) {
- struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
-
- ret = tcp_setsockopt(ssk, level, optname, optval, optlen);
- if (ret)
- break;
- }
-
- if (!ret)
- sockopt_seq_inc(msk);
-
- return ret;
-}
-
static int mptcp_setsockopt_sol_tcp(struct mptcp_sock *msk, int optname,
sockptr_t optval, unsigned int optlen)
{
@@ -932,8 +999,11 @@ int mptcp_setsockopt(struct sock *sk, int level, int optname,
if (level == SOL_IP)
return mptcp_setsockopt_v4(msk, optname, optval, optlen);
- if (level == SOL_IPV6)
+ if (level == SOL_IPV6) {
+ if (sk->sk_family != AF_INET6)
+ return -ENOPROTOOPT;
return mptcp_setsockopt_v6(msk, optname, optval, optlen);
+ }
if (level == SOL_TCP)
return mptcp_setsockopt_sol_tcp(msk, optname, optval, optlen);
@@ -1473,6 +1543,12 @@ static int mptcp_getsockopt_v4(struct mptcp_sock *msk, int optname,
case IP_LOCAL_PORT_RANGE:
return mptcp_put_int_option(msk, optval, optlen,
READ_ONCE(inet_sk(sk)->local_port_range));
+ case IP_RECVERR:
+ return mptcp_put_int_option(msk, optval, optlen,
+ inet_test_bit(RECVERR, sk));
+ case IP_RECVERR_RFC4884:
+ return mptcp_put_int_option(msk, optval, optlen,
+ inet_test_bit(RECVERR_RFC4884, sk));
}
return -EOPNOTSUPP;
@@ -1493,6 +1569,12 @@ static int mptcp_getsockopt_v6(struct mptcp_sock *msk, int optname,
case IPV6_FREEBIND:
return mptcp_put_int_option(msk, optval, optlen,
inet_test_bit(FREEBIND, sk));
+ case IPV6_RECVERR:
+ return mptcp_put_int_option(msk, optval, optlen,
+ inet6_test_bit(RECVERR6, sk));
+ case IPV6_RECVERR_RFC4884:
+ return mptcp_put_int_option(msk, optval, optlen,
+ inet6_test_bit(RECVERR6_RFC4884, sk));
}
return -EOPNOTSUPP;
@@ -1537,8 +1619,11 @@ int mptcp_getsockopt(struct sock *sk, int level, int optname,
if (level == SOL_IP)
return mptcp_getsockopt_v4(msk, optname, optval, option);
- if (level == SOL_IPV6)
+ if (level == SOL_IPV6) {
+ if (sk->sk_family != AF_INET6)
+ return -ENOPROTOOPT;
return mptcp_getsockopt_v6(msk, optname, optval, option);
+ }
if (level == SOL_TCP)
return mptcp_getsockopt_sol_tcp(msk, optname, optval, option);
if (level == SOL_MPTCP)
--
2.53.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH mptcp-next v7 3/4] mptcp: support MSG_ERRQUEUE on the parent socket
2026-05-09 21:16 [PATCH mptcp-next v7 0/4] mptcp: MSG_ERRQUEUE support on the parent socket David Carlier
2026-05-09 21:16 ` [PATCH mptcp-next v7 1/4] mptcp: sockopt: factor inet_flags propagation into a mask David Carlier
2026-05-09 21:16 ` [PATCH mptcp-next v7 2/4] mptcp: propagate RECVERR sockopts to subflows David Carlier
@ 2026-05-09 21:16 ` David Carlier
2026-05-09 21:16 ` [PATCH mptcp-next v7 4/4] selftests: mptcp: cover IP_RECVERR sockopt propagation David Carlier
` (2 subsequent siblings)
5 siblings, 0 replies; 10+ messages in thread
From: David Carlier @ 2026-05-09 21:16 UTC (permalink / raw)
To: mptcp; +Cc: matttbe, martineau, geliang, pabeni, David Carlier
Splice pending err skbs from each subflow's error queue onto the parent
msk's error queue at error-report time, so poll() and recvmsg(MSG_ERRQUEUE)
on the parent socket observe TX timestamps and MSG_ZEROCOPY completion
notifications through the standard inet ABI.
The splice filters by SO_EE_ORIGIN: TIMESTAMPING / ZEROCOPY / LOCAL
events forward to the parent because they are tied to user-handed data,
not to a specific path; subflow-level ICMP errors are dropped because
the legacy RECVERR ABI cannot meaningfully convey their per-subflow peer
identity to single-path-aware userspace. Such events will be carried by
a future MPTCP_RECERR channel.
mptcp_recv_error() retries the splice on the pull side: if
sock_queue_err_skb() previously failed under rmem pressure, the skb
stays on the subflow queue, and the next recvmsg(MSG_ERRQUEUE) splices
it once the parent's queue has been drained.
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David Carlier <devnexen@gmail.com>
---
net/mptcp/protocol.c | 92 +++++++++++++++++++++++++++++++++++++++++---
1 file changed, 86 insertions(+), 6 deletions(-)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 93e7a42fc65c..53abb8dc2c0f 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -11,6 +11,7 @@
#include <linux/netdevice.h>
#include <linux/sched/signal.h>
#include <linux/atomic.h>
+#include <linux/errqueue.h>
#include <net/aligned_data.h>
#include <net/rps.h>
#include <net/sock.h>
@@ -815,21 +816,52 @@ static bool __mptcp_ofo_queue(struct mptcp_sock *msk)
return moved;
}
+static bool mptcp_errqueue_skb_forwardable(const struct sk_buff *skb)
+{
+ u8 origin = SKB_EXT_ERR(skb)->ee.ee_origin;
+
+ return origin == SO_EE_ORIGIN_TIMESTAMPING ||
+ origin == SO_EE_ORIGIN_ZEROCOPY ||
+ origin == SO_EE_ORIGIN_LOCAL;
+}
+
+static bool __mptcp_subflow_splice_errqueue(struct sock *sk, struct sock *ssk)
+{
+ struct sk_buff *skb;
+ bool moved = false;
+
+ while ((skb = skb_dequeue(&ssk->sk_error_queue))) {
+ if (!mptcp_errqueue_skb_forwardable(skb)) {
+ kfree_skb(skb); /* path-specific (ICMP) — belongs in MPTCP_RECERR */
+ continue;
+ }
+ if (sock_queue_err_skb(sk, skb)) {
+ skb_queue_head(&ssk->sk_error_queue, skb);
+ break;
+ }
+ moved = true;
+ }
+
+ return moved;
+}
+
static bool __mptcp_subflow_error_report(struct sock *sk, struct sock *ssk)
{
int ssk_state;
+ bool report;
int err;
+ report = __mptcp_subflow_splice_errqueue(sk, ssk);
+
/* only propagate errors on fallen-back sockets or
* on MPC connect
*/
if (sk->sk_state != TCP_SYN_SENT && !__mptcp_check_fallback(mptcp_sk(sk)))
- return false;
+ goto out;
err = sock_error(ssk);
if (!err)
- return false;
-
+ goto out;
/* We need to propagate only transition to CLOSE state.
* Orphaned socket will see such state change via
* subflow_sched_work_if_closed() and that path will properly
@@ -839,6 +871,11 @@ static bool __mptcp_subflow_error_report(struct sock *sk, struct sock *ssk)
if (ssk_state == TCP_CLOSE && !sock_flag(sk, SOCK_DEAD))
mptcp_set_state(sk, ssk_state);
WRITE_ONCE(sk->sk_err, -err);
+ report = true;
+
+out:
+ if (!report)
+ return false;
/* This barrier is coupled with smp_rmb() in mptcp_poll() */
smp_wmb();
@@ -2286,6 +2323,35 @@ static unsigned int mptcp_inq_hint(const struct sock *sk)
return 0;
}
+static int mptcp_recv_error(struct sock *sk, struct msghdr *msg, int len)
+{
+ struct mptcp_sock *msk = mptcp_sk(sk);
+ struct mptcp_subflow_context *subflow;
+ bool moved = false;
+ int ret;
+
+ /* Drain the parent first: a previous splice may have failed under
+ * rmem pressure and the skb stayed on a subflow. Freeing space here
+ * lets the splice below succeed; sock_queue_err_skb() then re-asserts
+ * EPOLLERR so userspace knows to drain again on the next poll.
+ */
+ ret = inet_recv_error(sk, msg, len);
+
+ lock_sock(sk);
+ mptcp_for_each_subflow(msk, subflow) {
+ struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
+
+ if (!skb_queue_empty_lockless(&ssk->sk_error_queue))
+ moved |= __mptcp_subflow_splice_errqueue(sk, ssk);
+ }
+ release_sock(sk);
+
+ if (ret == -EAGAIN && moved)
+ ret = inet_recv_error(sk, msg, len);
+
+ return ret;
+}
+
static int mptcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
int flags)
{
@@ -2295,9 +2361,8 @@ static int mptcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
int target;
long timeo;
- /* MSG_ERRQUEUE is really a no-op till we support IP_RECVERR */
if (unlikely(flags & MSG_ERRQUEUE))
- return inet_recv_error(sk, msg, len);
+ return mptcp_recv_error(sk, msg, len);
lock_sock(sk);
if (unlikely(sk->sk_state == TCP_LISTEN)) {
@@ -4298,6 +4363,19 @@ static __poll_t mptcp_check_writeable(struct mptcp_sock *msk)
return 0;
}
+static bool mptcp_subflow_errqueue_pending(const struct mptcp_sock *msk)
+{
+ struct mptcp_subflow_context *subflow;
+
+ mptcp_for_each_subflow(msk, subflow) {
+ struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
+
+ if (!skb_queue_empty_lockless(&ssk->sk_error_queue))
+ return true;
+ }
+ return false;
+}
+
static __poll_t mptcp_poll(struct file *file, struct socket *sock,
struct poll_table_struct *wait)
{
@@ -4341,7 +4419,9 @@ static __poll_t mptcp_poll(struct file *file, struct socket *sock,
/* This barrier is coupled with smp_wmb() in __mptcp_error_report() */
smp_rmb();
- if (READ_ONCE(sk->sk_err))
+ if (READ_ONCE(sk->sk_err) ||
+ !skb_queue_empty_lockless(&sk->sk_error_queue) ||
+ mptcp_subflow_errqueue_pending(msk))
mask |= EPOLLERR;
return mask;
--
2.53.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH mptcp-next v7 4/4] selftests: mptcp: cover IP_RECVERR sockopt propagation
2026-05-09 21:16 [PATCH mptcp-next v7 0/4] mptcp: MSG_ERRQUEUE support on the parent socket David Carlier
` (2 preceding siblings ...)
2026-05-09 21:16 ` [PATCH mptcp-next v7 3/4] mptcp: support MSG_ERRQUEUE on the parent socket David Carlier
@ 2026-05-09 21:16 ` David Carlier
2026-05-09 22:24 ` [PATCH mptcp-next v7 0/4] mptcp: MSG_ERRQUEUE support on the parent socket MPTCP CI
2026-05-27 5:08 ` Matthieu Baerts
5 siblings, 0 replies; 10+ messages in thread
From: David Carlier @ 2026-05-09 21:16 UTC (permalink / raw)
To: mptcp; +Cc: matttbe, martineau, geliang, pabeni, David Carlier
Exercise setsockopt/getsockopt of IP_RECVERR and IPV6_RECVERR on the
MPTCP parent socket, including the empty-errqueue EAGAIN contract on
MSG_ERRQUEUE|MSG_DONTWAIT.
End-to-end errqueue delivery (ICMP, TX timestamps, zerocopy) depends on
subflow-side producers that are out of scope for this series and will be
covered by follow-up work.
Assisted-by: Codex:gpt-5
Signed-off-by: David Carlier <devnexen@gmail.com>
---
.../selftests/net/mptcp/mptcp_sockopt.c | 55 +++++++++++++++++++
1 file changed, 55 insertions(+)
diff --git a/tools/testing/selftests/net/mptcp/mptcp_sockopt.c b/tools/testing/selftests/net/mptcp/mptcp_sockopt.c
index b6e58d936ebe..95bb2cc8e2ff 100644
--- a/tools/testing/selftests/net/mptcp/mptcp_sockopt.c
+++ b/tools/testing/selftests/net/mptcp/mptcp_sockopt.c
@@ -769,6 +769,60 @@ static void test_ip_tos_sockopt(int fd)
xerror("expect socklen_t == -1");
}
+static void test_ip_recverr_sockopt(int fd)
+{
+ struct iovec iov = {
+ .iov_base = &(char){ 0 },
+ .iov_len = 1,
+ };
+ struct msghdr msg = {
+ .msg_iov = &iov,
+ .msg_iovlen = 1,
+ };
+ int one = 1, zero = 0, val = -1;
+ socklen_t s = sizeof(val);
+ int level, optname, r;
+
+ switch (pf) {
+ case AF_INET:
+ level = SOL_IP;
+ optname = IP_RECVERR;
+ break;
+ case AF_INET6:
+ level = SOL_IPV6;
+ optname = IPV6_RECVERR;
+ break;
+ default:
+ xerror("Unknown pf %d\n", pf);
+ }
+
+ r = setsockopt(fd, level, optname, &one, sizeof(one));
+ if (r)
+ die_perror("setsockopt recverr on");
+
+ r = getsockopt(fd, level, optname, &val, &s);
+ if (r)
+ die_perror("getsockopt recverr on");
+ if (s != sizeof(val) || val != one)
+ xerror("recverr on mismatch val=%d len=%u", val, s);
+
+ r = recvmsg(fd, &msg, MSG_ERRQUEUE | MSG_DONTWAIT);
+ if (r != -1 || errno != EAGAIN)
+ xerror("expected empty errqueue to return EAGAIN, ret=%d errno=%d", r, errno);
+
+ r = setsockopt(fd, level, optname, &zero, sizeof(zero));
+ if (r)
+ die_perror("setsockopt recverr off");
+
+ val = -1;
+ s = sizeof(val);
+ r = getsockopt(fd, level, optname, &val, &s);
+ if (r)
+ die_perror("getsockopt recverr off");
+ if (s != sizeof(val) || val != zero)
+ xerror("recverr off mismatch val=%d len=%u", val, s);
+}
+
static int client(int pipefd)
{
int fd = -1;
@@ -787,6 +841,7 @@ static int client(int pipefd)
}
test_ip_tos_sockopt(fd);
+ test_ip_recverr_sockopt(fd);
connect_one_server(fd, pipefd);
--
2.53.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH mptcp-next v7 0/4] mptcp: MSG_ERRQUEUE support on the parent socket
2026-05-09 21:16 [PATCH mptcp-next v7 0/4] mptcp: MSG_ERRQUEUE support on the parent socket David Carlier
` (3 preceding siblings ...)
2026-05-09 21:16 ` [PATCH mptcp-next v7 4/4] selftests: mptcp: cover IP_RECVERR sockopt propagation David Carlier
@ 2026-05-09 22:24 ` MPTCP CI
2026-05-27 5:08 ` Matthieu Baerts
5 siblings, 0 replies; 10+ messages in thread
From: MPTCP CI @ 2026-05-09 22:24 UTC (permalink / raw)
To: David Carlier; +Cc: mptcp
Hi David,
Thank you for your modifications, that's great!
Our CI did some validations and here is its report:
- KVM Validation: normal (except selftest_mptcp_join): Success! ✅
- KVM Validation: normal (only selftest_mptcp_join): Success! ✅
- KVM Validation: debug (except selftest_mptcp_join): Success! ✅
- KVM Validation: debug (only selftest_mptcp_join): Success! ✅
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/25612442092
Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/0f646cd55809
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=1092123
If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:
$ cd [kernel source code]
$ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
--pull always mptcp/mptcp-upstream-virtme-docker:latest \
auto-normal
For more details:
https://github.com/multipath-tcp/mptcp-upstream-virtme-docker
Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)
Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH mptcp-next v7 0/4] mptcp: MSG_ERRQUEUE support on the parent socket
2026-05-09 21:16 [PATCH mptcp-next v7 0/4] mptcp: MSG_ERRQUEUE support on the parent socket David Carlier
` (4 preceding siblings ...)
2026-05-09 22:24 ` [PATCH mptcp-next v7 0/4] mptcp: MSG_ERRQUEUE support on the parent socket MPTCP CI
@ 2026-05-27 5:08 ` Matthieu Baerts
2026-05-27 5:37 ` David CARLIER
5 siblings, 1 reply; 10+ messages in thread
From: Matthieu Baerts @ 2026-05-27 5:08 UTC (permalink / raw)
To: David Carlier, mptcp; +Cc: martineau, geliang, pabeni
Hi David,
On 10/05/2026 07:16, David Carlier wrote:
> This series adds MSG_ERRQUEUE support on the MPTCP parent socket, so
> poll() and recvmsg(MSG_ERRQUEUE) observe TX timestamps and MSG_ZEROCOPY
> completion notifications through the standard inet ABI. IP_RECVERR /
> IPV6_RECVERR (and their RFC4884 variants) are propagated to existing
> and future subflows.
>
> Patch 1 factors per-flag inet_assign_bit() calls in
> sync_socket_options() into a mask-driven loop so future propagated
> flags only need to extend MPTCP_INET_FLAGS_MASK.
>
> Patch 2 wires up RECVERR setsockopt/getsockopt: snapshot the value,
> apply it on the parent, and forward to every subflow under lock_sock()
> so concurrent setsockopt callers cannot leave parent and subflows
> desynchronized. Newly-joining subflows pick up the four RECVERR bits
> through sync_socket_options().
>
> Patch 3 splices forwardable err skbs (TIMESTAMPING / ZEROCOPY / LOCAL)
> from each subflow's error queue onto the parent's, so pollers see
> EPOLLERR and recvmsg(MSG_ERRQUEUE) on the parent drains them. Subflow
> ICMP errors are dropped — they will be carried by a future
> MPTCP_RECERR channel.
Sorry for the delay: I saw Sashiko had some comments [1], and because I
noticed you checked it before, I thought you were going to send a reply
or a new version, and I forgot to ask here. So here it is: is the review
correct?
[1]
https://sashiko.dev/#/patchset/20260509211651.104934-1-devnexen@gmail.com
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH mptcp-next v7 0/4] mptcp: MSG_ERRQUEUE support on the parent socket
2026-05-27 5:08 ` Matthieu Baerts
@ 2026-05-27 5:37 ` David CARLIER
0 siblings, 0 replies; 10+ messages in thread
From: David CARLIER @ 2026-05-27 5:37 UTC (permalink / raw)
To: Matthieu Baerts; +Cc: mptcp, martineau, geliang, pabeni
Hi,
On Wed, 27 May 2026 at 06:08, Matthieu Baerts <matttbe@kernel.org> wrote:
>
> Hi David,
>
> On 10/05/2026 07:16, David Carlier wrote:
> > This series adds MSG_ERRQUEUE support on the MPTCP parent socket, so
> > poll() and recvmsg(MSG_ERRQUEUE) observe TX timestamps and MSG_ZEROCOPY
> > completion notifications through the standard inet ABI. IP_RECVERR /
> > IPV6_RECVERR (and their RFC4884 variants) are propagated to existing
> > and future subflows.
> >
> > Patch 1 factors per-flag inet_assign_bit() calls in
> > sync_socket_options() into a mask-driven loop so future propagated
> > flags only need to extend MPTCP_INET_FLAGS_MASK.
> >
> > Patch 2 wires up RECVERR setsockopt/getsockopt: snapshot the value,
> > apply it on the parent, and forward to every subflow under lock_sock()
> > so concurrent setsockopt callers cannot leave parent and subflows
> > desynchronized. Newly-joining subflows pick up the four RECVERR bits
> > through sync_socket_options().
> >
> > Patch 3 splices forwardable err skbs (TIMESTAMPING / ZEROCOPY / LOCAL)
> > from each subflow's error queue onto the parent's, so pollers see
> > EPOLLERR and recvmsg(MSG_ERRQUEUE) on the parent drains them. Subflow
> > ICMP errors are dropped — they will be carried by a future
> > MPTCP_RECERR channel.
>
> Sorry for the delay: I saw Sashiko had some comments [1], and because I
> noticed you checked it before, I thought you were going to send a reply
> or a new version, and I forgot to ask here. So here it is: is the review
> correct?
>
> [1]
> https://sashiko.dev/#/patchset/20260509211651.104934-1-devnexen@gmail.com
>
> Cheers,
> Matt
> --
> Sponsored by the NGI0 Core fund.
>
Yes, both findings are real.
For v8 I'll drop the skb on splice failure (matches sock_queue_err_skb()'s
own behaviour under rmem pressure: -ENOMEM + sk_drops++, the skb is freed
by the caller). With nothing retained on subflow err queues,
mptcp_subflow_errqueue_pending() can go from mptcp_poll() — which fixes
the lockless conn_list walk too — and the recvmsg retry in
mptcp_recv_error() goes with it.
Cheers
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH mptcp-next v7 2/4] mptcp: propagate RECVERR sockopts to subflows
2026-05-09 21:16 ` [PATCH mptcp-next v7 2/4] mptcp: propagate RECVERR sockopts to subflows David Carlier
@ 2026-05-27 5:37 ` Matthieu Baerts
2026-05-27 5:48 ` David CARLIER
0 siblings, 1 reply; 10+ messages in thread
From: Matthieu Baerts @ 2026-05-27 5:37 UTC (permalink / raw)
To: David Carlier, mptcp; +Cc: martineau, geliang, pabeni
Hi David,
On 10/05/2026 07:16, David Carlier wrote:
> Propagate IP_RECVERR/IP_RECVERR_RFC4884 and
> IPV6_RECVERR/IPV6_RECVERR_RFC4884 from the MPTCP socket to existing
> and future subflows.
>
> mptcp_setsockopt_recverr() snapshots optval into a local int, applies
> it to the parent socket via ip_setsockopt() / ipv6_setsockopt(), bumps
> msk->setsockopt_seq, and forwards to every subflow via
> mptcp_setsockopt_all_sf(). Newly-joining subflows pick up the four
> RECVERR bits through sync_socket_options() now that
> MPTCP_INET_FLAGS_MASK covers them.
>
> mptcp_setsockopt_all_sf() skips IPv4 subflows when called with
> SOL_IPV6 to avoid the -ENOPROTOOPT that ip_setsockopt() returns on
> level mismatch in AF_INET6 msks carrying IPv4 subflows.
>
> Suggested-by: Paolo Abeni <pabeni@redhat.com>
> Assisted-by: Codex:gpt-5
> Signed-off-by: David Carlier <devnexen@gmail.com>
> ---
> net/mptcp/sockopt.c | 133 ++++++++++++++++++++++++++++++++++++--------
> 1 file changed, 109 insertions(+), 24 deletions(-)
>
> diff --git a/net/mptcp/sockopt.c b/net/mptcp/sockopt.c
> index 114436d2e401..fbbd1692af7e 100644
> --- a/net/mptcp/sockopt.c
> +++ b/net/mptcp/sockopt.c
> @@ -8,6 +8,8 @@
>
> #include <linux/kernel.h>
> #include <linux/module.h>
> +#include <net/ip.h>
> +#include <net/ipv6.h>
Are these new "include" really needed?
> #include <net/sock.h>
> #include <net/protocol.h>
> #include <net/tcp.h>
> @@ -19,7 +21,11 @@
> #define MPTCP_INET_FLAGS_MASK \
> (BIT(INET_FLAGS_TRANSPARENT) | \
> BIT(INET_FLAGS_FREEBIND) | \
> - BIT(INET_FLAGS_BIND_ADDRESS_NO_PORT))
> + BIT(INET_FLAGS_BIND_ADDRESS_NO_PORT) | \
> + BIT(INET_FLAGS_RECVERR) | \
> + BIT(INET_FLAGS_RECVERR_RFC4884) | \
> + BIT(INET_FLAGS_RECVERR6) | \
> + BIT(INET_FLAGS_RECVERR6_RFC4884))
>
> static struct sock *__mptcp_tcp_fallback(struct mptcp_sock *msk)
> {
> @@ -394,6 +400,81 @@ static int mptcp_setsockopt_sol_socket(struct mptcp_sock *msk, int optname,
> return -EOPNOTSUPP;
> }
>
> +static int mptcp_setsockopt_all_sf(struct mptcp_sock *msk, int level,
> + int optname, sockptr_t optval,
> + unsigned int optlen)
> +{
> + struct mptcp_subflow_context *subflow;
> + int ret = 0;
> +
> + mptcp_for_each_subflow(msk, subflow) {
> + struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
> +
> + if (level == SOL_IPV6 && ssk->sk_family != AF_INET6)
> + continue;
This new check looks strange:
- Was it required before this patch? If yes, please create a dedicated
patch. If no, explain how it can happen in the commit message (+ here in
a short comment?).
- if the msk is in v6, all subflows are in v6 as well, no? Or did I miss
a case (easy to miss with v6-mapped-in-v4 sockets...)?
> + ret = tcp_setsockopt(ssk, level, optname, optval, optlen);
> + if (ret)
> + break;
> + }
> +
> + if (!ret)
> + sockopt_seq_inc(msk);
> +
> + return ret;
> +}
> +
> +static int mptcp_setsockopt_recverr(struct mptcp_sock *msk, int level,
> + int optname, sockptr_t optval,
> + unsigned int optlen)
> +{
> + struct sock *sk = (struct sock *)msk;
> + int val, ret;
> +
> + /* Let ip_setsockopt() / ipv6_setsockopt() validate optval and optlen
> + * (so 1-byte boolean writes keep the same ABI as plain TCP) and update
> + * the parent's RECVERR bit. Re-read that bit under lock_sock() and
> + * push it to the subflows: concurrent setsockopt callers cannot leave
> + * parent and subflows desynchronized this way.
> + */
> + if (level == SOL_IP)
> + ret = ip_setsockopt(sk, level, optname, optval, optlen);
> +#if IS_ENABLED(CONFIG_IPV6)
> + else if (level == SOL_IPV6)
> + ret = ipv6_setsockopt(sk, level, optname, optval, optlen);
> +#endif
> + else
> + return -EOPNOTSUPP;
> + if (ret)
> + return ret;
> +
> + lock_sock(sk);
> + switch (optname) {
> + case IP_RECVERR:
> + val = inet_test_bit(RECVERR, sk);
> + break;
> + case IP_RECVERR_RFC4884:
> + val = inet_test_bit(RECVERR_RFC4884, sk);
> + break;
> +#if IS_ENABLED(CONFIG_IPV6)
> + case IPV6_RECVERR:
> + val = inet6_test_bit(RECVERR6, sk);
> + break;
> + case IPV6_RECVERR_RFC4884:
> + val = inet6_test_bit(RECVERR6_RFC4884, sk);
> + break;
> +#endif
> + default:
> + release_sock(sk);
> + return -EOPNOTSUPP;
When lock are used, we usually try to have one exit path: could you set
'ret' and use a "goto" here?
Also, this "default" case should never be used, right? Do we need it? Or
use a WARN?
> + }
> +
> + ret = mptcp_setsockopt_all_sf(msk, level, optname,
> + KERNEL_SOCKPTR(&val), sizeof(val));
> + release_sock(sk);
> + return ret;
> +}
> +
> static int mptcp_setsockopt_v6(struct mptcp_sock *msk, int optname,
> sockptr_t optval, unsigned int optlen)
> {
> @@ -436,6 +517,10 @@ static int mptcp_setsockopt_v6(struct mptcp_sock *msk, int optname,
>
> release_sock(sk);
> break;
> + case IPV6_RECVERR:
> + case IPV6_RECVERR_RFC4884:
> + ret = mptcp_setsockopt_recverr(msk, SOL_IPV6, optname, optval, optlen);
> + break;
> }
>
> return ret;
> @@ -781,6 +866,9 @@ static int mptcp_setsockopt_v4(struct mptcp_sock *msk, int optname,
> return mptcp_setsockopt_sol_ip_set(msk, optname, optval, optlen);
> case IP_TOS:
> return mptcp_setsockopt_v4_set_tos(msk, optname, optval, optlen);
> + case IP_RECVERR:
> + case IP_RECVERR_RFC4884:
> + return mptcp_setsockopt_recverr(msk, SOL_IP, optname, optval, optlen);
> }
>
> return -EOPNOTSUPP;
> @@ -808,27 +896,6 @@ static int mptcp_setsockopt_first_sf_only(struct mptcp_sock *msk, int level, int
> return ret;
> }
>
> -static int mptcp_setsockopt_all_sf(struct mptcp_sock *msk, int level,
> - int optname, sockptr_t optval,
> - unsigned int optlen)
> -{
> - struct mptcp_subflow_context *subflow;
> - int ret = 0;
> -
> - mptcp_for_each_subflow(msk, subflow) {
> - struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
> -
> - ret = tcp_setsockopt(ssk, level, optname, optval, optlen);
> - if (ret)
> - break;
> - }
> -
> - if (!ret)
> - sockopt_seq_inc(msk);
> -
> - return ret;
> -}
> -
> static int mptcp_setsockopt_sol_tcp(struct mptcp_sock *msk, int optname,
> sockptr_t optval, unsigned int optlen)
> {
> @@ -932,8 +999,11 @@ int mptcp_setsockopt(struct sock *sk, int level, int optname,
> if (level == SOL_IP)
> return mptcp_setsockopt_v4(msk, optname, optval, optlen);
>
> - if (level == SOL_IPV6)
> + if (level == SOL_IPV6) {
> + if (sk->sk_family != AF_INET6)
> + return -ENOPROTOOPT;
Same here: Was it required before this patch? If yes, please create a
dedicated patch. If no, explain how it can happen in the commit message
(+ here in a short comment?).
> return mptcp_setsockopt_v6(msk, optname, optval, optlen);
> + }
>
> if (level == SOL_TCP)
> return mptcp_setsockopt_sol_tcp(msk, optname, optval, optlen);
> @@ -1473,6 +1543,12 @@ static int mptcp_getsockopt_v4(struct mptcp_sock *msk, int optname,
> case IP_LOCAL_PORT_RANGE:
> return mptcp_put_int_option(msk, optval, optlen,
> READ_ONCE(inet_sk(sk)->local_port_range));
> + case IP_RECVERR:
> + return mptcp_put_int_option(msk, optval, optlen,
> + inet_test_bit(RECVERR, sk));
> + case IP_RECVERR_RFC4884:
> + return mptcp_put_int_option(msk, optval, optlen,
> + inet_test_bit(RECVERR_RFC4884, sk));
> }
>
> return -EOPNOTSUPP;
> @@ -1493,6 +1569,12 @@ static int mptcp_getsockopt_v6(struct mptcp_sock *msk, int optname,
> case IPV6_FREEBIND:
> return mptcp_put_int_option(msk, optval, optlen,
> inet_test_bit(FREEBIND, sk));
> + case IPV6_RECVERR:
> + return mptcp_put_int_option(msk, optval, optlen,
> + inet6_test_bit(RECVERR6, sk));
> + case IPV6_RECVERR_RFC4884:
> + return mptcp_put_int_option(msk, optval, optlen,
> + inet6_test_bit(RECVERR6_RFC4884, sk));
> }
>
> return -EOPNOTSUPP;
> @@ -1537,8 +1619,11 @@ int mptcp_getsockopt(struct sock *sk, int level, int optname,
>
> if (level == SOL_IP)
> return mptcp_getsockopt_v4(msk, optname, optval, option);
> - if (level == SOL_IPV6)
> + if (level == SOL_IPV6) {
> + if (sk->sk_family != AF_INET6)
> + return -ENOPROTOOPT;
Same here.
> return mptcp_getsockopt_v6(msk, optname, optval, option);
> + }
> if (level == SOL_TCP)
> return mptcp_getsockopt_sol_tcp(msk, optname, optval, option);
> if (level == SOL_MPTCP)
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH mptcp-next v7 2/4] mptcp: propagate RECVERR sockopts to subflows
2026-05-27 5:37 ` Matthieu Baerts
@ 2026-05-27 5:48 ` David CARLIER
0 siblings, 0 replies; 10+ messages in thread
From: David CARLIER @ 2026-05-27 5:48 UTC (permalink / raw)
To: Matthieu Baerts; +Cc: mptcp, martineau, geliang, pabeni
On 27/05/2026 06:37, Matthieu Baerts wrote:
>> +#include <net/ip.h>
>> +#include <net/ipv6.h>
>
> Are these new "include" really needed?
ip_setsockopt() was already used pre-patch (transitively pulled in), so
<net/ip.h> can go. <net/ipv6.h> I need for ipv6_setsockopt() — will
double-check at build time and drop if transitive too.
>> + if (level == SOL_IPV6 && ssk->sk_family != AF_INET6)
>> + continue;
>
> This new check looks strange: [...]
Not a fix for an existing bug — pre-patch, mptcp_setsockopt_all_sf()
was only called with SOL_TCP (TCP_MAXSEG), where family is irrelevant.
RECVERR is the first SOL_IPV6 caller, so the v4-subflow-on-v6-msk
case never gets exercised on this path without this patch.
For that case: an AF_INET6 listener accepts MP_JOINs from v4 peers
(subflow created with sk_family == AF_INET), and the userspace PM can
graft v4 subflows onto a v6 msk. Calling tcp_setsockopt(SOL_IPV6, ...)
on such a subflow hits ipv6_setsockopt()'s sk_family != AF_INET6 check
and returns -EAFNOSUPPORT, aborting the loop and leaving the remaining
subflows desynchronised.
Since it's only needed by the new caller, I'll keep the skip in this
patch with an inline comment + a note in the commit message.
>> + default:
>> + release_sock(sk);
>> + return -EOPNOTSUPP;
>
> When lock are used, we usually try to have one exit path: could you set
> 'ret' and use a "goto" here?
>
> Also, this "default" case should never be used, right? Do we need it? Or
> use a WARN?
Right, unreachable — the caller filters to the four RECVERR optnames.
I'll drop the default and use a single 'ret' + goto out for the success
path.
>> if (level == SOL_IP)
>> return mptcp_setsockopt_v4(msk, optname, optval, optlen);
>>
>> - if (level == SOL_IPV6)
>> + if (level == SOL_IPV6) {
>> + if (sk->sk_family != AF_INET6)
>> + return -ENOPROTOOPT;
>
> Same here: Was it required before this patch? [...]
Not needed — pre-patch the v6 path would already error out with
-EAFNOSUPPORT from the v6 layer further down. This is an unrelated
behaviour change; I'll drop both hunks (setsockopt + getsockopt).
Cheers
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2026-05-27 5:48 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-09 21:16 [PATCH mptcp-next v7 0/4] mptcp: MSG_ERRQUEUE support on the parent socket David Carlier
2026-05-09 21:16 ` [PATCH mptcp-next v7 1/4] mptcp: sockopt: factor inet_flags propagation into a mask David Carlier
2026-05-09 21:16 ` [PATCH mptcp-next v7 2/4] mptcp: propagate RECVERR sockopts to subflows David Carlier
2026-05-27 5:37 ` Matthieu Baerts
2026-05-27 5:48 ` David CARLIER
2026-05-09 21:16 ` [PATCH mptcp-next v7 3/4] mptcp: support MSG_ERRQUEUE on the parent socket David Carlier
2026-05-09 21:16 ` [PATCH mptcp-next v7 4/4] selftests: mptcp: cover IP_RECVERR sockopt propagation David Carlier
2026-05-09 22:24 ` [PATCH mptcp-next v7 0/4] mptcp: MSG_ERRQUEUE support on the parent socket MPTCP CI
2026-05-27 5:08 ` Matthieu Baerts
2026-05-27 5:37 ` David CARLIER
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox