From: gang.yan@linux.dev
To: "Paolo Abeni" <pabeni@redhat.com>, "MPTCP Linux" <mptcp@lists.linux.dev>
Cc: "Gang Yan" <yangang@kylinos.cn>
Subject: Re: [PATCH mptcp-net v4 1/2] mptcp: use sockopt_lock(release)_sock in sockopt
Date: Fri, 15 May 2026 09:39:07 +0000 [thread overview]
Message-ID: <62bce132417a3c98a6ad6aea821dba9de1002916@linux.dev> (raw)
In-Reply-To: <2678c535-2b51-4a04-803e-14a2e3ac348b@redhat.com>
May 15, 2026 at 5:06 PM, "Paolo Abeni" <pabeni@redhat.com mailto:pabeni@redhat.com?to=%22Paolo%20Abeni%22%20%3Cpabeni%40redhat.com%3E > wrote:
>
> Hi,
>
> On 5/15/26 3:27 AM, gang.yan@linux.dev wrote:
>
> >
> > May 9, 2026 at 2:44 PM, "Gang Yan" <gang.yan@linux.dev mailto:gang.yan@linux.dev?to=%22Gang%20Yan%22%20%3Cgang.yan%40linux.dev%3E > wrote:
> >
> > >
> > > From: Gang Yan <yangang@kylinos.cn>
> > >
> > > TCP and the core socket layer all use sockopt_lock_sock()
> > > sockopt_release_sock() in their setsockopt and getsockopt handlers. It
> > > is a BPF-aware wrapper that skips lock acquisition when invoked from a
> > > BPF program, where the socket lock is already held.
> > >
> > > Using lock_sock_fast() on subflows requires extra care: the fast path
> > > holds the socket spinlock with BH disabled, creating an atomic context
> > > where sleeping is not allowed. Switching to lock_sock() avoids the
> > > risk of accidentally introducing sleeping operations inside the
> > > lock_sock_fast() critical section.
> > >
> > > Fixes: 24426654ed3a ("bpf: net: Avoid sk_setsockopt() taking sk lock when called from bpf")
> > > Signed-off-by: Gang Yan <yangang@kylinos.cn>
> > > ---
> > > net/mptcp/sockopt.c | 97 ++++++++++++++++++++++++++---------------------------
> > > 1 file changed, 48 insertions(+), 49 deletions(-)
> > >
> > > diff --git a/net/mptcp/sockopt.c b/net/mptcp/sockopt.c
> > > index 1cf608e7357b..57ffde0d0b87 100644
> > > --- a/net/mptcp/sockopt.c
> > > +++ b/net/mptcp/sockopt.c
> > > @@ -72,12 +72,12 @@ static void mptcp_sol_socket_sync_intval(struct mptcp_sock *msk, int optname, in
> > > struct mptcp_subflow_context *subflow;
> > > struct sock *sk = (struct sock *)msk;
> > >
> > > - lock_sock(sk);
> > > + sockopt_lock_sock(sk);
> > > sockopt_seq_inc(msk);
> > >
> > > mptcp_for_each_subflow(msk, subflow) {
> > > struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
> > > - bool slow = lock_sock_fast(ssk);
> > >
> >
> > Hi, Maintainers:
> >
> > Thanks for the review of sashiko. Let me address these two points.
> >
> > First, I think this patch does not introduce issues for existing MPTCP code.
> > Currently there is no scenario where a BPF program invokes the MPTCP
> > setsockopt/getsockopt handler — BPF helpers (bpf_setsockopt) bypass the
> > protocol-specific handler via __bpf_setsockopt(), and the cgroup BPF sockopt
> > filter clears the BPF context before calling ops->setsockopt(). So
> > has_current_bpf_ctx() is always false when the MPTCP handler executes. The
> > commit's primary purpose, as stated in the commit message, is to replace
> > lock_sock_fast() with lock_sock() to avoid sleeping within the
> > lock_sock_fast() critical section.
> >
> >
> > The subflow locking concern is valid for future BPF support.
> >
> > That said, the observation about the subflow locking gap is a valid design
> > concern. As sashiko pointed out, when the MPTCP setsockopt handler is
> > eventually called from BPF context, the MPTCP meta socket (msk) is already
> > locked externally, but the subflow sockets (ssk) are not. The internal
> > delegation via sk_setsockopt(ssk, ...) or tcp_setsockopt(ssk, ...) calls
> > sockopt_lock_sock(ssk) which skips the lock due to the per-thread
> > has_current_bpf_ctx() flag, leaving ssk unprotected.
> >
> > However, solving this is not straightforward. In BPF context we don't have a
> > good way to call lock_sock() on the subflow — as sashiko noted in first point,
> > lock_sock() unconditionally calls might_sleep(), and will cause 'sleeping in
> > atomic context' problems in BPF context. So properly locking the subflow in
> > BPF context remains an open question that needs to be addressed when MPTCP
> > BPF setsockopt support is actually introduced.
> >
Hi Paolo,
Thanks for your reply.
> I'm sorry for the late feedback. I think such concern is actually quite
> relevant. Note that replacing
>
> bool slow = lock_sock_fast(ssk);
>
> with:
>
> lock_sock(ssk);
>
> does not change much WRT the sleeping issue.
I'm not questioning your point, but just to confirm — when you mentioned the
"sleeping issue", are you referring to the BPF context?
In fact, this patch addresses an issue that occurs in non-BPF scenarios like
sashiko said in [1]:
https://sashiko.dev/#/patchset/20260420093343.16443-1-gang.yan@linux.dev
>
> I *think* the only viable path in the short term is making the sockopt
> requiring subflow-level lock fail when called by BPF.
>
> In a possible long term alternative could be delegating the
> subflow-level changes to a workqueue, but that really looks invasive and
> I would really like to avoid such option.
I agree that the short-term approach sounds reasonable for now. We could
either implement the sockopt failure for BPF calls that require subflow-level
locking, or simply add a comment to mark this limitation clearly.
As far as I can see, MPTCP currently doesn’t have any real-world use cases
for setting sockopt via BPF anyway. The workqueue approach also seems too complex
and invasive for the current stage, so I’d prefer to avoid it for now.
WDYT?
Cherrs,
Gang
>
> /P
>
next prev parent reply other threads:[~2026-05-15 9:39 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-09 6:44 [PATCH mptcp-net v4 0/2] mptcp: convert to sockopt_lock_sock Gang Yan
2026-05-09 6:44 ` [PATCH mptcp-net v4 1/2] mptcp: use sockopt_lock(release)_sock in sockopt Gang Yan
2026-05-15 1:27 ` gang.yan
2026-05-15 9:06 ` Paolo Abeni
2026-05-15 9:39 ` gang.yan [this message]
2026-05-15 10:50 ` Paolo Abeni
2026-05-09 6:44 ` [PATCH mptcp-net v4 2/2] mptcp: use sockopt_ns_capable() in setsockopt congestion control Gang Yan
2026-05-09 7:47 ` [PATCH mptcp-net v4 0/2] mptcp: convert to sockopt_lock_sock MPTCP CI
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=62bce132417a3c98a6ad6aea821dba9de1002916@linux.dev \
--to=gang.yan@linux.dev \
--cc=mptcp@lists.linux.dev \
--cc=pabeni@redhat.com \
--cc=yangang@kylinos.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.