From: "Jiayuan Chen" <jiayuan.chen@linux.dev>
To: "Sebastian Andrzej Siewior" <bigeasy@linutronix.de>
Cc: bpf@vger.kernel.org, "Jiayuan Chen" <jiayuan.chen@shopee.com>,
syzbot+2b3391f44313b3983e91@syzkaller.appspotmail.com,
"Alexei Starovoitov" <ast@kernel.org>,
"Daniel Borkmann" <daniel@iogearbox.net>,
"David S. Miller" <davem@davemloft.net>,
"Jakub Kicinski" <kuba@kernel.org>,
"Jesper Dangaard Brouer" <hawk@kernel.org>,
"John Fastabend" <john.fastabend@gmail.com>,
"Stanislav Fomichev" <sdf@fomichev.me>,
"Andrii Nakryiko" <andrii@kernel.org>,
"Martin KaFai Lau" <martin.lau@linux.dev>,
"Eduard Zingerman" <eddyz87@gmail.com>,
"Song Liu" <song@kernel.org>,
"Yonghong Song" <yonghong.song@linux.dev>,
"KP Singh" <kpsingh@kernel.org>, "Hao Luo" <haoluo@google.com>,
"Jiri Olsa" <jolsa@kernel.org>, "Kees Cook" <kees@kernel.org>,
"Gustavo A. R. Silva" <gustavoars@kernel.org>,
"Clark Williams" <clrkwllms@kernel.org>,
"Steven Rostedt" <rostedt@goodmis.org>,
"Thomas Gleixner" <tglx@kernel.org>,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-hardening@vger.kernel.org, linux-rt-devel@lists.linux.dev
Subject: Re: [PATCH bpf v1] bpf: cpumap: fix race in bq_flush_to_queue on PREEMPT_RT
Date: Wed, 11 Feb 2026 12:26:23 +0000 [thread overview]
Message-ID: <82e29f76816971cfad92167c97afb437e1996aea@linux.dev> (raw)
In-Reply-To: <20260211114418.xnfx8M-t@linutronix.de>
February 11, 2026 at 19:44, "Sebastian Andrzej Siewior" <bigeasy@linutronix.de mailto:bigeasy@linutronix.de?to=%22Sebastian%20Andrzej%20Siewior%22%20%3Cbigeasy%40linutronix.de%3E > wrote:
>
> On 2026-02-11 14:44:16 [+0800], Jiayuan Chen wrote:
>
> >
> > From: Jiayuan Chen <jiayuan.chen@shopee.com>
> >
> > On PREEMPT_RT kernels, the per-CPU xdp_bulk_queue (bq) can be accessed
> > concurrently by multiple preemptible tasks on the same CPU.
> >
> > The original code assumes bq_enqueue() and __cpu_map_flush() run
> > atomically with respect to each other on the same CPU, relying on
> > local_bh_disable() to prevent preemption. However, on PREEMPT_RT,
> > local_bh_disable() only calls migrate_disable() and does not disable
> > preemption. spin_lock() also becomes a sleeping rt_mutex. Together,
> > this allows CFS scheduling to preempt a task during bq_flush_to_queue(),
> > enabling another task on the same CPU to enter bq_enqueue() and operate
> > on the same per-CPU bq concurrently.
> >
> …
>
> >
> > Fixes: d2d6422f8bd1 ("x86: Allow to enable PREEMPT_RT.")
> >
> Can you reproduce this? It should not trigger with the commit above.
> It should trigger starting with
> 3253cb49cbad4 ("softirq: Allow to drop the softirq-BKL lock on PREEMPT_RT")
Thanks for the review, Sebastian.
You are right. The race only becomes possible after the softirq BKL is dropped.
> >
> > Reported-by: syzbot+2b3391f44313b3983e91@syzkaller.appspotmail.com
> > Closes: https://lore.kernel.org/all/69369331.a70a0220.38f243.009d.GAE@google.com/T/
> > Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com>
> > Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> > ---
> > kernel/bpf/cpumap.c | 16 +++++++++++++++-
> > 1 file changed, 15 insertions(+), 1 deletion(-)
> >
> > diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
> > index 04171fbc39cb..7fda8421ec40 100644
> > --- a/kernel/bpf/cpumap.c
> > +++ b/kernel/bpf/cpumap.c
> > @@ -714,6 +717,7 @@ const struct bpf_map_ops cpu_map_ops = {
> > .map_redirect = cpu_map_redirect,
> > };
> >
> > +/* Caller must hold bq->bq_lock */
> >
> If this information is important please use lockdep_assert_held() in the
> function below. This can be used by lockdep and is understood by humans
> while the comment is only visible to humans.
Will add lockdep_assert_held() in bq_flush_to_queue() and drop the
comment.
> >
> > static void bq_flush_to_queue(struct xdp_bulk_queue *bq)
> > {
> > struct bpf_cpu_map_entry *rcpu = bq->obj;
> > @@ -750,10 +754,16 @@ static void bq_flush_to_queue(struct xdp_bulk_queue *bq)
> >
> > /* Runs under RCU-read-side, plus in softirq under NAPI protection.
> > * Thus, safe percpu variable access.
> >
> + PREEMPT_RT relies on local_lock_nested_bh().
>
> >
> > + *
> > + * On PREEMPT_RT, local_bh_disable() does not disable preemption,
> > + * so we use local_lock to serialize access to the per-CPU bq.
> > */
> > static void bq_enqueue(struct bpf_cpu_map_entry *rcpu, struct xdp_frame *xdpf)
> > {
> > - struct xdp_bulk_queue *bq = this_cpu_ptr(rcpu->bulkq);
> > + struct xdp_bulk_queue *bq;
> > +
> > + local_lock(&rcpu->bulkq->bq_lock);
> >
> local_lock_nested_bh() & the matching unlock here and in the other
> places, please.
Makes sense. Since these paths already run under local_bh_disable(),
local_lock_nested_bh() is the correct primitive.
> Sebastian
>
prev parent reply other threads:[~2026-02-11 12:26 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-11 6:44 [PATCH bpf v1] bpf: cpumap: fix race in bq_flush_to_queue on PREEMPT_RT Jiayuan Chen
2026-02-11 11:44 ` Sebastian Andrzej Siewior
2026-02-11 12:26 ` Jiayuan Chen [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=82e29f76816971cfad92167c97afb437e1996aea@linux.dev \
--to=jiayuan.chen@linux.dev \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bigeasy@linutronix.de \
--cc=bpf@vger.kernel.org \
--cc=clrkwllms@kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=eddyz87@gmail.com \
--cc=gustavoars@kernel.org \
--cc=haoluo@google.com \
--cc=hawk@kernel.org \
--cc=jiayuan.chen@shopee.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kees@kernel.org \
--cc=kpsingh@kernel.org \
--cc=kuba@kernel.org \
--cc=linux-hardening@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rt-devel@lists.linux.dev \
--cc=martin.lau@linux.dev \
--cc=netdev@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=sdf@fomichev.me \
--cc=song@kernel.org \
--cc=syzbot+2b3391f44313b3983e91@syzkaller.appspotmail.com \
--cc=tglx@kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.