From: Andrea Righi <arighi@nvidia.com>
To: Tejun Heo <tj@kernel.org>
Cc: David Vernet <void@manifault.com>,
Changwoo Min <changwoo@igalia.com>,
Christian Loehle <christian.loehle@arm.com>,
Kumar Kartikeya Dwivedi <memxor@gmail.com>,
sched-ext@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: Re: [PATCH sched_ext/for-6.18] sched_ext: Acquire task reference in scx_bpf_cpu_curr()
Date: Tue, 9 Sep 2025 22:45:01 +0200 [thread overview]
Message-ID: <aMCRzXgDv6PGqLwp@gpd4> (raw)
In-Reply-To: <aMCHjFwwjqvKsZBg@slm.duckdns.org>
Hi Tejun,
On Tue, Sep 09, 2025 at 10:01:16AM -1000, Tejun Heo wrote:
> Hello, Andrea.
>
> On Tue, Sep 09, 2025 at 09:57:09PM +0200, Andrea Righi wrote:
> > scx_bpf_cpu_curr() has been introduced to retrieve the current task of a
> > given runqueue, allowing schedulers to interact with that task.
> >
> > The kfunc assumes that it is always called in an RCU context, but this
> > is not always guaranteed and some BPF schedulers can trigger the
> > following warning:
> >
> > WARNING: suspicious RCU usage
> > sched_ext: BPF scheduler "cosmos_1.0.2_gd0e71ca_x86_64_unknown_linux_gnu_debug" enabled
> > 6.17.0-rc1 #1-NixOS Not tainted
> > -----------------------------
> > kernel/sched/ext.c:6415 suspicious rcu_dereference_check() usage!
> >
> > The correct behavior is to acquire a reference to the returned task, so
> > the scheduler can safely access it and then release it with
> > bpf_task_release().
> >
> > Update the kfunc and the corresponding compatibility helper to implement
> > reference acquisition and prevent potential RCU warnings.
>
> I think KF_RCU likely fits better for peeking kernel data structures than
> having to acquire/release them. Can you post the full backtrace? Is it being
> called from a sleepable bpf prog? Or is it that we just need to expand the
> rcu check scope to cover regular rcu, bh and sched? And, everything aside,
> if KF_RCU, should we be tripping on rcu_dereference() in the first place?
For the records, as discussed offline, we should be fine marking the kfunc
as KF_RCU_PROTECTED instead of acquiring the reference to the task.
Right now the kfunc is marked as KF_RCU, which is not really necessary,
because KF_RCU ensures the kfunc *arguments* are either RCU-protected or
trusted.
KF_RCU_PROTECTED, instead, should ensure that the kfunc is called inside an
RCU read-side critical section, that is what we need.
In this way the kfunc can safely return a pointer to the task and sleepable
BPF programs can wrap the call in a bpf_rcu_read_lock/unlock() section.
This should prevent the RCU warning while still letting schedulers safely
use the returned task.
I'll send a new patch with a proper fix.
Thanks,
-Andrea
next prev parent reply other threads:[~2025-09-09 20:45 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-09 19:57 [PATCH sched_ext/for-6.18] sched_ext: Acquire task reference in scx_bpf_cpu_curr() Andrea Righi
2025-09-09 20:01 ` Tejun Heo
2025-09-09 20:45 ` Andrea Righi [this message]
2025-09-18 15:48 ` Christian Loehle
2025-09-18 16:06 ` Andrea Righi
2025-09-18 16:46 ` Christian Loehle
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aMCRzXgDv6PGqLwp@gpd4 \
--to=arighi@nvidia.com \
--cc=changwoo@igalia.com \
--cc=christian.loehle@arm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=memxor@gmail.com \
--cc=sched-ext@lists.linux.dev \
--cc=tj@kernel.org \
--cc=void@manifault.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.