From: Peter Zijlstra <peterz@infradead.org>
To: "Puchert, Aaron" <aaron.puchert@sap.com>
Cc: Marco Elver <elver@google.com>,
Aaron Ballman <aaron@aaronballman.com>,
"linux-toolchains@vger.kernel.org"
<linux-toolchains@vger.kernel.org>,
"llvm@lists.linux.dev" <llvm@lists.linux.dev>,
Bart Van Assche <bvanassche@acm.org>,
"Paul E. McKenney" <paulmck@kernel.org>,
Boqun Feng <boqun.feng@gmail.com>
Subject: Re: Thread Safety Analysis and the Linux kernel
Date: Thu, 6 Mar 2025 11:37:52 +0100 [thread overview]
Message-ID: <20250306103752.GE16878@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <DB7PR02MB36260C79D8CD91925D0D3447E7CB2@DB7PR02MB3626.eurprd02.prod.outlook.com>
On Wed, Mar 05, 2025 at 11:54:14PM +0000, Puchert, Aaron wrote:
> > 5. Better control-flow handling. Basic understanding of conditional
> > locking, which is explicitly ruled out in:
> > https://clang.llvm.org/docs/ThreadSafetyAnalysis.html#conditional-locks
> > - however, if there's some way to even get basic support, would vastly
> > improve things for the kernel.
>
> The paper goes into some detail why this is tricky with the current
> design. We don't want to explore execution paths, because that
> obviously results in a combinatorial explosion. The Clang static
> analyzer deals with that by limiting exploration, but that makes it
> fuzzy and of course slow. As a warning flag we're part of the compiler
> itself, and the recommended practice is -Werror=thread-safety, so both
> of these are highly problematic. We want the analysis to be
> predictable and fast.
>
> It would be interesting to see some patterns. One "trick" that I have
> played with is to move the conditional locking into the capability
> itself: it is then unconditionally acquired or released, but the
> underlying mutex is only acquired or released based on some global
> state or member. Something like this:
>
> struct __attribute__((capability("mutex"))) conditional_mutex
> {
> struct mutex mu;
> bool active;
> }
>
> void acquire_conditional_mutex(struct conditional_mutex* cmu)
> __attribute__((acquire_capability(cmu)))
> {
> if (cmu->active)
> acquire_mutex(&cmu->mu);
> }
>
> However, I realize that might be a difficult proposal for the kernel community.
Version of this that came up were:
static __always_inline u32
bpf_prog_run_array_uprobe(const struct bpf_prog_array *array,
const void *ctx, bpf_prog_run_fn run_prog)
{
const struct bpf_prog_array_item *item;
const struct bpf_prog *prog;
struct bpf_run_ctx *old_run_ctx;
struct bpf_trace_run_ctx run_ctx;
u32 ret = 1;
might_fault();
RCU_LOCKDEP_WARN(!rcu_read_lock_trace_held(), "no rcu lock held");
if (unlikely(!array))
return ret;
migrate_disable();
run_ctx.is_uprobe = true;
old_run_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
item = &array->items[0];
while ((prog = READ_ONCE(item->prog))) {
if (!prog->sleepable)
rcu_read_lock();
run_ctx.bpf_cookie = item->bpf_cookie;
ret &= run_prog(prog, ctx);
item++;
if (!prog->sleepable)
rcu_read_unlock();
}
bpf_reset_run_ctx(old_run_ctx);
migrate_enable();
return ret;
}
and
static void ath9k_hif_usb_firmware_fail(struct hif_device_usb *hif_dev)
{
struct device *dev = &hif_dev->udev->dev;
struct device *parent = dev->parent;
complete_all(&hif_dev->fw_done);
if (parent)
device_lock(parent);
device_release_driver(dev);
if (parent)
device_unlock(parent);
}
And I think we can expect a fair amount of resistance from maintainers
if we're going to go around rewriting their -- perfectly fine code,
thank you very much -- to please a static analyzer :/
> > Some of these are more complex than others, and any hints how we might
> > go about this are appreciated. I'm happy to try and implement some of
> > them, but if you find that you already know exactly how you'd like an
> > implementation to look like, rough drafts that we can take over and
> > polish would be very very helpful, too!
> >
> > In general, the Linux kernel has some of the most complex
> > synchronization patterns with numerous synchronization primitives.
> > Getting this to work for a good chunk of the more complex
> > synchronization code in the kernel will be quite the achievement if we
> > get there. :-)
>
> Examples are always welcome!
One of the more common patterns that I've talked about with Marco is
where modification requires two locks, while access requires either
lock.
The current __guarded_by() annotation cannot express this.
One possible way would be to construct a fake / 0-size capability
structure and have each of these locks acquire it in 'shared' mode (but
then you run into the recursion thing) and have a special annotation
that upgrades it to exclusive.
struct phoney __attribute__((capability(phoney))) { } phoney;
struct task_struct {
raw_spinlock_t pi_lock;
cpumask_t cpus_allowed __guarded_by(phoney);
};
void raw_spin_pi_lock(struct task_struct *p)
__acquires(p->pi_lock)
__acquires_shared(phoney);
void raw_spin_rq_lock(struct rq *rq)
__acquires(rq->__lock)
__acquires_shared(phoney);
static inline void upgrade_phoney(struct task_struct *p, struct rq *rq)
__must_hold(p->pi_lock)
__must_hold(rq->__lock)
__releases_shared(phoney)
__releases_shared(phoney)
__acquires(phoney)
{ }
So this doesn't work because the recursion thing, but it is also just
plain horrible to have to do.
I would much rather write something like:
struct task_struct {
raw_spinlock_t pi_lock;
cpumask_t cpus_allowed __guarded_by(pi_lock, rq->__lock);
}
And now I realize this case is doubly annoying because (rq) isn't
something that is in lexical scope at this point :-( These are the
locking rules though, and it is vexing not being able to express things
like this.
I should've taken another example; same thing, simpler setup:
struct perf_event_context {
struct mutex mutex;
raw_spinlock_t lock;
struct list_head event_list __guarded_by(mutex, lock);
int nr_events __guarded_by(mutex, lock);
};
This would allow me to iterate the list holding either lock, but only
modify the list while holding both.
next prev parent reply other threads:[~2025-03-06 10:37 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-05 11:47 Thread Safety Analysis and the Linux kernel Marco Elver
2025-03-05 23:54 ` Puchert, Aaron
2025-03-06 9:47 ` Peter Zijlstra
2025-03-06 16:18 ` Bart Van Assche
2025-03-07 8:07 ` Peter Zijlstra
2025-03-07 21:50 ` Puchert, Aaron
2025-03-07 21:46 ` Puchert, Aaron
2025-03-06 10:08 ` Peter Zijlstra
2025-03-06 22:18 ` Puchert, Aaron
2025-03-07 7:59 ` Peter Zijlstra
2025-03-07 14:13 ` Peter Zijlstra
2025-03-06 10:37 ` Peter Zijlstra [this message]
2025-03-06 23:14 ` Puchert, Aaron
2025-03-07 8:52 ` Peter Zijlstra
2025-03-07 12:52 ` Peter Zijlstra
2025-03-07 14:22 ` Greg Kroah-Hartman
2025-03-07 14:35 ` Peter Zijlstra
2025-03-08 6:06 ` Greg Kroah-Hartman
2025-03-07 23:03 ` Puchert, Aaron
2025-03-06 17:11 ` Paul E. McKenney
2025-03-06 23:24 ` Puchert, Aaron
2025-03-06 23:44 ` Paul E. McKenney
2025-03-07 17:59 ` Puchert, Aaron
2025-03-07 18:24 ` Paul E. McKenney
2025-03-07 12:00 ` Marco Elver
2025-05-05 13:44 ` Marco Elver
2025-06-05 12:44 ` Marco Elver
2025-09-18 10:37 ` Marco Elver
2025-09-18 11:10 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250306103752.GE16878@noisy.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=aaron.puchert@sap.com \
--cc=aaron@aaronballman.com \
--cc=boqun.feng@gmail.com \
--cc=bvanassche@acm.org \
--cc=elver@google.com \
--cc=linux-toolchains@vger.kernel.org \
--cc=llvm@lists.linux.dev \
--cc=paulmck@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox