From: Peter Zijlstra <peterz@infradead.org>
To: "Puchert, Aaron" <aaron.puchert@sap.com>
Cc: Marco Elver <elver@google.com>,
Aaron Ballman <aaron@aaronballman.com>,
"linux-toolchains@vger.kernel.org"
<linux-toolchains@vger.kernel.org>,
"llvm@lists.linux.dev" <llvm@lists.linux.dev>,
Bart Van Assche <bvanassche@acm.org>,
"Paul E. McKenney" <paulmck@kernel.org>,
Boqun Feng <boqun.feng@gmail.com>
Subject: Re: Thread Safety Analysis and the Linux kernel
Date: Thu, 6 Mar 2025 11:37:52 +0100 [thread overview]
Message-ID: <20250306103752.GE16878@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <DB7PR02MB36260C79D8CD91925D0D3447E7CB2@DB7PR02MB3626.eurprd02.prod.outlook.com>
On Wed, Mar 05, 2025 at 11:54:14PM +0000, Puchert, Aaron wrote:
> > 5. Better control-flow handling. Basic understanding of conditional
> > locking, which is explicitly ruled out in:
> > https://clang.llvm.org/docs/ThreadSafetyAnalysis.html#conditional-locks
> > - however, if there's some way to even get basic support, would vastly
> > improve things for the kernel.
>
> The paper goes into some detail why this is tricky with the current
> design. We don't want to explore execution paths, because that
> obviously results in a combinatorial explosion. The Clang static
> analyzer deals with that by limiting exploration, but that makes it
> fuzzy and of course slow. As a warning flag we're part of the compiler
> itself, and the recommended practice is -Werror=thread-safety, so both
> of these are highly problematic. We want the analysis to be
> predictable and fast.
>
> It would be interesting to see some patterns. One "trick" that I have
> played with is to move the conditional locking into the capability
> itself: it is then unconditionally acquired or released, but the
> underlying mutex is only acquired or released based on some global
> state or member. Something like this:
>
> struct __attribute__((capability("mutex"))) conditional_mutex
> {
> struct mutex mu;
> bool active;
> }
>
> void acquire_conditional_mutex(struct conditional_mutex* cmu)
> __attribute__((acquire_capability(cmu)))
> {
> if (cmu->active)
> acquire_mutex(&cmu->mu);
> }
>
> However, I realize that might be a difficult proposal for the kernel community.
Version of this that came up were:
static __always_inline u32
bpf_prog_run_array_uprobe(const struct bpf_prog_array *array,
const void *ctx, bpf_prog_run_fn run_prog)
{
const struct bpf_prog_array_item *item;
const struct bpf_prog *prog;
struct bpf_run_ctx *old_run_ctx;
struct bpf_trace_run_ctx run_ctx;
u32 ret = 1;
might_fault();
RCU_LOCKDEP_WARN(!rcu_read_lock_trace_held(), "no rcu lock held");
if (unlikely(!array))
return ret;
migrate_disable();
run_ctx.is_uprobe = true;
old_run_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
item = &array->items[0];
while ((prog = READ_ONCE(item->prog))) {
if (!prog->sleepable)
rcu_read_lock();
run_ctx.bpf_cookie = item->bpf_cookie;
ret &= run_prog(prog, ctx);
item++;
if (!prog->sleepable)
rcu_read_unlock();
}
bpf_reset_run_ctx(old_run_ctx);
migrate_enable();
return ret;
}
and
static void ath9k_hif_usb_firmware_fail(struct hif_device_usb *hif_dev)
{
struct device *dev = &hif_dev->udev->dev;
struct device *parent = dev->parent;
complete_all(&hif_dev->fw_done);
if (parent)
device_lock(parent);
device_release_driver(dev);
if (parent)
device_unlock(parent);
}
And I think we can expect a fair amount of resistance from maintainers
if we're going to go around rewriting their -- perfectly fine code,
thank you very much -- to please a static analyzer :/
> > Some of these are more complex than others, and any hints how we might
> > go about this are appreciated. I'm happy to try and implement some of
> > them, but if you find that you already know exactly how you'd like an
> > implementation to look like, rough drafts that we can take over and
> > polish would be very very helpful, too!
> >
> > In general, the Linux kernel has some of the most complex
> > synchronization patterns with numerous synchronization primitives.
> > Getting this to work for a good chunk of the more complex
> > synchronization code in the kernel will be quite the achievement if we
> > get there. :-)
>
> Examples are always welcome!
One of the more common patterns that I've talked about with Marco is
where modification requires two locks, while access requires either
lock.
The current __guarded_by() annotation cannot express this.
One possible way would be to construct a fake / 0-size capability
structure and have each of these locks acquire it in 'shared' mode (but
then you run into the recursion thing) and have a special annotation
that upgrades it to exclusive.
struct phoney __attribute__((capability(phoney))) { } phoney;
struct task_struct {
raw_spinlock_t pi_lock;
cpumask_t cpus_allowed __guarded_by(phoney);
};
void raw_spin_pi_lock(struct task_struct *p)
__acquires(p->pi_lock)
__acquires_shared(phoney);
void raw_spin_rq_lock(struct rq *rq)
__acquires(rq->__lock)
__acquires_shared(phoney);
static inline void upgrade_phoney(struct task_struct *p, struct rq *rq)
__must_hold(p->pi_lock)
__must_hold(rq->__lock)
__releases_shared(phoney)
__releases_shared(phoney)
__acquires(phoney)
{ }
So this doesn't work because the recursion thing, but it is also just
plain horrible to have to do.
I would much rather write something like:
struct task_struct {
raw_spinlock_t pi_lock;
cpumask_t cpus_allowed __guarded_by(pi_lock, rq->__lock);
}
And now I realize this case is doubly annoying because (rq) isn't
something that is in lexical scope at this point :-( These are the
locking rules though, and it is vexing not being able to express things
like this.
I should've taken another example; same thing, simpler setup:
struct perf_event_context {
struct mutex mutex;
raw_spinlock_t lock;
struct list_head event_list __guarded_by(mutex, lock);
int nr_events __guarded_by(mutex, lock);
};
This would allow me to iterate the list holding either lock, but only
modify the list while holding both.
next prev parent reply other threads:[~2025-03-06 10:37 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-05 11:47 Thread Safety Analysis and the Linux kernel Marco Elver
2025-03-05 23:54 ` Puchert, Aaron
2025-03-06 9:47 ` Peter Zijlstra
2025-03-06 16:18 ` Bart Van Assche
2025-03-07 8:07 ` Peter Zijlstra
2025-03-07 21:50 ` Puchert, Aaron
2025-03-07 21:46 ` Puchert, Aaron
2025-03-06 10:08 ` Peter Zijlstra
2025-03-06 22:18 ` Puchert, Aaron
2025-03-07 7:59 ` Peter Zijlstra
2025-03-07 14:13 ` Peter Zijlstra
2025-03-06 10:37 ` Peter Zijlstra [this message]
2025-03-06 23:14 ` Puchert, Aaron
2025-03-07 8:52 ` Peter Zijlstra
2025-03-07 12:52 ` Peter Zijlstra
2025-03-07 14:22 ` Greg Kroah-Hartman
2025-03-07 14:35 ` Peter Zijlstra
2025-03-08 6:06 ` Greg Kroah-Hartman
2025-03-07 23:03 ` Puchert, Aaron
2025-03-06 17:11 ` Paul E. McKenney
2025-03-06 23:24 ` Puchert, Aaron
2025-03-06 23:44 ` Paul E. McKenney
2025-03-07 17:59 ` Puchert, Aaron
2025-03-07 18:24 ` Paul E. McKenney
2025-03-07 12:00 ` Marco Elver
2025-05-05 13:44 ` Marco Elver
2025-06-05 12:44 ` Marco Elver
2025-09-18 10:37 ` Marco Elver
2025-09-18 11:10 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250306103752.GE16878@noisy.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=aaron.puchert@sap.com \
--cc=aaron@aaronballman.com \
--cc=boqun.feng@gmail.com \
--cc=bvanassche@acm.org \
--cc=elver@google.com \
--cc=linux-toolchains@vger.kernel.org \
--cc=llvm@lists.linux.dev \
--cc=paulmck@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.