Linux block layer

Linux block layer
 help / color / mirror / Atom feed

* Re: [PATCH v6 09/14] block/blk-iocost: Split ioc_rqos_throttle()
From: Christoph Hellwig @ 2026-06-05  6:49 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, linux-block, Christoph Hellwig, Hannes Reinecke,
	Damien Le Moal, Tejun Heo, Josef Bacik
In-Reply-To: <8037f3bb5f5ecbbd635ef48804a1a9ad2d2be2e0.1780419600.git.bvanassche@acm.org>

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply

* Re: [PATCH v6 10/14] block/blk-iocost: Inline iocg_lock() and iocg_unlock()
From: Christoph Hellwig @ 2026-06-05  6:50 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, linux-block, Christoph Hellwig, Hannes Reinecke,
	Damien Le Moal, Tejun Heo, Josef Bacik
In-Reply-To: <e0e81fbd746bfb54c31cb84f8671a0c23dcdedb2.1780419600.git.bvanassche@acm.org>

On Tue, Jun 02, 2026 at 10:07:39AM -0700, Bart Van Assche wrote:
> +	if (ioc_locked) {
> +		guard(spinlock_irqsave)(&iocg->ioc->lock);
> +		guard(spinlock)(&iocg->waitq.lock);
> +		action = iocg_handle_over_budget(rqos, iocg, bio, &now, &wait,
> +						 use_debt, ioc_locked, abs_cost,
> +						 cost);
> +	} else {
> +		guard(spinlock_irqsave)(&iocg->waitq.lock);
> +		action = iocg_handle_over_budget(rqos, iocg, bio, &now, &wait,
> +						 use_debt, ioc_locked, abs_cost,
> +						 cost);

Please avoid the undebuggable guard mess.  This is a lot more readable with
good old locking calls.


^ permalink raw reply

* Re: [PATCH v6 11/14] block/blk-mq-debugfs: Improve lock context annotations
From: Christoph Hellwig @ 2026-06-05  6:51 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, linux-block, Christoph Hellwig, Hannes Reinecke,
	Damien Le Moal, Nathan Chancellor
In-Reply-To: <66eff9a5e488fbe0b79db3a54518fe5d181a31ec.1780419600.git.bvanassche@acm.org>

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply

* Re: configurable block error injection
From: Christoph Hellwig @ 2026-06-05  7:23 UTC (permalink / raw)
  To: Daniel Gomez
  Cc: Christoph Hellwig, Jens Axboe, Jonathan Corbet, linux-block,
	linux-doc, bpf, linux-kselftest, Luis Chamberlain,
	Masami Hiramatsu, Brendan Gregg, GOST, Shin'ichiro Kawasaki
In-Reply-To: <d9a5b911-7692-4df2-b67c-c33bc80b61aa@kernel.org>

On Thu, Jun 04, 2026 at 01:06:13PM +0200, Daniel Gomez wrote:
> IIU the series correctly, and oversimplifying: when
> injection is enabled and a bio matches, the block layer completes the
> bio inline with the chosen blk_status_t (the status= rule from debugfs)
> via bio_endio_status(). The submission path returns to the caller
> immediately, with the bio already in the error state. Nothing is ever
> sent to the device, but the completion path sees the injected error.

Yes.

> > I'd have to allow access to certain bio
> 
> With struct_ops the bio is passed to the ebpf side as read-only, bio 
> fields can be read to decide the policy but cannot write them. Is 
> read-only access to bio fields itself a concern?

Yes, for the reason state below.

> 
> > fields and would have create a stable UAPI for commands and status
> > using the fake BTF struct access which really would not be a good
> > idea here as we need to be able to change internals.  
> 
> That should not be a problem at all. With CO-RE (compile once, run
> everywhere) the program resolves the bio field offsets against the
> BTF of the kernel it loads on, so it adapts dynamically if the layout
> changes. The contract is just the struct_ops callback signature: a
> struct bio * argument and a blk_status_t return. And that doesn't imply
> any UAPI commitment AFAIK.

That assumes the fields even exists in this form.  And we definitively
need to do things about the iter going ahead.  In other words we have
to build up a huge abstraction here first.

> > Additionally
> > having fully BTF-enabled toolchains in test VMs is not great either.
> 
> Are you referring to the old BCC toolchain requirements [1]? This is
> solved in CO-RE [2]. The toolchain (Clang/LLVM, pahole) stays on the
> build host; the test VM only needs the prebuilt BPF object, libbpf at
> runtime, and the kernel's own BTF (CONFIG_DEBUG_INFO_BTF). No compiler
> or BTF toolchain is required inside the VM. Clang/LLVM 10+ is enough to
> build CO-RE libbpf tools [3].

libbpf and BTF support is what I mean.  And of course whatever code
you write to communicate with it as a simple shell script won't work
any more.

> Are you referring to the bio sector range comparison in
> __blk_error_inject()?

Yes.

> I don't think that needs to be delegated to a
> BPF map (Documentation/bpf/maps.rst). The ebpf side has direct access
> to the bio fields, so it can apply the same sector/op filtering
> __blk_error_inject() does today. That match is already a linear list
> walk, so the ebpf program just runs the same [start, end] condition
> check.

Which means I need to write eBPF code for every specific range I
want to inject, which makes the thing very hard to use.

> In summary: what do you think of evolving this series
> into eBPF, but BPF_PROG_TYPE_STRUCT_OPS instead of
> ALLOW_ERROR_INJECTION/bpf_override_return()?.

As said before, I prototyped it and it was a mess.  Having about 300
lines of simple code that can be directly used from a shell script
seems much preferable.


^ permalink raw reply

* Re: [PATCH] block/fops: fix refcount underflow in __blkdev_direct_IO()
From: Christoph Hellwig @ 2026-06-05  7:31 UTC (permalink / raw)
  To: Wentao Liang; +Cc: axboe, linux-block, linux-kernel, stable
In-Reply-To: <20260603021035.3690601-1-vulab@iscas.ac.cn>

On Wed, Jun 03, 2026 at 02:10:35AM +0000, Wentao Liang wrote:
> __blkdev_direct_IO() calls bio_get() and bio_put() around
> I/O operations, but if the I/O fails, the error path may call
> bio_put() twice, causing a refcount underflow.

What I/O failure do you observe?


^ permalink raw reply

* Re: [PATCH v3] block: assign caller-specific lockdep class to disk->open_mutex
From: Andreas Hindborg @ 2026-06-05  7:36 UTC (permalink / raw)
  To: Miguel Ojeda, penguin-kernel, Kentaro Takeda
  Cc: akpm, axboe, bvanassche, dlemoal, hch, hdanton, linux-block,
	linux-kernel, ojeda, tom.leiming, wqu, Boqun Feng, rust-for-linux,
	Mark Brown
In-Reply-To: <20260604210704.41751-1-ojeda@kernel.org>

"Miguel Ojeda" <ojeda@kernel.org> writes:

> On Wed, 03 Jun 2026 20:54:05 +0900 Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> wrote:
>>
>> Acknowledgment:
>>   Since I have no experience with Rust, changes needed by Rust block layer
>>   bindings and rnull module are made based on conversation with the Gemini
>>   AI collaborator.
>
> Then please do Cc the right people as `MAINTAINERS` mentions, including
> "BLOCK LAYER DEVICE DRIVER API [RUST]" and "RUST"...
>
> I am quite confused. Why was this added to linux-next?
>
> It doesn't go through block, nor has an Ack or review and breaks
> the `rustdoc` build in linux-next (and thus rust.docs.kernel.org):
>
>     error: unresolved link to `my_gendisk_lkclass`
>       --> rust/kernel/block/mq/gen_disk.rs:42:50
>        |
>     42 | /// This type can only be instantiated via the [`my_gendisk_lkclass!`] macro.
>
> It is also not Clippy-clean -- it doesn't follow our usual conventions
> for safety comments and sections:
>
>     error: unsafe function's docs are missing a `# Safety` section
>       --> rust/kernel/block/mq/gen_disk.rs:59:5
>        |
>     59 |     pub const unsafe fn new_lock_class(ptr: *mut bindings::gendisk_lkclass) -> GenDiskLockClass {
>
>     error: function has unnecessary safety comment
>       --> rust/kernel/block/mq/gen_disk.rs:59:5
>        |
>     58 |     /// SAFETY: `ptr` must point to a valid static `gendisk_lkclass` instance.
>        |         ------- help: consider changing it to a `# Safety` section: `# Safety`
>     59 |     pub const unsafe fn new_lock_class(ptr: *mut bindings::gendisk_lkclass) -> GenDiskLockClass {
>
> Please see:
>
>   https://rust-for-linux.com/contributing#submit-checklist-addendum
>
> In any case, it is also too late in the cycle to be experimenting in
> linux-next.
>
> So what am I missing? What is going on?
>
> (And on top of all that, for some reason I did not receive it even if I
> am apparently in Cc, so I have asked the admins about that.)
>
> Cheers,
> Miguel
>
> Cc: Andreas Hindborg <a.hindborg@kernel.org>
> Cc: Boqun Feng <boqun@kernel.org>
> Cc: rust-for-linux@vger.kernel.org
>
> Cc: Mark Brown <broonie@kernel.org>

Looks like this was pulled through the tomoyo tree:

tomoyo		git://git.code.sf.net/p/tomoyo/tomoyo.git#master

M:	Kentaro Takeda <takedakn@nttdata.co.jp>
M:	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>


Best regards,
Andreas Hindborg



^ permalink raw reply

* Re: [PATCH v3] block: assign caller-specific lockdep class to disk->open_mutex
From: Andreas Hindborg @ 2026-06-05  7:54 UTC (permalink / raw)
  To: Tetsuo Handa, Christoph Hellwig, Bart Van Assche, Jens Axboe
  Cc: linux-block, LKML, Andrew Morton, Ming Lei, Damien Le Moal,
	Qu Wenruo, Hillf Danton, Miguel Ojeda
In-Reply-To: <226152a3-1e4c-4eec-9a17-1d40426a7b18@I-love.SAKURA.ne.jp>

Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> writes:

> The block core currently allocates a single monolithic lockdep key for
> disk->open_mutex across all callers. This single key conflates locking
> hierarchies between independent block streams. For example, if a stacked
> driver like loop flushes its internal workqueues inside lo_release() while
> holding its own open_mutex, lockdep views this as a potential ABBA deadlock
> against the underlying storage stack, leading to numerous circular
> dependency splats.
>
> To structurally reduce false positives, this patch splits the global
> monolithic lock class into distinct, per-caller instances during disk
> allocation. This is done by replacing "struct lock_class_key" with
> "struct gendisk_lkclass", which contains two instances of
> "struct lock_class_key" for the legacy "(bio completion)" map and
> disk->open_mutex respectively.
>
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>

For the Rust part, we have existing infrastructure for lock class keys
[1]. Please take a look how we generate lock class keys elsewhere [2].

Best regards,
Andreas Hindborg

[1] https://rust.docs.kernel.org/kernel/sync/struct.LockClassKey.html
[2] https://github.com/torvalds/linux/blob/ddd664bbff63e09e7a7f9acae9c43605d4cf185f/rust/kernel/sync/lock/mutex.rs#L12


^ permalink raw reply

* Re: [PATCH v3] block: assign caller-specific lockdep class to disk->open_mutex
From: Tetsuo Handa @ 2026-06-05 10:08 UTC (permalink / raw)
  To: Miguel Ojeda
  Cc: akpm, axboe, bvanassche, dlemoal, hch, hdanton, linux-block,
	linux-kernel, tom.leiming, wqu, Andreas Hindborg, Boqun Feng,
	rust-for-linux, Mark Brown
In-Reply-To: <20260604210704.41751-1-ojeda@kernel.org>

On 2026/06/05 6:07, Miguel Ojeda wrote:
> On Wed, 03 Jun 2026 20:54:05 +0900 Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> wrote:
>>
>> Acknowledgment:
>>   Since I have no experience with Rust, changes needed by Rust block layer
>>   bindings and rnull module are made based on conversation with the Gemini
>>   AI collaborator.
> 
> Then please do Cc the right people as `MAINTAINERS` mentions, including
> "BLOCK LAYER DEVICE DRIVER API [RUST]" and "RUST"...

Oops, it seems that I overlooked the rust-for-linux@vger.kernel.org line.

> 
> I am quite confused. Why was this added to linux-next?

Since sashiko did not find issues on this v3 patch
( https://sashiko.dev/#/patchset/226152a3-1e4c-4eec-9a17-1d40426a7b18%40I-love.SAKURA.ne.jp ),
I started pre-testing by syzbot while waiting for responses from maintainers.

I am using my tree for allowing syzbot to test debug/experimental patches in linux-next tree
(or to apply workaround patches for bugs that prevent syzbot from testing linux-next tree),
and therefore my tree is subjected to "git reset --hard" changes.

> 
> It doesn't go through block, nor has an Ack or review and breaks
> the `rustdoc` build in linux-next (and thus rust.docs.kernel.org):
> 
>     error: unresolved link to `my_gendisk_lkclass`
>       --> rust/kernel/block/mq/gen_disk.rs:42:50
>        |
>     42 | /// This type can only be instantiated via the [`my_gendisk_lkclass!`] macro.
> 
> It is also not Clippy-clean -- it doesn't follow our usual conventions
> for safety comments and sections:
> 
>     error: unsafe function's docs are missing a `# Safety` section
>       --> rust/kernel/block/mq/gen_disk.rs:59:5
>        |
>     59 |     pub const unsafe fn new_lock_class(ptr: *mut bindings::gendisk_lkclass) -> GenDiskLockClass {
> 
>     error: function has unnecessary safety comment
>       --> rust/kernel/block/mq/gen_disk.rs:59:5
>        |
>     58 |     /// SAFETY: `ptr` must point to a valid static `gendisk_lkclass` instance.
>        |         ------- help: consider changing it to a `# Safety` section: `# Safety`
>     59 |     pub const unsafe fn new_lock_class(ptr: *mut bindings::gendisk_lkclass) -> GenDiskLockClass {

Hmm, I'm not familiar with comment styles for Rust...

Do you see any technical problems except comment style problem?

> 
> Please see:
> 
>   https://rust-for-linux.com/contributing#submit-checklist-addendum
> 
> In any case, it is also too late in the cycle to be experimenting in
> linux-next.
> 
> So what am I missing? What is going on?

Nothing bad is going on. The final patch will be sent to linux-next tree via
appropriate tree after getting acks from maintainers, and the current patch
in my tree will be dropped when maintainers accepted the final patch.

> 
> (And on top of all that, for some reason I did not receive it even if I
> am apparently in Cc, so I have asked the admins about that.)

I was enabling only SPF, but it seems that gmail started rejecting such mails.
Therefore, last night I also enabled DKIM/ARC and DMARC. I hope this mail is
delivered to gmail users.

^ permalink raw reply

* Re: [PATCH v3] block: assign caller-specific lockdep class to disk->open_mutex
From: Tetsuo Handa @ 2026-06-05 10:14 UTC (permalink / raw)
  To: Andreas Hindborg, Christoph Hellwig, Bart Van Assche, Jens Axboe
  Cc: linux-block, LKML, Andrew Morton, Ming Lei, Damien Le Moal,
	Qu Wenruo, Hillf Danton, Miguel Ojeda
In-Reply-To: <87wlwdifrv.fsf@kernel.org>

On 2026/06/05 16:54, Andreas Hindborg wrote:
> Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> writes:
> 
>> The block core currently allocates a single monolithic lockdep key for
>> disk->open_mutex across all callers. This single key conflates locking
>> hierarchies between independent block streams. For example, if a stacked
>> driver like loop flushes its internal workqueues inside lo_release() while
>> holding its own open_mutex, lockdep views this as a potential ABBA deadlock
>> against the underlying storage stack, leading to numerous circular
>> dependency splats.
>>
>> To structurally reduce false positives, this patch splits the global
>> monolithic lock class into distinct, per-caller instances during disk
>> allocation. This is done by replacing "struct lock_class_key" with
>> "struct gendisk_lkclass", which contains two instances of
>> "struct lock_class_key" for the legacy "(bio completion)" map and
>> disk->open_mutex respectively.
>>
>> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> 
> For the Rust part, we have existing infrastructure for lock class keys
> [1]. Please take a look how we generate lock class keys elsewhere [2].

My understanding is that we don't have infrastructure for lock class keys
that can be applied to

  +struct gendisk_lkclass {
  +	struct lock_class_key bio_lkclass;
  +	struct lock_class_key open_mutex_lkclass;
  +};

  -	static struct lock_class_key __key;
  +	static struct gendisk_lkclass __key;

change. Alternative approach is welcomed if you have one.

> 
> Best regards,
> Andreas Hindborg
> 
> [1] https://rust.docs.kernel.org/kernel/sync/struct.LockClassKey.html
> [2] https://github.com/torvalds/linux/blob/ddd664bbff63e09e7a7f9acae9c43605d4cf185f/rust/kernel/sync/lock/mutex.rs#L12
> 


^ permalink raw reply

* Re: [PATCH v3] block: assign caller-specific lockdep class to disk->open_mutex
From: Miguel Ojeda @ 2026-06-05 11:02 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Miguel Ojeda, akpm, axboe, bvanassche, dlemoal, hch, hdanton,
	linux-block, linux-kernel, tom.leiming, wqu, Andreas Hindborg,
	Boqun Feng, rust-for-linux, Mark Brown
In-Reply-To: <89f0f308-2ee7-4c68-b811-8132e5d88530@I-love.SAKURA.ne.jp>

On Fri, Jun 5, 2026 at 12:15 PM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> Oops, it seems that I overlooked the rust-for-linux@vger.kernel.org line.

I am not sure what you mean -- it is not just the mailing list, the
maintainers and the reviewers of both entries were not Cc'd either.

So like 9 Ccs are missing...

> I am using my tree for allowing syzbot to test debug/experimental patches in linux-next tree

linux-next is only meant for patches that are already reviewed, tested
and destined for the merge window.

It is not meant for experimental patches, and adding those, especially
this late in the kernel cycle, can break other people's CIs and
systems at the worst possible time. If every maintainers put random
patches in linux-next, then it would be unusable...

(There are exceptions, of course, but only if people is aware and in
agreement...)

> Do you see any technical problems except comment style problem?

The above errors are already a technical problem -- they break the
Clippy and docs build!

I assume Andreas or Boqun will eventually take a look at the patch now
that they know about its existence (is there a particular reason to
rush?).

> Nothing bad is going on. The final patch will be sent to linux-next tree via
> appropriate tree after getting acks from maintainers, and the current patch
> in my tree will be dropped when maintainers accepted the final patch.

No, please drop it now -- the patch shouldn't have been applied to
linux-next yet, since the maintainers didn't ack it (they just
realized it exists...), it is not reviewed (apart from Sashiko), it is
not tested enough (given the errors above) and not destined for the
next merge window (for all the reasons before).

Thanks!

Cheers,
Miguel

^ permalink raw reply

* Re: [PATCH v3] block: assign caller-specific lockdep class to disk->open_mutex
From: Mark Brown @ 2026-06-05 12:04 UTC (permalink / raw)
  To: Miguel Ojeda
  Cc: Tetsuo Handa, Miguel Ojeda, akpm, axboe, bvanassche, dlemoal, hch,
	hdanton, linux-block, linux-kernel, tom.leiming, wqu,
	Andreas Hindborg, Boqun Feng, rust-for-linux
In-Reply-To: <CANiq72nGo=F5D7b7Hx8A1jfAJw_72Np+UpRWBRQoPLAE+RAPpA@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2010 bytes --]

On Fri, Jun 05, 2026 at 01:02:59PM +0200, Miguel Ojeda wrote:
> On Fri, Jun 5, 2026 at 12:15 PM Tetsuo Handa
> <penguin-kernel@i-love.sakura.ne.jp> wrote:

> > I am using my tree for allowing syzbot to test debug/experimental patches in linux-next tree

> linux-next is only meant for patches that are already reviewed, tested
> and destined for the merge window.

> It is not meant for experimental patches, and adding those, especially
> this late in the kernel cycle, can break other people's CIs and
> systems at the worst possible time. If every maintainers put random
> patches in linux-next, then it would be unusable...

> (There are exceptions, of course, but only if people is aware and in
> agreement...)

In general the sort of testing that it's good for is "this needs more
exposure" sorts of things - things that look good locally but where
there's a wide variety of users or affected systems that might be
affected and where it's hard to judge the impact.

> > Do you see any technical problems except comment style problem?

> The above errors are already a technical problem -- they break the
> Clippy and docs build!

Oh, bah - something turned off RUST in allmodconfig again so we lost
coverage in -next, sorry about that.  I'll need to work something out to
make sure I notice that happening and can do something about it.  It's
kind of worrying that this keeps happening TBH, otherwise rust conflicts
are likely to result in broken builds.  The dependencies feel really
fraigle here.

> No, please drop it now -- the patch shouldn't have been applied to
> linux-next yet, since the maintainers didn't ack it (they just
> realized it exists...), it is not reviewed (apart from Sashiko), it is
> not tested enough (given the errors above) and not destined for the
> next merge window (for all the reasons before).

Had the rust builds been enabled for allmodconfig as I had expected the
tree would have been kept out of -next as a result of this.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [PATCH v3] block: assign caller-specific lockdep class to disk->open_mutex
From: Miguel Ojeda @ 2026-06-05 12:40 UTC (permalink / raw)
  To: Mark Brown
  Cc: Tetsuo Handa, Miguel Ojeda, akpm, axboe, bvanassche, dlemoal, hch,
	hdanton, linux-block, linux-kernel, tom.leiming, wqu,
	Andreas Hindborg, Boqun Feng, rust-for-linux
In-Reply-To: <4ffdbe16-61ab-47e8-b696-e2fcf9a26385@sirena.org.uk>

On Fri, Jun 5, 2026 at 2:04 PM Mark Brown <broonie@kernel.org> wrote:
>
> Oh, bah - something turned off RUST in allmodconfig again so we lost
> coverage in -next, sorry about that.  I'll need to work something out to
> make sure I notice that happening and can do something about it.  It's
> kind of worrying that this keeps happening TBH, otherwise rust conflicts
> are likely to result in broken builds.  The dependencies feel really
> fraigle here.
>
> Had the rust builds been enabled for allmodconfig as I had expected the
> tree would have been kept out of -next as a result of this.

No worries at all, and thanks a lot for testing it, as usual. I am
happy to be there as a backup to catch extra things.

I Cc'd you since I was wondering if your new `LLVM=1` build should
have caught it (I thought the GCC one wouldn't, since my understanding
is that we lost that last week due to the KASAN+RUST patch).

And, yeah, it is fragile... It is quite a complicated set of Kconfig
relationships, so years ago I also ended up in the same situation and
early on decided to add an explicit grep for `CONFIG_RUST=y` and
certain other bits that I wanted to ensure are in place after the
config phase.

Especially the toolchain side, i.e. the fact that `CONFIG_RUST` gets
automatically disabled if the Rust toolchain is not found, is
particularly subtle. We are considering finally changing that to fail
the build:

  https://lore.kernel.org/rust-for-linux/20260521-evolve-to-crab-v2-1-c18e0e98fc54@chaosmail.tech/

For you and those building with Rust enabled it would be great to
notice when it gets disabled by mistake, but I am not sure if most
maintainers (i.e. those without a Rust toolchain but using
`all*config` routinely) will appreciate it (or rather, tolerate it...
:).

It may be too soon -- I considered announcing it in linux-next and
adding the patch early next cycle to see what happens, but I will
probably take the safer route and ask in LPC first, unless you think
it is a good idea. Any feedback/suggestions appreciated!

Cheers,
Miguel

^ permalink raw reply

* Re: [PATCH v6 00/14] Enable lock context analysis for the block layer core
From: Nilay Shroff @ 2026-06-05 12:45 UTC (permalink / raw)
  To: Bart Van Assche, Jens Axboe
  Cc: linux-block, Christoph Hellwig, Hannes Reinecke, Damien Le Moal
In-Reply-To: <cover.1780419600.git.bvanassche@acm.org>

Hi Bart,

On 6/2/26 10:37 PM, Bart Van Assche wrote:
> Hi Jens,
> 
> Recently the following patch series has been merged: [PATCH v5 00/36]
> Compiler-Based Context- and Locking-Analysis
> (https://lore.kernel.org/lkml/20251219154418.3592607-1-elver@google.com/). That
> patch series drops support for verifying lock context annotations with sparse
> and introduces support for verifying lock context annotations with Clang. The
> support in Clang for lock context annotation and verification is better than
> that in sparse. As an example, __cond_acquires() and __guarded_by() are
> supported by Clang but not by sparse. Hence this patch series that enables lock
> context analysis for the block layer core.
> 
> Note: although the Linux kernel documentation specifies 22 as minimal
> version for Clang for context analysis support, this patch series requires
> Clang 23 because it annotates function pointers. As one can see here, a patch
> has been queued that fixes the kernel documentation:
> https://lore.kernel.org/all/177926568868.711.3058599932884307249.tip-bot2@tip-bot2/
> 
> Please consider this patch series for the next merge window.
> 
> Thanks,
> 
> Bart.

I have reviewed this series and it looks good to me.

One question: I still see a few block-layer functions that document locking
requirements with lockdep assertions. Should we also consider adding the
corresponding Clang context annotations for these where practical?

For example, blkg_create() requires q->queue_lock must be held by its caller, so
it could be annotated with __must_hold(&disk->queue->queue_lock). Similarly,
blk_report_disk_dead() asserts that disk->open_mutex must not be held, so it could
be annotated with __must_not_hold(&disk->open_mutex).

There are a few more similar cases. Would it make sense to cover these in this
series, or should that be left for follow-up patches?

Thanks,
--Nilay

^ permalink raw reply

* Re: [PATCH v3] block: assign caller-specific lockdep class to disk->open_mutex
From: Mark Brown @ 2026-06-05 13:03 UTC (permalink / raw)
  To: Miguel Ojeda
  Cc: Tetsuo Handa, Miguel Ojeda, akpm, axboe, bvanassche, dlemoal, hch,
	hdanton, linux-block, linux-kernel, tom.leiming, wqu,
	Andreas Hindborg, Kees Cook, Gustavo A. R. Silva, Boqun Feng,
	rust-for-linux
In-Reply-To: <CANiq72nKDpzd9BTPfQ-NhWgcTgWdE=spED_n-idv1Jnekdo=Fw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1793 bytes --]

On Fri, Jun 05, 2026 at 02:40:24PM +0200, Miguel Ojeda wrote:

Copying in Kees and Gustavo due to a rust/randstruct interaction in
-next.

> I Cc'd you since I was wondering if your new `LLVM=1` build should
> have caught it (I thought the GCC one wouldn't, since my understanding
> is that we lost that last week due to the KASAN+RUST patch).

Yes, you did loose it so I swapped to LLVM=1 in order to try to keep
coverage.  I didn't specifically verify that it worked though, I only
noticed that change as I'd been enabling rust for KUnit and that flags
any missing Kconfig symbol.

> And, yeah, it is fragile... It is quite a complicated set of Kconfig
> relationships, so years ago I also ended up in the same situation and
> early on decided to add an explicit grep for `CONFIG_RUST=y` and
> certain other bits that I wanted to ensure are in place after the
> config phase.

It looks like the breakage is due to 'depends on !RANDSTRUCT' because
clang supports randstruct and allmodconfig defaults that to
RANDSTRUCT_FULL.  The fact that rust depends on !RANDSTRUCT but
randstruct doesn't care about rust means that randstruct wins.  I've got
to say rust coverage seems more important than randstruct coverage for
-next, others might have different opinions though!

> For you and those building with Rust enabled it would be great to
> notice when it gets disabled by mistake, but I am not sure if most
> maintainers (i.e. those without a Rust toolchain but using
> `all*config` routinely) will appreciate it (or rather, tolerate it...
> :).

I guess there's a difference between !RUST_IS_AVAILABLE and
RUST_IS_AVAILABLE && !RUST which is interesting here.  

I know other people have been caught out by silently not having rust
coverage in their CI.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* [PATCH v2] configfs: rust: add an API for adding default groups from C
From: Andreas Hindborg @ 2026-06-05 13:15 UTC (permalink / raw)
  To: Jens Axboe, Miguel Ojeda, Gary Guo, Björn Roy Baron,
	Benno Lossin, Alice Ryhl, Trevor Gross, Danilo Krummrich,
	Breno Leitao, Boqun Feng
  Cc: linux-kernel, linux-block, rust-for-linux, Andreas Hindborg,
	Boqun Feng

Some C subsystems provide a feature to add configfs default groups to the
configfs hierarchy of other drivers or subsystems. Rust abstractions for
these subsystems will want a way to add these default groups via the
configfs Rust API. So add infrastructure to make this possible.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
Changes in v2:
- Drop the spurious `__` prefix on the symbol name of the
  `configfs_remove_default_groups` Rust helper (Alice).
- Wrap the body of `Group::new` in `pin_init::pin_init_scope` so that
  allocation failure from `KVec::push` is propagated with `?` instead
  of panicking via `unwrap` (Danilo).
- Only build configfs rust helpers when configfs is enabled.
- Link to v1: https://msgid.link/20260215-configfs-c-default-groups-v1-1-e967daef6c36@kernel.org

To: Andreas Hindborg <a.hindborg@kernel.org>
To: Boqun Feng <boqun@kernel.org>
To: Jens Axboe <axboe@kernel.dk>
To: Miguel Ojeda <ojeda@kernel.org>
To: Gary Guo <gary@garyguo.net>
To: Björn Roy Baron <bjorn3_gh@protonmail.com>
To: Benno Lossin <lossin@kernel.org>
To: Alice Ryhl <aliceryhl@google.com>
To: Trevor Gross <tmgross@umich.edu>
To: Danilo Krummrich <dakr@kernel.org>
To: Breno Leitao <leitao@debian.org>
Cc: linux-kernel@vger.kernel.org
Cc: linux-block@vger.kernel.org
Cc: rust-for-linux@vger.kernel.org
---
 drivers/block/rnull/configfs.rs |  1 +
 rust/helpers/configfs.c         | 20 +++++++++++++
 rust/helpers/helpers.c          |  1 +
 rust/kernel/configfs.rs         | 63 +++++++++++++++++++++++++++++++++--------
 samples/rust/rust_configfs.rs   |  8 +++++-
 5 files changed, 80 insertions(+), 13 deletions(-)

diff --git a/drivers/block/rnull/configfs.rs b/drivers/block/rnull/configfs.rs
index 7c2eb5c0b722..b165347e9413 100644
--- a/drivers/block/rnull/configfs.rs
+++ b/drivers/block/rnull/configfs.rs
@@ -76,6 +76,7 @@ fn make_group(
                     name: name.try_into()?,
                 }),
             }),
+            core::iter::empty(),
         ))
     }
 }
diff --git a/rust/helpers/configfs.c b/rust/helpers/configfs.c
new file mode 100644
index 000000000000..4c71c33211bf
--- /dev/null
+++ b/rust/helpers/configfs.c
@@ -0,0 +1,20 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/configfs.h>
+
+#ifdef CONFIG_CONFIGFS_FS
+
+__rust_helper void
+rust_helper_configfs_add_default_group(struct config_group *new_group,
+				       struct config_group *group)
+{
+	configfs_add_default_group(new_group, group);
+}
+
+__rust_helper void
+rust_helper_configfs_remove_default_groups(struct config_group *group)
+{
+	configfs_remove_default_groups(group);
+}
+
+#endif
diff --git a/rust/helpers/helpers.c b/rust/helpers/helpers.c
index 625921e27dfb..851dc77d76b1 100644
--- a/rust/helpers/helpers.c
+++ b/rust/helpers/helpers.c
@@ -51,6 +51,7 @@
 #include "build_bug.c"
 #include "clk.c"
 #include "completion.c"
+#include "configfs.c"
 #include "cpu.c"
 #include "cpufreq.c"
 #include "cpumask.c"
diff --git a/rust/kernel/configfs.rs b/rust/kernel/configfs.rs
index 2339c6467325..8891fa491496 100644
--- a/rust/kernel/configfs.rs
+++ b/rust/kernel/configfs.rs
@@ -241,12 +241,22 @@ unsafe fn container_of(group: *const bindings::config_group) -> *const Self {
 ///
 /// To add a subgroup to configfs, pass this type as `ctype` to
 /// [`crate::configfs_attrs`] when creating a group in [`GroupOperations::make_group`].
-#[pin_data]
+#[pin_data(PinnedDrop)]
 pub struct Group<Data> {
     #[pin]
     group: Opaque<bindings::config_group>,
     #[pin]
     data: Data,
+    default_groups: KVec<Arc<dyn CDefaultGroup>>,
+}
+
+#[pinned_drop]
+impl<Data> PinnedDrop for Group<Data> {
+    fn drop(self: Pin<&mut Self>) {
+        // SAFETY: We have exclusive access to `self` and we know the default groups are alive
+        // because we reference them through `self.default_groups`.
+        unsafe { bindings::configfs_remove_default_groups(self.group.get()) };
+    }
 }
 
 impl<Data> Group<Data> {
@@ -258,22 +268,51 @@ pub fn new(
         name: CString,
         item_type: &'static ItemType<Group<Data>, Data>,
         data: impl PinInit<Data, Error>,
+        default_groups: impl IntoIterator<Item = Arc<dyn CDefaultGroup>>,
     ) -> impl PinInit<Self, Error> {
-        try_pin_init!(Self {
-            group <- pin_init::init_zeroed().chain(|v: &mut Opaque<bindings::config_group>| {
-                let place = v.get();
-                let name = name.to_bytes_with_nul().as_ptr();
-                // SAFETY: It is safe to initialize a group once it has been zeroed.
-                unsafe {
-                    bindings::config_group_init_type_name(place, name.cast(), item_type.as_ptr())
-                };
-                Ok(())
-            }),
-            data <- data,
+        pin_init::pin_init_scope(move || {
+            let mut dg = KVec::new();
+            for group in default_groups {
+                dg.push(group, GFP_KERNEL)?;
+            }
+
+            Ok(try_pin_init!(Self {
+                group <- pin_init::init_zeroed().chain(|v: &mut Opaque<bindings::config_group>| {
+                    let place = v.get();
+                    let name = name.to_bytes_with_nul().as_ptr();
+                    // SAFETY: It is safe to initialize a group once it has been zeroed.
+                    unsafe {
+                        bindings::config_group_init_type_name(
+                            place,
+                            name.cast(),
+                            item_type.as_ptr(),
+                        )
+                    };
+
+                    for default_group in &dg {
+                        // SAFETY: We keep the default groups alive until `Self` is dropped.
+                        unsafe {
+                            bindings::configfs_add_default_group(default_group.group_ptr(), place)
+                        }
+                    }
+                    Ok(())
+                }),
+                data <- data,
+                default_groups: dg,
+            }))
         })
     }
 }
 
+/// A trait for default configfs groups added by C code.
+///
+/// Rust abstractions that work with C code that creates configfs groups can implement this trait to
+/// add the groups as default groups via the Rust configfs API.
+pub trait CDefaultGroup {
+    /// Return a raw pointer to the group definition.
+    fn group_ptr(&self) -> *mut bindings::config_group;
+}
+
 // SAFETY: `Group<Data>` embeds a field of type `bindings::config_group`
 // within the `group` field.
 unsafe impl<Data> HasGroup<Data> for Group<Data> {
diff --git a/samples/rust/rust_configfs.rs b/samples/rust/rust_configfs.rs
index a1bd9db6010d..ada378bc331b 100644
--- a/samples/rust/rust_configfs.rs
+++ b/samples/rust/rust_configfs.rs
@@ -83,7 +83,12 @@ fn make_group(&self, name: &CStr) -> Result<impl PinInit<configfs::Group<Child>,
             ],
         };
 
-        Ok(configfs::Group::new(name.try_into()?, tpe, Child::new()))
+        Ok(configfs::Group::new(
+            name.try_into()?,
+            tpe,
+            Child::new(),
+            core::iter::empty(),
+        ))
     }
 }
 
@@ -152,6 +157,7 @@ fn make_group(&self, name: &CStr) -> Result<impl PinInit<configfs::Group<GrandCh
             name.try_into()?,
             tpe,
             GrandChild::new(),
+            core::iter::empty(),
         ))
     }
 }

---
base-commit: 7fd2df204f342fc17d1a0bfcd474b24232fb0f32
change-id: 20260215-configfs-c-default-groups-bdb0a44633a6

Best regards,
--  
Andreas Hindborg <a.hindborg@kernel.org>



^ permalink raw reply related

* Re: [PATCH v3] block: assign caller-specific lockdep class to disk->open_mutex
From: Mark Brown @ 2026-06-05 15:04 UTC (permalink / raw)
  To: Miguel Ojeda
  Cc: Tetsuo Handa, Miguel Ojeda, akpm, axboe, bvanassche, dlemoal, hch,
	hdanton, linux-block, linux-kernel, tom.leiming, wqu,
	Andreas Hindborg, Kees Cook, Gustavo A. R. Silva, Boqun Feng,
	rust-for-linux
In-Reply-To: <14686d75-c481-42ba-bbcf-94a4423aeb42@sirena.org.uk>

[-- Attachment #1: Type: text/plain, Size: 922 bytes --]

On Fri, Jun 05, 2026 at 02:03:22PM +0100, Mark Brown wrote:
> On Fri, Jun 05, 2026 at 02:40:24PM +0200, Miguel Ojeda wrote:

> > And, yeah, it is fragile... It is quite a complicated set of Kconfig
> > relationships, so years ago I also ended up in the same situation and
> > early on decided to add an explicit grep for `CONFIG_RUST=y` and
> > certain other bits that I wanted to ensure are in place after the
> > config phase.
> 
> It looks like the breakage is due to 'depends on !RANDSTRUCT' because
> clang supports randstruct and allmodconfig defaults that to
> RANDSTRUCT_FULL.  The fact that rust depends on !RANDSTRUCT but
> randstruct doesn't care about rust means that randstruct wins.  I've got
> to say rust coverage seems more important than randstruct coverage for
> -next, others might have different opinions though!

I think I've got a fix which isn't too bad, I'll post it later today.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [PATCH v6 00/14] Enable lock context analysis for the block layer core
From: Bart Van Assche @ 2026-06-05 16:10 UTC (permalink / raw)
  To: Nilay Shroff, Jens Axboe
  Cc: linux-block, Christoph Hellwig, Hannes Reinecke, Damien Le Moal
In-Reply-To: <013ee998-7b98-4870-950b-806eef46cf9c@linux.ibm.com>

On 6/5/26 5:45 AM, Nilay Shroff wrote:
> I have reviewed this series and it looks good to me.

Thanks!

> One question: I still see a few block-layer functions that document locking
> requirements with lockdep assertions. Should we also consider adding the
> corresponding Clang context annotations for these where practical?

Compile-time checking is always better than run-time checking so I think
that would be useful. My goal with this patch series is to do the
minimum necessary to enable context analysis support for the block layer
core.

> For example, blkg_create() requires q->queue_lock must be held by its 
> caller, so
> it could be annotated with __must_hold(&disk->queue->queue_lock). 
> Similarly,
> blk_report_disk_dead() asserts that disk->open_mutex must not be held, 
> so it could
> be annotated with __must_not_hold(&disk->open_mutex).

There are more lockdep_assert_held() calls for which the __must_hold()
annotation can be added, e.g. for the functions in block/blk-iocost.c.
However, adding these __must_hold() annotations without adding new
lockdep_assert_held() or __assume_ctx_lock() calls is not possible
without adding more arguments to the functions in that source file.
That's one of the reasons I'd like to keep such changes out of this
patch series and for a future patch series.

Thanks,

Bart.

^ permalink raw reply

* Re: [PATCH] block: Add WQ_PERCPU to alloc_workqueue users
From: Chaitanya Kulkarni @ 2026-06-05 17:15 UTC (permalink / raw)
  To: Marco Crivellari, linux-kernel@vger.kernel.org,
	linux-block@vger.kernel.org
  Cc: Tejun Heo, Lai Jiangshan, Frederic Weisbecker,
	Sebastian Andrzej Siewior, Michal Hocko, Damien Le Moal,
	Jens Axboe
In-Reply-To: <20260604105347.168322-1-marco.crivellari@suse.com>

On 6/4/26 03:53, Marco Crivellari wrote:
> This continues the effort to refactor workqueue APIs, which began with
> the introduction of new workqueues and a new alloc_workqueue flag in:
>
>     commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq")
>     commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")
>
> The refactoring is going to alter the default behavior of
> alloc_workqueue() to be unbound by default.
>
> With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
> any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
> must now use WQ_PERCPU. For more details see the Link tag below.
>
> In order to keep alloc_workqueue() behavior identical, explicitly request
> WQ_PERCPU.
>
> Link:https://lore.kernel.org/all/20250221112003.1dSuoGyc@linutronix.de/
> Suggested-by: Tejun Heo<tj@kernel.org>
>
> Signed-off-by: Marco Crivellari<marco.crivellari@suse.com>


Looks good.

Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>

-ck



^ permalink raw reply

* Re: [PATCH] block: Add WQ_PERCPU to alloc_workqueue users
From: Jens Axboe @ 2026-06-05 17:21 UTC (permalink / raw)
  To: linux-kernel, linux-block, Marco Crivellari
  Cc: Tejun Heo, Lai Jiangshan, Frederic Weisbecker,
	Sebastian Andrzej Siewior, Michal Hocko, Damien Le Moal
In-Reply-To: <20260604105347.168322-1-marco.crivellari@suse.com>


On Thu, 04 Jun 2026 12:53:47 +0200, Marco Crivellari wrote:
> This continues the effort to refactor workqueue APIs, which began with
> the introduction of new workqueues and a new alloc_workqueue flag in:
> 
>    commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq")
>    commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")
> 
> The refactoring is going to alter the default behavior of
> alloc_workqueue() to be unbound by default.
> 
> [...]

Applied, thanks!

[1/1] block: Add WQ_PERCPU to alloc_workqueue users
      commit: 7e712f292e7f01e91d09e83eb7b9526f77f66c71

Best regards,
-- 
Jens Axboe




^ permalink raw reply

* [PATCH v7 01/14] block: Annotate the queue limits functions
From: Bart Van Assche @ 2026-06-05 18:00 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Hannes Reinecke, Damien Le Moal,
	Bart Van Assche
In-Reply-To: <cover.1780682325.git.bvanassche@acm.org>

Let the thread-safety checker verify whether every start of a queue
limits update is followed by a call to a function that finishes a queue
limits update.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 include/linux/blkdev.h | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 17270a28c66d..65efbd7fe1a3 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1092,15 +1092,17 @@ static inline unsigned int blk_boundary_sectors_left(sector_t offset,
  */
 static inline struct queue_limits
 queue_limits_start_update(struct request_queue *q)
+	__acquires(&q->limits_lock)
 {
 	mutex_lock(&q->limits_lock);
 	return q->limits;
 }
 int queue_limits_commit_update_frozen(struct request_queue *q,
-		struct queue_limits *lim);
+		struct queue_limits *lim) __releases(&q->limits_lock);
 int queue_limits_commit_update(struct request_queue *q,
-		struct queue_limits *lim);
-int queue_limits_set(struct request_queue *q, struct queue_limits *lim);
+		struct queue_limits *lim) __releases(&q->limits_lock);
+int queue_limits_set(struct request_queue *q, struct queue_limits *lim)
+	__must_not_hold(&q->limits_lock);
 int blk_validate_limits(struct queue_limits *lim);
 
 /**
@@ -1112,6 +1114,7 @@ int blk_validate_limits(struct queue_limits *lim);
  * starting update.
  */
 static inline void queue_limits_cancel_update(struct request_queue *q)
+	__releases(&q->limits_lock)
 {
 	mutex_unlock(&q->limits_lock);
 }

^ permalink raw reply related

* [PATCH v7 00/14] Enable lock context analysis for the block layer core
From: Bart Van Assche @ 2026-06-05 18:00 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Hannes Reinecke, Damien Le Moal,
	Bart Van Assche

Hi Jens,

Recently the following patch series has been merged: [PATCH v5 00/36]
Compiler-Based Context- and Locking-Analysis
(https://lore.kernel.org/lkml/20251219154418.3592607-1-elver@google.com/). That
patch series drops support for verifying lock context annotations with sparse
and introduces support for verifying lock context annotations with Clang. The
support in Clang for lock context annotation and verification is better than
that in sparse. As an example, __cond_acquires() and __guarded_by() are
supported by Clang but not by sparse. Hence this patch series that enables lock
context analysis for the block layer core.

Note: although the Linux kernel documentation specifies 22 as minimal
version for Clang for context analysis support, this patch series requires
Clang 23 because it annotates function pointers. As one can see here, a patch
has been queued that fixes the kernel documentation:
https://lore.kernel.org/all/177926568868.711.3058599932884307249.tip-bot2@tip-bot2/

Please consider this patch series for the next merge window.

Thanks,

Bart.

Changes compared to v6:
 - In patch 10/14, change guard()() back to spin_lock() calls.

Changes compared to v5:
 - Made the description of patch "block/cgroup: Inline
   blkg_conf_{open,close}_bdev_frozen()" more clear.
 - Combined two error paths in ioc_qos_write().
 - Folded iocg_lock() and iocg_unlock() into their callers. Removed the token
   context lock "ioc_lock".
 - Introduced macros in the debugfs code that capture expressions with
   pointer casts.
 - Added two patches: "block/blk-iocost: Combine two error paths in
   ioc_qos_write()" and "block/blk-iocost: Split ioc_rqos_throttle()".

Changes compared to v4:
 - Rebased and retested on top of Jens' latest for-next branch.

Changes compared to v3:
 - Replaced the "block/bdev: Annotate the blk_holder_ops callback invocations"
   patch with a patch that adds __releases() annotations to the function
   pointers in struct blk_holder_ops.
 - Dropped the blk-zoned patch since a better patch from Christoph has
   been merged.

Changes compared to v2:
 - Retained the block layer core patches and left out the block driver patches.
 - Inlined blkg_conf_open_bdev_frozen() and blkg_conf_close_bdev_frozen().
 - In blkg_conf_open_bdev(), added a return statement if the
   WARN_ON_ONCE() statement triggers.
 - Replaced the "block/ioctl: Add lock context annotations" patch with a
   __release() annotation.
 - Replaced the blk-zoned patch with a patch from Christoph.

Changes compared to v1:
 - Rebased this patch series on top of Jens' for-next branch.
 - Included two patches that split blkg_conf_prep() and blkg_conf_exit().
 - Modified how patches are split. Split the block layer core patch into
   multiple patches and moved the CONTEXT_ANALYSIS := y assignments into the
   block driver patches.
 - Made the new source code comments easier to comprehend.
 - Introduced macros in the mq-deadline and Kyber I/O schedulers to make the
   __acquires() expressions easier to read.
 - Removed the changes from this series that are not block layer changes.

Bart Van Assche (14):
  block: Annotate the queue limits functions
  block/bdev: Annotate the blk_holder_ops callback functions
  block/cgroup: Split blkg_conf_prep()
  block/cgroup: Split blkg_conf_exit()
  block/cgroup: Improve lock context annotations
  block/blk-iocost: Combine two error paths in ioc_qos_write()
  block/cgroup: Inline blkg_conf_{open,close}_bdev_frozen()
  block/crypto: Annotate the crypto functions
  block/blk-iocost: Split ioc_rqos_throttle()
  block/blk-iocost: Inline iocg_lock() and iocg_unlock()
  block/blk-mq-debugfs: Improve lock context annotations
  block/Kyber: Make the lock context annotations compatible with Clang
  block/mq-deadline: Make the lock context annotations compatible with
    Clang
  block: Enable lock context analysis

 block/Makefile             |   2 +
 block/bfq-cgroup.c         |  11 +-
 block/blk-cgroup.c         |  98 +++---------
 block/blk-cgroup.h         |  13 +-
 block/blk-crypto-profile.c |   2 +
 block/blk-iocost.c         | 306 +++++++++++++++++++++----------------
 block/blk-iolatency.c      |  19 ++-
 block/blk-mq-debugfs.c     |  24 ++-
 block/blk-throttle.c       |  34 +++--
 block/blk.h                |   4 +
 block/kyber-iosched.c      |   7 +-
 block/mq-deadline.c        |  12 +-
 include/linux/blkdev.h     |  21 ++-
 13 files changed, 297 insertions(+), 256 deletions(-)


^ permalink raw reply

* [PATCH v7 02/14] block/bdev: Annotate the blk_holder_ops callback functions
From: Bart Van Assche @ 2026-06-05 18:00 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Hannes Reinecke, Damien Le Moal,
	Bart Van Assche
In-Reply-To: <cover.1780682325.git.bvanassche@acm.org>

The four callback functions in blk_holder_ops all release the
bd_holder_lock. Annotate these functions accordingly.

Reviewed-by: Hannes Reinecke <hare@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 include/linux/blkdev.h | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 65efbd7fe1a3..57e84d59a642 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1746,22 +1746,26 @@ void blkdev_show(struct seq_file *seqf, off_t offset);
 #endif
 
 struct blk_holder_ops {
-	void (*mark_dead)(struct block_device *bdev, bool surprise);
+	void (*mark_dead)(struct block_device *bdev, bool surprise)
+		__releases(&bdev->bd_holder_lock);
 
 	/*
 	 * Sync the file system mounted on the block device.
 	 */
-	void (*sync)(struct block_device *bdev);
+	void (*sync)(struct block_device *bdev)
+		__releases(&bdev->bd_holder_lock);
 
 	/*
 	 * Freeze the file system mounted on the block device.
 	 */
-	int (*freeze)(struct block_device *bdev);
+	int (*freeze)(struct block_device *bdev)
+		__releases(&bdev->bd_holder_lock);
 
 	/*
 	 * Thaw the file system mounted on the block device.
 	 */
-	int (*thaw)(struct block_device *bdev);
+	int (*thaw)(struct block_device *bdev)
+		__releases(&bdev->bd_holder_lock);
 };
 
 /*

^ permalink raw reply related

* [PATCH v7 03/14] block/cgroup: Split blkg_conf_prep()
From: Bart Van Assche @ 2026-06-05 18:00 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Hannes Reinecke, Damien Le Moal,
	Bart Van Assche, Tejun Heo, Yu Kuai, Josef Bacik
In-Reply-To: <cover.1780682325.git.bvanassche@acm.org>

Move the blkg_conf_open_bdev() call out of blkg_conf_prep() to make it
possible to add lock context annotations to blkg_conf_prep(). Change an
if-statement in blkg_conf_open_bdev() into a WARN_ON_ONCE() call. Export
blkg_conf_open_bdev() because it is called by the BFQ I/O scheduler and
the BFQ I/O scheduler may be built as a kernel module.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/bfq-cgroup.c |  4 ++++
 block/blk-cgroup.c | 18 ++++++++----------
 block/blk-iocost.c |  4 ++++
 3 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c
index 37ab70930c8d..df7b5a646e96 100644
--- a/block/bfq-cgroup.c
+++ b/block/bfq-cgroup.c
@@ -1052,6 +1052,10 @@ static ssize_t bfq_io_set_device_weight(struct kernfs_open_file *of,
 
 	blkg_conf_init(&ctx, buf);
 
+	ret = blkg_conf_open_bdev(&ctx);
+	if (ret)
+		goto out;
+
 	ret = blkg_conf_prep(blkcg, &blkcg_policy_bfq, &ctx);
 	if (ret)
 		goto out;
diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 554c87bb4a86..a8d95d51b866 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -771,10 +771,7 @@ EXPORT_SYMBOL_GPL(blkg_conf_init);
  * @ctx->input and get and store the matching bdev in @ctx->bdev. @ctx->body is
  * set to point past the device node prefix.
  *
- * This function may be called multiple times on @ctx and the extra calls become
- * NOOPs. blkg_conf_prep() implicitly calls this function. Use this function
- * explicitly if bdev access is needed without resolving the blkcg / policy part
- * of @ctx->input. Returns -errno on error.
+ * Returns: -errno on error.
  */
 int blkg_conf_open_bdev(struct blkg_conf_ctx *ctx)
 {
@@ -783,8 +780,8 @@ int blkg_conf_open_bdev(struct blkg_conf_ctx *ctx)
 	struct block_device *bdev;
 	int key_len;
 
-	if (ctx->bdev)
-		return 0;
+	if (WARN_ON_ONCE(ctx->bdev))
+		return -EINVAL;
 
 	if (sscanf(input, "%u:%u%n", &major, &minor, &key_len) != 2)
 		return -EINVAL;
@@ -813,6 +810,8 @@ int blkg_conf_open_bdev(struct blkg_conf_ctx *ctx)
 	ctx->bdev = bdev;
 	return 0;
 }
+EXPORT_SYMBOL_GPL(blkg_conf_open_bdev);
+
 /*
  * Similar to blkg_conf_open_bdev, but additionally freezes the queue,
  * ensures the correct locking order between freeze queue and q->rq_qos_mutex.
@@ -857,7 +856,7 @@ unsigned long __must_check blkg_conf_open_bdev_frozen(struct blkg_conf_ctx *ctx)
  * following MAJ:MIN, @ctx->bdev points to the target block device and
  * @ctx->blkg to the blkg being configured.
  *
- * blkg_conf_open_bdev() may be called on @ctx beforehand. On success, this
+ * blkg_conf_open_bdev() must be called on @ctx beforehand. On success, this
  * function returns with queue lock held and must be followed by
  * blkg_conf_exit().
  */
@@ -870,9 +869,8 @@ int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol,
 	struct blkcg_gq *blkg;
 	int ret;
 
-	ret = blkg_conf_open_bdev(ctx);
-	if (ret)
-		return ret;
+	if (WARN_ON_ONCE(!ctx->bdev))
+		return -EINVAL;
 
 	disk = ctx->bdev->bd_disk;
 	q = disk->queue;
diff --git a/block/blk-iocost.c b/block/blk-iocost.c
index 0cca88a366dc..b34f820dedcc 100644
--- a/block/blk-iocost.c
+++ b/block/blk-iocost.c
@@ -3140,6 +3140,10 @@ static ssize_t ioc_weight_write(struct kernfs_open_file *of, char *buf,
 
 	blkg_conf_init(&ctx, buf);
 
+	ret = blkg_conf_open_bdev(&ctx);
+	if (ret)
+		goto err;
+
 	ret = blkg_conf_prep(blkcg, &blkcg_policy_iocost, &ctx);
 	if (ret)
 		goto err;

^ permalink raw reply related

* [PATCH v7 05/14] block/cgroup: Improve lock context annotations
From: Bart Van Assche @ 2026-06-05 18:00 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Hannes Reinecke, Damien Le Moal,
	Bart Van Assche, Tejun Heo, Josef Bacik
In-Reply-To: <cover.1780682325.git.bvanassche@acm.org>

Add lock context annotations where these are missing. Move the
blkg_conf_prep() annotation into block/blk-cgroup.h to make it visible
to all blkg_conf_prep() callers.

Reviewed-by: Hannes Reinecke <hare@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/blk-cgroup.c |  1 -
 block/blk-cgroup.h | 15 ++++++++++-----
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 38d7bcfcbbe8..86513c54c217 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -862,7 +862,6 @@ unsigned long __must_check blkg_conf_open_bdev_frozen(struct blkg_conf_ctx *ctx)
  */
 int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol,
 		   struct blkg_conf_ctx *ctx)
-	__acquires(&bdev->bd_queue->queue_lock)
 {
 	struct gendisk *disk;
 	struct request_queue *q;
diff --git a/block/blk-cgroup.h b/block/blk-cgroup.h
index ce90f5b60d52..f0a3af520c55 100644
--- a/block/blk-cgroup.h
+++ b/block/blk-cgroup.h
@@ -218,14 +218,19 @@ struct blkg_conf_ctx {
 };
 
 void blkg_conf_init(struct blkg_conf_ctx *ctx, char *input);
-int blkg_conf_open_bdev(struct blkg_conf_ctx *ctx);
+int blkg_conf_open_bdev(struct blkg_conf_ctx *ctx)
+	__cond_acquires(0, &ctx->bdev->bd_queue->rq_qos_mutex);
 unsigned long blkg_conf_open_bdev_frozen(struct blkg_conf_ctx *ctx);
 int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol,
-		   struct blkg_conf_ctx *ctx);
-void blkg_conf_unprep(struct blkg_conf_ctx *ctx);
-void blkg_conf_close_bdev(struct blkg_conf_ctx *ctx);
+		   struct blkg_conf_ctx *ctx)
+	__cond_acquires(0, &ctx->bdev->bd_disk->queue->queue_lock);
+void blkg_conf_unprep(struct blkg_conf_ctx *ctx)
+	__releases(ctx->bdev->bd_disk->queue->queue_lock);
+void blkg_conf_close_bdev(struct blkg_conf_ctx *ctx)
+	__releases(&ctx->bdev->bd_queue->rq_qos_mutex);
 void blkg_conf_close_bdev_frozen(struct blkg_conf_ctx *ctx,
-				 unsigned long memflags);
+				 unsigned long memflags)
+	__releases(&ctx->bdev->bd_queue->rq_qos_mutex);
 
 /**
  * bio_issue_as_root_blkg - see if this bio needs to be issued as root blkg

^ permalink raw reply related

* [PATCH v7 06/14] block/blk-iocost: Combine two error paths in ioc_qos_write()
From: Bart Van Assche @ 2026-06-05 18:00 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Hannes Reinecke, Damien Le Moal,
	Bart Van Assche, Tejun Heo, Josef Bacik
In-Reply-To: <cover.1780682325.git.bvanassche@acm.org>

Reduce code duplication by combining two error paths. No functionality
has been changed.

Suggested-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/blk-iocost.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/block/blk-iocost.c b/block/blk-iocost.c
index 3e4e28ecc21f..050bfbc6d806 100644
--- a/block/blk-iocost.c
+++ b/block/blk-iocost.c
@@ -3239,7 +3239,7 @@ static ssize_t ioc_qos_write(struct kernfs_open_file *of, char *input,
 	bool enable, user;
 	char *body, *p;
 	unsigned long memflags;
-	int ret;
+	int ret = 0;
 
 	blkg_conf_init(&ctx, input);
 
@@ -3251,14 +3251,14 @@ static ssize_t ioc_qos_write(struct kernfs_open_file *of, char *input,
 	disk = ctx.bdev->bd_disk;
 	if (!queue_is_mq(disk->queue)) {
 		ret = -EOPNOTSUPP;
-		goto err;
+		goto close_bdev;
 	}
 
 	ioc = q_to_ioc(disk->queue);
 	if (!ioc) {
 		ret = blk_iocost_init(disk);
 		if (ret)
-			goto err;
+			goto close_bdev;
 		ioc = q_to_ioc(disk->queue);
 	}
 
@@ -3362,15 +3362,15 @@ static ssize_t ioc_qos_write(struct kernfs_open_file *of, char *input,
 
 	blk_mq_unquiesce_queue(disk->queue);
 
+close_bdev:
 	blkg_conf_close_bdev_frozen(&ctx, memflags);
-	return nbytes;
+	return ret ?: nbytes;
+
 einval:
 	spin_unlock_irq(&ioc->lock);
 	blk_mq_unquiesce_queue(disk->queue);
 	ret = -EINVAL;
-err:
-	blkg_conf_close_bdev_frozen(&ctx, memflags);
-	return ret;
+	goto close_bdev;
 }
 
 static u64 ioc_cost_model_prfill(struct seq_file *sf,

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox