Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/83] block: rnull: complete the rust null block driver
@ 2026-06-09 19:07 Andreas Hindborg
  2026-06-09 19:07 ` [PATCH v2 01/83] block: rust: fix `Send` bound for `GenDisk` Andreas Hindborg
                   ` (82 more replies)
  0 siblings, 83 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:07 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux, Yuan Tan, Andreas Hindborg

This series aims to bring the feature set of the Rust null block driver on
par with that of the C null_blk driver.

There are quite a few changes from v1 in this version. I tried to capture
everything in the change log, but I might have missed something along
the way.

I have prepared a tree with all dependencies applied at [1].

Best regards,
Andreas Hindborg

[1] git https://git.kernel.org/pub/scm/linux/kernel/git/a.hindborg/linux.git rnull-v7.1-rc2

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
Changes in v2:
- Fix shift direction in transfer length calculation.
- Retry page preload after reacquiring locks.
- Fix a bug where badblocks did not correctly limit IO size.
- Close TOCTOU window in configfs power-check stores (Alice).
- Use `bool` for semantically-boolean module parameters (Alice).
- Take `NumaNode` instead of a raw `i32` as the home node argument to `TagSet` (Alice).
- Use `c_void`, `c_uint`, and `c_int` from the prelude in `hctx` private data support (Alice).
- Use `size_of` from the prelude in `Request` private data support (Alice).
- Return `Ok(())` from `new_request_data` instead of `pin_init::zeroed` (Alice).
- Add `// CAST:` annotations to casts (Alice).
- Expand the comment on the `BLK_STS_.*` bindgen blocklist entry.
- Depend on "rust: module_param: return copy from value() for Copy types"
- block: rust: introduce `kernel::block::bio` module (Alice):
  - Use `kernel::fmt::Display` for `Bio` and cache `raw_iter()`.
  - Mark the `bio_advance_iter_single` helper `__rust_helper`.
  - Use a `srctree/` link for the C header.
  - Remove the stale reference-counting invariants from `Bio`.
  - Take `Pin<&mut Self>` in `Bio::segment_iter` and `Request::bio_mut`.
  - Document that the `bvec_iter` cursor can be copied and moved freely.
  - Use `&raw mut` instead of `core::ptr::from_mut`.
- Narrow the `unsafe` block in `Request::command()` using `BitAnd` (Alice, Gary).
- Use `c_void` from the prelude and drop a spurious blank line in the `TagSet` flags module (Alice).
- Drop the `Tree` type alias in favor of `XArray<TreeNode>` in rnull (Alice).
- Use a `NoIo` memory allocation scope in `queue_rq` rather than passing `GFP_NOIO`.
- Add the missing comma between `memory_backed` and `submit_queues` in the configfs feature listing (Alice).
- Fix the `use_per_node_hctx` store to set `submit_queues` to the online node count instead of multiplying by it (Alice).
- Use `static_assert!` instead of a `build_assert!` constant for the page/sector width check (Alice).
- Fix a typo in the `TagSet::new` doc comment (Ken).
- block: rust: add `BadBlocks` for bad block tracking (Alice):
  - Remove newline after `use` statements.
  - Add C header link.
  - Convert boolean to int with `into`.
  - Remove duplicated docs from `enabled`.
  - Use if/else rather than `then_some` in `set_bad`.
- Add a patch to rename `SECTOR_MASK` to `PAGE_SECTOR_MASK`.
- Use `pr_warn_once!` where applicable.
- Require `TagSet` private data to be `Send` for `TagSet` to be `Send`.
- Require `Operations::TagSetData: Sync`.
- Require `Operations::HwData: Send + Sync` and add a note on the bounds.
- Require `Operations::RequestData: Send` and add note on the bound.
- Add `TagSet::flags` to obtain flags and fix a bug in zoned emulation caused by taking a mutex under rcu read lock.
- Link to v1: https://msgid.link/20260216-rnull-v6-19-rc5-send-v1-0-de9a7af4b469@kernel.org

Cc: linux-block@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: rust-for-linux@vger.kernel.org
To: "Liam R. Howlett" <Liam.Howlett@oracle.com>
To: "Liam R. Howlett" <liam@infradead.org>
To: Alice Ryhl <aliceryhl@google.com>
To: Andreas Hindborg <a.hindborg@kernel.org>
To: Anna-Maria Behnsen <anna-maria@linutronix.de>
To: Benno Lossin <lossin@kernel.org>
To: Björn Roy Baron <bjorn3_gh@protonmail.com>
To: Boqun Feng <boqun.feng@gmail.com>
To: Boqun Feng <boqun@kernel.org>
To: Danilo Krummrich <dakr@kernel.org>
To: FUJITA Tomonori <fujita.tomonori@gmail.com>
To: Frederic Weisbecker <frederic@kernel.org>
To: Gary Guo <gary@garyguo.net>
To: Jens Axboe <axboe@kernel.dk>
To: John Stultz <jstultz@google.com>
To: Lorenzo Stoakes <ljs@kernel.org>
To: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: Lyude Paul <lyude@redhat.com>
To: Miguel Ojeda <ojeda@kernel.org>
To: Stephen Boyd <sboyd@kernel.org>
To: Thomas Gleixner <tglx@kernel.org>
To: Trevor Gross <tmgross@umich.edu>

---
Andreas Hindborg (83):
      block: rust: fix `Send` bound for `GenDisk`
      rust: block: rename `SECTOR_MASK` to `PAGE_SECTOR_MASK`
      block: rnull: adopt new formatting guidelines
      block: rnull: add module parameters
      block: rnull: add macros to define configfs attributes
      block: rust: fix generation of bindings to `BLK_STS_.*`
      block: rust: change `queue_rq` request type to `Owned`
      block: rust: add `Request` private data support
      block: rust: document the lifetime of `Request`
      block: rust: allow `hrtimer::Timer` in `RequestData`
      block: rnull: add timer completion mode
      block: rust: introduce `kernel::block::bio` module
      block: rust: add `command` getter to `Request`
      block: rust: mq: use GFP_KERNEL from prelude
      block: rust: add `TagSet` flags
      block: rnull: add memory backing
      block: rnull: add submit queue count config option
      block: rnull: add `use_per_node_hctx` config option
      block: rust: allow specifying home node when constructing `TagSet`
      block: rnull: allow specifying the home numa node
      block: rust: add Request::sectors() method
      block: rust: mq: add max_hw_discard_sectors support to GenDiskBuilder
      block: rnull: add discard support
      block: rust: add `NoDefaultScheduler` flag for `TagSet`
      block: rnull: add no_sched module parameter and configfs attribute
      block: rust: change sector type from usize to u64
      block: rust: add `BadBlocks` for bad block tracking
      block: rust: mq: add Request::end() method for custom status codes
      block: rnull: add badblocks support
      block: rnull: add badblocks_once support
      block: rust: add `Segment::truncate`
      block: rnull: add partial I/O support for bad blocks
      block: rust: add `TagSet` private data support
      block: rust: add `hctx` private data support
      block: rnull: add volatile cache emulation
      block: rust: implement `Sync` for `GenDisk`.
      block: rust: add a back reference feature to `GenDisk`
      block: rust: introduce an idle type state for `Request`
      block: rust: add a request queue abstraction
      block: rust: add a method to get the request queue for a request
      block: rust: introduce `kernel::block::error`
      block: rust: require `queue_rq` to return a `BlkResult`
      block: rust: add `GenDisk::queue_data`
      block: rnull: add bandwidth limiting
      block: rnull: add blocking queue mode
      block: rnull: add shared tags
      block: rnull: add queue depth config option
      block: rust: add an abstraction for `bindings::req_op`
      block: rust: add a method to set the target sector of a request
      block: rust: move gendisk vtable construction to separate function
      block: rust: add zoned block device support
      block: rust: add `TagSet::flags`
      block: rnull: add zoned storage support
      block: rust: add `map_queues` support
      block: rust: add an abstraction for `struct blk_mq_queue_map`
      block: rust: add polled completion support
      block: rust: add accessors to `TagSet`
      block: rnull: add polled completion support
      block: rnull: add REQ_OP_FLUSH support
      block: rust: add request flags abstraction
      block: rust: add abstraction for block queue feature flags
      block: rust: allow setting write cache and FUA flags for `GenDisk`
      block: rust: add `Segment::copy_to_page_limit`
      block: rnull: add fua support
      block: rust: add `GenDisk::tag_set`
      block: rust: add `TagSet::update_hw_queue_count`
      block: rnull: add an option to change the number of hardware queues
      block: rust: add an abstraction for `struct rq_list`
      block: rust: add `queue_rqs` vtable hook
      block: rnull: support queue_rqs
      block: rust: remove the `is_poll` parameter from `queue_rq`
      block: rust: add a debug assert for refcounts
      block: rust: add `TagSet::tag_to_rq`
      block: rust: add `Request::queue_index`
      block: rust: add `Request::requeue`
      block: rust: add `request_timeout` hook
      block: rnull: add fault injection support
      block: rust: add max_sectors option to `GenDiskBuilder`
      block: rnull: allow configuration of the maximum IO size
      block: rust: add `virt_boundary_mask` option to `GenDiskBuilder`
      block: rnull: add `virt_boundary` option
      block: rnull: add `shared_tag_bitmap` config option
      block: rnull: add zone offline and readonly configfs files

 drivers/block/rnull/Kconfig              |   11 +
 drivers/block/rnull/configfs.rs          |  605 +++++++++++++--
 drivers/block/rnull/configfs/macros.rs   |  143 ++++
 drivers/block/rnull/disk_storage.rs      |  326 ++++++++
 drivers/block/rnull/disk_storage/page.rs |   78 ++
 drivers/block/rnull/rnull.rs             | 1198 ++++++++++++++++++++++++++++--
 drivers/block/rnull/util.rs              |   65 ++
 drivers/block/rnull/zoned.rs             |  696 +++++++++++++++++
 rust/bindgen_parameters                  |    6 +
 rust/bindings/bindings_helper.h          |   55 ++
 rust/helpers/blk.c                       |   47 ++
 rust/kernel/block.rs                     |  101 ++-
 rust/kernel/block/badblocks.rs           |  716 ++++++++++++++++++
 rust/kernel/block/bio.rs                 |  147 ++++
 rust/kernel/block/bio/vec.rs             |  448 +++++++++++
 rust/kernel/block/mq.rs                  |   78 +-
 rust/kernel/block/mq/feature.rs          |   76 ++
 rust/kernel/block/mq/gen_disk.rs         |  336 +++++++--
 rust/kernel/block/mq/operations.rs       |  489 +++++++++++-
 rust/kernel/block/mq/request.rs          |  677 ++++++++++++++---
 rust/kernel/block/mq/request/command.rs  |   65 ++
 rust/kernel/block/mq/request/flag.rs     |   65 ++
 rust/kernel/block/mq/request_list.rs     |  119 +++
 rust/kernel/block/mq/request_queue.rs    |   60 ++
 rust/kernel/block/mq/tag_set.rs          |  299 +++++++-
 rust/kernel/block/mq/tag_set/flags.rs    |   29 +
 rust/kernel/error.rs                     |    3 +-
 rust/kernel/page.rs                      |    2 +-
 rust/kernel/time/hrtimer.rs              |    5 +-
 29 files changed, 6603 insertions(+), 342 deletions(-)
---
base-commit: 9e0898f1c0f134c6bad146ca8578f73c3e40ac0a
change-id: 20260215-rnull-v6-19-rc5-send-98c33ec692d6
prerequisite-change-id: 20250305-unique-ref-29fcd675f9e9:v17
prerequisite-patch-id: 6c6a7fdd56627293ec3bba61c495f16a0858700c
prerequisite-patch-id: c1958590235ee32d6ddb31ea168105bd9cf248f2
prerequisite-patch-id: c5a4b231dc8adf37e93ebdce308dacbe6a244bf3
prerequisite-patch-id: 541dba7938ba874f8d17fee05a36b1cd9fa2c4d7
prerequisite-patch-id: 3668fd640e4c411bae0c8ea9d986c3fa5d3c9e82
prerequisite-patch-id: da1274864841e267697be9529a50531126c64872
prerequisite-patch-id: c1463b6578e94b56d2bad41f6e614b5286fb1db3
prerequisite-patch-id: a31185fe1abbf553377d6d695c5d206eebc84358
prerequisite-patch-id: 4f392b5736e55a354ec3022644389f89b52fda42
prerequisite-patch-id: b6388ff0ebdd54610010d72a5398842a3c668bbf
prerequisite-change-id: 20251203-xarray-entry-send-00230f0744e6:v4
prerequisite-patch-id: 5d797523ed1bb94597570b6faa4cacea8d94b4f7
prerequisite-patch-id: f82bffce83d85ad4dd0bc9dab876e31c4500d467
prerequisite-patch-id: bc00e3c0a3694d8d490c782bc24b2a5786350da7
prerequisite-patch-id: 39c26c865ad383b133a742e5998e2b1f54999908
prerequisite-patch-id: 4082a1ae45104c2f3170197e186d83db552f9302
prerequisite-patch-id: de0c55224727e169d151d68a5316f0ae4549e4b8
prerequisite-patch-id: 57c6d2464a380542b5283817666540d2c97b0b61
prerequisite-patch-id: c788013f9319aa91f51f74f92f43cf7f2c04496f
prerequisite-patch-id: 959c962400d8595cc55b4f1b3a5501c2290a7d0e
prerequisite-patch-id: 66ed5c6a31fe2d775b5bc70774e3148fa3d860e5
prerequisite-patch-id: 869aa913843e11b467890ed35a1455458dbf3de4
prerequisite-change-id: 20260206-xarray-lockdep-fix-10f1cc68e5d7:v2
prerequisite-patch-id: e871db17a721fede1b7419b8236229190449885b
prerequisite-change-id: 20260130-page-volatile-io-05ff595507d3:v4
prerequisite-patch-id: 09224764d69c35c18e6fec846d4b7ba33c0e9cac
prerequisite-patch-id: cfd909257db3f5811c94d52ac2fc31cf220560c3
prerequisite-change-id: 20260128-gfp-noio-fbd41e135088:v2
prerequisite-patch-id: 420a09fdd0f2758f4d46228f99f29ff82f2d05f3
prerequisite-change-id: 20260212-impl-flags-inner-c61974b27b18:v2
prerequisite-patch-id: 379fb78c07b554278fae3c42d84d62bcfcfa0d45
prerequisite-change-id: 20260214-pin-slice-init-e8ef96fc07b9:v2
prerequisite-patch-id: cdf4e4b2b8c43bcb54b3ddf13a02e28c0e11e9ce
prerequisite-change-id: 20260215-page-additions-bc36046e9ffd:v2
prerequisite-patch-id: 6c6a7fdd56627293ec3bba61c495f16a0858700c
prerequisite-patch-id: c1958590235ee32d6ddb31ea168105bd9cf248f2
prerequisite-patch-id: c5a4b231dc8adf37e93ebdce308dacbe6a244bf3
prerequisite-patch-id: 541dba7938ba874f8d17fee05a36b1cd9fa2c4d7
prerequisite-patch-id: 3668fd640e4c411bae0c8ea9d986c3fa5d3c9e82
prerequisite-patch-id: da1274864841e267697be9529a50531126c64872
prerequisite-patch-id: c1463b6578e94b56d2bad41f6e614b5286fb1db3
prerequisite-patch-id: a31185fe1abbf553377d6d695c5d206eebc84358
prerequisite-patch-id: 4f392b5736e55a354ec3022644389f89b52fda42
prerequisite-patch-id: b6388ff0ebdd54610010d72a5398842a3c668bbf
prerequisite-patch-id: 1f57b529c53f4a650cbeeb7c1ff81653cb95e7f3
prerequisite-patch-id: 4d71a95c2d1a6a36339a9feda6296c33ec86f258
prerequisite-change-id: 20260215-cpu-helpers-08efb2572487:v2
prerequisite-patch-id: fd7f24bed247075d1946f9f526390772afb45236
prerequisite-patch-id: 7d243f4cd29a08a1eb2ca0e0e976fa82f0760f11
prerequisite-change-id: 20260215-export-do-unlocked-00a6ac9373d4:v2
prerequisite-patch-id: c65f4a3078f1acc1b77ea28b531e54664187dbce
prerequisite-change-id: 20260215-impl-flags-additions-0340ffcba5b9:v2
prerequisite-patch-id: 379fb78c07b554278fae3c42d84d62bcfcfa0d45
prerequisite-patch-id: 04c7db66a06be7a2566a23328d2c485ce24f1bb8
prerequisite-patch-id: 4d78d6d7aae15c51e6a1df2cb393392fb7ea90de
prerequisite-change-id: 20260215-ringbuffer-42455964aaf2:v2
prerequisite-patch-id: 44924a030c52ae111983078f1225510e9dc0c009
prerequisite-change-id: 20260215-configfs-c-default-groups-bdb0a44633a6:v2
prerequisite-patch-id: 03b8e71b79be89a73946f3c1f7248671c28ccd42
prerequisite-change-id: 20260215-unique-arc-as-ptr-32eb209dde1b:v2
prerequisite-patch-id: 20f44fe6cfe6b0e52b614bd64469fbf1df5c1e94
prerequisite-change-id: 20260215-rust-fault-inject-bc62f1083502:v2
prerequisite-patch-id: 03b8e71b79be89a73946f3c1f7248671c28ccd42
prerequisite-patch-id: 8b287be6364945d10e661e0828ad17b023f487e1
prerequisite-change-id: 20260215-hrtimer-active-f183411fe56b:v2
prerequisite-patch-id: e029dd2cb097192e597417e40d7d23bedaa79370
prerequisite-change-id: 20260529-modules-value-ref-e95a7ab94fdb:v2
prerequisite-patch-id: 618f9f3cfea3f8a03db5e73229d77b48f6549ab4
prerequisite-message-id: <20260411130254.3510128-2-wenzhaoliao@ruc.edu.cn>
prerequisite-patch-id: f714b166f93e453dddd01ed17c976b53e6da4957
prerequisite-change-id: 20260608-queue-data-sync-80b66ab312ac:v1
prerequisite-patch-id: ec86c4ec1531441a2c19085bf24ecc06819d7420
prerequisite-change-id: 20260608-update-hw-nodes-arg-940ecec0380a:v1
prerequisite-patch-id: a1e95b0ec36bf18976553fb8a2e17fd1527a6a1a
prerequisite-change-id: 20260608-configfs-fix-offset-6b3117158901:v1
prerequisite-patch-id: e8355bdd4444f8bda2663aa0bdcf3336de126255
prerequisite-change-id: 20260608-numa-node-id-85de708d4e8d:v1
prerequisite-patch-id: 8b82a179a91cd3e0ca8396eff81dae7bf66e5349

Best regards,
--  
Andreas Hindborg <a.hindborg@kernel.org>




^ permalink raw reply	[flat|nested] 88+ messages in thread

* [PATCH v2 01/83] block: rust: fix `Send` bound for `GenDisk`
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
@ 2026-06-09 19:07 ` Andreas Hindborg
  2026-06-09 20:44   ` Yuan Tan
  2026-06-09 21:45   ` Yuan Tan
  2026-06-09 19:07 ` [PATCH v2 02/83] rust: block: rename `SECTOR_MASK` to `PAGE_SECTOR_MASK` Andreas Hindborg
                   ` (81 subsequent siblings)
  82 siblings, 2 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:07 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux, Yuan Tan

The `Send` implementation for `GenDisk<T>` was conditioned on `T: Send`.
This constrains the wrong type. `T` is the `Operations` implementation,
which is typically a zero-sized marker type that carries no data, so `T:
Send` says nothing about whether the data a `GenDisk` actually owns can be
moved to another thread.

A `GenDisk<T>` owns the queue data `T::QueueData` (stored as the
`gendisk`'s `queuedata` and dropped when the `GenDisk` is dropped) and an
`Arc<TagSet<T>>`. These are the values transferred when a `GenDisk` is sent
across a thread boundary, so the `Send` bound must constrain exactly them.
Bound `T::QueueData: Send` and `Arc<TagSet<T>>: Send` instead.

Fixes: 3253aba3408a ("rust: block: introduce `kernel::block::mq` module")
Suggested-by: Yuan Tan <ytan089@ucr.edu>
Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---

Please take patch from Yuan instead of this one, if they send a fixed
version [1].

[1] https://lore.kernel.org/r/8839ddc5ff54bf454d508cde91d27d00fc3e2dd8.1780633578.git.ytan089@ucr.edu
---
 rust/kernel/block/mq/gen_disk.rs | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/rust/kernel/block/mq/gen_disk.rs b/rust/kernel/block/mq/gen_disk.rs
index 912cb805caf5..b36d24382cc3 100644
--- a/rust/kernel/block/mq/gen_disk.rs
+++ b/rust/kernel/block/mq/gen_disk.rs
@@ -199,8 +199,14 @@ pub struct GenDisk<T: Operations> {
 }
 
 // SAFETY: `GenDisk` is an owned pointer to a `struct gendisk` and an `Arc` to a
-// `TagSet` It is safe to send this to other threads as long as T is Send.
-unsafe impl<T: Operations + Send> Send for GenDisk<T> {}
+// `TagSet`. It is safe to send this to other threads as long as these two are `Send`.
+unsafe impl<T> Send for GenDisk<T>
+where
+    T: Operations,
+    T::QueueData: Send,
+    Arc<TagSet<T>>: Send,
+{
+}
 
 impl<T: Operations> Drop for GenDisk<T> {
     fn drop(&mut self) {

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 02/83] rust: block: rename `SECTOR_MASK` to `PAGE_SECTOR_MASK`
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
  2026-06-09 19:07 ` [PATCH v2 01/83] block: rust: fix `Send` bound for `GenDisk` Andreas Hindborg
@ 2026-06-09 19:07 ` Andreas Hindborg
  2026-06-09 19:07 ` [PATCH v2 03/83] block: rnull: adopt new formatting guidelines Andreas Hindborg
                   ` (80 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:07 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

The constant exposes `bindings::SECTOR_MASK`, which masks the sector
index within a page (`PAGE_SIZE / SECTOR_SIZE - 1`), not `SECTOR_SIZE`
itself as the original docstring suggested. The misleading name made
it easy for callers to reach for it when they wanted a byte-level
sector mask.

Rename the Rust constant to `PAGE_SECTOR_MASK` and fix the docstring.
The C binding is unchanged.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/kernel/block.rs | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/rust/kernel/block.rs b/rust/kernel/block.rs
index 32c8d865afb6..b120e83d9425 100644
--- a/rust/kernel/block.rs
+++ b/rust/kernel/block.rs
@@ -4,8 +4,8 @@
 
 pub mod mq;
 
-/// Bit mask for masking out [`SECTOR_SIZE`].
-pub const SECTOR_MASK: u32 = bindings::SECTOR_MASK;
+/// Bit mask for masking out the sector index in a page.
+pub const PAGE_SECTOR_MASK: u32 = bindings::SECTOR_MASK;
 
 /// Sectors are size `1 << SECTOR_SHIFT`.
 pub const SECTOR_SHIFT: u32 = bindings::SECTOR_SHIFT;

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 03/83] block: rnull: adopt new formatting guidelines
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
  2026-06-09 19:07 ` [PATCH v2 01/83] block: rust: fix `Send` bound for `GenDisk` Andreas Hindborg
  2026-06-09 19:07 ` [PATCH v2 02/83] rust: block: rename `SECTOR_MASK` to `PAGE_SECTOR_MASK` Andreas Hindborg
@ 2026-06-09 19:07 ` Andreas Hindborg
  2026-06-09 19:07 ` [PATCH v2 04/83] block: rnull: add module parameters Andreas Hindborg
                   ` (79 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:07 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Reformat `use` statements to have one item per line as required by the
updated Rust formatting guidelines. Apply a formatting workaround to
ensure `rustfmt` produces the expected output.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/rnull.rs | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index 0ca8715febe8..d58d2c4c5f63 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -10,12 +10,19 @@
         self,
         mq::{
             self,
-            gen_disk::{self, GenDisk},
-            Operations, TagSet,
+            gen_disk::{
+                self,
+                GenDisk, //
+            },
+            Operations,
+            TagSet, //
         },
     },
     prelude::*,
-    sync::{aref::ARef, Arc},
+    sync::{
+        aref::ARef,
+        Arc, //
+    },
 };
 
 module! {

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 04/83] block: rnull: add module parameters
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (2 preceding siblings ...)
  2026-06-09 19:07 ` [PATCH v2 03/83] block: rnull: adopt new formatting guidelines Andreas Hindborg
@ 2026-06-09 19:07 ` Andreas Hindborg
  2026-06-09 19:07 ` [PATCH v2 05/83] block: rnull: add macros to define configfs attributes Andreas Hindborg
                   ` (78 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:07 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Module parameter support for Rust modules was merged a few releases back.
Add module parameter support to the rnull driver.

This allows the user to control the driver either via configfs or module
parameters, just like the C counterpart.

Please note that the rust module parameters do not support boolean values.
Flags that should have been booleans are parsed as integers and compared to
zero.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/rnull.rs | 50 +++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 49 insertions(+), 1 deletion(-)

diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index d58d2c4c5f63..77ccc6850961 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -18,10 +18,14 @@
             TagSet, //
         },
     },
+    error::Result,
+    new_mutex, pr_info,
     prelude::*,
+    str::CString,
     sync::{
         aref::ARef,
-        Arc, //
+        Arc,
+        Mutex, //
     },
 };
 
@@ -31,20 +35,64 @@
     authors: ["Andreas Hindborg"],
     description: "Rust implementation of the C null block driver",
     license: "GPL v2",
+    params: {
+        gb: u64 {
+            default: 4,
+            description: "Device capacity in GiB",
+        },
+        rotational: bool {
+            default: false,
+            description: "Set the rotational feature for the device.",
+        },
+        bs: u32 {
+            default: 4096,
+            description: "Block size (in bytes)",
+        },
+        nr_devices: u64 {
+            default: 1,
+            description: "Number of devices to register",
+        },
+        irqmode: u8 {
+            default: 0,
+            description:  "IRQ completion handler. 0-none, 1-softirq",
+        },
+    },
 }
 
 #[pin_data]
 struct NullBlkModule {
     #[pin]
     configfs_subsystem: kernel::configfs::Subsystem<configfs::Config>,
+    #[pin]
+    param_disks: Mutex<KVec<GenDisk<NullBlkDevice>>>,
 }
 
 impl kernel::InPlaceModule for NullBlkModule {
     fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
         pr_info!("Rust null_blk loaded\n");
 
+        let mut disks = KVec::new();
+
+        let defer_init = move || -> Result<_, Error> {
+            for i in 0..module_parameters::nr_devices.value() {
+                let name = CString::try_from_fmt(fmt!("rnullb{}", i))?;
+
+                let disk = NullBlkDevice::new(
+                    &name,
+                    module_parameters::bs.value(),
+                    module_parameters::rotational.value(),
+                    module_parameters::gb.value() * 1024,
+                    module_parameters::irqmode.value().try_into()?,
+                )?;
+                disks.push(disk, GFP_KERNEL)?;
+            }
+
+            Ok(disks)
+        };
+
         try_pin_init!(Self {
             configfs_subsystem <- configfs::subsystem(),
+            param_disks <- new_mutex!(defer_init()?),
         })
     }
 }

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 05/83] block: rnull: add macros to define configfs attributes
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (3 preceding siblings ...)
  2026-06-09 19:07 ` [PATCH v2 04/83] block: rnull: add module parameters Andreas Hindborg
@ 2026-06-09 19:07 ` Andreas Hindborg
  2026-06-09 19:07 ` [PATCH v2 06/83] block: rust: fix generation of bindings to `BLK_STS_.*` Andreas Hindborg
                   ` (77 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:07 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Defining configfs attributes in rust is a bit verbose at the moment. Add
some macros to make the attribute definition less verbose.

The configfs Rust abstractions should eventually provide procedural macros
for this task. When we get more users of the configfs Rust abstractions, we
shall consider this task.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/configfs.rs        | 134 +++++++++---------------------
 drivers/block/rnull/configfs/macros.rs | 143 +++++++++++++++++++++++++++++++++
 2 files changed, 179 insertions(+), 98 deletions(-)

diff --git a/drivers/block/rnull/configfs.rs b/drivers/block/rnull/configfs.rs
index b165347e9413..fd309fc17e66 100644
--- a/drivers/block/rnull/configfs.rs
+++ b/drivers/block/rnull/configfs.rs
@@ -1,18 +1,39 @@
 // SPDX-License-Identifier: GPL-2.0
 
-use super::{NullBlkDevice, THIS_MODULE};
+use super::{
+    NullBlkDevice,
+    THIS_MODULE, //
+};
 use kernel::{
-    block::mq::gen_disk::{GenDisk, GenDiskBuilder},
-    configfs::{self, AttributeOperations},
+    block::mq::gen_disk::{
+        GenDisk,
+        GenDiskBuilder, //
+    },
+    configfs::{
+        self,
+        AttributeOperations, //
+    },
     configfs_attrs,
-    fmt::{self, Write as _},
+    fmt::{
+        self,
+        Write as _, //
+    },
     new_mutex,
     page::PAGE_SIZE,
     prelude::*,
-    str::{kstrtobool_bytes, CString},
-    sync::Mutex,
+    str::{
+        kstrtobool_bytes,
+        CString, //
+    },
+    sync::Mutex, //
+};
+use macros::{
+    configfs_simple_bool_field,
+    configfs_simple_field, //
 };
 
+mod macros;
+
 pub(crate) fn subsystem() -> impl PinInit<kernel::configfs::Subsystem<Config>, Error> {
     let item_type = configfs_attrs! {
         container: configfs::Subsystem<Config>,
@@ -164,99 +185,16 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
     }
 }
 
-#[vtable]
-impl configfs::AttributeOperations<1> for DeviceConfig {
-    type Data = DeviceConfig;
-
-    fn show(this: &DeviceConfig, page: &mut [u8; PAGE_SIZE]) -> Result<usize> {
-        let mut writer = kernel::str::Formatter::new(page);
-        writer.write_fmt(fmt!("{}\n", this.data.lock().block_size))?;
-        Ok(writer.bytes_written())
-    }
-
-    fn store(this: &DeviceConfig, page: &[u8]) -> Result {
-        if this.data.lock().powered {
-            return Err(EBUSY);
-        }
-
-        let text = core::str::from_utf8(page)?.trim();
-        let value = text.parse::<u32>().map_err(|_| EINVAL)?;
-
-        GenDiskBuilder::validate_block_size(value)?;
-        this.data.lock().block_size = value;
-        Ok(())
-    }
-}
-
-#[vtable]
-impl configfs::AttributeOperations<2> for DeviceConfig {
-    type Data = DeviceConfig;
-
-    fn show(this: &DeviceConfig, page: &mut [u8; PAGE_SIZE]) -> Result<usize> {
-        let mut writer = kernel::str::Formatter::new(page);
-
-        if this.data.lock().rotational {
-            writer.write_str("1\n")?;
-        } else {
-            writer.write_str("0\n")?;
-        }
-
-        Ok(writer.bytes_written())
-    }
-
-    fn store(this: &DeviceConfig, page: &[u8]) -> Result {
-        if this.data.lock().powered {
-            return Err(EBUSY);
-        }
-
-        this.data.lock().rotational = kstrtobool_bytes(page)?;
-
-        Ok(())
-    }
-}
-
-#[vtable]
-impl configfs::AttributeOperations<3> for DeviceConfig {
-    type Data = DeviceConfig;
-
-    fn show(this: &DeviceConfig, page: &mut [u8; PAGE_SIZE]) -> Result<usize> {
-        let mut writer = kernel::str::Formatter::new(page);
-        writer.write_fmt(fmt!("{}\n", this.data.lock().capacity_mib))?;
-        Ok(writer.bytes_written())
-    }
-
-    fn store(this: &DeviceConfig, page: &[u8]) -> Result {
-        if this.data.lock().powered {
-            return Err(EBUSY);
-        }
-
-        let text = core::str::from_utf8(page)?.trim();
-        let value = text.parse::<u64>().map_err(|_| EINVAL)?;
-
-        this.data.lock().capacity_mib = value;
-        Ok(())
-    }
-}
-
-#[vtable]
-impl configfs::AttributeOperations<4> for DeviceConfig {
-    type Data = DeviceConfig;
-
-    fn show(this: &DeviceConfig, page: &mut [u8; PAGE_SIZE]) -> Result<usize> {
-        let mut writer = kernel::str::Formatter::new(page);
-        writer.write_fmt(fmt!("{}\n", this.data.lock().irq_mode))?;
-        Ok(writer.bytes_written())
-    }
+configfs_simple_field!(DeviceConfig, 1, block_size, u32, check GenDiskBuilder::validate_block_size);
+configfs_simple_bool_field!(DeviceConfig, 2, rotational);
+configfs_simple_field!(DeviceConfig, 3, capacity_mib, u64);
+configfs_simple_field!(DeviceConfig, 4, irq_mode, IRQMode);
 
-    fn store(this: &DeviceConfig, page: &[u8]) -> Result {
-        if this.data.lock().powered {
-            return Err(EBUSY);
-        }
+impl core::str::FromStr for IRQMode {
+    type Err = Error;
 
-        let text = core::str::from_utf8(page)?.trim();
-        let value = text.parse::<u8>().map_err(|_| EINVAL)?;
-
-        this.data.lock().irq_mode = IRQMode::try_from(value)?;
-        Ok(())
+    fn from_str(s: &str) -> Result<Self> {
+        let value: u8 = s.parse().map_err(|_| EINVAL)?;
+        value.try_into()
     }
 }
diff --git a/drivers/block/rnull/configfs/macros.rs b/drivers/block/rnull/configfs/macros.rs
new file mode 100644
index 000000000000..30bb32238457
--- /dev/null
+++ b/drivers/block/rnull/configfs/macros.rs
@@ -0,0 +1,143 @@
+// SPDX-License-Identifier: GPL-2.0
+
+use super::{
+    DeviceConfig,
+    DeviceConfigInner, //
+};
+use core::str::FromStr;
+use kernel::{
+    fmt::{
+        self,
+        Write, //
+    },
+    page::PAGE_SIZE,
+    prelude::*,
+};
+
+pub(crate) fn show_field<T: fmt::Display>(value: T, page: &mut [u8; PAGE_SIZE]) -> Result<usize> {
+    let mut writer = kernel::str::Formatter::new(page);
+    writer.write_fmt(fmt!("{}\n", value))?;
+    Ok(writer.bytes_written())
+}
+
+// The lock guard is passed to `store_fn` so the powered check and the
+// store happen atomically. Releasing the lock between the two would
+// allow another writer to power the device on in the gap.
+pub(crate) fn store_with_power_check<F>(this: &DeviceConfig, page: &[u8], store_fn: F) -> Result
+where
+    F: FnOnce(&mut DeviceConfigInner, &[u8]) -> Result,
+{
+    let mut guard = this.data.lock();
+    if guard.powered {
+        return Err(EBUSY);
+    }
+    store_fn(&mut guard, page)
+}
+
+pub(crate) fn store_number_with_power_check<F, T>(
+    this: &DeviceConfig,
+    page: &[u8],
+    store_fn: F,
+) -> Result
+where
+    F: FnOnce(&mut DeviceConfigInner, T) -> Result,
+    T: FromStr,
+{
+    let text = core::str::from_utf8(page)?.trim();
+    let value = text.parse::<T>().map_err(|_| EINVAL)?;
+
+    let mut guard = this.data.lock();
+    if guard.powered {
+        return Err(EBUSY);
+    }
+
+    store_fn(&mut guard, value)
+}
+
+macro_rules! configfs_attribute {
+    (
+        $type:ty,
+        $id:literal,
+        show: |$show_this:ident, $show_page:ident| $show_block:expr,
+        store: |$store_this:ident, $store_page:ident| $store_block:expr
+        $(,)?
+    ) => {
+        #[vtable]
+        impl configfs::AttributeOperations<$id> for $type {
+            type Data = $type;
+
+            fn show($show_this: &$type, $show_page: &mut [u8; PAGE_SIZE]) -> Result<usize> {
+                $show_block
+            }
+
+            fn store($store_this: &$type, $store_page: &[u8]) -> Result {
+                $store_block
+            }
+        }
+    };
+}
+pub(crate) use configfs_attribute;
+
+// Specialized macro for simple boolean fields that just store kstrtobool_bytes result.
+macro_rules! configfs_simple_bool_field {
+    ($type:ty, $id:literal, $field:ident) => {
+        crate::configfs::macros::configfs_attribute!($type, $id,
+            show: |this, page| crate::configfs::macros::show_field(this.data.lock().$field, page),
+            store: |this, page|
+              crate::configfs::macros::store_with_power_check(this, page, |data, page| {
+                data.$field = kstrtobool_bytes(page)?;
+                Ok(())
+            })
+        );
+    };
+}
+pub(crate) use configfs_simple_bool_field;
+
+// Specialized macro for simple numeric fields that just parse and assign
+macro_rules! configfs_simple_field {
+    // Simple direct assignment
+    ($type:ty, $id:literal, $field:ident, $field_type:ty) => {
+        crate::configfs::macros::configfs_attribute!($type, $id,
+            show: |this, page| crate::configfs::macros::show_field(this.data.lock().$field, page),
+            store: |this, page| crate::configfs::macros::store_number_with_power_check(
+                this,
+                page,
+                |data, value: $field_type| {
+                    data.$field = value;
+                    Ok(())
+                }
+            )
+        );
+    };
+    // With infallible conversion expression (direct value)
+    ($type:ty, $id:literal, $field:ident, $field_type:ty, into $convert:expr) => {
+        crate::configfs::macros::configfs_attribute!($type, $id,
+            show: |this, page|
+                crate::configfs::macros::show_field(this.data.lock().$field, page),
+            store: |this, page| crate::configfs::macros::store_number_with_power_check(
+                this,
+                page,
+                |data, value: $field_type| {
+                    data.$field = $convert(value);
+                    Ok(())
+                }
+            )
+        );
+    };
+    // With check, no conversion
+    ($type:ty, $id:literal, $field:ident, $field_type:ty, check $check:expr) => {
+        crate::configfs::macros::configfs_attribute!($type, $id,
+            show: |this, page| crate::configfs::macros::show_field(this.data.lock().$field, page),
+            store: |this, page| crate::configfs::macros::store_number_with_power_check(
+                this,
+                page,
+                |data, value: $field_type| {
+                    $check(value)?;
+                    data.$field = value;
+                    Ok(())
+                }
+            )
+        );
+    };
+}
+pub(crate) use configfs_simple_field;

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 06/83] block: rust: fix generation of bindings to `BLK_STS_.*`
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (4 preceding siblings ...)
  2026-06-09 19:07 ` [PATCH v2 05/83] block: rnull: add macros to define configfs attributes Andreas Hindborg
@ 2026-06-09 19:07 ` Andreas Hindborg
  2026-06-09 19:07 ` [PATCH v2 07/83] block: rust: change `queue_rq` request type to `Owned` Andreas Hindborg
                   ` (76 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:07 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Bindgen generates constants for CPP integer literals as u32. The
`blk_status_t` type is defined as `u8` but the variants of the type are
defined as integer literals via CPP macros. Thus the defined variants of
the type are not of the same type as the type itself.

Prevent bindgen from emitting generated bindings for the `BLK_STS_.*`
defines and instead define constants manually in `bindings_helper.h`

Also remove casts that are no longer necessary.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/bindgen_parameters            |  6 ++++++
 rust/bindings/bindings_helper.h    | 19 +++++++++++++++++++
 rust/kernel/block/mq/operations.rs | 17 +++++++++++++----
 3 files changed, 38 insertions(+), 4 deletions(-)

diff --git a/rust/bindgen_parameters b/rust/bindgen_parameters
index 6f02d9720ad2..128731e84775 100644
--- a/rust/bindgen_parameters
+++ b/rust/bindgen_parameters
@@ -5,6 +5,12 @@
 --blocklist-type __kernel_s?size_t
 --blocklist-type __kernel_ptrdiff_t
 
+# Bindgen cannot extract values from the `((__force blk_status_t)N)`
+# CPP-macro form used by most of these and emits the few it can extract
+# as `u32`. Block them entirely; the `RUST_CONST_HELPER_BLK_STS_*`
+# definitions in `bindings_helper.h` expose them as `blk_status_t`.
+--blocklist-item BLK_STS_.*
+
 --opaque-type xregs_state
 --opaque-type desc_struct
 --opaque-type arch_lbr_state
diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
index 9da216faad51..b1fb3afee4ca 100644
--- a/rust/bindings/bindings_helper.h
+++ b/rust/bindings/bindings_helper.h
@@ -119,6 +119,25 @@ const gfp_t RUST_CONST_HELPER___GFP_ZERO = __GFP_ZERO;
 const gfp_t RUST_CONST_HELPER___GFP_HIGHMEM = ___GFP_HIGHMEM;
 const gfp_t RUST_CONST_HELPER___GFP_NOWARN = ___GFP_NOWARN;
 const blk_features_t RUST_CONST_HELPER_BLK_FEAT_ROTATIONAL = BLK_FEAT_ROTATIONAL;
+const blk_status_t RUST_CONST_HELPER_BLK_STS_OK = BLK_STS_OK;
+const blk_status_t RUST_CONST_HELPER_BLK_STS_NOTSUPP = BLK_STS_NOTSUPP;
+const blk_status_t RUST_CONST_HELPER_BLK_STS_TIMEOUT = BLK_STS_TIMEOUT;
+const blk_status_t RUST_CONST_HELPER_BLK_STS_NOSPC = BLK_STS_NOSPC;
+const blk_status_t RUST_CONST_HELPER_BLK_STS_TRANSPORT = BLK_STS_TRANSPORT;
+const blk_status_t RUST_CONST_HELPER_BLK_STS_TARGET = BLK_STS_TARGET;
+const blk_status_t RUST_CONST_HELPER_BLK_STS_RESV_CONFLICT = BLK_STS_RESV_CONFLICT;
+const blk_status_t RUST_CONST_HELPER_BLK_STS_MEDIUM = BLK_STS_MEDIUM;
+const blk_status_t RUST_CONST_HELPER_BLK_STS_PROTECTION = BLK_STS_PROTECTION;
+const blk_status_t RUST_CONST_HELPER_BLK_STS_RESOURCE = BLK_STS_RESOURCE;
+const blk_status_t RUST_CONST_HELPER_BLK_STS_IOERR = BLK_STS_IOERR;
+const blk_status_t RUST_CONST_HELPER_BLK_STS_DM_REQUEUE = BLK_STS_DM_REQUEUE;
+const blk_status_t RUST_CONST_HELPER_BLK_STS_AGAIN = BLK_STS_AGAIN;
+const blk_status_t RUST_CONST_HELPER_BLK_STS_DEV_RESOURCE = BLK_STS_DEV_RESOURCE;
+const blk_status_t RUST_CONST_HELPER_BLK_STS_ZONE_OPEN_RESOURCE = BLK_STS_ZONE_OPEN_RESOURCE;
+const blk_status_t RUST_CONST_HELPER_BLK_STS_ZONE_ACTIVE_RESOURCE = BLK_STS_ZONE_ACTIVE_RESOURCE;
+const blk_status_t RUST_CONST_HELPER_BLK_STS_OFFLINE = BLK_STS_OFFLINE;
+const blk_status_t RUST_CONST_HELPER_BLK_STS_DURATION_LIMIT = BLK_STS_DURATION_LIMIT;
+const blk_status_t RUST_CONST_HELPER_BLK_STS_INVAL = BLK_STS_INVAL;
 const fop_flags_t RUST_CONST_HELPER_FOP_UNSIGNED_OFFSET = FOP_UNSIGNED_OFFSET;
 
 const xa_mark_t RUST_CONST_HELPER_XA_PRESENT = XA_PRESENT;
diff --git a/rust/kernel/block/mq/operations.rs b/rust/kernel/block/mq/operations.rs
index 89029f468f44..6b2fcd76372e 100644
--- a/rust/kernel/block/mq/operations.rs
+++ b/rust/kernel/block/mq/operations.rs
@@ -6,10 +6,19 @@
 
 use crate::{
     bindings,
-    block::mq::{request::RequestDataWrapper, Request},
-    error::{from_result, Result},
+    block::mq::{
+        request::RequestDataWrapper,
+        Request, //
+    },
+    error::{
+        from_result,
+        Result, //
+    },
     prelude::*,
-    sync::{aref::ARef, Refcount},
+    sync::{
+        aref::ARef,
+        Refcount, //
+    },
     types::ForeignOwnable,
 };
 use core::marker::PhantomData;
@@ -124,7 +133,7 @@ impl<T: Operations> OperationsVTable<T> {
         if let Err(e) = ret {
             e.to_blk_status()
         } else {
-            bindings::BLK_STS_OK as bindings::blk_status_t
+            bindings::BLK_STS_OK
         }
     }
 

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 07/83] block: rust: change `queue_rq` request type to `Owned`
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (5 preceding siblings ...)
  2026-06-09 19:07 ` [PATCH v2 06/83] block: rust: fix generation of bindings to `BLK_STS_.*` Andreas Hindborg
@ 2026-06-09 19:07 ` Andreas Hindborg
  2026-06-09 19:07 ` [PATCH v2 08/83] block: rust: add `Request` private data support Andreas Hindborg
                   ` (75 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:07 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Simplify the reference counting scheme for `Request` from 4 states to 3
states. This is achieved by coalescing the zero state between block layer
owned and uniquely owned by driver.

Implement `Ownable` for `Request` and deliver `Request` to drivers as
`Owned<Request>`. In this process:

 - Move uniqueness assertions out of `rnull` as these are now guaranteed by
   the `Owned` type.
 - Move `start_unchecked`, `try_set_end` and `end_ok` from `Request` to
   `Owned<Request>`, relying on type invariant for uniqueness.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/rnull.rs       |  26 ++---
 rust/kernel/block/mq.rs            |  10 +-
 rust/kernel/block/mq/operations.rs |  32 +++--
 rust/kernel/block/mq/request.rs    | 231 ++++++++++++++++++++++---------------
 4 files changed, 176 insertions(+), 123 deletions(-)

diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index 77ccc6850961..69cf62475446 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -19,7 +19,8 @@
         },
     },
     error::Result,
-    new_mutex, pr_info,
+    new_mutex,
+    pr_info,
     prelude::*,
     str::CString,
     sync::{
@@ -27,6 +28,10 @@
         Arc,
         Mutex, //
     },
+    types::{
+        OwnableRefCounted,
+        Owned, //
+    }, //
 };
 
 module! {
@@ -129,15 +134,10 @@ impl Operations for NullBlkDevice {
     type QueueData = KBox<QueueData>;
 
     #[inline(always)]
-    fn queue_rq(queue_data: &QueueData, rq: ARef<mq::Request<Self>>, _is_last: bool) -> Result {
+    fn queue_rq(queue_data: &QueueData, rq: Owned<mq::Request<Self>>, _is_last: bool) -> Result {
         match queue_data.irq_mode {
-            IRQMode::None => mq::Request::end_ok(rq)
-                .map_err(|_e| kernel::error::code::EIO)
-                // We take no refcounts on the request, so we expect to be able to
-                // end the request. The request reference must be unique at this
-                // point, and so `end_ok` cannot fail.
-                .expect("Fatal error - expected to be able to end request"),
-            IRQMode::Soft => mq::Request::complete(rq),
+            IRQMode::None => rq.end_ok(),
+            IRQMode::Soft => mq::Request::complete(rq.into()),
         }
         Ok(())
     }
@@ -145,11 +145,9 @@ fn queue_rq(queue_data: &QueueData, rq: ARef<mq::Request<Self>>, _is_last: bool)
     fn commit_rqs(_queue_data: &QueueData) {}
 
     fn complete(rq: ARef<mq::Request<Self>>) {
-        mq::Request::end_ok(rq)
+        OwnableRefCounted::try_from_shared(rq)
             .map_err(|_e| kernel::error::code::EIO)
-            // We take no refcounts on the request, so we expect to be able to
-            // end the request. The request reference must be unique at this
-            // point, and so `end_ok` cannot fail.
-            .expect("Fatal error - expected to be able to end request");
+            .expect("Failed to complete request")
+            .end_ok();
     }
 }
diff --git a/rust/kernel/block/mq.rs b/rust/kernel/block/mq.rs
index 1fd0d54dd549..b8ecd69abe98 100644
--- a/rust/kernel/block/mq.rs
+++ b/rust/kernel/block/mq.rs
@@ -62,6 +62,7 @@
 //!     new_mutex,
 //!     prelude::*,
 //!     sync::{aref::ARef, Arc, Mutex},
+//!     types::{ForeignOwnable, OwnableRefCounted, Owned},
 //! };
 //!
 //! struct MyBlkDevice;
@@ -70,17 +71,18 @@
 //! impl Operations for MyBlkDevice {
 //!     type QueueData = ();
 //!
-//!     fn queue_rq(_queue_data: (), rq: ARef<Request<Self>>, _is_last: bool) -> Result {
-//!         Request::end_ok(rq);
+//!     fn queue_rq(_queue_data: (), rq: Owned<Request<Self>>, _is_last: bool) -> Result {
+//!         rq.end_ok();
 //!         Ok(())
 //!     }
 //!
 //!     fn commit_rqs(_queue_data: ()) {}
 //!
 //!     fn complete(rq: ARef<Request<Self>>) {
-//!         Request::end_ok(rq)
+//!         OwnableRefCounted::try_from_shared(rq)
 //!             .map_err(|_e| kernel::error::code::EIO)
-//!             .expect("Fatal error - expected to be able to end request");
+//!             .expect("Fatal error - expected to be able to end request")
+//!             .end_ok();
 //!     }
 //! }
 //!
diff --git a/rust/kernel/block/mq/operations.rs b/rust/kernel/block/mq/operations.rs
index 6b2fcd76372e..bb23a32f3983 100644
--- a/rust/kernel/block/mq/operations.rs
+++ b/rust/kernel/block/mq/operations.rs
@@ -17,11 +17,18 @@
     prelude::*,
     sync::{
         aref::ARef,
+        atomic::ordering,
         Refcount, //
     },
-    types::ForeignOwnable,
+    types::{
+        ForeignOwnable,
+        Owned, //
+    },
+};
+use core::{
+    marker::PhantomData,
+    ptr::NonNull, //
 };
-use core::marker::PhantomData;
 
 type ForeignBorrowed<'a, T> = <T as ForeignOwnable>::Borrowed<'a>;
 
@@ -45,7 +52,7 @@ pub trait Operations: Sized {
     /// `false`, the driver is allowed to defer committing the request.
     fn queue_rq(
         queue_data: ForeignBorrowed<'_, Self::QueueData>,
-        rq: ARef<Request<Self>>,
+        rq: Owned<Request<Self>>,
         is_last: bool,
     ) -> Result;
 
@@ -99,16 +106,23 @@ impl<T: Operations> OperationsVTable<T> {
         // this function.
         let request = unsafe { &*(*bd).rq.cast::<Request<T>>() };
 
-        // One refcount for the ARef, one for being in flight
-        request.wrapper_ref().refcount().set(2);
+        debug_assert!(
+            request
+                .wrapper_ref()
+                .refcount()
+                .as_atomic()
+                .load(ordering::Acquire)
+                == 0
+        );
 
         // SAFETY:
-        //  - We own a refcount that we took above. We pass that to `ARef`.
+        //  - By API contract, we own the request.
         //  - By the safety requirements of this function, `request` is a valid
         //    `struct request` and the private data is properly initialized.
         //  - `rq` will be alive until `blk_mq_end_request` is called and is
-        //    reference counted by `ARef` until then.
-        let rq = unsafe { Request::aref_from_raw((*bd).rq) };
+        //    reference counted by until then.
+        let mut rq =
+            unsafe { Owned::from_raw(NonNull::<Request<T>>::new_unchecked((*bd).rq.cast())) };
 
         // SAFETY: `hctx` is valid as required by this function.
         let queue_data = unsafe { (*(*hctx).queue).queuedata };
@@ -120,7 +134,7 @@ impl<T: Operations> OperationsVTable<T> {
         let queue_data = unsafe { T::QueueData::borrow(queue_data) };
 
         // SAFETY: We have exclusive access and we just set the refcount above.
-        unsafe { Request::start_unchecked(&rq) };
+        unsafe { rq.start_unchecked() };
 
         let ret = T::queue_rq(
             queue_data,
diff --git a/rust/kernel/block/mq/request.rs b/rust/kernel/block/mq/request.rs
index cf013b9e2cac..7444de3c8522 100644
--- a/rust/kernel/block/mq/request.rs
+++ b/rust/kernel/block/mq/request.rs
@@ -7,39 +7,45 @@
 use crate::{
     bindings,
     block::mq::Operations,
-    error::Result,
     sync::{
-        aref::{ARef, AlwaysRefCounted, RefCounted},
-        atomic::Relaxed,
+        aref::{
+            ARef,
+            RefCounted, //
+        },
+        atomic::ordering,
         Refcount,
     },
-    types::Opaque,
+    types::{
+        Opaque,
+        Ownable,
+        OwnableRefCounted,
+        Owned, //
+    },
+};
+use core::{
+    marker::PhantomData,
+    ptr::NonNull, //
 };
-use core::{marker::PhantomData, ptr::NonNull};
 
 /// A wrapper around a blk-mq [`struct request`]. This represents an IO request.
 ///
 /// # Implementation details
 ///
-/// There are four states for a request that the Rust bindings care about:
-///
-/// 1. Request is owned by block layer (refcount 0).
-/// 2. Request is owned by driver but with zero [`ARef`]s in existence
-///    (refcount 1).
-/// 3. Request is owned by driver with exactly one [`ARef`] in existence
-///    (refcount 2).
-/// 4. Request is owned by driver with more than one [`ARef`] in existence
-///    (refcount > 2).
+/// There are three states for a request that the Rust bindings care about:
 ///
+/// - 0: The request is owned by C block layer or is uniquely referenced (by [`Owned<_>`]).
+/// - 1: The request is owned by Rust abstractions but is not referenced.
+/// - 2+: There is one or more [`ARef`] instances referencing the request.
 ///
-/// We need to track 1 and 2 to ensure we fail tag to request conversions for
-/// requests that are not owned by the driver.
+/// We need to track 1 and 2 to make sure that `tag_to_rq` does not issue any
+/// [`ARef`] to requests not owned by the driver, or to requests that have a
+/// [`Owned`] referencing it.
 ///
-/// We need to track 3 and 4 to ensure that it is safe to end the request and hand
-/// back ownership to the block layer.
+/// We need to track 3 to know when it is safe to convert an [`ARef`] to a
+/// [`Owned`].
 ///
 /// Note that the driver can still obtain new `ARef` even if there is no `ARef`s in existence by
-/// using `tag_to_rq`, hence the need to distinguish B and C.
+/// using `tag_to_rq`, hence the need to distinct 1 and 2.
 ///
 /// The states are tracked through the private `refcount` field of
 /// `RequestDataWrapper`. This structure lives in the private data area of the C
@@ -66,6 +72,7 @@ impl<T: Operations> Request<T> {
     ///
     /// * The caller must own a refcount on `ptr` that is transferred to the
     ///   returned [`ARef`].
+    /// * The refcount must be >= 2.
     /// * The type invariants for [`Request`] must hold for the pointee of `ptr`.
     ///
     /// [`struct request`]: srctree/include/linux/blk-mq.h
@@ -76,72 +83,6 @@ pub(crate) unsafe fn aref_from_raw(ptr: *mut bindings::request) -> ARef<Self> {
         unsafe { ARef::from_raw(NonNull::new_unchecked(ptr.cast())) }
     }
 
-    /// Notify the block layer that a request is going to be processed now.
-    ///
-    /// The block layer uses this hook to do proper initializations such as
-    /// starting the timeout timer. It is a requirement that block device
-    /// drivers call this function when starting to process a request.
-    ///
-    /// # Safety
-    ///
-    /// The caller must have exclusive ownership of `self`, that is
-    /// `self.wrapper_ref().refcount() == 2`.
-    pub(crate) unsafe fn start_unchecked(this: &ARef<Self>) {
-        // SAFETY: By type invariant, `self.0` is a valid `struct request` and
-        // we have exclusive access.
-        unsafe { bindings::blk_mq_start_request(this.0.get()) };
-    }
-
-    /// Try to take exclusive ownership of `this` by dropping the refcount to 0.
-    /// This fails if `this` is not the only [`ARef`] pointing to the underlying
-    /// [`Request`].
-    ///
-    /// If the operation is successful, [`Ok`] is returned with a pointer to the
-    /// C [`struct request`]. If the operation fails, `this` is returned in the
-    /// [`Err`] variant.
-    ///
-    /// [`struct request`]: srctree/include/linux/blk-mq.h
-    fn try_set_end(this: ARef<Self>) -> Result<*mut bindings::request, ARef<Self>> {
-        // To hand back the ownership, we need the current refcount to be 2.
-        // Since we can race with `TagSet::tag_to_rq`, this needs to atomically reduce
-        // refcount to 0. `Refcount` does not provide a way to do this, so use the underlying
-        // atomics directly.
-        if let Err(_old) = this
-            .wrapper_ref()
-            .refcount()
-            .as_atomic()
-            .cmpxchg(2, 0, Relaxed)
-        {
-            return Err(this);
-        }
-
-        let request_ptr = this.0.get();
-        core::mem::forget(this);
-
-        Ok(request_ptr)
-    }
-
-    /// Notify the block layer that the request has been completed without errors.
-    ///
-    /// This function will return [`Err`] if `this` is not the only [`ARef`]
-    /// referencing the request.
-    pub fn end_ok(this: ARef<Self>) -> Result<(), ARef<Self>> {
-        let request_ptr = Self::try_set_end(this)?;
-
-        // SAFETY: By type invariant, `this.0` was a valid `struct request`. The
-        // success of the call to `try_set_end` guarantees that there are no
-        // `ARef`s pointing to this request. Therefore it is safe to hand it
-        // back to the block layer.
-        unsafe {
-            bindings::blk_mq_end_request(
-                request_ptr,
-                bindings::BLK_STS_OK as bindings::blk_status_t,
-            )
-        };
-
-        Ok(())
-    }
-
     /// Complete the request by scheduling `Operations::complete` for
     /// execution.
     ///
@@ -234,27 +175,125 @@ unsafe impl<T: Operations> Sync for Request<T> {}
 // matching reference count decrement is executed.
 unsafe impl<T: Operations> RefCounted for Request<T> {
     fn inc_ref(&self) {
-        self.wrapper_ref().refcount().inc();
+        let refcount = &self.wrapper_ref().refcount().as_atomic();
+
+        // Load acquire, store relaxed. We sync with store release of
+        // `OwnableRefCounted::into_shared`. After that all unique references are dead and we have
+        // shared access. We can use relaxed ordering for the store.
+        #[cfg_attr(not(debug_assertions), allow(unused_variables))]
+        let old = refcount.fetch_add(1, ordering::Acquire);
+
+        debug_assert!(old >= 1, "Request refcount zero clone");
     }
 
     unsafe fn dec_ref(obj: core::ptr::NonNull<Self>) {
-        // SAFETY: The type invariants of `ARef` guarantee that `obj` is valid
+        // SAFETY: The type invariants of `RefCounted` guarantee that `obj` is valid
         // for read.
         let wrapper_ptr = unsafe { Self::wrapper_ptr(obj.as_ptr()).as_ptr() };
         // SAFETY: The type invariant of `Request` guarantees that the private
         // data area is initialized and valid.
         let refcount = unsafe { &*RequestDataWrapper::refcount_ptr(wrapper_ptr) };
 
-        #[cfg_attr(not(CONFIG_DEBUG_MISC), allow(unused_variables))]
-        let is_zero = refcount.dec_and_test();
+        // Store release to sync with load acquire in
+        // `OwnableRefCounted::try_from_shared`.
+        #[cfg_attr(not(debug_assertions), allow(unused_variables))]
+        let old = refcount.as_atomic().fetch_sub(1, ordering::Release);
 
-        #[cfg(CONFIG_DEBUG_MISC)]
-        if is_zero {
-            panic!("Request reached refcount zero in Rust abstractions");
-        }
+        debug_assert!(
+            old > 1,
+            "Request reached refcount zero in Rust abstractions"
+        );
+    }
+}
+
+impl<T: Operations> Owned<Request<T>> {
+    /// Notify the block layer that a request is going to be processed now.
+    ///
+    /// The block layer uses this hook to do proper initializations such as
+    /// starting the timeout timer. It is a requirement that block device
+    /// drivers call this function when starting to process a request.
+    ///
+    /// # Safety
+    ///
+    /// The caller must have exclusive ownership of `self`, that is
+    /// `self.wrapper_ref().refcount() == 0`.
+    ///
+    /// This can only be called once in the request life cycle.
+    pub(crate) unsafe fn start_unchecked(&mut self) {
+        // SAFETY: By type invariant, `self.0` is a valid `struct request` and
+        // we have exclusive access.
+        unsafe { bindings::blk_mq_start_request(self.0.get()) };
+    }
+
+    /// Notify the block layer that the request has been completed without errors.
+    pub fn end_ok(self) {
+        let request_ptr = self.0.get().cast();
+        core::mem::forget(self);
+        // SAFETY: By type invariant, `this.0` was a valid `struct request`. The
+        // existence of `self` guarantees that there are no `ARef`s pointing to
+        // this request. Therefore it is safe to hand it back to the block
+        // layer.
+        unsafe { bindings::blk_mq_end_request(request_ptr, bindings::BLK_STS_OK) };
     }
 }
 
-// SAFETY: We currently do not implement `Ownable`, thus it is okay to obtain an `ARef<Request>`
-// from a `&Request` (but this will change in the future).
-unsafe impl<T: Operations> AlwaysRefCounted for Request<T> {}
+impl<T: Operations> Ownable for Request<T> {
+    // The `release` implementation frees the underlying request according to the reference
+    // counting scheme for `Request`.
+    unsafe fn release(&mut self) {
+        // SAFETY: The safety requirements of this function guarantee that `self`
+        // is valid for read.
+        let wrapper_ptr = unsafe { Self::wrapper_ptr(self).as_ptr() };
+        // SAFETY: The type invariant of `Request` guarantees that the private
+        // data area is initialized and valid.
+        let refcount = unsafe { &*RequestDataWrapper::refcount_ptr(wrapper_ptr) };
+
+        // Store release to sync with load acquire when converting back to owned.
+        #[cfg_attr(not(debug_assertions), allow(unused_variables))]
+        let old = refcount.as_atomic().fetch_add(1, ordering::Release);
+
+        debug_assert!(
+            old == 0,
+            "Invalid refcount when releasing `Owned<Request<T>>`"
+        );
+    }
+}
+
+impl<T: Operations> OwnableRefCounted for Request<T> {
+    fn try_from_shared(this: ARef<Self>) -> core::result::Result<Owned<Self>, ARef<Self>> {
+        // Load acquire to sync with decrement store release to make sure all
+        // shared access has ended.
+        let updated = this
+            .wrapper_ref()
+            .refcount()
+            .as_atomic()
+            .cmpxchg(2, 0, ordering::Acquire);
+
+        match updated {
+            Ok(_) => Ok(
+                // SAFETY: We achieved unique ownership above.
+                unsafe { Owned::from_raw(ARef::into_raw(this)) },
+            ),
+            Err(_) => Err(this),
+        }
+    }
+
+    fn into_shared(this: Owned<Self>) -> ARef<Self> {
+        // Store release to sync with future increments using load acquire to
+        // make sure exclusive access has ended before shared access start.
+        #[cfg_attr(not(debug_assertions), allow(unused_variables))]
+        let old = this
+            .wrapper_ref()
+            .refcount()
+            .as_atomic()
+            .fetch_add(2, ordering::Release);
+
+        debug_assert!(
+            old == 0,
+            "Invalid refcount when upgrading `Owned<Request<T>>`"
+        );
+
+        // SAFETY: We incremented the refcount above.
+        unsafe { ARef::from_raw(Owned::into_raw(this)) }
+    }
+}

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 08/83] block: rust: add `Request` private data support
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (6 preceding siblings ...)
  2026-06-09 19:07 ` [PATCH v2 07/83] block: rust: change `queue_rq` request type to `Owned` Andreas Hindborg
@ 2026-06-09 19:07 ` Andreas Hindborg
  2026-06-09 19:07 ` [PATCH v2 09/83] block: rust: document the lifetime of `Request` Andreas Hindborg
                   ` (74 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:07 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux, Andreas Hindborg

From: Andreas Hindborg <a.hindborg@samsung.com>

C block device drivers can attach private data to a `struct request`. This
data is stored next to the request structure and is part of the request
allocation set up during driver initialization.

Expose this private request data area to Rust block device drivers.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/rnull.rs       |  5 +++++
 rust/kernel/block/mq.rs            |  6 ++++++
 rust/kernel/block/mq/operations.rs | 26 +++++++++++++++++++++++++-
 rust/kernel/block/mq/request.rs    | 24 +++++++++++++++++++-----
 rust/kernel/block/mq/tag_set.rs    | 24 +++++++++++++++++++-----
 5 files changed, 74 insertions(+), 11 deletions(-)

diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index 69cf62475446..dd7a30519870 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -132,6 +132,11 @@ struct QueueData {
 #[vtable]
 impl Operations for NullBlkDevice {
     type QueueData = KBox<QueueData>;
+    type RequestData = ();
+
+    fn new_request_data() -> impl PinInit<Self::RequestData> {
+        Ok(())
+    }
 
     #[inline(always)]
     fn queue_rq(queue_data: &QueueData, rq: Owned<mq::Request<Self>>, _is_last: bool) -> Result {
diff --git a/rust/kernel/block/mq.rs b/rust/kernel/block/mq.rs
index b8ecd69abe98..7718b106eb49 100644
--- a/rust/kernel/block/mq.rs
+++ b/rust/kernel/block/mq.rs
@@ -69,8 +69,14 @@
 //!
 //! #[vtable]
 //! impl Operations for MyBlkDevice {
+//!     type RequestData = ();
 //!     type QueueData = ();
 //!
+//!     fn new_request_data(
+//!     ) -> impl PinInit<()> {
+//!         Ok(())
+//!     }
+//!
 //!     fn queue_rq(_queue_data: (), rq: Owned<Request<Self>>, _is_last: bool) -> Result {
 //!         rq.end_ok();
 //!         Ok(())
diff --git a/rust/kernel/block/mq/operations.rs b/rust/kernel/block/mq/operations.rs
index bb23a32f3983..c49ca2e8bbb2 100644
--- a/rust/kernel/block/mq/operations.rs
+++ b/rust/kernel/block/mq/operations.rs
@@ -29,6 +29,7 @@
     marker::PhantomData,
     ptr::NonNull, //
 };
+use pin_init::PinInit;
 
 type ForeignBorrowed<'a, T> = <T as ForeignOwnable>::Borrowed<'a>;
 
@@ -44,10 +45,27 @@
 /// [module level documentation]: kernel::block::mq
 #[macros::vtable]
 pub trait Operations: Sized {
+    /// Data associated with a request. This data is located next to the request
+    /// structure.
+    ///
+    /// To be able to handle accessing this data from interrupt context, this
+    /// data must be `Sync`.
+    ///
+    /// Requests may be cleaned up by a thread different from the allocating thread, so
+    /// `RequestData` must be `Send`.
+    ///
+    /// The `RequestData` object is initialized when the requests are allocated
+    /// during queue initialization, and it is are dropped when the requests are
+    /// dropped during queue teardown.
+    type RequestData: Sized + Sync + Send;
+
     /// Data associated with the `struct request_queue` that is allocated for
     /// the `GenDisk` associated with this `Operations` implementation.
     type QueueData: ForeignOwnable + Sync;
 
+    /// Called by the kernel to get an initializer for a `Pin<&mut RequestData>`.
+    fn new_request_data() -> impl PinInit<Self::RequestData>;
+
     /// Called by the kernel to queue a request with the driver. If `is_last` is
     /// `false`, the driver is allowed to defer committing the request.
     fn queue_rq(
@@ -252,6 +270,12 @@ impl<T: Operations> OperationsVTable<T> {
             // it is valid for writes.
             unsafe { RequestDataWrapper::refcount_ptr(pdu.as_ptr()).write(Refcount::new(0)) };
 
+            let initializer = T::new_request_data();
+
+            // SAFETY: `pdu` is a valid pointer as established above. We do not touch `pdu` if
+            // `__pinned_init` returns an error. We promise not to move the pointee of `pdu`.
+            unsafe { initializer.__pinned_init(RequestDataWrapper::data_ptr(pdu.as_ptr()))? };
+
             Ok(0)
         })
     }
@@ -271,7 +295,7 @@ impl<T: Operations> OperationsVTable<T> {
     ) {
         // SAFETY: The tagset invariants guarantee that all requests are allocated with extra memory
         // for the request data.
-        let pdu = unsafe { bindings::blk_mq_rq_to_pdu(rq) }.cast::<RequestDataWrapper>();
+        let pdu = unsafe { bindings::blk_mq_rq_to_pdu(rq) }.cast::<RequestDataWrapper<T>>();
 
         // SAFETY: `pdu` is valid for read and write and is properly initialised.
         unsafe { core::ptr::drop_in_place(pdu) };
diff --git a/rust/kernel/block/mq/request.rs b/rust/kernel/block/mq/request.rs
index 7444de3c8522..1882d697dcf3 100644
--- a/rust/kernel/block/mq/request.rs
+++ b/rust/kernel/block/mq/request.rs
@@ -107,12 +107,12 @@ pub fn complete(this: ARef<Self>) {
     ///
     /// - `this` must point to a valid allocation of size at least size of
     ///   [`Self`] plus size of [`RequestDataWrapper`].
-    pub(crate) unsafe fn wrapper_ptr(this: *mut Self) -> NonNull<RequestDataWrapper> {
+    pub(crate) unsafe fn wrapper_ptr(this: *mut Self) -> NonNull<RequestDataWrapper<T>> {
         let request_ptr = this.cast::<bindings::request>();
         // SAFETY: By safety requirements for this function, `this` is a
         // valid allocation.
         let wrapper_ptr =
-            unsafe { bindings::blk_mq_rq_to_pdu(request_ptr).cast::<RequestDataWrapper>() };
+            unsafe { bindings::blk_mq_rq_to_pdu(request_ptr).cast::<RequestDataWrapper<T>>() };
         // SAFETY: By C API contract, `wrapper_ptr` points to a valid allocation
         // and is not null.
         unsafe { NonNull::new_unchecked(wrapper_ptr) }
@@ -120,7 +120,7 @@ pub(crate) unsafe fn wrapper_ptr(this: *mut Self) -> NonNull<RequestDataWrapper>
 
     /// Return a reference to the [`RequestDataWrapper`] stored in the private
     /// area of the request structure.
-    pub(crate) fn wrapper_ref(&self) -> &RequestDataWrapper {
+    pub(crate) fn wrapper_ref(&self) -> &RequestDataWrapper<T> {
         // SAFETY: By type invariant, `self.0` is a valid allocation. Further,
         // the private data associated with this request is initialized and
         // valid. The existence of `&self` guarantees that the private data is
@@ -132,16 +132,19 @@ pub(crate) fn wrapper_ref(&self) -> &RequestDataWrapper {
 /// A wrapper around data stored in the private area of the C [`struct request`].
 ///
 /// [`struct request`]: srctree/include/linux/blk-mq.h
-pub(crate) struct RequestDataWrapper {
+pub(crate) struct RequestDataWrapper<T: Operations> {
     /// The Rust request refcount has the following states:
     ///
     /// - 0: The request is owned by C block layer.
     /// - 1: The request is owned by Rust abstractions but there are no [`ARef`] references to it.
     /// - 2+: There are [`ARef`] references to the request.
     refcount: Refcount,
+
+    /// Driver managed request data
+    data: T::RequestData,
 }
 
-impl RequestDataWrapper {
+impl<T: Operations> RequestDataWrapper<T> {
     /// Return a reference to the refcount of the request that is embedding
     /// `self`.
     pub(crate) fn refcount(&self) -> &Refcount {
@@ -159,6 +162,17 @@ pub(crate) unsafe fn refcount_ptr(this: *mut Self) -> *mut Refcount {
         // field projection is safe.
         unsafe { &raw mut (*this).refcount }
     }
+
+    /// Return a pointer to the `data` field of the `Self` pointed to by `this`.
+    ///
+    /// # Safety
+    ///
+    /// - `this` must point to a live allocation of at least the size of `Self`.
+    pub(crate) unsafe fn data_ptr(this: *mut Self) -> *mut T::RequestData {
+        // SAFETY: Because of the safety requirements of this function, the
+        // field projection is safe.
+        unsafe { &raw mut (*this).data }
+    }
 }
 
 // SAFETY: Exclusive access is thread-safe for `Request`. `Request` has no `&mut
diff --git a/rust/kernel/block/mq/tag_set.rs b/rust/kernel/block/mq/tag_set.rs
index dae9df408a86..ec5cac48b83f 100644
--- a/rust/kernel/block/mq/tag_set.rs
+++ b/rust/kernel/block/mq/tag_set.rs
@@ -8,13 +8,27 @@
 
 use crate::{
     bindings,
-    block::mq::{operations::OperationsVTable, request::RequestDataWrapper, Operations},
-    error::{self, Result},
+    block::mq::{
+        operations::OperationsVTable,
+        request::RequestDataWrapper,
+        Operations, //
+    },
+    error::{
+        self,
+        Result, //
+    },
     prelude::try_pin_init,
     types::Opaque,
 };
-use core::{convert::TryInto, marker::PhantomData};
-use pin_init::{pin_data, pinned_drop, PinInit};
+use core::{
+    convert::TryInto,
+    marker::PhantomData, //
+};
+use pin_init::{
+    pin_data,
+    pinned_drop,
+    PinInit, //
+};
 
 /// A wrapper for the C `struct blk_mq_tag_set`.
 ///
@@ -39,7 +53,7 @@ pub fn new(
         num_maps: u32,
     ) -> impl PinInit<Self, error::Error> {
         let tag_set: bindings::blk_mq_tag_set = pin_init::zeroed();
-        let tag_set: Result<_> = core::mem::size_of::<RequestDataWrapper>()
+        let tag_set: Result<_> = size_of::<RequestDataWrapper<T>>()
             .try_into()
             .map(|cmd_size| {
                 bindings::blk_mq_tag_set {

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 09/83] block: rust: document the lifetime of `Request`
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (7 preceding siblings ...)
  2026-06-09 19:07 ` [PATCH v2 08/83] block: rust: add `Request` private data support Andreas Hindborg
@ 2026-06-09 19:07 ` Andreas Hindborg
  2026-06-09 19:07 ` [PATCH v2 10/83] block: rust: allow `hrtimer::Timer` in `RequestData` Andreas Hindborg
                   ` (73 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:07 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

The `struct request` objects backing a `Request` are not allocated and
freed for each IO. Instead, a fixed pool of requests is allocated when
the tag set is initialized, and each request is reused to service many
distinct IO operations over the lifetime of the request queue. It is
easy to assume from the existing documentation that a request, and in
particular its private data, is fresh for each IO.

Add a `Lifetime` section to the `Request` documentation describing this
reuse and its consequence for the lifetime of the request private data.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/kernel/block/mq/request.rs | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/rust/kernel/block/mq/request.rs b/rust/kernel/block/mq/request.rs
index 1882d697dcf3..a6e757d8755d 100644
--- a/rust/kernel/block/mq/request.rs
+++ b/rust/kernel/block/mq/request.rs
@@ -29,6 +29,24 @@
 
 /// A wrapper around a blk-mq [`struct request`]. This represents an IO request.
 ///
+/// # Lifetime
+///
+/// The [`struct request`] backing a [`Request`] is not allocated and freed for
+/// each IO. Instead, a fixed pool of requests is allocated up front when the
+/// [`TagSet`](crate::block::mq::TagSet) is initialized, with one request per
+/// available tag. A single request allocation is then reused to service many
+/// distinct IO operations over the lifetime of the request queue: when the
+/// block layer needs to process an IO, it assigns a free tag and hands the
+/// driver the associated request, and once that IO completes the request is
+/// returned to the pool to later be handed out again for an unrelated IO.
+///
+/// The private data area of the request, which holds the driver defined
+/// [`Operations::RequestData`], shares this lifetime. It is initialized once
+/// when the request pool is allocated and dropped once when the pool is torn
+/// down - not once per IO. As a result, [`Operations::RequestData`] persists
+/// across the many IO operations that reuse the same request, and a driver must
+/// not assume that it is reset to a fresh value at the start of each IO.
+///
 /// # Implementation details
 ///
 /// There are three states for a request that the Rust bindings care about:

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 10/83] block: rust: allow `hrtimer::Timer` in `RequestData`
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (8 preceding siblings ...)
  2026-06-09 19:07 ` [PATCH v2 09/83] block: rust: document the lifetime of `Request` Andreas Hindborg
@ 2026-06-09 19:07 ` Andreas Hindborg
  2026-06-09 19:07 ` [PATCH v2 11/83] block: rnull: add timer completion mode Andreas Hindborg
                   ` (72 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:07 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

`Request` is essentially a smart pointer that derefs to
`Operations::RequestData`. To use an `HrTimer` in `Operations::RequestData`
via the `Request` pointer, we must implement `HrTimerPointer` for
`Request`.

Thus, implement `HrTimerPointer` and friends for `ARef<Request>`.

Publicly export `HrTimer::raw_cancel` and `HrTimer::into_c`.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/kernel/block/mq.rs         |   5 +-
 rust/kernel/block/mq/request.rs | 142 ++++++++++++++++++++++++++++++++++++++++
 rust/kernel/time/hrtimer.rs     |   5 +-
 3 files changed, 149 insertions(+), 3 deletions(-)

diff --git a/rust/kernel/block/mq.rs b/rust/kernel/block/mq.rs
index 7718b106eb49..a03d46d274a5 100644
--- a/rust/kernel/block/mq.rs
+++ b/rust/kernel/block/mq.rs
@@ -107,5 +107,8 @@
 mod tag_set;
 
 pub use operations::Operations;
-pub use request::Request;
+pub use request::{
+    Request,
+    RequestTimerHandle, //
+};
 pub use tag_set::TagSet;
diff --git a/rust/kernel/block/mq/request.rs b/rust/kernel/block/mq/request.rs
index a6e757d8755d..0b14f584c9d9 100644
--- a/rust/kernel/block/mq/request.rs
+++ b/rust/kernel/block/mq/request.rs
@@ -15,6 +15,14 @@
         atomic::ordering,
         Refcount,
     },
+    time::hrtimer::{
+        HasHrTimer,
+        HrTimer,
+        HrTimerCallback,
+        HrTimerHandle,
+        HrTimerMode,
+        HrTimerPointer, //
+    },
     types::{
         Opaque,
         Ownable,
@@ -23,6 +31,7 @@
     },
 };
 use core::{
+    ffi::c_void,
     marker::PhantomData,
     ptr::NonNull, //
 };
@@ -145,6 +154,11 @@ pub(crate) fn wrapper_ref(&self) -> &RequestDataWrapper<T> {
         // valid as a shared reference.
         unsafe { Self::wrapper_ptr(core::ptr::from_ref(self).cast_mut()).as_ref() }
     }
+
+    /// Return a reference to the per-request data associated with this request.
+    pub fn data_ref(&self) -> &T::RequestData {
+        &self.wrapper_ref().data
+    }
 }
 
 /// A wrapper around data stored in the private area of the C [`struct request`].
@@ -329,3 +343,131 @@ fn into_shared(this: Owned<Self>) -> ARef<Self> {
         unsafe { ARef::from_raw(Owned::into_raw(this)) }
     }
 }
+
+/// A handle for a timer that is embedded in a [`Request`] private data area.
+pub struct RequestTimerHandle<T>
+where
+    T: Operations,
+    T::RequestData: HasHrTimer<T::RequestData>,
+{
+    inner: ARef<Request<T>>,
+}
+
+// SAFETY: The drop implementation of `RequestTimerHandle` calls `cancel`, which cancels the timer
+// if it is running. `drop` will block if the timer handler is running. This is ensured via a call
+// to `HrTimer::raw_cancel` in the implementation of `cancel`.
+unsafe impl<T> HrTimerHandle for RequestTimerHandle<T>
+where
+    T: Operations,
+    T::RequestData: HasHrTimer<T::RequestData>,
+{
+    fn cancel(&mut self) -> bool {
+        let request_data_ptr = &self.inner.wrapper_ref().data as *const T::RequestData;
+
+        // SAFETY: As we obtained `self_ptr` from a valid reference above, it
+        // must point to a valid `U`.
+        let timer_ptr = unsafe {
+            <T::RequestData as HasHrTimer<T::RequestData>>::raw_get_timer(request_data_ptr)
+        };
+
+        // SAFETY: As `timer_ptr` points into `U` and `U` is valid, `timer_ptr`
+        // must point to a valid `HrTimer` instance.
+        unsafe { HrTimer::<T::RequestData>::raw_cancel(timer_ptr) }
+    }
+}
+
+impl<T> RequestTimerHandle<T>
+where
+    T: Operations,
+    T::RequestData: HasHrTimer<T::RequestData>,
+{
+    /// Drop the timer handle without cancelling the timer.
+    ///
+    /// This is safe because dropping the last [`ARef<Request>`] does not drop the [`Request`].
+    pub fn dismiss(mut self) {
+        let inner = core::ptr::from_mut(&mut self.inner);
+
+        // SAFETY: `inner` is valid for reads and writes, is properly aligned and nonnull. We have
+        // exclusive access to `inner` and we do not access `inner` after this call.
+        unsafe { core::ptr::drop_in_place(inner) };
+        core::mem::forget(self);
+    }
+}
+
+impl<T> Drop for RequestTimerHandle<T>
+where
+    T: Operations,
+    T::RequestData: HasHrTimer<T::RequestData>,
+{
+    fn drop(&mut self) {
+        self.cancel();
+    }
+}
+
+impl<T> HrTimerPointer for ARef<Request<T>>
+where
+    T: Operations,
+    T::RequestData: HasHrTimer<T::RequestData>,
+    T::RequestData: Sync,
+{
+    type TimerMode = <T::RequestData as HasHrTimer<T::RequestData>>::TimerMode;
+    type TimerHandle = RequestTimerHandle<T>;
+
+    fn start(self, expires: <Self::TimerMode as HrTimerMode>::Expires) -> RequestTimerHandle<T> {
+        let pdu_ptr = self.data_ref() as *const T::RequestData;
+
+        // SAFETY: `pdu_pointer` is coerced from a live reference to a `T` and this points to a
+        // valid `T`. The reference is valid until `T` is dropped, and the timer will be canceled
+        // before this.
+        unsafe { T::RequestData::start(pdu_ptr, expires) };
+
+        RequestTimerHandle { inner: self }
+    }
+}
+
+impl<T> kernel::time::hrtimer::RawHrTimerCallback for ARef<Request<T>>
+where
+    T: Operations,
+    T::RequestData: HasHrTimer<T::RequestData>,
+    T::RequestData: for<'a> HrTimerCallback<Pointer<'a> = ARef<Request<T>>>,
+    T::RequestData: Sync,
+{
+    type CallbackTarget<'a> = Self;
+
+    unsafe extern "C" fn run(ptr: *mut bindings::hrtimer) -> bindings::hrtimer_restart {
+        // `HrTimer` is `repr(transparent)`
+        let timer_ptr = ptr.cast::<kernel::time::hrtimer::HrTimer<T::RequestData>>();
+
+        // SAFETY: By C API contract `ptr` is the pointer we passed when
+        // enqueuing the timer, so it is a `HrTimer<T::RequestData>` embedded in a `T::RequestData`
+        let request_data_ptr = unsafe { T::RequestData::timer_container_of(timer_ptr) };
+
+        let offset = core::mem::offset_of!(RequestDataWrapper<T>, data);
+
+        // SAFETY: This sub stays within the `bindings::request` allocation and does not wrap.
+        let pdu_ptr = unsafe {
+            request_data_ptr
+                .cast::<u8>()
+                .sub(offset)
+                .cast::<RequestDataWrapper<T>>()
+        };
+
+        // SAFETY: This request pointer was passed to us by the kernel in `init_request_callback`.
+        let request_ptr = unsafe { bindings::blk_mq_rq_from_pdu(pdu_ptr.cast::<c_void>()) };
+
+        // SAFETY: By C API contract, we have ownership of the request.
+        let request_ref = unsafe { &*(request_ptr as *const Request<T>) };
+
+        request_ref.inc_ref();
+        // SAFETY: We just incremented the refcount above.
+        let aref: ARef<Request<T>> = unsafe { ARef::from_raw(NonNull::from(request_ref)) };
+
+        // SAFETY:
+        // - By C API contract `timer_ptr` is the pointer that we passed when queuing the timer, so
+        //   it is a valid pointer to a `HrTimer<T>` embedded in a `T`.
+        // - We are within `RawHrTimerCallback::run`
+        let context = unsafe { kernel::time::hrtimer::HrTimerCallbackContext::from_raw(timer_ptr) };
+
+        T::RequestData::run(aref, context).into_c()
+    }
+}
diff --git a/rust/kernel/time/hrtimer.rs b/rust/kernel/time/hrtimer.rs
index d57276496ed6..096b18523c73 100644
--- a/rust/kernel/time/hrtimer.rs
+++ b/rust/kernel/time/hrtimer.rs
@@ -496,7 +496,7 @@ unsafe fn raw_get(this: *const Self) -> *mut bindings::hrtimer {
     /// # Safety
     ///
     /// `this` must point to a valid `Self`.
-    pub(crate) unsafe fn raw_cancel(this: *const Self) -> bool {
+    pub unsafe fn raw_cancel(this: *const Self) -> bool {
         // SAFETY: `this` points to an allocation of at least `HrTimer` size.
         let c_timer_ptr = unsafe { HrTimer::raw_get(this) };
 
@@ -900,7 +900,8 @@ pub enum HrTimerRestart {
 }
 
 impl HrTimerRestart {
-    fn into_c(self) -> bindings::hrtimer_restart {
+    /// Convert `self` into an integer for FFI use.
+    pub fn into_c(self) -> bindings::hrtimer_restart {
         self as bindings::hrtimer_restart
     }
 }

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 11/83] block: rnull: add timer completion mode
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (9 preceding siblings ...)
  2026-06-09 19:07 ` [PATCH v2 10/83] block: rust: allow `hrtimer::Timer` in `RequestData` Andreas Hindborg
@ 2026-06-09 19:07 ` Andreas Hindborg
  2026-06-09 19:07 ` [PATCH v2 12/83] block: rust: introduce `kernel::block::bio` module Andreas Hindborg
                   ` (71 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:07 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add a timer completion mode to `rnull`. This will complete requests after a
specified time has elapsed. To use this mode of operation, set `irqmode` to
`2` and write a timeout in nanoseconds to `completion_nsec`.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/configfs.rs | 34 ++++++++++++++++++++--
 drivers/block/rnull/rnull.rs    | 63 ++++++++++++++++++++++++++++++++++++++---
 2 files changed, 90 insertions(+), 7 deletions(-)

diff --git a/drivers/block/rnull/configfs.rs b/drivers/block/rnull/configfs.rs
index fd309fc17e66..83b474f6da60 100644
--- a/drivers/block/rnull/configfs.rs
+++ b/drivers/block/rnull/configfs.rs
@@ -25,11 +25,15 @@
         kstrtobool_bytes,
         CString, //
     },
-    sync::Mutex, //
+    sync::Mutex,
+    time, //
 };
 use macros::{
+    configfs_attribute,
     configfs_simple_bool_field,
-    configfs_simple_field, //
+    configfs_simple_field,
+    show_field,
+    store_number_with_power_check, //
 };
 
 mod macros;
@@ -56,7 +60,7 @@ impl AttributeOperations<0> for Config {
 
     fn show(_this: &Config, page: &mut [u8; PAGE_SIZE]) -> Result<usize> {
         let mut writer = kernel::str::Formatter::new(page);
-        writer.write_str("blocksize,size,rotational,irqmode\n")?;
+        writer.write_str("blocksize,size,rotational,irqmode,completion_nsec\n")?;
         Ok(writer.bytes_written())
     }
 }
@@ -79,6 +83,7 @@ fn make_group(
                 rotational: 2,
                 size: 3,
                 irqmode: 4,
+                completion_nsec: 5,
             ],
         };
 
@@ -94,6 +99,7 @@ fn make_group(
                     disk: None,
                     capacity_mib: 4096,
                     irq_mode: IRQMode::None,
+                    completion_time: time::Delta::ZERO,
                     name: name.try_into()?,
                 }),
             }),
@@ -106,6 +112,7 @@ fn make_group(
 pub(crate) enum IRQMode {
     None,
     Soft,
+    Timer,
 }
 
 impl TryFrom<u8> for IRQMode {
@@ -115,6 +122,7 @@ fn try_from(value: u8) -> Result<Self> {
         match value {
             0 => Ok(Self::None),
             1 => Ok(Self::Soft),
+            2 => Ok(Self::Timer),
             _ => Err(EINVAL),
         }
     }
@@ -125,11 +133,22 @@ fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
         match self {
             Self::None => f.write_str("0")?,
             Self::Soft => f.write_str("1")?,
+            Self::Timer => f.write_str("2")?,
         }
         Ok(())
     }
 }
 
+/// Wraps [`time::Delta`] to render the value as a bare nanosecond count for
+/// configfs attributes that historically used this format.
+struct DeltaDisplay(time::Delta);
+
+impl kernel::fmt::Display for DeltaDisplay {
+    fn fmt(&self, f: &mut kernel::fmt::Formatter<'_>) -> kernel::fmt::Result {
+        f.write_fmt(kernel::prelude::fmt!("{}", self.0.as_nanos()))
+    }
+}
+
 #[pin_data]
 pub(crate) struct DeviceConfig {
     #[pin]
@@ -144,6 +163,7 @@ struct DeviceConfigInner {
     rotational: bool,
     capacity_mib: u64,
     irq_mode: IRQMode,
+    completion_time: time::Delta,
     disk: Option<GenDisk<NullBlkDevice>>,
 }
 
@@ -174,6 +194,7 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
                 guard.rotational,
                 guard.capacity_mib,
                 guard.irq_mode,
+                guard.completion_time,
             )?);
             guard.powered = true;
         } else if guard.powered && !power_op {
@@ -189,6 +210,13 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
 configfs_simple_bool_field!(DeviceConfig, 2, rotational);
 configfs_simple_field!(DeviceConfig, 3, capacity_mib, u64);
 configfs_simple_field!(DeviceConfig, 4, irq_mode, IRQMode);
+configfs_attribute!(DeviceConfig, 5,
+    show: |this, page| show_field(DeltaDisplay(this.data.lock().completion_time), page),
+    store: |this, page| store_number_with_power_check(this, page, |data, value: i64| {
+        data.completion_time = time::Delta::from_nanos(value);
+        Ok(())
+    })
+);
 
 impl core::str::FromStr for IRQMode {
     type Err = Error;
diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index dd7a30519870..3e7a47e6d0e5 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -28,6 +28,15 @@
         Arc,
         Mutex, //
     },
+    time::{
+        hrtimer::{
+            HrTimerCallback,
+            HrTimerCallbackContext,
+            HrTimerPointer,
+            HrTimerRestart, //
+        },
+        Delta,
+    },
     types::{
         OwnableRefCounted,
         Owned, //
@@ -59,7 +68,11 @@
         },
         irqmode: u8 {
             default: 0,
-            description:  "IRQ completion handler. 0-none, 1-softirq",
+            description:  "IRQ completion handler. 0-none, 1-softirq, 2-timer",
+        },
+        completion_nsec: u64 {
+            default: 10_000,
+            description:  "Time in ns to complete a request in hardware. Default: 10,000ns",
         },
     },
 }
@@ -79,6 +92,7 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
         let mut disks = KVec::new();
 
         let defer_init = move || -> Result<_, Error> {
+            let completion_time: i64 = module_parameters::completion_nsec.value().try_into()?;
             for i in 0..module_parameters::nr_devices.value() {
                 let name = CString::try_from_fmt(fmt!("rnullb{}", i))?;
 
@@ -88,6 +102,7 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
                     module_parameters::rotational.value(),
                     module_parameters::gb.value() * 1024,
                     module_parameters::irqmode.value().try_into()?,
+                    Delta::from_nanos(completion_time),
                 )?;
                 disks.push(disk, GFP_KERNEL)?;
             }
@@ -111,10 +126,17 @@ fn new(
         rotational: bool,
         capacity_mib: u64,
         irq_mode: IRQMode,
+        completion_time: Delta,
     ) -> Result<GenDisk<Self>> {
         let tagset = Arc::pin_init(TagSet::new(1, 256, 1), GFP_KERNEL)?;
 
-        let queue_data = Box::new(QueueData { irq_mode }, GFP_KERNEL)?;
+        let queue_data = Box::new(
+            QueueData {
+                irq_mode,
+                completion_time,
+            },
+            GFP_KERNEL,
+        )?;
 
         gen_disk::GenDiskBuilder::new()
             .capacity_sectors(capacity_mib << (20 - block::SECTOR_SHIFT))
@@ -127,15 +149,43 @@ fn new(
 
 struct QueueData {
     irq_mode: IRQMode,
+    completion_time: Delta,
+}
+
+#[pin_data]
+struct Pdu {
+    #[pin]
+    timer: kernel::time::hrtimer::HrTimer<Self>,
+}
+
+impl HrTimerCallback for Pdu {
+    type Pointer<'a> = ARef<mq::Request<NullBlkDevice>>;
+
+    fn run(this: Self::Pointer<'_>, _context: HrTimerCallbackContext<'_, Self>) -> HrTimerRestart {
+        OwnableRefCounted::try_from_shared(this)
+            .map_err(|_e| kernel::error::code::EIO)
+            .expect("Failed to complete request")
+            .end_ok();
+        HrTimerRestart::NoRestart
+    }
+}
+
+kernel::impl_has_hr_timer! {
+    impl HasHrTimer<Self> for Pdu {
+        mode: kernel::time::hrtimer::RelativeMode<kernel::time::Monotonic>,
+        field: self.timer,
+    }
 }
 
 #[vtable]
 impl Operations for NullBlkDevice {
     type QueueData = KBox<QueueData>;
-    type RequestData = ();
+    type RequestData = Pdu;
 
     fn new_request_data() -> impl PinInit<Self::RequestData> {
-        Ok(())
+        pin_init!(Pdu {
+            timer <- kernel::time::hrtimer::HrTimer::new(),
+        })
     }
 
     #[inline(always)]
@@ -143,6 +193,11 @@ fn queue_rq(queue_data: &QueueData, rq: Owned<mq::Request<Self>>, _is_last: bool
         match queue_data.irq_mode {
             IRQMode::None => rq.end_ok(),
             IRQMode::Soft => mq::Request::complete(rq.into()),
+            IRQMode::Timer => {
+                OwnableRefCounted::into_shared(rq)
+                    .start(queue_data.completion_time)
+                    .dismiss();
+            }
         }
         Ok(())
     }

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 12/83] block: rust: introduce `kernel::block::bio` module
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (10 preceding siblings ...)
  2026-06-09 19:07 ` [PATCH v2 11/83] block: rnull: add timer completion mode Andreas Hindborg
@ 2026-06-09 19:07 ` Andreas Hindborg
  2026-06-09 19:07 ` [PATCH v2 13/83] block: rust: add `command` getter to `Request` Andreas Hindborg
                   ` (70 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:07 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add Rust abstractions for working with `struct bio`, the core IO command
descriptor for the block layer.

The `Bio` type wraps `struct bio` and provides safe access to the IO
vector describing the data buffers associated with the IO command. The
data buffers are represented as a vector of `Segment`s, where each
segment is a contiguous region of physical memory backed by `Page`.

The `BioSegmentIterator` provides iteration over segments in a single
bio, while `BioIterator` allows traversing a chain of bios. The
`Segment` type offers methods for copying data to and from pages, as
well as zeroing page contents, which are the fundamental operations
needed by block device drivers to process IO requests.

The `Request` type is extended with methods to access the bio chain
associated with a request, allowing drivers to iterate over all data
buffers that need to be processed.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/helpers/blk.c              |   8 +
 rust/kernel/block.rs            |   1 +
 rust/kernel/block/bio.rs        | 147 ++++++++++++++
 rust/kernel/block/bio/vec.rs    | 411 ++++++++++++++++++++++++++++++++++++++++
 rust/kernel/block/mq/request.rs |  49 +++++
 rust/kernel/page.rs             |   2 +-
 6 files changed, 617 insertions(+), 1 deletion(-)

diff --git a/rust/helpers/blk.c b/rust/helpers/blk.c
index 20c512e46a7a..6a70e1306a3a 100644
--- a/rust/helpers/blk.c
+++ b/rust/helpers/blk.c
@@ -1,5 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0
 
+#include <linux/bio.h>
 #include <linux/blk-mq.h>
 #include <linux/blkdev.h>
 
@@ -12,3 +13,10 @@ __rust_helper struct request *rust_helper_blk_mq_rq_from_pdu(void *pdu)
 {
 	return blk_mq_rq_from_pdu(pdu);
 }
+
+__rust_helper void rust_helper_bio_advance_iter_single(const struct bio *bio,
+						       struct bvec_iter *iter,
+						       unsigned int bytes)
+{
+	bio_advance_iter_single(bio, iter, bytes);
+}
diff --git a/rust/kernel/block.rs b/rust/kernel/block.rs
index b120e83d9425..eb512dad031b 100644
--- a/rust/kernel/block.rs
+++ b/rust/kernel/block.rs
@@ -2,6 +2,7 @@
 
 //! Types for working with the block layer.
 
+pub mod bio;
 pub mod mq;
 
 /// Bit mask for masking out the sector index in a page.
diff --git a/rust/kernel/block/bio.rs b/rust/kernel/block/bio.rs
new file mode 100644
index 000000000000..af84f94a85fe
--- /dev/null
+++ b/rust/kernel/block/bio.rs
@@ -0,0 +1,147 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Types for working with the bio layer.
+//!
+//! C header: [`include/linux/blk_types.h`](srctree/include/linux/blk_types.h)
+
+use crate::{
+    fmt,
+    types::Opaque, //
+};
+use core::{
+    marker::PhantomData,
+    pin::Pin,
+    ptr::NonNull, //
+};
+
+mod vec;
+
+pub use vec::{
+    BioSegmentIterator,
+    Segment, //
+};
+
+/// A block device IO descriptor (`struct bio`).
+///
+/// A `Bio` is the main unit of IO for the block layer. It describes an IO command and associated
+/// data buffers.
+///
+/// The data buffers associated with a `Bio` are represented by a vector of [`Segment`]s. These
+/// segments represent physically contiguous regions of memory. The memory is represented by
+/// [`Page`] descriptors internally.
+///
+/// The vector of [`Segment`]s can be iterated by obtaining a [`SegmentIterator`].
+#[repr(transparent)]
+pub struct Bio(Opaque<bindings::bio>);
+
+impl Bio {
+    /// Returns an iterator over segments in this `Bio`. Does not consider
+    /// segments of other bios in this bio chain.
+    #[inline(always)]
+    pub fn segment_iter(self: Pin<&mut Self>) -> BioSegmentIterator<'_> {
+        BioSegmentIterator::new(self)
+    }
+
+    /// Get the number of io vectors in this bio.
+    fn io_vec_count(&self) -> u16 {
+        // SAFETY: By the type invariant of `Bio` and existence of `&self`,
+        // `self.0` is valid for read.
+        unsafe { (*self.0.get()).bi_vcnt }
+    }
+
+    /// Get slice referencing the `bio_vec` array of this bio
+    #[inline(always)]
+    fn io_vec(&self) -> NonNull<bindings::bio_vec> {
+        let this = self.0.get();
+
+        // SAFETY: By the type invariant of `Bio` and existence of `&self`,
+        // `this` is valid for read.
+        let vec_ptr = unsafe { (*this).bi_io_vec };
+
+        // SAFETY: By C API contract, bi_io_vec is always set, even if bi_vcnt
+        // is zero.
+        unsafe { NonNull::new_unchecked(vec_ptr) }
+    }
+
+    /// Return a copy of the `bvec_iter` for this `Bio`. This iterator always
+    /// indexes to a valid `bio_vec` entry.
+    #[inline(always)]
+    fn raw_iter(&self) -> bindings::bvec_iter {
+        // SAFETY: By the type invariant of `Bio` and existence of `&self`,
+        // `self` is valid for read.
+        unsafe { (*self.0.get()).bi_iter }
+    }
+
+    /// Create an instance of `Bio` from a raw pointer.
+    ///
+    /// # Safety
+    ///
+    /// Caller must ensure that the `ptr` is valid for use as a reference to
+    /// `Bio` for the duration of `'a`.
+    #[inline(always)]
+    pub(crate) unsafe fn from_raw<'a>(ptr: *mut bindings::bio) -> Option<&'a Self> {
+        Some(
+            // SAFETY: by the safety requirement of this function, `ptr` is
+            // valid for read for the duration of the returned lifetime
+            unsafe { &*NonNull::new(ptr)?.as_ptr().cast::<Bio>() },
+        )
+    }
+
+    /// Create an instance of `Bio` from a raw pointer.
+    ///
+    /// # Safety
+    ///
+    /// Caller must ensure that the `ptr` is valid for use as a unique reference
+    /// to `Bio` for the duration of `'a`.
+    #[inline(always)]
+    pub(crate) unsafe fn from_raw_mut<'a>(ptr: *mut bindings::bio) -> Option<Pin<&'a mut Self>> {
+        // SAFETY: by the safety requirement of this function, `ptr` is
+        // valid for read for the duration of the returned lifetime.
+        let bio = unsafe { &mut *NonNull::new(ptr)?.as_ptr().cast::<Bio>() };
+
+        // SAFETY: `bindings::bio` is pinned.
+        Some(unsafe { Pin::new_unchecked(bio) })
+    }
+}
+
+impl fmt::Display for Bio {
+    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
+        let iter = self.raw_iter();
+        write!(
+            f,
+            "Bio({:?}, vcnt: {}, idx: {}, size: 0x{:x}, completed: 0x{:x})",
+            self.0.get(),
+            self.io_vec_count(),
+            iter.bi_idx,
+            iter.bi_size,
+            iter.bi_bvec_done
+        )
+    }
+}
+
+/// An iterator over `Bio` in a bio chain, yielding `&mut Bio`.
+///
+/// # Invariants
+///
+/// `bio` must be either `None` or be valid for use as a `&mut Bio`.
+pub struct BioIterator<'a> {
+    pub(crate) bio: Option<NonNull<Bio>>,
+    pub(crate) _p: PhantomData<&'a ()>,
+}
+
+impl<'a> core::iter::Iterator for BioIterator<'a> {
+    type Item = Pin<&'a mut Bio>;
+
+    #[inline(always)]
+    fn next(&mut self) -> Option<Pin<&'a mut Bio>> {
+        let mut current = self.bio.take()?;
+        // SAFETY: By the type invariant of `Bio` and type invariant on `Self`,
+        // `current` is valid for use as a unique reference.
+        let next = unsafe { (*current.as_ref().0.get()).bi_next };
+        self.bio = NonNull::new(next.cast());
+        // SAFETY:
+        // - By type invariant, `bio` is valid for use as a reference.
+        // - `bindings::bio` is pinned.
+        Some(unsafe { Pin::new_unchecked(current.as_mut()) })
+    }
+}
diff --git a/rust/kernel/block/bio/vec.rs b/rust/kernel/block/bio/vec.rs
new file mode 100644
index 000000000000..99ab164d4038
--- /dev/null
+++ b/rust/kernel/block/bio/vec.rs
@@ -0,0 +1,411 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Types for working with `struct bio_vec` IO vectors
+//!
+//! C header: [`include/linux/bvec.h`](../../include/linux/bvec.h)
+
+use super::Bio;
+use crate::{
+    error::{
+        code,
+        Result, //
+    },
+    page::{
+        Page,
+        SafePage,
+        PAGE_SIZE, //
+    },
+    prelude::*, //
+};
+use core::{
+    fmt,
+    mem::ManuallyDrop, //
+};
+
+/// A segment of an IO request.
+///
+/// [`Segment`] represents a contiguous range of physical memory addresses of an IO request. A
+/// segment has a offset and a length, representing the amount of data that needs to be processed.
+/// Processing the data increases the offset and reduces the length.
+///
+/// The data buffer of a [`Segment`] is borrowed from a `Bio`.
+///
+/// # Implementation details
+///
+/// In the context of user driven block IO, the pages backing a [`Segment`] are often mapped to user
+/// space concurrently with the IO operation. Further, the page backing a `Segment` may be part of
+/// multiple IO operations, if user space decides to issue multiple concurrent IO operations
+/// involving the same page. Thus, the data represented by a [`Segment`] must always be assumed to
+/// be subject to racy writes.
+///
+/// A [`Segemnt`] is a wrapper around a `strutct bio_vec`.
+///
+/// # Invariants
+///
+/// `bio_vec` must always be initialized and valid for read and write
+pub struct Segment<'a> {
+    bio_vec: bindings::bio_vec,
+    _marker: core::marker::PhantomData<&'a ()>,
+}
+
+impl Segment<'_> {
+    /// Get he length of the segment in bytes.
+    #[inline(always)]
+    pub fn len(&self) -> u32 {
+        self.bio_vec.bv_len
+    }
+
+    /// Returns true if the length of the segment is 0.
+    #[inline(always)]
+    pub fn is_empty(&self) -> bool {
+        self.len() == 0
+    }
+
+    /// Get the offset field of the `bio_vec`.
+    #[inline(always)]
+    pub fn offset(&self) -> usize {
+        self.bio_vec.bv_offset as usize
+    }
+
+    /// Advance the offset of the segment.
+    ///
+    /// If `count` is greater than the remaining size of the segment, an error
+    /// is returned.
+    pub fn advance(&mut self, count: u32) -> Result {
+        if self.len() < count {
+            return Err(code::EINVAL);
+        }
+
+        self.bio_vec.bv_offset += count;
+        self.bio_vec.bv_len -= count;
+        Ok(())
+    }
+
+    /// Copy data of this segment into `dst_page`.
+    ///
+    /// Copies data from the current offset to the next page boundary. That is `PAGE_SIZE -
+    /// (self.offeset() % PAGE_SIZE)` bytes of data. Data is placed at offset `self.offset()` in the
+    /// target page. This call will advance offset and reduce length of `self`.
+    ///
+    /// Returns the number of bytes copied.
+    #[inline(always)]
+    pub fn copy_to_page(&mut self, dst_page: Pin<&mut SafePage>, dst_offset: usize) -> usize {
+        // SAFETY: We are not moving out of `dst_page`.
+        let dst_page = unsafe { Pin::into_inner_unchecked(dst_page) };
+        let src_offset = self.offset() % PAGE_SIZE;
+        debug_assert!(dst_offset <= PAGE_SIZE);
+        let length = (PAGE_SIZE - src_offset)
+            .min(self.len() as usize)
+            .min(PAGE_SIZE - dst_offset);
+        let page_idx = self.offset() / PAGE_SIZE;
+
+        // SAFETY: self.bio_vec is valid and thus bv_page must be a valid
+        // pointer to a `struct page` array.
+        let src_page = unsafe { Page::from_raw(self.bio_vec.bv_page.add(page_idx)) };
+
+        src_page
+            .with_pointer_into_page(src_offset, length, |src| {
+                // SAFETY:
+                // - If `with_pointer_into_page` calls this closure, it has performed bounds
+                //   checking and guarantees that `src` is valid for `length` bytes.
+                // - Any other operations to `src` are atomic or user space operations.
+                // - We have exclusive ownership of `dst_page` and thus this write will not race.
+                unsafe { dst_page.write_bytewise_atomic(src, dst_offset, length) }
+            })
+            .expect("Assertion failure, bounds check failed.");
+
+        self.advance(length as u32)
+            .expect("Assertion failure, bounds check failed.");
+
+        length
+    }
+
+    /// Copy data to the current page of this segment from `src_page`.
+    ///
+    /// Copies  `PAGE_SIZE - (self.offset() % PAGE_SIZE` bytes of data from `src_page` to this
+    /// segment starting at `self.offset()` from offset `self.offset() % PAGE_SIZE`. This call
+    /// will advance offset and reduce length of `self`.
+    ///
+    /// Returns the number of bytes copied.
+    pub fn copy_from_page(&mut self, src_page: &SafePage, src_offset: usize) -> usize {
+        let dst_offset = self.offset() % PAGE_SIZE;
+        debug_assert!(src_offset <= PAGE_SIZE);
+        let length = (PAGE_SIZE - dst_offset)
+            .min(self.len() as usize)
+            .min(PAGE_SIZE - src_offset);
+        let page_idx = self.offset() / PAGE_SIZE;
+
+        // SAFETY: self.bio_vec is valid and thus bv_page must be a valid
+        // pointer to a `struct page`.
+        let dst_page = unsafe { Page::from_raw(self.bio_vec.bv_page.add(page_idx)) };
+
+        dst_page
+            .with_pointer_into_page(dst_offset, length, |dst| {
+                // SAFETY:
+                // - If `with_pointer_into_page` calls this closure, then it has performed bounds
+                //   checks and guarantees that `dst` is valid for `length` bytes.
+                // - Any other operations to `dst` are atomic or user space operations.
+                // - Since we have a shared reference to `src_page`, the read cannot race with any
+                //   writes to `src_page`.
+                unsafe { src_page.read_bytewise_atomic(dst, src_offset, length) }
+            })
+            .expect("Assertion failure, bounds check failed.");
+
+        self.advance(length as u32)
+            .expect("Assertion failure, bounds check failed.");
+
+        length
+    }
+
+    /// Copy zeroes to the current page of this segment.
+    ///
+    /// Copies  `PAGE_SIZE - (self.offset() % PAGE_SIZE` bytes of data to this
+    /// segment starting at `self.offset()`. This call will advance offset and reduce length of
+    /// `self`.
+    ///
+    /// Returns the number of bytes written to this segment.
+    pub fn zero_page(&mut self) -> usize {
+        let offset = self.offset() % PAGE_SIZE;
+        let length = (PAGE_SIZE - offset).min(self.len() as usize);
+        let page_idx = self.offset() / PAGE_SIZE;
+
+        // SAFETY: self.bio_vec is valid and thus bv_page must be a valid
+        // pointer to a `struct page`. We do not own the page, but we prevent
+        // drop by wrapping the `Page` in `ManuallyDrop`.
+        let dst_page =
+            ManuallyDrop::new(unsafe { Page::from_raw(self.bio_vec.bv_page.add(page_idx)) });
+
+        // SAFETY: TODO: This might race with user space writes.
+        unsafe { dst_page.fill_zero_raw(offset, length) }
+            .expect("Assertion failure, bounds check failed.");
+
+        self.advance(length as u32)
+            .expect("Assertion failure, bounds check failed.");
+
+        length
+    }
+}
+
+impl core::fmt::Display for Segment<'_> {
+    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
+        write!(
+            f,
+            "Segment {:?} len: {}, offset: {}",
+            self.bio_vec.bv_page, self.bio_vec.bv_len, self.bio_vec.bv_offset
+        )
+    }
+}
+
+/// An iterator over `Segment`
+///
+/// The iterator takes a copy of the bio's `bvec_iter` when it is created and
+/// advances that copy as it yields [`Segment`]s, leaving the `Bio` untouched. A
+/// `struct bvec_iter` is a standalone cursor into the bio's `bio_vec` array: as
+/// described in the kernel's [immutable biovecs] documentation, the `bio_vec`
+/// array is immutable once the bio is submitted and all of the position that
+/// changes while iterating is held in the `bvec_iter`, not in the array. Such an
+/// iterator can therefore be freely copied and moved, and advancing one copy
+/// affects neither the `Bio` nor any other copy of the iterator.
+///
+/// [immutable biovecs]: srctree/Documentation/block/biovecs.rst
+///
+/// # Invariants
+///
+/// If `iter.bi_size` > 0, `iter` must always index a valid `bio_vec` in `bio.io_vec()`.
+pub struct BioSegmentIterator<'a> {
+    bio: Pin<&'a mut Bio>,
+    iter: bindings::bvec_iter,
+}
+
+impl<'a> BioSegmentIterator<'a> {
+    /// Create a new segment iterator for iterating the segments of `bio`. The
+    /// iterator starts at the beginning of `bio`.
+    #[inline(always)]
+    pub(crate) fn new(bio: Pin<&'a mut Bio>) -> BioSegmentIterator<'a> {
+        let iter = bio.raw_iter();
+
+        // INVARIANT: `bio.raw_iter()` returns an index that indexes into a valid
+        // `bio_vec` in `bio.io_vec()`.
+        Self { bio, iter }
+    }
+
+    // The accessors in this implementation block are modelled after C side
+    // macros and static functions `bvec_iter_*` and `mp_bvec_iter_*` from
+    // bvec.h.
+
+    /// Construct a `bio_vec` from the current iterator state.
+    ///
+    /// This will return a `bio_vec`of size <= PAGE_SIZE
+    ///
+    /// # Safety
+    ///
+    /// Caller must ensure that `self.iter.bi_size` > 0 before calling this
+    /// method.
+    unsafe fn io_vec(&self) -> bindings::bio_vec {
+        debug_assert!(self.iter.bi_size > 0);
+        // SAFETY: By safety requirement of this function `self.iter.bi_size` is
+        // greater than 0.
+        unsafe {
+            bindings::bio_vec {
+                bv_page: self.page(),
+                bv_len: self.len(),
+                bv_offset: self.offset(),
+            }
+        }
+    }
+
+    /// Get the currently indexed `bio_vec` entry.
+    ///
+    /// # Safety
+    ///
+    /// Caller must ensure that `self.iter.bi_size` > 0 before calling this
+    /// method.
+    #[inline(always)]
+    unsafe fn bvec(&self) -> &bindings::bio_vec {
+        debug_assert!(self.iter.bi_size > 0);
+        // SAFETY: By the safety requirement of this function and the type
+        // invariant of `Self`, `self.iter.bi_idx` indexes into a valid
+        // `bio_vec`
+        unsafe { self.bio.io_vec().offset(self.iter.bi_idx as isize).as_ref() }
+    }
+
+    /// Get the  as u32currently indexed page, indexing into pages of order >= 0.
+    ///
+    /// # Safety
+    ///
+    /// Caller must ensure that `self.iter.bi_size` > 0 before calling this
+    /// method.
+    #[inline(always)]
+    unsafe fn page(&self) -> *mut bindings::page {
+        debug_assert!(self.iter.bi_size > 0);
+        // SAFETY: By C API contract, the following offset cannot exceed pages
+        // allocated to this bio.
+        unsafe { self.mp_page().add(self.mp_page_idx()) }
+    }
+
+    /// Get the remaining bytes in the current page. Never more than PAGE_SIZE.
+    ///
+    /// # Safety
+    ///
+    /// Caller must ensure that `self.iter.bi_size` > 0 before calling this
+    /// method.
+    #[inline(always)]
+    unsafe fn len(&self) -> u32 {
+        debug_assert!(self.iter.bi_size > 0);
+        // SAFETY: By safety requirement of this function `self.iter.bi_size` is
+        // greater than 0.
+        unsafe {
+            self.mp_len()
+                .min((bindings::PAGE_SIZE as u32) - self.offset())
+        }
+    }
+
+    /// Get the offset from the last page boundary in the currently indexed
+    /// `bio_vec` entry. Never more than PAGE_SIZE.
+    ///
+    /// # Safety
+    ///
+    /// Caller must ensure that `self.iter.bi_size` > 0 before calling this
+    /// method.
+    #[inline(always)]
+    unsafe fn offset(&self) -> u32 {
+        debug_assert!(self.iter.bi_size > 0);
+        // SAFETY: By safety requirement of this function `self.iter.bi_size` is
+        // greater than 0.
+        unsafe { self.mp_offset() % (bindings::PAGE_SIZE as u32) }
+    }
+
+    /// Return the first page of the currently indexed `bio_vec` entry. This
+    /// might be a multi-page entry, meaning that page might have order > 0.
+    ///
+    /// # Safety
+    ///
+    /// Caller must ensure that `self.iter.bi_size` > 0 before calling this
+    /// method.
+    #[inline(always)]
+    unsafe fn mp_page(&self) -> *mut bindings::page {
+        debug_assert!(self.iter.bi_size > 0);
+        // SAFETY: By safety requirement of this function `self.iter.bi_size` is
+        // greater than 0.
+        unsafe { self.bvec().bv_page }
+    }
+
+    /// Get the offset in whole pages into the currently indexed `bio_vec`. This
+    /// can be more than 0 is the page has order > 0.
+    ///
+    /// # Safety
+    ///
+    /// Caller must ensure that `self.iter.bi_size` > 0 before calling this
+    /// method.
+    #[inline(always)]
+    unsafe fn mp_page_idx(&self) -> usize {
+        debug_assert!(self.iter.bi_size > 0);
+        // SAFETY: By safety requirement of this function `self.iter.bi_size` is
+        // greater than 0.
+        (unsafe { self.mp_offset() } / (bindings::PAGE_SIZE as u32)) as usize
+    }
+
+    /// Get the offset in the currently indexed `bio_vec` multi-page entry. This
+    /// can be more than `PAGE_SIZE` if the page has order > 0.
+    ///
+    /// # Safety
+    ///
+    /// Caller must ensure that `self.iter.bi_size` > 0 before calling this
+    /// method.
+    #[inline(always)]
+    unsafe fn mp_offset(&self) -> u32 {
+        debug_assert!(self.iter.bi_size > 0);
+        // SAFETY: By safety requirement of this function `self.iter.bi_size` is
+        // greater than 0.
+        unsafe { self.bvec().bv_offset + self.iter.bi_bvec_done }
+    }
+
+    /// Get the number of remaining bytes for the currently indexed `bio_vec`
+    /// entry. Can be more than PAGE_SIZE for `bio_vec` entries with pages of
+    /// order > 0.
+    ///
+    /// # Safety
+    ///
+    /// Caller must ensure that `self.iter.bi_size` > 0 before calling this
+    /// method.
+    #[inline(always)]
+    unsafe fn mp_len(&self) -> u32 {
+        debug_assert!(self.iter.bi_size > 0);
+        // SAFETY: By safety requirement of this function `self.iter.bi_size` is
+        // greater than 0.
+        self.iter
+            .bi_size
+            .min(unsafe { self.bvec().bv_len } - self.iter.bi_bvec_done)
+    }
+}
+
+impl<'a> core::iter::Iterator for BioSegmentIterator<'a> {
+    type Item = Segment<'a>;
+
+    #[inline(always)]
+    fn next(&mut self) -> Option<Self::Item> {
+        if self.iter.bi_size == 0 {
+            return None;
+        }
+
+        // SAFETY: We checked that `self.iter.bi_size` > 0 above.
+        let bio_vec_ret = unsafe { self.io_vec() };
+
+        // SAFETY: By existence of reference `&bio`, `bio.0` contains a valid
+        // `struct bio`. By type invariant of `BioSegmentItarator` `self.iter`
+        // indexes into a valid `bio_vec` entry. By C API contracit, `bv_len`
+        // does not exceed the size of the bio.
+        unsafe {
+            bindings::bio_advance_iter_single(
+                self.bio.0.get(),
+                &raw mut self.iter,
+                bio_vec_ret.bv_len,
+            )
+        };
+
+        Some(Segment {
+            bio_vec: bio_vec_ret,
+            _marker: core::marker::PhantomData,
+        })
+    }
+}
diff --git a/rust/kernel/block/mq/request.rs b/rust/kernel/block/mq/request.rs
index 0b14f584c9d9..98e54f0586d1 100644
--- a/rust/kernel/block/mq/request.rs
+++ b/rust/kernel/block/mq/request.rs
@@ -33,9 +33,15 @@
 use core::{
     ffi::c_void,
     marker::PhantomData,
+    pin::Pin,
     ptr::NonNull, //
 };
 
+use crate::block::bio::{
+    Bio,
+    BioIterator, //
+};
+
 /// A wrapper around a blk-mq [`struct request`]. This represents an IO request.
 ///
 /// # Lifetime
@@ -127,6 +133,49 @@ pub fn complete(this: ARef<Self>) {
         }
     }
 
+    /// Get a reference to the first [`Bio`] in this request.
+    #[inline(always)]
+    pub fn bio(&self) -> Option<&Bio> {
+        // SAFETY: By type invariant of `Self`, `self.0` is valid and the deref
+        // is safe.
+        let ptr = unsafe { (*self.0.get()).bio };
+        // SAFETY: By C API contract, if `bio` is not null it will have a
+        // positive refcount at least for the duration of the lifetime of
+        // `&self`.
+        unsafe { Bio::from_raw(ptr) }
+    }
+
+    /// Get a mutable reference to the first [`Bio`] in this request.
+    #[inline(always)]
+    pub fn bio_mut(self: Pin<&mut Self>) -> Option<Pin<&mut Bio>> {
+        // SAFETY: By type invariant of `Self`, `self.0` is valid and the deref
+        // is safe.
+        let ptr = unsafe { (*self.0.get()).bio };
+        // SAFETY: By C API contract, if `bio` is not null it will have a
+        // positive refcount at least for the duration of the lifetime of
+        // `&mut self`.
+        unsafe { Bio::from_raw_mut(ptr) }
+    }
+
+    /// Get an iterator over all bio structures in this request.
+    #[inline(always)]
+    pub fn bio_iter_mut<'a>(self: &'a mut Owned<Self>) -> BioIterator<'a> {
+        // INVARIANT: By C API contract, if the bio pointer is not null, it is a valid `struct bio`.
+        // `NonNull::new` will return `None` if the pointer is null.
+        BioIterator {
+            // SAFETY: By type invariant `self.0` is a valid `struct request`.
+            bio: NonNull::new(unsafe { (*self.0.get()).bio.cast() }),
+            _p: PhantomData,
+        }
+    }
+
+    /// Get the target sector for the request.
+    #[inline(always)]
+    pub fn sector(&self) -> usize {
+        // SAFETY: By type invariant of `Self`, `self.0` is valid and live.
+        unsafe { (*self.0.get()).__sector as usize }
+    }
+
     /// Return a pointer to the [`RequestDataWrapper`] stored in the private area
     /// of the request structure.
     ///
diff --git a/rust/kernel/page.rs b/rust/kernel/page.rs
index e4585e1dba0c..a3473dabf587 100644
--- a/rust/kernel/page.rs
+++ b/rust/kernel/page.rs
@@ -282,7 +282,7 @@ fn with_page_mapped<T>(&self, f: impl FnOnce(*mut u8) -> T) -> T {
     /// different addresses. However, even if the addresses are different, the underlying memory is
     /// still the same for these purposes (e.g., it's still a data race if they both write to the
     /// same underlying byte at the same time).
-    fn with_pointer_into_page<T>(
+    pub(crate) fn with_pointer_into_page<T>(
         &self,
         off: usize,
         len: usize,

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 13/83] block: rust: add `command` getter to `Request`
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (11 preceding siblings ...)
  2026-06-09 19:07 ` [PATCH v2 12/83] block: rust: introduce `kernel::block::bio` module Andreas Hindborg
@ 2026-06-09 19:07 ` Andreas Hindborg
  2026-06-09 19:07 ` [PATCH v2 14/83] block: rust: mq: use GFP_KERNEL from prelude Andreas Hindborg
                   ` (69 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:07 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux, Andreas Hindborg

From: Andreas Hindborg <a.hindborg@samsung.com>

Add a method to extract the command operation code from a request. The
command is obtained by masking the lower bits of `cmd_flags` as defined by
`REQ_OP_BITS`. This allows Rust block drivers to determine the type of
operation being requested.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/kernel/block/mq/request.rs | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/rust/kernel/block/mq/request.rs b/rust/kernel/block/mq/request.rs
index 98e54f0586d1..19bdf17de166 100644
--- a/rust/kernel/block/mq/request.rs
+++ b/rust/kernel/block/mq/request.rs
@@ -116,6 +116,13 @@ pub(crate) unsafe fn aref_from_raw(ptr: *mut bindings::request) -> ARef<Self> {
         unsafe { ARef::from_raw(NonNull::new_unchecked(ptr.cast())) }
     }
 
+    /// Get the command identifier for the request
+    pub fn command(&self) -> u32 {
+        use core::ops::BitAnd;
+        // SAFETY: By C API contract and type invariant, `cmd_flags` is valid for read
+        unsafe { (*self.0.get()).cmd_flags }.bitand((1u32 << bindings::REQ_OP_BITS) - 1)
+    }
+
     /// Complete the request by scheduling `Operations::complete` for
     /// execution.
     ///

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 14/83] block: rust: mq: use GFP_KERNEL from prelude
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (12 preceding siblings ...)
  2026-06-09 19:07 ` [PATCH v2 13/83] block: rust: add `command` getter to `Request` Andreas Hindborg
@ 2026-06-09 19:07 ` Andreas Hindborg
  2026-06-09 19:07 ` [PATCH v2 15/83] block: rust: add `TagSet` flags Andreas Hindborg
                   ` (68 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:07 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Remove the explicit import of kernel::alloc::flags and use GFP_KERNEL
directly from the prelude in the module documentation example.

This simplifies the import list and follows the pattern of using
commonly used constants from the prelude.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/kernel/block/mq.rs | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/rust/kernel/block/mq.rs b/rust/kernel/block/mq.rs
index a03d46d274a5..23660817df29 100644
--- a/rust/kernel/block/mq.rs
+++ b/rust/kernel/block/mq.rs
@@ -57,7 +57,6 @@
 //!
 //! ```rust
 //! use kernel::{
-//!     alloc::flags,
 //!     block::mq::*,
 //!     new_mutex,
 //!     prelude::*,
@@ -93,7 +92,7 @@
 //! }
 //!
 //! let tagset: Arc<TagSet<MyBlkDevice>> =
-//!     Arc::pin_init(TagSet::new(1, 256, 1), flags::GFP_KERNEL)?;
+//!     Arc::pin_init(TagSet::new(1, 256, 1), GFP_KERNEL)?;
 //! let mut disk = gen_disk::GenDiskBuilder::new()
 //!     .capacity_sectors(4096)
 //!     .build(fmt!("myblk"), tagset, ())?;

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 15/83] block: rust: add `TagSet` flags
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (13 preceding siblings ...)
  2026-06-09 19:07 ` [PATCH v2 14/83] block: rust: mq: use GFP_KERNEL from prelude Andreas Hindborg
@ 2026-06-09 19:07 ` Andreas Hindborg
  2026-06-09 19:07 ` [PATCH v2 16/83] block: rnull: add memory backing Andreas Hindborg
                   ` (67 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:07 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add support for `TagSet` flags by introducing a `Flags` type and adding
a flags parameter to `TagSet::new`. This allows configuring tagset
behavior such as blocking vs non-blocking operation.

The Flags type supports bitwise operations and provides values like
`Blocking` for common use cases. The module documentation example is
updated to demonstrate the new API.

For now, only a single flag is added.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/rnull.rs          |  5 ++++-
 rust/kernel/block/mq.rs               |  6 +++---
 rust/kernel/block/mq/tag_set.rs       | 13 ++++++++++---
 rust/kernel/block/mq/tag_set/flags.rs | 21 +++++++++++++++++++++
 4 files changed, 38 insertions(+), 7 deletions(-)

diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index 3e7a47e6d0e5..746ddadd11f0 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -128,7 +128,10 @@ fn new(
         irq_mode: IRQMode,
         completion_time: Delta,
     ) -> Result<GenDisk<Self>> {
-        let tagset = Arc::pin_init(TagSet::new(1, 256, 1), GFP_KERNEL)?;
+        let tagset = Arc::pin_init(
+            TagSet::new(1, 256, 1, mq::tag_set::Flags::default()),
+            GFP_KERNEL,
+        )?;
 
         let queue_data = Box::new(
             QueueData {
diff --git a/rust/kernel/block/mq.rs b/rust/kernel/block/mq.rs
index 23660817df29..e556b3bb1191 100644
--- a/rust/kernel/block/mq.rs
+++ b/rust/kernel/block/mq.rs
@@ -57,7 +57,7 @@
 //!
 //! ```rust
 //! use kernel::{
-//!     block::mq::*,
+//!     block::mq::{self, *},
 //!     new_mutex,
 //!     prelude::*,
 //!     sync::{aref::ARef, Arc, Mutex},
@@ -92,7 +92,7 @@
 //! }
 //!
 //! let tagset: Arc<TagSet<MyBlkDevice>> =
-//!     Arc::pin_init(TagSet::new(1, 256, 1), GFP_KERNEL)?;
+//!     Arc::pin_init(TagSet::new(1, 256, 1, mq::tag_set::Flags::default()), GFP_KERNEL)?;
 //! let mut disk = gen_disk::GenDiskBuilder::new()
 //!     .capacity_sectors(4096)
 //!     .build(fmt!("myblk"), tagset, ())?;
@@ -103,7 +103,7 @@
 pub mod gen_disk;
 mod operations;
 mod request;
-mod tag_set;
+pub mod tag_set;
 
 pub use operations::Operations;
 pub use request::{
diff --git a/rust/kernel/block/mq/tag_set.rs b/rust/kernel/block/mq/tag_set.rs
index ec5cac48b83f..5b1a5bcc978d 100644
--- a/rust/kernel/block/mq/tag_set.rs
+++ b/rust/kernel/block/mq/tag_set.rs
@@ -17,7 +17,7 @@
         self,
         Result, //
     },
-    prelude::try_pin_init,
+    prelude::*,
     types::Opaque,
 };
 use core::{
@@ -30,6 +30,12 @@
     PinInit, //
 };
 
+mod flags;
+pub use flags::{
+    Flag,
+    Flags, //
+};
+
 /// A wrapper for the C `struct blk_mq_tag_set`.
 ///
 /// `struct blk_mq_tag_set` contains a `struct list_head` and so must be pinned.
@@ -51,6 +57,7 @@ pub fn new(
         nr_hw_queues: u32,
         num_tags: u32,
         num_maps: u32,
+        flags: Flags,
     ) -> impl PinInit<Self, error::Error> {
         let tag_set: bindings::blk_mq_tag_set = pin_init::zeroed();
         let tag_set: Result<_> = size_of::<RequestDataWrapper<T>>()
@@ -63,8 +70,8 @@ pub fn new(
                     numa_node: bindings::NUMA_NO_NODE,
                     queue_depth: num_tags,
                     cmd_size,
-                    flags: 0,
-                    driver_data: core::ptr::null_mut::<crate::ffi::c_void>(),
+                    flags: flags.into(),
+                    driver_data: core::ptr::null_mut::<c_void>(),
                     nr_maps: num_maps,
                     ..tag_set
                 }
diff --git a/rust/kernel/block/mq/tag_set/flags.rs b/rust/kernel/block/mq/tag_set/flags.rs
new file mode 100644
index 000000000000..b7eaccd200a2
--- /dev/null
+++ b/rust/kernel/block/mq/tag_set/flags.rs
@@ -0,0 +1,21 @@
+// SPDX-License-Identifier: GPL-2.0
+
+use crate::{
+    bindings,
+    impl_flags, //
+};
+
+impl_flags! {
+    /// Flags to be used when creating [`super::TagSet`] objects.
+    #[derive(Debug, Clone, Default, Copy, PartialEq, Eq)]
+    pub struct Flags(u32);
+
+    /// Allowed values for [`Flags`].
+    #[derive(Debug, Clone, Copy, PartialEq, Eq)]
+    pub enum Flag {
+        /// Indicate that the queues associated with this tag set might sleep when
+        /// processing IO. When this flag is not set, IO is processed in atomic
+        /// context. When this flag is set, IO is processed in process context.
+        Blocking = bindings::BLK_MQ_F_BLOCKING,
+    }
+}

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 16/83] block: rnull: add memory backing
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (14 preceding siblings ...)
  2026-06-09 19:07 ` [PATCH v2 15/83] block: rust: add `TagSet` flags Andreas Hindborg
@ 2026-06-09 19:07 ` Andreas Hindborg
  2026-06-09 19:07 ` [PATCH v2 17/83] block: rnull: add submit queue count config option Andreas Hindborg
                   ` (66 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:07 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add memory backing to the rust null block driver. This implementation will
always allocate a page on write, even though a page backing the written
sector is already allocated, in which case the page will be released again.
A later patch will fix this inefficiency.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/configfs.rs |   8 ++-
 drivers/block/rnull/rnull.rs    | 126 ++++++++++++++++++++++++++++++++++------
 2 files changed, 116 insertions(+), 18 deletions(-)

diff --git a/drivers/block/rnull/configfs.rs b/drivers/block/rnull/configfs.rs
index 83b474f6da60..8daf2ca409ba 100644
--- a/drivers/block/rnull/configfs.rs
+++ b/drivers/block/rnull/configfs.rs
@@ -60,7 +60,7 @@ impl AttributeOperations<0> for Config {
 
     fn show(_this: &Config, page: &mut [u8; PAGE_SIZE]) -> Result<usize> {
         let mut writer = kernel::str::Formatter::new(page);
-        writer.write_str("blocksize,size,rotational,irqmode,completion_nsec\n")?;
+        writer.write_str("blocksize,size,rotational,irqmode,completion_nsec,memory_backed\n")?;
         Ok(writer.bytes_written())
     }
 }
@@ -84,6 +84,7 @@ fn make_group(
                 size: 3,
                 irqmode: 4,
                 completion_nsec: 5,
+                memory_backed: 6,
             ],
         };
 
@@ -101,6 +102,7 @@ fn make_group(
                     irq_mode: IRQMode::None,
                     completion_time: time::Delta::ZERO,
                     name: name.try_into()?,
+                    memory_backed: false,
                 }),
             }),
             core::iter::empty(),
@@ -165,6 +167,7 @@ struct DeviceConfigInner {
     irq_mode: IRQMode,
     completion_time: time::Delta,
     disk: Option<GenDisk<NullBlkDevice>>,
+    memory_backed: bool,
 }
 
 #[vtable]
@@ -195,6 +198,7 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
                 guard.capacity_mib,
                 guard.irq_mode,
                 guard.completion_time,
+                guard.memory_backed,
             )?);
             guard.powered = true;
         } else if guard.powered && !power_op {
@@ -226,3 +230,5 @@ fn from_str(s: &str) -> Result<Self> {
         value.try_into()
     }
 }
+
+configfs_simple_bool_field!(DeviceConfig, 6, memory_backed);
diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index 746ddadd11f0..8e4d2b270bcf 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -6,8 +6,10 @@
 
 use configfs::IRQMode;
 use kernel::{
+    bindings,
     block::{
         self,
+        bio::Segment,
         mq::{
             self,
             gen_disk::{
@@ -19,15 +21,12 @@
         },
     },
     error::Result,
-    new_mutex,
+    memalloc_scope, new_mutex, new_xarray,
+    page::SafePage,
     pr_info,
     prelude::*,
     str::CString,
-    sync::{
-        aref::ARef,
-        Arc,
-        Mutex, //
-    },
+    sync::{aref::ARef, Arc, Mutex},
     time::{
         hrtimer::{
             HrTimerCallback,
@@ -40,7 +39,8 @@
     types::{
         OwnableRefCounted,
         Owned, //
-    }, //
+    },
+    xarray::XArray,
 };
 
 module! {
@@ -74,6 +74,10 @@
             default: 10_000,
             description:  "Time in ns to complete a request in hardware. Default: 10,000ns",
         },
+        memory_backed: bool {
+            default: false,
+            description: "Create a memory-backed block device.",
+        },
     },
 }
 
@@ -103,6 +107,7 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
                     module_parameters::gb.value() * 1024,
                     module_parameters::irqmode.value().try_into()?,
                     Delta::from_nanos(completion_time),
+                    module_parameters::memory_backed.value(),
                 )?;
                 disks.push(disk, GFP_KERNEL)?;
             }
@@ -127,17 +132,23 @@ fn new(
         capacity_mib: u64,
         irq_mode: IRQMode,
         completion_time: Delta,
+        memory_backed: bool,
     ) -> Result<GenDisk<Self>> {
-        let tagset = Arc::pin_init(
-            TagSet::new(1, 256, 1, mq::tag_set::Flags::default()),
-            GFP_KERNEL,
-        )?;
+        let flags = if memory_backed {
+            mq::tag_set::Flag::Blocking.into()
+        } else {
+            mq::tag_set::Flags::default()
+        };
+
+        let tagset = Arc::pin_init(TagSet::new(1, 256, 1, flags), GFP_KERNEL)?;
 
-        let queue_data = Box::new(
-            QueueData {
+        let queue_data = Box::pin_init(
+            pin_init!(QueueData {
+                tree <- new_xarray!(kernel::xarray::AllocKind::Alloc),
                 irq_mode,
                 completion_time,
-            },
+                memory_backed,
+            }),
             GFP_KERNEL,
         )?;
 
@@ -148,11 +159,72 @@ fn new(
             .rotational(rotational)
             .build(fmt!("{}", name.to_str()?), tagset, queue_data)
     }
+
+    #[inline(always)]
+    fn write(tree: &XArray<TreeNode>, mut sector: usize, mut segment: Segment<'_>) -> Result {
+        while !segment.is_empty() {
+            let page = SafePage::alloc_page(GFP_KERNEL)?;
+            let mut tree = tree.lock();
+
+            let page_idx = sector >> block::PAGE_SECTORS_SHIFT;
+
+            let page = if let Some(page) = tree.get_mut(page_idx) {
+                page
+            } else {
+                tree.store(page_idx, page, GFP_KERNEL)?;
+                tree.get_mut(page_idx).unwrap()
+            };
+
+            let page_offset = (sector & block::PAGE_SECTOR_MASK as usize) << block::SECTOR_SHIFT;
+            sector += segment.copy_to_page(page, page_offset) >> block::SECTOR_SHIFT;
+        }
+        Ok(())
+    }
+
+    #[inline(always)]
+    fn read(tree: &XArray<TreeNode>, mut sector: usize, mut segment: Segment<'_>) -> Result {
+        let tree = tree.lock();
+
+        while !segment.is_empty() {
+            let idx = sector >> block::PAGE_SECTORS_SHIFT;
+
+            if let Some(page) = tree.get(idx) {
+                let page_offset =
+                    (sector & block::PAGE_SECTOR_MASK as usize) << block::SECTOR_SHIFT;
+                sector += segment.copy_from_page(page, page_offset) >> block::SECTOR_SHIFT;
+            } else {
+                sector += segment.zero_page() >> block::SECTOR_SHIFT;
+            }
+        }
+
+        Ok(())
+    }
+
+    #[inline(never)]
+    fn transfer(
+        command: bindings::req_op,
+        tree: &XArray<TreeNode>,
+        sector: usize,
+        segment: Segment<'_>,
+    ) -> Result {
+        match command {
+            bindings::req_op_REQ_OP_WRITE => Self::write(tree, sector, segment)?,
+            bindings::req_op_REQ_OP_READ => Self::read(tree, sector, segment)?,
+            _ => (),
+        }
+        Ok(())
+    }
 }
 
+type TreeNode = Owned<SafePage>;
+
+#[pin_data]
 struct QueueData {
+    #[pin]
+    tree: XArray<TreeNode>,
     irq_mode: IRQMode,
     completion_time: Delta,
+    memory_backed: bool,
 }
 
 #[pin_data]
@@ -182,7 +254,7 @@ impl HasHrTimer<Self> for Pdu {
 
 #[vtable]
 impl Operations for NullBlkDevice {
-    type QueueData = KBox<QueueData>;
+    type QueueData = Pin<KBox<QueueData>>;
     type RequestData = Pdu;
 
     fn new_request_data() -> impl PinInit<Self::RequestData> {
@@ -192,7 +264,27 @@ fn new_request_data() -> impl PinInit<Self::RequestData> {
     }
 
     #[inline(always)]
-    fn queue_rq(queue_data: &QueueData, rq: Owned<mq::Request<Self>>, _is_last: bool) -> Result {
+    fn queue_rq(
+        queue_data: Pin<&QueueData>,
+        mut rq: Owned<mq::Request<Self>>,
+        _is_last: bool,
+    ) -> Result {
+        if queue_data.memory_backed {
+            memalloc_scope!(let _noio: NoIo);
+            let tree = &queue_data.tree;
+            let command = rq.command();
+            let mut sector = rq.sector();
+
+            for bio in rq.bio_iter_mut() {
+                let segment_iter = bio.segment_iter();
+                for segment in segment_iter {
+                    let length = segment.len();
+                    Self::transfer(command, tree, sector, segment)?;
+                    sector += length as usize >> block::SECTOR_SHIFT;
+                }
+            }
+        }
+
         match queue_data.irq_mode {
             IRQMode::None => rq.end_ok(),
             IRQMode::Soft => mq::Request::complete(rq.into()),
@@ -205,7 +297,7 @@ fn queue_rq(queue_data: &QueueData, rq: Owned<mq::Request<Self>>, _is_last: bool
         Ok(())
     }
 
-    fn commit_rqs(_queue_data: &QueueData) {}
+    fn commit_rqs(_queue_data: Pin<&QueueData>) {}
 
     fn complete(rq: ARef<mq::Request<Self>>) {
         OwnableRefCounted::try_from_shared(rq)

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 17/83] block: rnull: add submit queue count config option
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (15 preceding siblings ...)
  2026-06-09 19:07 ` [PATCH v2 16/83] block: rnull: add memory backing Andreas Hindborg
@ 2026-06-09 19:07 ` Andreas Hindborg
  2026-06-09 19:07 ` [PATCH v2 18/83] block: rnull: add `use_per_node_hctx` " Andreas Hindborg
                   ` (65 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:07 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Allow user space to control the number of submission queues when creating
null block devices.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/configfs.rs | 56 +++++++++++++++++++++++++++++++++--------
 drivers/block/rnull/rnull.rs    | 56 +++++++++++++++++++++++++++--------------
 2 files changed, 83 insertions(+), 29 deletions(-)

diff --git a/drivers/block/rnull/configfs.rs b/drivers/block/rnull/configfs.rs
index 8daf2ca409ba..0dea92a9079b 100644
--- a/drivers/block/rnull/configfs.rs
+++ b/drivers/block/rnull/configfs.rs
@@ -60,7 +60,10 @@ impl AttributeOperations<0> for Config {
 
     fn show(_this: &Config, page: &mut [u8; PAGE_SIZE]) -> Result<usize> {
         let mut writer = kernel::str::Formatter::new(page);
-        writer.write_str("blocksize,size,rotational,irqmode,completion_nsec,memory_backed\n")?;
+        writer.write_str(
+            "blocksize,size,rotational,irqmode,completion_nsec,memory_backed,\
+             submit_queues\n",
+        )?;
         Ok(writer.bytes_written())
     }
 }
@@ -85,6 +88,7 @@ fn make_group(
                 irqmode: 4,
                 completion_nsec: 5,
                 memory_backed: 6,
+                submit_queues: 7,
             ],
         };
 
@@ -103,6 +107,7 @@ fn make_group(
                     completion_time: time::Delta::ZERO,
                     name: name.try_into()?,
                     memory_backed: false,
+                    submit_queues: 1,
                 }),
             }),
             core::iter::empty(),
@@ -168,6 +173,7 @@ struct DeviceConfigInner {
     completion_time: time::Delta,
     disk: Option<GenDisk<NullBlkDevice>>,
     memory_backed: bool,
+    submit_queues: u32,
 }
 
 #[vtable]
@@ -191,15 +197,16 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
         let mut guard = this.data.lock();
 
         if !guard.powered && power_op {
-            guard.disk = Some(NullBlkDevice::new(
-                &guard.name,
-                guard.block_size,
-                guard.rotational,
-                guard.capacity_mib,
-                guard.irq_mode,
-                guard.completion_time,
-                guard.memory_backed,
-            )?);
+            guard.disk = Some(NullBlkDevice::new(crate::NullBlkOptions {
+                name: &guard.name,
+                block_size: guard.block_size,
+                rotational: guard.rotational,
+                capacity_mib: guard.capacity_mib,
+                irq_mode: guard.irq_mode,
+                completion_time: guard.completion_time,
+                memory_backed: guard.memory_backed,
+                submit_queues: guard.submit_queues,
+            })?);
             guard.powered = true;
         } else if guard.powered && !power_op {
             drop(guard.disk.take());
@@ -232,3 +239,32 @@ fn from_str(s: &str) -> Result<Self> {
 }
 
 configfs_simple_bool_field!(DeviceConfig, 6, memory_backed);
+
+#[vtable]
+impl configfs::AttributeOperations<7> for DeviceConfig {
+    type Data = DeviceConfig;
+
+    fn show(this: &DeviceConfig, page: &mut [u8; PAGE_SIZE]) -> Result<usize> {
+        let mut writer = kernel::str::Formatter::new(page);
+        writer.write_fmt(fmt!("{}\n", this.data.lock().submit_queues))?;
+        Ok(writer.bytes_written())
+    }
+
+    fn store(this: &DeviceConfig, page: &[u8]) -> Result {
+        if this.data.lock().powered {
+            return Err(EBUSY);
+        }
+
+        let text = core::str::from_utf8(page)?.trim();
+        let value = text
+            .parse::<u32>()
+            .map_err(|_| kernel::error::code::EINVAL)?;
+
+        if value == 0 || value > kernel::cpu::num_possible_cpus() {
+            return Err(kernel::error::code::EINVAL);
+        }
+
+        this.data.lock().submit_queues = value;
+        Ok(())
+    }
+}
diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index 8e4d2b270bcf..a7c35f33631a 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -78,6 +78,10 @@
             default: false,
             description: "Create a memory-backed block device.",
         },
+        submit_queues: u32 {
+            default: 1,
+            description: "Number of submission queues",
+        },
     },
 }
 
@@ -100,15 +104,16 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
             for i in 0..module_parameters::nr_devices.value() {
                 let name = CString::try_from_fmt(fmt!("rnullb{}", i))?;
 
-                let disk = NullBlkDevice::new(
-                    &name,
-                    module_parameters::bs.value(),
-                    module_parameters::rotational.value(),
-                    module_parameters::gb.value() * 1024,
-                    module_parameters::irqmode.value().try_into()?,
-                    Delta::from_nanos(completion_time),
-                    module_parameters::memory_backed.value(),
-                )?;
+                let disk = NullBlkDevice::new(NullBlkOptions {
+                    name: &name,
+                    block_size: module_parameters::bs.value(),
+                    rotational: module_parameters::rotational.value(),
+                    capacity_mib: module_parameters::gb.value() * 1024,
+                    irq_mode: module_parameters::irqmode.value().try_into()?,
+                    completion_time: Delta::from_nanos(completion_time),
+                    memory_backed: module_parameters::memory_backed.value(),
+                    submit_queues: module_parameters::submit_queues.value(),
+                })?;
                 disks.push(disk, GFP_KERNEL)?;
             }
 
@@ -122,25 +127,38 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
     }
 }
 
+struct NullBlkOptions<'a> {
+    name: &'a CStr,
+    block_size: u32,
+    rotational: bool,
+    capacity_mib: u64,
+    irq_mode: IRQMode,
+    completion_time: Delta,
+    memory_backed: bool,
+    submit_queues: u32,
+}
 struct NullBlkDevice;
 
 impl NullBlkDevice {
-    fn new(
-        name: &CStr,
-        block_size: u32,
-        rotational: bool,
-        capacity_mib: u64,
-        irq_mode: IRQMode,
-        completion_time: Delta,
-        memory_backed: bool,
-    ) -> Result<GenDisk<Self>> {
+    fn new(options: NullBlkOptions<'_>) -> Result<GenDisk<Self>> {
+        let NullBlkOptions {
+            name,
+            block_size,
+            rotational,
+            capacity_mib,
+            irq_mode,
+            completion_time,
+            memory_backed,
+            submit_queues,
+        } = options;
+
         let flags = if memory_backed {
             mq::tag_set::Flag::Blocking.into()
         } else {
             mq::tag_set::Flags::default()
         };
 
-        let tagset = Arc::pin_init(TagSet::new(1, 256, 1, flags), GFP_KERNEL)?;
+        let tagset = Arc::pin_init(TagSet::new(submit_queues, 256, 1, flags), GFP_KERNEL)?;
 
         let queue_data = Box::pin_init(
             pin_init!(QueueData {

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 18/83] block: rnull: add `use_per_node_hctx` config option
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (16 preceding siblings ...)
  2026-06-09 19:07 ` [PATCH v2 17/83] block: rnull: add submit queue count config option Andreas Hindborg
@ 2026-06-09 19:07 ` Andreas Hindborg
  2026-06-09 19:07 ` [PATCH v2 19/83] block: rust: allow specifying home node when constructing `TagSet` Andreas Hindborg
                   ` (64 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:07 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add a configfs attribute to enable per-NUMA-node hardware contexts.
When enabled, the driver creates one hardware queue per NUMA node
instead of the default configuration.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/configfs.rs | 24 ++++++++++++++++++++++--
 drivers/block/rnull/rnull.rs    | 28 ++++++++++++++++++++++------
 2 files changed, 44 insertions(+), 8 deletions(-)

diff --git a/drivers/block/rnull/configfs.rs b/drivers/block/rnull/configfs.rs
index 0dea92a9079b..71b38373be33 100644
--- a/drivers/block/rnull/configfs.rs
+++ b/drivers/block/rnull/configfs.rs
@@ -33,7 +33,8 @@
     configfs_simple_bool_field,
     configfs_simple_field,
     show_field,
-    store_number_with_power_check, //
+    store_number_with_power_check,
+    store_with_power_check, //
 };
 
 mod macros;
@@ -62,7 +63,7 @@ impl AttributeOperations<0> for Config {
         let mut writer = kernel::str::Formatter::new(page);
         writer.write_str(
             "blocksize,size,rotational,irqmode,completion_nsec,memory_backed,\
-             submit_queues\n",
+             submit_queues,use_per_node_hctx\n",
         )?;
         Ok(writer.bytes_written())
     }
@@ -89,6 +90,7 @@ fn make_group(
                 completion_nsec: 5,
                 memory_backed: 6,
                 submit_queues: 7,
+                use_per_node_hctx: 8,
             ],
         };
 
@@ -268,3 +270,21 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
         Ok(())
     }
 }
+
+configfs_attribute!(DeviceConfig, 8,
+    show: |this, page| show_field(
+        this.data.lock().submit_queues == kernel::numa::num_online_nodes(), page
+    ),
+    store: |this, page| store_with_power_check(this, page, |data, page| {
+        let value = core::str::from_utf8(page)?
+            .trim()
+            .parse::<u8>()
+            .map_err(|_| kernel::error::code::EINVAL)?
+            != 0;
+
+        if value {
+            data.submit_queues = kernel::numa::num_online_nodes();
+        }
+        Ok(())
+    })
+);
diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index a7c35f33631a..30de022146ec 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -21,12 +21,18 @@
         },
     },
     error::Result,
-    memalloc_scope, new_mutex, new_xarray,
+    memalloc_scope,
+    new_mutex,
+    new_xarray,
     page::SafePage,
     pr_info,
     prelude::*,
     str::CString,
-    sync::{aref::ARef, Arc, Mutex},
+    sync::{
+        aref::ARef,
+        Arc,
+        Mutex, //
+    },
     time::{
         hrtimer::{
             HrTimerCallback,
@@ -40,7 +46,7 @@
         OwnableRefCounted,
         Owned, //
     },
-    xarray::XArray,
+    xarray::XArray, //
 };
 
 module! {
@@ -71,8 +77,9 @@
             description:  "IRQ completion handler. 0-none, 1-softirq, 2-timer",
         },
         completion_nsec: u64 {
-            default: 10_000,
-            description:  "Time in ns to complete a request in hardware. Default: 10,000ns",
+                default: 10_000,
+                description:
+            "Time in ns to complete a request in hardware. Default: 10,000ns",
         },
         memory_backed: bool {
             default: false,
@@ -82,6 +89,10 @@
             default: 1,
             description: "Number of submission queues",
         },
+        use_per_node_hctx: bool {
+            default: false,
+            description: "Use per-node allocation for hardware context queues.",
+        },
     },
 }
 
@@ -104,6 +115,11 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
             for i in 0..module_parameters::nr_devices.value() {
                 let name = CString::try_from_fmt(fmt!("rnullb{}", i))?;
 
+                let submit_queues = if module_parameters::use_per_node_hctx.value() {
+                    kernel::numa::num_online_nodes()
+                } else {
+                    module_parameters::submit_queues.value()
+                };
                 let disk = NullBlkDevice::new(NullBlkOptions {
                     name: &name,
                     block_size: module_parameters::bs.value(),
@@ -112,7 +128,7 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
                     irq_mode: module_parameters::irqmode.value().try_into()?,
                     completion_time: Delta::from_nanos(completion_time),
                     memory_backed: module_parameters::memory_backed.value(),
-                    submit_queues: module_parameters::submit_queues.value(),
+                    submit_queues,
                 })?;
                 disks.push(disk, GFP_KERNEL)?;
             }

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 19/83] block: rust: allow specifying home node when constructing `TagSet`
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (17 preceding siblings ...)
  2026-06-09 19:07 ` [PATCH v2 18/83] block: rnull: add `use_per_node_hctx` " Andreas Hindborg
@ 2026-06-09 19:07 ` Andreas Hindborg
  2026-06-09 19:07 ` [PATCH v2 20/83] block: rnull: allow specifying the home numa node Andreas Hindborg
                   ` (63 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:07 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add a `numa_node` parameter to `TagSet::new` to specify the home NUMA
node for tag set allocations. This allows drivers to optimize memory
placement for NUMA systems.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/rnull.rs    | 11 ++++++++++-
 rust/kernel/block/mq.rs         |  5 ++++-
 rust/kernel/block/mq/tag_set.rs |  4 +++-
 3 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index 30de022146ec..6323327d4a5a 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -174,7 +174,16 @@ fn new(options: NullBlkOptions<'_>) -> Result<GenDisk<Self>> {
             mq::tag_set::Flags::default()
         };
 
-        let tagset = Arc::pin_init(TagSet::new(submit_queues, 256, 1, flags), GFP_KERNEL)?;
+        let tagset = Arc::pin_init(
+            TagSet::new(
+                submit_queues,
+                256,
+                1,
+                kernel::alloc::NumaNode::NO_NODE,
+                flags,
+            ),
+            GFP_KERNEL,
+        )?;
 
         let queue_data = Box::pin_init(
             pin_init!(QueueData {
diff --git a/rust/kernel/block/mq.rs b/rust/kernel/block/mq.rs
index e556b3bb1191..bac15b509d90 100644
--- a/rust/kernel/block/mq.rs
+++ b/rust/kernel/block/mq.rs
@@ -57,6 +57,7 @@
 //!
 //! ```rust
 //! use kernel::{
+//!     alloc::NumaNode,
 //!     block::mq::{self, *},
 //!     new_mutex,
 //!     prelude::*,
@@ -92,7 +93,9 @@
 //! }
 //!
 //! let tagset: Arc<TagSet<MyBlkDevice>> =
-//!     Arc::pin_init(TagSet::new(1, 256, 1, mq::tag_set::Flags::default()), GFP_KERNEL)?;
+//!     Arc::pin_init(
+//!         TagSet::new(1, 256, 1, NumaNode::NO_NODE, mq::tag_set::Flags::default()),
+//!         GFP_KERNEL)?;
 //! let mut disk = gen_disk::GenDiskBuilder::new()
 //!     .capacity_sectors(4096)
 //!     .build(fmt!("myblk"), tagset, ())?;
diff --git a/rust/kernel/block/mq/tag_set.rs b/rust/kernel/block/mq/tag_set.rs
index 5b1a5bcc978d..d6d104adf4aa 100644
--- a/rust/kernel/block/mq/tag_set.rs
+++ b/rust/kernel/block/mq/tag_set.rs
@@ -7,6 +7,7 @@
 use core::pin::Pin;
 
 use crate::{
+    alloc::NumaNode,
     bindings,
     block::mq::{
         operations::OperationsVTable,
@@ -57,6 +58,7 @@ pub fn new(
         nr_hw_queues: u32,
         num_tags: u32,
         num_maps: u32,
+        numa_node: NumaNode,
         flags: Flags,
     ) -> impl PinInit<Self, error::Error> {
         let tag_set: bindings::blk_mq_tag_set = pin_init::zeroed();
@@ -67,7 +69,7 @@ pub fn new(
                     ops: OperationsVTable::<T>::build(),
                     nr_hw_queues,
                     timeout: 0, // 0 means default which is 30Hz in C
-                    numa_node: bindings::NUMA_NO_NODE,
+                    numa_node: numa_node.id(),
                     queue_depth: num_tags,
                     cmd_size,
                     flags: flags.into(),

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 20/83] block: rnull: allow specifying the home numa node
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (18 preceding siblings ...)
  2026-06-09 19:07 ` [PATCH v2 19/83] block: rust: allow specifying home node when constructing `TagSet` Andreas Hindborg
@ 2026-06-09 19:07 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 21/83] block: rust: add Request::sectors() method Andreas Hindborg
                   ` (62 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:07 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add a configfs attribute to specify the NUMA node for rnull tag set
and CPU map allocations. This allows testing NUMA-aware block device
behavior and optimizing memory placement for specific hardware
configurations.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/configfs.rs | 19 +++++++++++++++++++
 drivers/block/rnull/rnull.rs    | 30 ++++++++++++++++++++++--------
 2 files changed, 41 insertions(+), 8 deletions(-)

diff --git a/drivers/block/rnull/configfs.rs b/drivers/block/rnull/configfs.rs
index 71b38373be33..2f3fa81ea121 100644
--- a/drivers/block/rnull/configfs.rs
+++ b/drivers/block/rnull/configfs.rs
@@ -5,6 +5,7 @@
     THIS_MODULE, //
 };
 use kernel::{
+    bindings,
     block::mq::gen_disk::{
         GenDisk,
         GenDiskBuilder, //
@@ -91,6 +92,7 @@ fn make_group(
                 memory_backed: 6,
                 submit_queues: 7,
                 use_per_node_hctx: 8,
+                home_node: 9,
             ],
         };
 
@@ -110,6 +112,7 @@ fn make_group(
                     name: name.try_into()?,
                     memory_backed: false,
                     submit_queues: 1,
+                    home_node: bindings::NUMA_NO_NODE,
                 }),
             }),
             core::iter::empty(),
@@ -176,6 +179,7 @@ struct DeviceConfigInner {
     disk: Option<GenDisk<NullBlkDevice>>,
     memory_backed: bool,
     submit_queues: u32,
+    home_node: i32,
 }
 
 #[vtable]
@@ -208,6 +212,7 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
                 completion_time: guard.completion_time,
                 memory_backed: guard.memory_backed,
                 submit_queues: guard.submit_queues,
+                home_node: guard.home_node,
             })?);
             guard.powered = true;
         } else if guard.powered && !power_op {
@@ -288,3 +293,17 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
         Ok(())
     })
 );
+
+configfs_simple_field!(
+    DeviceConfig,
+    9,
+    home_node,
+    i32,
+    check(|value| {
+        if value == 0 || value >= kernel::numa::num_online_nodes().try_into()? {
+            Err(kernel::error::code::EINVAL)
+        } else {
+            Ok(())
+        }
+    })
+);
diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index 6323327d4a5a..1d0faf524f5c 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -20,7 +20,10 @@
             TagSet, //
         },
     },
-    error::Result,
+    error::{
+        code,
+        Result, //
+    },
     memalloc_scope,
     new_mutex,
     new_xarray,
@@ -93,6 +96,10 @@
             default: false,
             description: "Use per-node allocation for hardware context queues.",
         },
+        home_node: i32 {
+            default: -1,
+            description: "Home node for the device. Default: -1 (no node)",
+        },
     },
 }
 
@@ -129,6 +136,7 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
                     completion_time: Delta::from_nanos(completion_time),
                     memory_backed: module_parameters::memory_backed.value(),
                     submit_queues,
+                    home_node: module_parameters::home_node.value(),
                 })?;
                 disks.push(disk, GFP_KERNEL)?;
             }
@@ -152,6 +160,7 @@ struct NullBlkOptions<'a> {
     completion_time: Delta,
     memory_backed: bool,
     submit_queues: u32,
+    home_node: i32,
 }
 struct NullBlkDevice;
 
@@ -166,6 +175,7 @@ fn new(options: NullBlkOptions<'_>) -> Result<GenDisk<Self>> {
             completion_time,
             memory_backed,
             submit_queues,
+            home_node,
         } = options;
 
         let flags = if memory_backed {
@@ -174,14 +184,18 @@ fn new(options: NullBlkOptions<'_>) -> Result<GenDisk<Self>> {
             mq::tag_set::Flags::default()
         };
 
+        if home_node > kernel::numa::num_online_nodes().try_into()? {
+            return Err(code::EINVAL);
+        }
+
+        let numa_node = if home_node == -1 {
+            kernel::alloc::NumaNode::NO_NODE
+        } else {
+            kernel::alloc::NumaNode::new(home_node)?
+        };
+
         let tagset = Arc::pin_init(
-            TagSet::new(
-                submit_queues,
-                256,
-                1,
-                kernel::alloc::NumaNode::NO_NODE,
-                flags,
-            ),
+            TagSet::new(submit_queues, 256, 1, numa_node, flags),
             GFP_KERNEL,
         )?;
 

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 21/83] block: rust: add Request::sectors() method
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (19 preceding siblings ...)
  2026-06-09 19:07 ` [PATCH v2 20/83] block: rnull: allow specifying the home numa node Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 22/83] block: rust: mq: add max_hw_discard_sectors support to GenDiskBuilder Andreas Hindborg
                   ` (61 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add a new method to get the size of a request in number of sectors.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/kernel/block/mq/request.rs | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/rust/kernel/block/mq/request.rs b/rust/kernel/block/mq/request.rs
index 19bdf17de166..54fe580b7b42 100644
--- a/rust/kernel/block/mq/request.rs
+++ b/rust/kernel/block/mq/request.rs
@@ -183,6 +183,13 @@ pub fn sector(&self) -> usize {
         unsafe { (*self.0.get()).__sector as usize }
     }
 
+    /// Get the size of the request in number of sectors.
+    #[inline(always)]
+    pub fn sectors(&self) -> usize {
+        // SAFETY: By type invariant of `Self`, `self.0` is valid and live.
+        (unsafe { (*self.0.get()).__data_len as usize }) >> crate::block::SECTOR_SHIFT
+    }
+
     /// Return a pointer to the [`RequestDataWrapper`] stored in the private area
     /// of the request structure.
     ///

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 22/83] block: rust: mq: add max_hw_discard_sectors support to GenDiskBuilder
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (20 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 21/83] block: rust: add Request::sectors() method Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 23/83] block: rnull: add discard support Andreas Hindborg
                   ` (60 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add support for configuring the maximum hardware discard sectors
through GenDiskBuilder. This allows block devices to specify their
discard/trim capabilities.

Setting this value to 0 (the default) indicates that discard is not
supported by the device. Non-zero values specify the maximum number
of sectors that can be discarded in a single operation.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/kernel/block/mq/gen_disk.rs | 34 ++++++++++++++++++++++++++++++----
 1 file changed, 30 insertions(+), 4 deletions(-)

diff --git a/rust/kernel/block/mq/gen_disk.rs b/rust/kernel/block/mq/gen_disk.rs
index b36d24382cc3..2b204b0ed49a 100644
--- a/rust/kernel/block/mq/gen_disk.rs
+++ b/rust/kernel/block/mq/gen_disk.rs
@@ -7,14 +7,27 @@
 
 use crate::{
     bindings,
-    block::mq::{Operations, TagSet},
-    error::{self, from_err_ptr, Result},
-    fmt::{self, Write},
+    block::mq::{
+        Operations,
+        TagSet, //
+    },
+    error::{
+        self,
+        from_err_ptr,
+        Result, //
+    },
+    fmt::{
+        self,
+        Write, //
+    },
     prelude::*,
     static_lock_class,
     str::NullTerminatedFormatter,
     sync::Arc,
-    types::{ForeignOwnable, ScopeGuard},
+    types::{
+        ForeignOwnable,
+        ScopeGuard, //
+    },
 };
 
 /// A builder for [`GenDisk`].
@@ -25,6 +38,7 @@ pub struct GenDiskBuilder {
     logical_block_size: u32,
     physical_block_size: u32,
     capacity_sectors: u64,
+    max_hw_discard_sectors: u32,
 }
 
 impl Default for GenDiskBuilder {
@@ -34,6 +48,7 @@ fn default() -> Self {
             logical_block_size: bindings::PAGE_SIZE as u32,
             physical_block_size: bindings::PAGE_SIZE as u32,
             capacity_sectors: 0,
+            max_hw_discard_sectors: 0,
         }
     }
 }
@@ -94,6 +109,16 @@ pub fn capacity_sectors(mut self, capacity: u64) -> Self {
         self
     }
 
+    /// Set the maximum amount of sectors the underlying hardware device can
+    /// discard/trim in a single operation.
+    ///
+    /// Setting 0 (default) here will cause the disk to report discard not
+    /// supported.
+    pub fn max_hw_discard_sectors(mut self, max_hw_discard_sectors: u32) -> Self {
+        self.max_hw_discard_sectors = max_hw_discard_sectors;
+        self
+    }
+
     /// Build a new `GenDisk` and add it to the VFS.
     pub fn build<T: Operations>(
         self,
@@ -111,6 +136,7 @@ pub fn build<T: Operations>(
 
         lim.logical_block_size = self.logical_block_size;
         lim.physical_block_size = self.physical_block_size;
+        lim.max_hw_discard_sectors = self.max_hw_discard_sectors;
         if self.rotational {
             lim.features = bindings::BLK_FEAT_ROTATIONAL;
         }

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 23/83] block: rnull: add discard support
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (21 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 22/83] block: rust: mq: add max_hw_discard_sectors support to GenDiskBuilder Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-10 13:55   ` Malte Wechter
  2026-06-09 19:08 ` [PATCH v2 24/83] block: rust: add `NoDefaultScheduler` flag for `TagSet` Andreas Hindborg
                   ` (59 subsequent siblings)
  82 siblings, 1 reply; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add support for discard operations to the rnull block driver:
- Add discard module parameter and configfs attribute.
- Set max_hw_discard_sectors when discard is enabled.
- Add sector occupancy tracking.
- Add discard handling that frees sectors and removes empty pages.
- Discard operations require memory backing to function.

The discard feature uses a bitmap to track which sectors in each page are
occupied, allowing cleanup of pages when they are empty.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/configfs.rs |  15 +++++
 drivers/block/rnull/rnull.rs    | 120 +++++++++++++++++++++++++++++++++++-----
 2 files changed, 121 insertions(+), 14 deletions(-)

diff --git a/drivers/block/rnull/configfs.rs b/drivers/block/rnull/configfs.rs
index 2f3fa81ea121..e47399cd45a4 100644
--- a/drivers/block/rnull/configfs.rs
+++ b/drivers/block/rnull/configfs.rs
@@ -93,6 +93,7 @@ fn make_group(
                 submit_queues: 7,
                 use_per_node_hctx: 8,
                 home_node: 9,
+                discard: 10,
             ],
         };
 
@@ -113,6 +114,7 @@ fn make_group(
                     memory_backed: false,
                     submit_queues: 1,
                     home_node: bindings::NUMA_NO_NODE,
+                    discard: false,
                 }),
             }),
             core::iter::empty(),
@@ -180,6 +182,7 @@ struct DeviceConfigInner {
     memory_backed: bool,
     submit_queues: u32,
     home_node: i32,
+    discard: bool,
 }
 
 #[vtable]
@@ -213,6 +216,7 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
                 memory_backed: guard.memory_backed,
                 submit_queues: guard.submit_queues,
                 home_node: guard.home_node,
+                discard: guard.discard,
             })?);
             guard.powered = true;
         } else if guard.powered && !power_op {
@@ -307,3 +311,14 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
         }
     })
 );
+
+configfs_attribute!(DeviceConfig, 10,
+    show: |this, page| show_field(this.data.lock().discard, page),
+    store: |this, page| store_with_power_check(this, page, |data, page| {
+        if !data.memory_backed {
+            return Err(EINVAL);
+        }
+        data.discard = kstrtobool_bytes(page)?;
+        Ok(())
+    })
+);
diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index 1d0faf524f5c..bdc05b3f6072 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -19,15 +19,20 @@
             Operations,
             TagSet, //
         },
+        PAGE_SECTOR_MASK, SECTOR_SHIFT,
     },
     error::{
         code,
         Result, //
     },
+    ffi,
     memalloc_scope,
     new_mutex,
     new_xarray,
-    page::SafePage,
+    page::{
+        SafePage, //
+        PAGE_SIZE,
+    },
     pr_info,
     prelude::*,
     str::CString,
@@ -100,6 +105,11 @@
             default: -1,
             description: "Home node for the device. Default: -1 (no node)",
         },
+        discard: bool {
+            default: false,
+            description:
+                "Support discard operations (requires memory-backed null_blk device).",
+        },
     },
 }
 
@@ -137,6 +147,7 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
                     memory_backed: module_parameters::memory_backed.value(),
                     submit_queues,
                     home_node: module_parameters::home_node.value(),
+                    discard: module_parameters::discard.value(),
                 })?;
                 disks.push(disk, GFP_KERNEL)?;
             }
@@ -161,6 +172,7 @@ struct NullBlkOptions<'a> {
     memory_backed: bool,
     submit_queues: u32,
     home_node: i32,
+    discard: bool,
 }
 struct NullBlkDevice;
 
@@ -176,6 +188,7 @@ fn new(options: NullBlkOptions<'_>) -> Result<GenDisk<Self>> {
             memory_backed,
             submit_queues,
             home_node,
+            discard,
         } = options;
 
         let flags = if memory_backed {
@@ -205,22 +218,30 @@ fn new(options: NullBlkOptions<'_>) -> Result<GenDisk<Self>> {
                 irq_mode,
                 completion_time,
                 memory_backed,
+                block_size: block_size as usize,
             }),
             GFP_KERNEL,
         )?;
 
-        gen_disk::GenDiskBuilder::new()
+        let mut builder = gen_disk::GenDiskBuilder::new()
             .capacity_sectors(capacity_mib << (20 - block::SECTOR_SHIFT))
             .logical_block_size(block_size)?
             .physical_block_size(block_size)?
-            .rotational(rotational)
-            .build(fmt!("{}", name.to_str()?), tagset, queue_data)
+            .rotational(rotational);
+
+        if memory_backed && discard {
+            builder = builder
+                // Max IO size is u32::MAX bytes
+                .max_hw_discard_sectors(ffi::c_uint::MAX >> block::SECTOR_SHIFT);
+        }
+
+        builder.build(fmt!("{}", name.to_str()?), tagset, queue_data)
     }
 
     #[inline(always)]
     fn write(tree: &XArray<TreeNode>, mut sector: usize, mut segment: Segment<'_>) -> Result {
         while !segment.is_empty() {
-            let page = SafePage::alloc_page(GFP_KERNEL)?;
+            let page = NullBlockPage::new()?;
             let mut tree = tree.lock();
 
             let page_idx = sector >> block::PAGE_SECTORS_SHIFT;
@@ -232,8 +253,10 @@ fn write(tree: &XArray<TreeNode>, mut sector: usize, mut segment: Segment<'_>) -
                 tree.get_mut(page_idx).unwrap()
             };
 
+            page.set_occupied(sector);
             let page_offset = (sector & block::PAGE_SECTOR_MASK as usize) << block::SECTOR_SHIFT;
-            sector += segment.copy_to_page(page, page_offset) >> block::SECTOR_SHIFT;
+            sector +=
+                segment.copy_to_page(page.page.as_pin_mut(), page_offset) >> block::SECTOR_SHIFT;
         }
         Ok(())
     }
@@ -248,7 +271,7 @@ fn read(tree: &XArray<TreeNode>, mut sector: usize, mut segment: Segment<'_>) ->
             if let Some(page) = tree.get(idx) {
                 let page_offset =
                     (sector & block::PAGE_SECTOR_MASK as usize) << block::SECTOR_SHIFT;
-                sector += segment.copy_from_page(page, page_offset) >> block::SECTOR_SHIFT;
+                sector += segment.copy_from_page(&page.page, page_offset) >> block::SECTOR_SHIFT;
             } else {
                 sector += segment.zero_page() >> block::SECTOR_SHIFT;
             }
@@ -257,6 +280,37 @@ fn read(tree: &XArray<TreeNode>, mut sector: usize, mut segment: Segment<'_>) ->
         Ok(())
     }
 
+    fn discard(
+        tree: &XArray<TreeNode>,
+        mut sector: usize,
+        sectors: usize,
+        block_size: usize,
+    ) -> Result {
+        let mut remaining_bytes = sectors << SECTOR_SHIFT;
+        let mut tree = tree.lock();
+
+        while remaining_bytes > 0 {
+            let page_idx = sector >> block::PAGE_SECTORS_SHIFT;
+            let mut remove = false;
+            if let Some(page) = tree.get_mut(page_idx) {
+                page.set_free(sector);
+                if page.is_empty() {
+                    remove = true;
+                }
+            }
+
+            if remove {
+                drop(tree.remove(page_idx))
+            }
+
+            let processed = remaining_bytes.min(block_size);
+            sector += processed >> SECTOR_SHIFT;
+            remaining_bytes -= processed;
+        }
+
+        Ok(())
+    }
+
     #[inline(never)]
     fn transfer(
         command: bindings::req_op,
@@ -273,7 +327,40 @@ fn transfer(
     }
 }
 
-type TreeNode = Owned<SafePage>;
+static_assert!((PAGE_SIZE >> SECTOR_SHIFT) <= 64);
+
+struct NullBlockPage {
+    page: Owned<SafePage>,
+    status: u64,
+}
+
+impl NullBlockPage {
+    fn new() -> Result<KBox<Self>> {
+        Ok(KBox::new(
+            Self {
+                page: SafePage::alloc_page(GFP_KERNEL | __GFP_ZERO)?,
+                status: 0,
+            },
+            GFP_KERNEL,
+        )?)
+    }
+
+    fn set_occupied(&mut self, sector: usize) {
+        let idx = sector & PAGE_SECTOR_MASK as usize;
+        self.status |= 1 << idx;
+    }
+
+    fn set_free(&mut self, sector: usize) {
+        let idx = sector & PAGE_SECTOR_MASK as usize;
+        self.status &= !(1 << idx);
+    }
+
+    fn is_empty(&self) -> bool {
+        self.status == 0
+    }
+}
+
+type TreeNode = KBox<NullBlockPage>;
 
 #[pin_data]
 struct QueueData {
@@ -282,6 +369,7 @@ struct QueueData {
     irq_mode: IRQMode,
     completion_time: Delta,
     memory_backed: bool,
+    block_size: usize,
 }
 
 #[pin_data]
@@ -332,12 +420,16 @@ fn queue_rq(
             let command = rq.command();
             let mut sector = rq.sector();
 
-            for bio in rq.bio_iter_mut() {
-                let segment_iter = bio.segment_iter();
-                for segment in segment_iter {
-                    let length = segment.len();
-                    Self::transfer(command, tree, sector, segment)?;
-                    sector += length as usize >> block::SECTOR_SHIFT;
+            if command == bindings::req_op_REQ_OP_DISCARD {
+                Self::discard(tree, sector, rq.sectors(), queue_data.block_size)?;
+            } else {
+                for bio in rq.bio_iter_mut() {
+                    let segment_iter = bio.segment_iter();
+                    for segment in segment_iter {
+                        let length = segment.len();
+                        Self::transfer(command, tree, sector, segment)?;
+                        sector += length as usize >> block::SECTOR_SHIFT;
+                    }
                 }
             }
         }

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 24/83] block: rust: add `NoDefaultScheduler` flag for `TagSet`
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (22 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 23/83] block: rnull: add discard support Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 25/83] block: rnull: add no_sched module parameter and configfs attribute Andreas Hindborg
                   ` (58 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add a flag that maps to the BLK_MQ_F_NO_SCHED_BY_DEFAULT. This flag selects
the 'none' scheduler during queue registration in case of a single hwq or
shared hwqs instead of 'mq-deadline'.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/kernel/block/mq/tag_set/flags.rs | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/rust/kernel/block/mq/tag_set/flags.rs b/rust/kernel/block/mq/tag_set/flags.rs
index b7eaccd200a2..2561d7090c49 100644
--- a/rust/kernel/block/mq/tag_set/flags.rs
+++ b/rust/kernel/block/mq/tag_set/flags.rs
@@ -17,5 +17,9 @@ pub enum Flag {
         /// processing IO. When this flag is not set, IO is processed in atomic
         /// context. When this flag is set, IO is processed in process context.
         Blocking = bindings::BLK_MQ_F_BLOCKING,
+
+        /// Select 'none' during queue registration in case of a single hwq or shared
+        /// hwqs instead of 'mq-deadline'.
+        NoDefaultScheduler = bindings::BLK_MQ_F_NO_SCHED_BY_DEFAULT,
     }
 }

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 25/83] block: rnull: add no_sched module parameter and configfs attribute
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (23 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 24/83] block: rust: add `NoDefaultScheduler` flag for `TagSet` Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 26/83] block: rust: change sector type from usize to u64 Andreas Hindborg
                   ` (57 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add support for disabling the default IO scheduler by adding:
- no_sched module parameter to control scheduler selection at device
  creation.
- no_sched configfs attribute (ID 11) for runtime configuration.
- Use of NO_DEFAULT_SCHEDULER flag when no_sched is enabled.

This allows bypassing the default 'mq-deadline' scheduler and using 'none'
instead, which can improve performance for certain workloads. The flag
selection logic is updated to use compound assignment operators for better
readability.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/configfs.rs |  6 ++++++
 drivers/block/rnull/rnull.rs    | 25 ++++++++++++++++++-------
 2 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/drivers/block/rnull/configfs.rs b/drivers/block/rnull/configfs.rs
index e47399cd45a4..d9aead646ae0 100644
--- a/drivers/block/rnull/configfs.rs
+++ b/drivers/block/rnull/configfs.rs
@@ -94,6 +94,7 @@ fn make_group(
                 use_per_node_hctx: 8,
                 home_node: 9,
                 discard: 10,
+                no_sched:11,
             ],
         };
 
@@ -115,6 +116,7 @@ fn make_group(
                     submit_queues: 1,
                     home_node: bindings::NUMA_NO_NODE,
                     discard: false,
+                    no_sched: false,
                 }),
             }),
             core::iter::empty(),
@@ -183,6 +185,7 @@ struct DeviceConfigInner {
     submit_queues: u32,
     home_node: i32,
     discard: bool,
+    no_sched: bool,
 }
 
 #[vtable]
@@ -217,6 +220,7 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
                 submit_queues: guard.submit_queues,
                 home_node: guard.home_node,
                 discard: guard.discard,
+                no_sched: guard.no_sched,
             })?);
             guard.powered = true;
         } else if guard.powered && !power_op {
@@ -322,3 +326,5 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
         Ok(())
     })
 );
+
+configfs_simple_bool_field!(DeviceConfig, 11, no_sched);
diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index bdc05b3f6072..cb5b642f68e5 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -30,8 +30,8 @@
     new_mutex,
     new_xarray,
     page::{
-        SafePage, //
-        PAGE_SIZE,
+        SafePage,
+        PAGE_SIZE, //
     },
     pr_info,
     prelude::*,
@@ -110,6 +110,10 @@
             description:
                 "Support discard operations (requires memory-backed null_blk device).",
         },
+        no_sched: bool {
+            default: false,
+            description: "No IO scheduler",
+        },
     },
 }
 
@@ -148,6 +152,7 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
                     submit_queues,
                     home_node: module_parameters::home_node.value(),
                     discard: module_parameters::discard.value(),
+                    no_sched: module_parameters::no_sched.value(),
                 })?;
                 disks.push(disk, GFP_KERNEL)?;
             }
@@ -173,6 +178,7 @@ struct NullBlkOptions<'a> {
     submit_queues: u32,
     home_node: i32,
     discard: bool,
+    no_sched: bool,
 }
 struct NullBlkDevice;
 
@@ -189,13 +195,18 @@ fn new(options: NullBlkOptions<'_>) -> Result<GenDisk<Self>> {
             submit_queues,
             home_node,
             discard,
+            no_sched,
         } = options;
 
-        let flags = if memory_backed {
-            mq::tag_set::Flag::Blocking.into()
-        } else {
-            mq::tag_set::Flags::default()
-        };
+        let mut flags = mq::tag_set::Flags::default();
+
+        if memory_backed {
+            flags |= mq::tag_set::Flag::Blocking;
+        }
+
+        if no_sched {
+            flags |= mq::tag_set::Flag::NoDefaultScheduler;
+        }
 
         if home_node > kernel::numa::num_online_nodes().try_into()? {
             return Err(code::EINVAL);

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 26/83] block: rust: change sector type from usize to u64
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (24 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 25/83] block: rnull: add no_sched module parameter and configfs attribute Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 27/83] block: rust: add `BadBlocks` for bad block tracking Andreas Hindborg
                   ` (56 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Change the `sector()` and `sectors()` methods in `Request` to return
`u64` and `u32` respectively instead of `usize`. This matches the
underlying kernel types.

Update rnull driver to handle the new sector types with appropriate
casting throughout the read, write, and discard operations.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/rnull.rs    | 71 +++++++++++++++++++++++++----------------
 rust/kernel/block/mq/request.rs |  8 ++---
 2 files changed, 47 insertions(+), 32 deletions(-)

diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index cb5b642f68e5..73f14d6e379f 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -218,6 +218,13 @@ fn new(options: NullBlkOptions<'_>) -> Result<GenDisk<Self>> {
             kernel::alloc::NumaNode::new(home_node)?
         };
 
+        let capacity_sectors = capacity_mib << (20 - block::SECTOR_SHIFT);
+
+        // Prevent overflow in usize/u64 casts
+        if usize::BITS == 32 && capacity_sectors > u32::MAX.into() {
+            return Err(code::EINVAL);
+        }
+
         let tagset = Arc::pin_init(
             TagSet::new(submit_queues, 256, 1, numa_node, flags),
             GFP_KERNEL,
@@ -229,13 +236,13 @@ fn new(options: NullBlkOptions<'_>) -> Result<GenDisk<Self>> {
                 irq_mode,
                 completion_time,
                 memory_backed,
-                block_size: block_size as usize,
+                block_size: block_size.into(),
             }),
             GFP_KERNEL,
         )?;
 
         let mut builder = gen_disk::GenDiskBuilder::new()
-            .capacity_sectors(capacity_mib << (20 - block::SECTOR_SHIFT))
+            .capacity_sectors(capacity_sectors)
             .logical_block_size(block_size)?
             .physical_block_size(block_size)?
             .rotational(rotational);
@@ -250,12 +257,13 @@ fn new(options: NullBlkOptions<'_>) -> Result<GenDisk<Self>> {
     }
 
     #[inline(always)]
-    fn write(tree: &XArray<TreeNode>, mut sector: usize, mut segment: Segment<'_>) -> Result {
+    fn write(tree: &XArray<TreeNode>, mut sector: u64, mut segment: Segment<'_>) -> Result {
         while !segment.is_empty() {
             let page = NullBlockPage::new()?;
             let mut tree = tree.lock();
 
-            let page_idx = sector >> block::PAGE_SECTORS_SHIFT;
+            // CAST: Device size limited during setup to (2^32)-1 on 32 bit systems.
+            let page_idx = (sector >> block::PAGE_SECTORS_SHIFT) as usize;
 
             let page = if let Some(page) = tree.get_mut(page_idx) {
                 page
@@ -265,43 +273,50 @@ fn write(tree: &XArray<TreeNode>, mut sector: usize, mut segment: Segment<'_>) -
             };
 
             page.set_occupied(sector);
-            let page_offset = (sector & block::PAGE_SECTOR_MASK as usize) << block::SECTOR_SHIFT;
-            sector +=
-                segment.copy_to_page(page.page.as_pin_mut(), page_offset) >> block::SECTOR_SHIFT;
+
+            // CAST: Page offset always fits in 32 bits.
+            let page_offset =
+                ((sector & u64::from(block::PAGE_SECTOR_MASK)) << block::SECTOR_SHIFT) as usize;
+
+            // CAST: Casting from `usize` to `u64` never overflows.
+            sector += segment.copy_to_page(page.page.as_pin_mut(), page_offset) as u64
+                >> block::SECTOR_SHIFT;
         }
         Ok(())
     }
 
     #[inline(always)]
-    fn read(tree: &XArray<TreeNode>, mut sector: usize, mut segment: Segment<'_>) -> Result {
+    fn read(tree: &XArray<TreeNode>, mut sector: u64, mut segment: Segment<'_>) -> Result {
         let tree = tree.lock();
 
         while !segment.is_empty() {
-            let idx = sector >> block::PAGE_SECTORS_SHIFT;
+            // CAST: Device size limited during setup to (2^32)-1 on 32 bit systems.
+            let page_idx = (sector >> block::PAGE_SECTORS_SHIFT) as usize;
 
-            if let Some(page) = tree.get(idx) {
+            if let Some(page) = tree.get(page_idx) {
+                // CAST: Page offset always fits in 32 bits.
                 let page_offset =
-                    (sector & block::PAGE_SECTOR_MASK as usize) << block::SECTOR_SHIFT;
-                sector += segment.copy_from_page(&page.page, page_offset) >> block::SECTOR_SHIFT;
+                    ((sector & u64::from(block::PAGE_SECTOR_MASK)) << block::SECTOR_SHIFT) as usize;
+
+                // CAST: Casting from `usize` to `u64` never overflows.
+                sector +=
+                    segment.copy_from_page(&page.page, page_offset) as u64 >> block::SECTOR_SHIFT;
             } else {
-                sector += segment.zero_page() >> block::SECTOR_SHIFT;
+                // CAST: Casting from `usize` to `u64` never overflows.
+                sector += segment.zero_page() as u64 >> block::SECTOR_SHIFT;
             }
         }
 
         Ok(())
     }
 
-    fn discard(
-        tree: &XArray<TreeNode>,
-        mut sector: usize,
-        sectors: usize,
-        block_size: usize,
-    ) -> Result {
+    fn discard(tree: &XArray<TreeNode>, mut sector: u64, sectors: u64, block_size: u64) -> Result {
         let mut remaining_bytes = sectors << SECTOR_SHIFT;
         let mut tree = tree.lock();
 
         while remaining_bytes > 0 {
-            let page_idx = sector >> block::PAGE_SECTORS_SHIFT;
+            // CAST: Device size limited during setup to (2^32)-1 on 32 bit systems.
+            let page_idx = (sector >> block::PAGE_SECTORS_SHIFT) as usize;
             let mut remove = false;
             if let Some(page) = tree.get_mut(page_idx) {
                 page.set_free(sector);
@@ -326,7 +341,7 @@ fn discard(
     fn transfer(
         command: bindings::req_op,
         tree: &XArray<TreeNode>,
-        sector: usize,
+        sector: u64,
         segment: Segment<'_>,
     ) -> Result {
         match command {
@@ -356,13 +371,13 @@ fn new() -> Result<KBox<Self>> {
         )?)
     }
 
-    fn set_occupied(&mut self, sector: usize) {
-        let idx = sector & PAGE_SECTOR_MASK as usize;
+    fn set_occupied(&mut self, sector: u64) {
+        let idx = sector & u64::from(PAGE_SECTOR_MASK);
         self.status |= 1 << idx;
     }
 
-    fn set_free(&mut self, sector: usize) {
-        let idx = sector & PAGE_SECTOR_MASK as usize;
+    fn set_free(&mut self, sector: u64) {
+        let idx = sector & u64::from(PAGE_SECTOR_MASK);
         self.status &= !(1 << idx);
     }
 
@@ -380,7 +395,7 @@ struct QueueData {
     irq_mode: IRQMode,
     completion_time: Delta,
     memory_backed: bool,
-    block_size: usize,
+    block_size: u64,
 }
 
 #[pin_data]
@@ -432,14 +447,14 @@ fn queue_rq(
             let mut sector = rq.sector();
 
             if command == bindings::req_op_REQ_OP_DISCARD {
-                Self::discard(tree, sector, rq.sectors(), queue_data.block_size)?;
+                Self::discard(tree, sector, rq.sectors().into(), queue_data.block_size)?;
             } else {
                 for bio in rq.bio_iter_mut() {
                     let segment_iter = bio.segment_iter();
                     for segment in segment_iter {
                         let length = segment.len();
                         Self::transfer(command, tree, sector, segment)?;
-                        sector += length as usize >> block::SECTOR_SHIFT;
+                        sector += u64::from(length) >> block::SECTOR_SHIFT;
                     }
                 }
             }
diff --git a/rust/kernel/block/mq/request.rs b/rust/kernel/block/mq/request.rs
index 54fe580b7b42..9e176f015ab8 100644
--- a/rust/kernel/block/mq/request.rs
+++ b/rust/kernel/block/mq/request.rs
@@ -178,16 +178,16 @@ pub fn bio_iter_mut<'a>(self: &'a mut Owned<Self>) -> BioIterator<'a> {
 
     /// Get the target sector for the request.
     #[inline(always)]
-    pub fn sector(&self) -> usize {
+    pub fn sector(&self) -> u64 {
         // SAFETY: By type invariant of `Self`, `self.0` is valid and live.
-        unsafe { (*self.0.get()).__sector as usize }
+        unsafe { (*self.0.get()).__sector }
     }
 
     /// Get the size of the request in number of sectors.
     #[inline(always)]
-    pub fn sectors(&self) -> usize {
+    pub fn sectors(&self) -> u32 {
         // SAFETY: By type invariant of `Self`, `self.0` is valid and live.
-        (unsafe { (*self.0.get()).__data_len as usize }) >> crate::block::SECTOR_SHIFT
+        (unsafe { (*self.0.get()).__data_len }) >> crate::block::SECTOR_SHIFT
     }
 
     /// Return a pointer to the [`RequestDataWrapper`] stored in the private area

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 27/83] block: rust: add `BadBlocks` for bad block tracking
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (25 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 26/83] block: rust: change sector type from usize to u64 Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 28/83] block: rust: mq: add Request::end() method for custom status codes Andreas Hindborg
                   ` (55 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add a safe Rust wrapper around the Linux kernel's badblocks infrastructure
to track and manage defective sectors on block devices. The BadBlocks type
provides methods to:

- Mark sectors as bad or good (set_bad/set_good)
- Check if sector ranges contain bad blocks (check)
- Automatically handle memory management with PinnedDrop

The implementation includes comprehensive documentation with examples for
block device drivers that need to avoid known bad sectors to maintain
data integrity. Bad blocks information is used by device drivers,
filesystem layers, and device management tools.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/bindings/bindings_helper.h |   1 +
 rust/kernel/block.rs            |   1 +
 rust/kernel/block/badblocks.rs  | 716 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 718 insertions(+)

diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
index b1fb3afee4ca..eaf05d60dda9 100644
--- a/rust/bindings/bindings_helper.h
+++ b/rust/bindings/bindings_helper.h
@@ -38,6 +38,7 @@
 #include <drm/drm_ioctl.h>
 #include <kunit/test.h>
 #include <linux/auxiliary_bus.h>
+#include <linux/badblocks.h>
 #include <linux/bitmap.h>
 #include <linux/blk-mq.h>
 #include <linux/blk_types.h>
diff --git a/rust/kernel/block.rs b/rust/kernel/block.rs
index eb512dad031b..96e48a2e6116 100644
--- a/rust/kernel/block.rs
+++ b/rust/kernel/block.rs
@@ -2,6 +2,7 @@
 
 //! Types for working with the block layer.
 
+pub mod badblocks;
 pub mod bio;
 pub mod mq;
 
diff --git a/rust/kernel/block/badblocks.rs b/rust/kernel/block/badblocks.rs
new file mode 100644
index 000000000000..0aab661ed7be
--- /dev/null
+++ b/rust/kernel/block/badblocks.rs
@@ -0,0 +1,716 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Bad blocks tracking for block devices.
+//!
+//! This module provides a safe Rust wrapper around the badblocks
+//! infrastructure, which is used to track and manage bad sectors on block
+//! devices. Bad blocks are sectors that cannot reliably store data and should
+//! be avoided during I/O operations.
+//!
+//! C header: [`include/linux/fault-inject.h`](srctree/include/linux/fault-inject.h).
+
+use core::ops::{
+    Range,
+    RangeBounds, //
+};
+
+use crate::{
+    error::to_result,
+    page::PAGE_SIZE,
+    prelude::*,
+    sync::atomic::{
+        ordering,
+        Atomic, //
+    },
+    types::Opaque,
+};
+use pin_init::{
+    pin_data,
+    PinInit, //
+};
+
+/// A bad blocks tracker for managing defective sectors on a block device.
+///
+/// `BadBlocks` provides functionality to mark sectors as bad and check if
+/// ranges contain bad blocks. This is useful for some classes of drivers to
+/// maintain data integrity by avoiding known bad sectors.
+///
+/// # Storage Format
+///
+/// Bad blocks are stored in a compact format where each 64-bit entry contains:
+/// - **Sector offset** (54 bits): Starting sector of the bad range
+/// - **Length** (9 bits): Number of sectors (1-512) in the bad range
+/// - **Acknowledged flag** (1 bit): Whether the bad blocks have been acknowledged
+///
+/// The bad blocks tracker uses exactly one page ([`PAGE_SIZE`]) of memory to store
+/// bad block entries. This allows tracking up to `PAGE_SIZE/8` bad block ranges
+/// (typically 512 ranges on systems with 4KB pages).
+///
+/// # Locking
+///
+/// Operations on the structure is internally synchronized by a seqlock.
+///
+/// # Examples
+///
+/// Basic usage:
+///
+/// ```rust
+/// # use kernel::block::badblocks::{BadBlocks, BlockStatus};
+/// # use kernel::prelude::*;
+/// // Create a new bad blocks tracker
+/// let bad_blocks = KBox::pin_init(BadBlocks::new(true), GFP_KERNEL)?;
+///
+/// // Mark sectors 100-109 as bad (unacknowledged)
+/// bad_blocks.set_bad(100..110, false)?;
+///
+/// // Check if sector range 95-104 contains bad blocks
+/// match bad_blocks.check(95..105) {
+///     BlockStatus::None => pr_info!("No bad blocks found"),
+///     BlockStatus::Acknowledged(range) => pr_warn!("Acknowledged bad blocks: {:?}", range),
+///     BlockStatus::Unacknowledged(range) => pr_err!("Unacknowledged bad blocks: {:?}", range),
+/// }
+/// # Ok::<(), kernel::error::Error>(())
+/// ```
+/// # Invariants
+///
+/// - `self.blocks` is a valid `bindings::badblocks` struct.
+#[pin_data(PinnedDrop)]
+pub struct BadBlocks {
+    #[pin]
+    blocks: Opaque<bindings::badblocks>,
+}
+
+impl BadBlocks {
+    /// Creates a new bad blocks tracker.
+    ///
+    /// Initializes an empty bad blocks tracker that can manage defective sectors
+    /// on a block device. The tracker starts with no bad blocks recorded and
+    /// allocates a single page for storing bad block entries.
+    ///
+    /// # Returns
+    ///
+    /// Returns a [`PinInit`] that can be used to initialize a [`BadBlocks`] instance.
+    /// Initialization may fail with `ENOMEM` if memory allocation fails.
+    ///
+    /// # Examples
+    ///
+    /// ```rust
+    /// # use kernel::block::badblocks::{BadBlocks, BlockStatus};
+    /// # use kernel::prelude::*;
+    /// // Create and initialize a bad blocks tracker
+    /// let bad_blocks = KBox::pin_init(BadBlocks::new(true), GFP_KERNEL)?;
+    ///
+    /// // The tracker is ready to use with no bad blocks initially
+    /// match bad_blocks.check(0..100) {
+    ///     BlockStatus::None => pr_info!("No bad blocks found initially"),
+    ///     _ => unreachable!(),
+    /// }
+    /// # Ok::<(), kernel::error::Error>(())
+    /// ```
+    pub fn new(enable: bool) -> impl PinInit<Self, Error> {
+        // INVARIANT: We initialize `self.blocks` below. If initialization fails, an error is
+        // returned.
+        try_pin_init!(Self {
+            blocks <- Opaque::try_ffi_init(|slot| {
+                // SAFETY: `slot` is a valid pointer to uninitialized memory
+                // allocated by the Opaque type. `badblocks_init` is safe to
+                // call with uninitialized memory.
+                to_result(unsafe { bindings::badblocks_init(slot, enable.into()) })
+            }),
+        })
+    }
+
+    fn shift_ref(&self) -> &Atomic<c_int> {
+        // SAFETY: By type invariant self.blocks is valid.
+        let ptr = unsafe { &raw const (*self.blocks.get()).shift };
+        // SAFETY: `shift` is only written by C code using atomic operations after initialization.
+        unsafe { Atomic::from_ptr(ptr.cast_mut().cast()) }
+    }
+
+    /// Enables the bad blocks tracker if it was previously disabled.
+    ///
+    /// Attempts to enable bad block tracking by transitioning the tracker from
+    /// a disabled state to an enabled state.
+    ///
+    /// # Behavior
+    ///
+    /// - If the tracker is disabled, it will be enabled.
+    /// - If the tracker is already enabled, this operation has no effect.
+    /// - The operation is atomic and thread-safe.
+    ///
+    /// # Usage
+    ///
+    /// Bad blocks trackers can be created in a disabled state and enabled later
+    /// when needed. This is useful for conditional bad block tracking or for
+    /// deferring activation until the device is fully initialized.
+    ///
+    /// # Examples
+    ///
+    /// ```rust
+    /// # use kernel::block::badblocks::BadBlocks;
+    /// # use kernel::prelude::*;
+    /// // Create a disabled bad blocks tracker
+    /// let bad_blocks = KBox::pin_init(BadBlocks::new(false), GFP_KERNEL)?;
+    /// assert!(!bad_blocks.enabled());
+    ///
+    /// // Enable it when needed
+    /// bad_blocks.enable();
+    /// assert!(bad_blocks.enabled());
+    ///
+    /// // Subsequent enable calls have no effect
+    /// bad_blocks.enable();
+    /// assert!(bad_blocks.enabled());
+    /// # Ok::<(), kernel::error::Error>(())
+    /// ```
+    pub fn enable(&self) {
+        let _ = self.shift_ref().cmpxchg(-1, 0, ordering::Relaxed);
+    }
+
+    /// Checks whether the bad blocks tracker is currently enabled.
+    ///
+    /// Returns `true` if bad block tracking is active, `false` if it is disabled.
+    /// When disabled, the tracker will not perform bad block checks or operations.
+    ///
+    /// # Thread Safety
+    ///
+    /// This method is thread-safe and uses atomic operations to check the
+    /// tracker's state without requiring external synchronization.
+    ///
+    /// # Examples
+    ///
+    /// ```rust
+    /// # use kernel::block::badblocks::BadBlocks;
+    /// # use kernel::prelude::*;
+    /// // Create an enabled tracker
+    /// let enabled_tracker = KBox::pin_init(BadBlocks::new(true), GFP_KERNEL)?;
+    /// assert!(enabled_tracker.enabled());
+    ///
+    /// // Create a disabled tracker
+    /// let disabled_tracker = KBox::pin_init(BadBlocks::new(false), GFP_KERNEL)?;
+    /// assert!(!disabled_tracker.enabled());
+    ///
+    /// // Enable and verify
+    /// disabled_tracker.enable();
+    /// assert!(disabled_tracker.enabled());
+    /// # Ok::<(), kernel::error::Error>(())
+    /// ```
+    pub fn enabled(&self) -> bool {
+        self.shift_ref().load(ordering::Relaxed) >= 0
+    }
+
+    /// Marks a range of sectors as bad.
+    ///
+    /// Records a contiguous range of sectors as defective in the bad blocks tracker.
+    /// Bad sectors should be avoided during I/O operations to prevent data corruption.
+    /// The implementation may merge, split, or extend existing ranges as needed.
+    ///
+    /// # Parameters
+    ///
+    /// - `range` - The range of sectors to mark as bad. Each individual range is limited to 512
+    ///   sectors maximum by the underlying implementation.
+    /// - `acknowledged` - Whether the bad blocks have been acknowledged to be bad. Acknowledged bad
+    ///   blocks may be handled differently by some subsystems.
+    ///
+    /// # Acknowledgment Semantics
+    ///
+    /// - **Unacknowledged** (`acknowledged = false`): Newly discovered bad blocks that
+    ///   need attention. These are often treated as errors by upper layers.
+    /// - **Acknowledged** (`acknowledged = true`): Blocks that have been confirmed bad. These may
+    ///   be should be handled by remapping.
+    ///
+    /// # Range Management
+    ///
+    /// The implementation automatically:
+    /// - **Merges** adjacent or overlapping ranges with the same acknowledgment status
+    /// - **Splits** ranges when acknowledgment status differs
+    /// - **Extends** existing ranges when new bad blocks are adjacent
+    /// - **Limits** individual ranges to 512 sectors maximum (BB_MAX_LEN)
+    ///
+    /// Please see [C documentation] for details.
+    ///
+    /// # Performance
+    ///
+    /// Executes in O(n) time where n is number of entries in the bad block table.
+    ///
+    /// # Returns
+    ///
+    /// * `Ok(())` - Bad blocks were successfully recorded
+    /// * `Err(ENOMEM)` - Insufficient space in bad blocks table (table full)
+    ///
+    /// # Examples
+    ///
+    /// Basic usage:
+    ///
+    /// ```rust
+    /// # use kernel::block::badblocks::{BadBlocks, BlockStatus};
+    /// # use kernel::prelude::*;
+    /// let bad_blocks = KBox::pin_init(BadBlocks::new(true), GFP_KERNEL)?;
+    ///
+    /// // Mark sectors 1000-1009 as bad (unacknowledged)
+    /// bad_blocks.set_bad(1000..1010, false)?;
+    ///
+    /// // Mark a single sector as bad and acknowledged
+    /// bad_blocks.set_bad(2000..2001, true)?;
+    ///
+    /// // Verify the bad blocks are recorded
+    /// assert!(matches!(bad_blocks.check(1000..1010), BlockStatus::Unacknowledged(_)));
+    /// assert!(matches!(bad_blocks.check(2000..2001), BlockStatus::Acknowledged(_)));
+    /// # Ok::<(), kernel::error::Error>(())
+    /// ```
+    ///
+    /// Range merging behavior:
+    ///
+    /// ```rust
+    /// # use kernel::block::badblocks::{BadBlocks, BlockStatus};
+    /// # use kernel::prelude::*;
+    /// let bad_blocks = KBox::pin_init(BadBlocks::new(true), GFP_KERNEL)?;
+    ///
+    /// // Add adjacent ranges with same acknowledgment status
+    /// bad_blocks.set_bad(100..105, false)?;  // Sectors 100-104
+    /// bad_blocks.set_bad(105..108, false)?;  // Sectors 105-107
+    ///
+    /// // These will be merged into a single range 100-107
+    /// match bad_blocks.check(100..108) {
+    ///     BlockStatus::Unacknowledged(range) => {
+    ///         assert_eq!(range.start, 100);
+    ///         assert_eq!(range.end, 108);
+    ///     },
+    ///     _ => panic!("Expected unacknowledged bad blocks"),
+    /// }
+    /// # Ok::<(), kernel::error::Error>(())
+    /// ```
+    ///
+    /// Handling acknowledgment conflicts:
+    ///
+    /// ```rust
+    /// # use kernel::block::badblocks::{BadBlocks, BlockStatus};
+    /// # use kernel::prelude::*;
+    /// let bad_blocks = KBox::pin_init(BadBlocks::new(true), GFP_KERNEL)?;
+    ///
+    /// // Mark range as unacknowledged
+    /// bad_blocks.set_bad(200..210, false)?;
+    ///
+    /// // Acknowledge part of the range (will split)
+    /// bad_blocks.set_bad(205..208, true)?;
+    ///
+    /// // Now we have: unack[200-204], ack[205-207], unack[208-209]
+    /// assert!(matches!(bad_blocks.check(200..205), BlockStatus::Unacknowledged(_)));
+    /// assert!(matches!(bad_blocks.check(205..208), BlockStatus::Acknowledged(_)));
+    /// assert!(matches!(bad_blocks.check(208..210), BlockStatus::Unacknowledged(_)));
+    /// # Ok::<(), kernel::error::Error>(())
+    /// ```
+    ///
+    /// [C documentation]: srctree/block/badblocks.c
+    pub fn set_bad(&self, range: impl RangeBounds<u64>, acknowledged: bool) -> Result {
+        let range = Self::range(range);
+
+        // SAFETY: By type invariant `self.blocks` is valid. The C function
+        // `badblocks_set` handles synchronization internally.
+        let status = unsafe {
+            bindings::badblocks_set(
+                self.blocks.get(),
+                range.start,
+                range.end - range.start,
+                if acknowledged { 1 } else { 0 },
+            )
+        };
+
+        if status {
+            Ok(())
+        } else {
+            Err(ENOMEM)
+        }
+    }
+
+    /// Marks a range of sectors as good.
+    ///
+    /// Removes a contiguous range of sectors from the bad blocks tracker,
+    /// indicating that these sectors are now reliable for I/O operations.
+    /// This is typically used after bad sectors have been repaired, remapped,
+    /// or determined to be false positives.
+    ///
+    /// # Parameters
+    ///
+    /// - `range` - The range of sectors to mark as good.
+    ///
+    /// # Behavior
+    ///
+    /// The implementation handles various scenarios automatically:
+    /// - **Complete removal**: If the range exactly matches a bad block range, it's removed
+    ///   entirely.
+    /// - **Partial removal**: If the range partially overlaps, the bad block range is split or
+    ///   trimmed.
+    /// - **No effect**: If the range doesn't overlap any bad blocks, the operation succeeds without
+    ///   changes.
+    /// - **Range splitting**: If the cleared range is in the middle of a bad block range, it may
+    ///   split the range in two.
+    ///
+    /// # Performance
+    ///
+    /// Executes in O(n) time where n is the number of entries in the bad blocks table.
+    ///
+    /// # Returns
+    ///
+    /// * `Ok(())` - Sectors were successfully marked as good (or were already good)
+    /// * `Err(EINVAL)` - Operation failed (typically due to table constraints)
+    ///
+    /// # Examples
+    ///
+    /// Basic usage after repair:
+    ///
+    /// ```rust
+    /// # use kernel::block::badblocks::{BadBlocks, BlockStatus};
+    /// # use kernel::prelude::*;
+    /// let bad_blocks = KBox::pin_init(BadBlocks::new(true), GFP_KERNEL)?;
+    ///
+    /// // Mark some sectors as bad initially
+    /// bad_blocks.set_bad(100..110, false)?;
+    /// assert!(matches!(bad_blocks.check(100..110), BlockStatus::Unacknowledged(_)));
+    ///
+    /// // After successful repair, mark them as good
+    /// bad_blocks.set_good(100..110)?;
+    /// assert!(matches!(bad_blocks.check(100..110), BlockStatus::None));
+    /// # Ok::<(), kernel::error::Error>(())
+    /// ```
+    ///
+    /// Partial clearing:
+    ///
+    /// ```rust
+    /// # use kernel::block::badblocks::{BadBlocks, BlockStatus};
+    /// # use kernel::prelude::*;
+    /// let bad_blocks = KBox::pin_init(BadBlocks::new(true), GFP_KERNEL)?;
+    ///
+    /// // Mark a large range as bad
+    /// bad_blocks.set_bad(200..220, false)?;
+    ///
+    /// // Clear only the middle portion
+    /// bad_blocks.set_good(205..215)?; // Clear sectors 205-214
+    ///
+    /// // Now we have bad blocks at the edges: 200-204 and 215-219
+    /// assert!(matches!(bad_blocks.check(200..205), BlockStatus::Unacknowledged(_)));
+    /// assert!(matches!(bad_blocks.check(205..215), BlockStatus::None));
+    /// assert!(matches!(bad_blocks.check(215..220), BlockStatus::Unacknowledged(_)));
+    /// # Ok::<(), kernel::error::Error>(())
+    /// ```
+    ///
+    /// Safe clearing of potentially good sectors:
+    ///
+    /// ```rust
+    /// # use kernel::block::badblocks::{BadBlocks, BlockStatus};
+    /// # use kernel::prelude::*;
+    /// let bad_blocks = KBox::pin_init(BadBlocks::new(true), GFP_KERNEL)?;
+    ///
+    /// // It's safe to clear sectors that were never marked as bad
+    /// bad_blocks.set_good(1000..1100)?; // No-op, but succeeds
+    /// assert!(matches!(bad_blocks.check(1000..1100), BlockStatus::None));
+    /// # Ok::<(), kernel::error::Error>(())
+    /// ```
+    pub fn set_good(&self, range: impl RangeBounds<u64>) -> Result {
+        let range = Self::range(range);
+        // SAFETY: By type invariant `self.blocks` is valid. The C function
+        // `badblocks_clear` handles synchronization internally.
+        unsafe {
+            bindings::badblocks_clear(self.blocks.get(), range.start, range.end - range.start)
+        }
+        .then_some(())
+        .ok_or(EINVAL)
+    }
+
+    // Transform a `RangeBounds` to start included end excluded range.
+    fn range(range: impl RangeBounds<u64>) -> Range<u64> {
+        let start = match range.start_bound() {
+            core::ops::Bound::Included(start) => *start,
+            core::ops::Bound::Excluded(start) => start + 1,
+            core::ops::Bound::Unbounded => u64::MIN,
+        };
+
+        let end = match range.end_bound() {
+            core::ops::Bound::Included(end) => end + 1,
+            core::ops::Bound::Excluded(end) => *end,
+            core::ops::Bound::Unbounded => u64::MAX,
+        };
+
+        start..end
+    }
+
+    /// Checks if a range of sectors contains any bad blocks.
+    ///
+    /// Examines the specified sector range to determine if it contains any sectors
+    /// that have been marked as bad. This is typically called before performing I/O
+    /// operations to avoid accessing defective sectors. The check uses seqlocks to
+    /// ensure consistent reads even under concurrent modifications.
+    ///
+    /// # Parameters
+    ///
+    /// - `range` - The range of sectors to check (supports any type implementing
+    ///   `RangeBounds<u64>`).
+    ///
+    /// # Returns
+    ///
+    /// Returns a [`BlockStatus`] indicating the state of the checked range:
+    ///
+    /// - `BlockStatus::None` - No bad blocks found in the specified range.
+    /// - `BlockStatus::Acknowledged(range)` - Contains acknowledged bad blocks.
+    /// - `BlockStatus::Unacknowledged(range)` - Contains unacknowledged bad blocks.
+    ///
+    /// The returned range indicates the **first bad block range** encountered that
+    /// overlaps with the checked area. If multiple separate bad ranges exist, only
+    /// the first is reported.
+    ///
+    /// # Performance
+    ///
+    /// The check operation uses binary search on the sorted bad blocks table,
+    /// providing O(log n) lookup time where n is the number of bad block ranges.
+    ///
+    /// # Examples
+    ///
+    /// Basic checking:
+    ///
+    /// ```rust
+    /// # use kernel::block::badblocks::{BadBlocks, BlockStatus};
+    /// # use kernel::prelude::*;
+    /// let bad_blocks = KBox::pin_init(BadBlocks::new(true), GFP_KERNEL)?;
+    ///
+    /// // Initially no bad blocks
+    /// assert!(matches!(bad_blocks.check(0..1000), BlockStatus::None));
+    ///
+    /// // Mark some sectors as bad
+    /// bad_blocks.set_bad(100..110, false)?;
+    ///
+    /// // Check various ranges
+    /// match bad_blocks.check(90..120) {
+    ///     BlockStatus::Unacknowledged(range) => {
+    ///         assert_eq!(range.start, 100);
+    ///         assert_eq!(range.end, 110);
+    ///         pr_warn!("Found unacknowledged bad blocks: {}-{}", range.start, range.end - 1);
+    ///     },
+    ///     _ => panic!("Expected bad blocks"),
+    /// }
+    ///
+    /// // Check range that doesn't overlap
+    /// assert!(matches!(bad_blocks.check(0..50), BlockStatus::None));
+    /// # Ok::<(), kernel::error::Error>(())
+    /// ```
+    ///
+    /// Handling different acknowledgment states:
+    ///
+    /// ```rust
+    /// # use kernel::block::badblocks::{BadBlocks, BlockStatus};
+    /// # use kernel::prelude::*;
+    /// let bad_blocks = KBox::pin_init(BadBlocks::new(true), GFP_KERNEL)?;
+    ///
+    /// // Add both acknowledged and unacknowledged bad blocks
+    /// bad_blocks.set_bad(100..105, true)?;   // Acknowledged
+    /// bad_blocks.set_bad(200..205, false)?;  // Unacknowledged
+    ///
+    /// match bad_blocks.check(95..105) {
+    ///     BlockStatus::Acknowledged(range) => {
+    ///         pr_info!("Acknowledged bad blocks found, can potentially remap: {:?}", range);
+    ///         // Continue with remapping logic
+    ///     },
+    ///     BlockStatus::Unacknowledged(range) => {
+    ///         pr_err!("Unacknowledged bad blocks found, requires attention: {:?}", range);
+    ///         // Handle as error condition
+    ///     },
+    ///     BlockStatus::None => {
+    ///         // Safe to proceed with I/O
+    ///     },
+    /// }
+    /// # Ok::<(), kernel::error::Error>(())
+    /// ```
+    ///
+    /// Safe I/O operation pattern:
+    ///
+    /// ```rust
+    /// # use kernel::block::badblocks::{BadBlocks, BlockStatus};
+    /// # use kernel::prelude::*;
+    /// # use core::ops::RangeBounds;
+    /// # fn perform_sector_read(range: impl RangeBounds<u64>) -> Result<()> { Ok(()) }
+    /// fn safe_read_sectors(
+    ///     bad_blocks: &BadBlocks,
+    ///     range: impl RangeBounds<u64> + Clone
+    /// ) -> Result<()> {
+    ///     // Check for bad blocks before attempting I/O
+    ///     match bad_blocks.check(range.clone()) {
+    ///         BlockStatus::None => {
+    ///             // Safe to proceed with I/O operation - convert range to
+    ///             // start/count for legacy function.
+    ///             perform_sector_read(range)
+    ///         },
+    ///         BlockStatus::Acknowledged(range) => {
+    ///             pr_warn!("I/O intersects acknowledged bad blocks: {:?}", range);
+    ///             // Potentially remap or skip bad sectors
+    ///             Err(EIO)
+    ///         },
+    ///         BlockStatus::Unacknowledged(range) => {
+    ///             pr_err!("I/O intersects unacknowledged bad blocks: {:?}", range);
+    ///             // Treat as serious error
+    ///             Err(EIO)
+    ///         },
+    ///     }
+    /// }
+    /// # Ok::<(), kernel::error::Error>(())
+    /// ```
+    pub fn check(&self, range: impl RangeBounds<u64>) -> BlockStatus {
+        let mut first_bad = 0;
+        let mut bad_count = 0;
+        let range = Self::range(range);
+
+        // SAFETY: By type invariant `self.blocks` is valid. `first_bad` and
+        // `bad_count` are valid mutable references The C function
+        // `badblocks_check` handles synchronization internally.
+        let ret = unsafe {
+            bindings::badblocks_check(
+                self.blocks.get(),
+                range.start,
+                range.end - range.start,
+                &mut first_bad,
+                &mut bad_count,
+            )
+        };
+
+        match ret {
+            0 => BlockStatus::None,
+            1 => BlockStatus::Acknowledged(first_bad..first_bad + bad_count),
+            -1 => BlockStatus::Unacknowledged(first_bad..first_bad + bad_count),
+            _ => {
+                debug_assert!(false, "Illegal return value from `badblocks_check`");
+                BlockStatus::None
+            }
+        }
+    }
+
+    /// Formats bad blocks information into a human-readable string.
+    ///
+    /// Exports the current bad blocks table to a text representation suitable
+    /// for display via sysfs. The output format shows each bad block range
+    /// with sector numbers and acknowledgment status.
+    ///
+    /// # Parameters
+    ///
+    /// - `page` - A page-sized buffer to write the formatted output into.
+    /// - `show_unacknowledged` - Whether to include unacknowledged bad blocks in output.
+    ///   - `true`: Shows both acknowledged and unacknowledged bad blocks
+    ///   - `false`: Shows only acknowledged bad blocks
+    ///
+    /// # Output Format
+    ///
+    /// The output consists of space-separated entries, each representing a bad block range:
+    /// - Format: `start_sector length [acknowledgment_status]`
+    /// - Acknowledged blocks: Just sector and length (e.g., "100 10")
+    /// - Unacknowledged blocks: Sector, length, and "u" suffix (e.g., "200 5 u")
+    ///
+    /// # Returns
+    ///
+    /// Returns the number of bytes written to the buffer, or a negative value on error.
+    /// The returned length can be used to extract the valid portion of the buffer.
+    ///
+    /// # Examples
+    ///
+    /// Basic usage:
+    ///
+    /// ```rust
+    /// # use kernel::block::badblocks::{BadBlocks, BlockStatus};
+    /// # use kernel::prelude::*;
+    /// # use kernel::page::PAGE_SIZE;
+    /// let bad_blocks = KBox::pin_init(BadBlocks::new(true), GFP_KERNEL)?;
+    /// let mut page = [0u8; PAGE_SIZE];
+    ///
+    /// // Add some bad blocks
+    /// bad_blocks.set_bad(100..110, true)?;   // Acknowledged
+    /// bad_blocks.set_bad(200..205, false)?;   // Unacknowledged
+    ///
+    /// // Show all bad blocks (including unacknowledged)
+    /// let len = bad_blocks.show(&mut page, true);
+    /// if len > 0 {
+    ///     let output = core::str::from_utf8(&page[..len as usize]).unwrap_or("<invalid utf8>");
+    ///     pr_info!("Bad blocks: {}", output);
+    ///     // Output might be: "100 10 200 5 u"
+    /// }
+    /// # Ok::<(), kernel::error::Error>(())
+    /// ```
+    pub fn show(&self, page: &mut [u8; PAGE_SIZE], show_unacknowledged: bool) -> isize {
+        // SAFETY: By type invariant `self.blocks` is valid. The C function
+        // `badblocks_show` handles synchronization internally.
+        // `page.as_mut_ptr()` returns a valid pointer to a PAGE_SIZE buffer.
+        // The C function will not write beyond the provided buffer size.
+        unsafe {
+            bindings::badblocks_show(
+                self.blocks.get(),
+                page.as_mut_ptr(),
+                if show_unacknowledged { 1 } else { 0 },
+            )
+        }
+    }
+}
+
+#[pinned_drop]
+impl PinnedDrop for BadBlocks {
+    fn drop(self: Pin<&mut Self>) {
+        // SAFETY: We do not move out of `self` before it is dropped.
+        let this = unsafe { self.get_unchecked_mut() };
+        // SAFETY: By type invariant `this.blocks` is valid. `badblocks_exit` is
+        // safe to call during destruction and will properly clean up allocated
+        // resources.
+        unsafe { bindings::badblocks_exit(this.blocks.get()) };
+    }
+}
+
+// SAFETY: `BadBlocks` can be safely dropped from other threads.
+unsafe impl Send for BadBlocks {}
+
+// SAFETY: All `BadBlocks` methods use internal synchronization.
+unsafe impl Sync for BadBlocks {}
+
+/// Status of a sector range after checking for bad blocks.
+///
+/// This enum represents the result of checking a sector range against the bad blocks
+/// table. It distinguishes between ranges with no bad blocks, ranges with acknowledged
+/// bad blocks, and ranges with unacknowledged bad blocks.
+///
+/// # Examples
+///
+/// ```rust
+/// # use kernel::block::badblocks::{BadBlocks, BlockStatus};
+/// # use kernel::prelude::*;
+/// # use core::ops::{Range, RangeBounds};
+/// # fn perform_io(range: impl RangeBounds<u64>) -> Result<()> { Ok(()) }
+/// # fn remap_and_retry(io_range: impl RangeBounds<u64>, bad_range: Range<u64>)
+/// #     -> Result<()> { Ok(()) }
+/// fn handle_io_request(bad_blocks: &BadBlocks, range: impl RangeBounds<u64> + Clone)
+///   -> Result<()>
+/// {
+///     match bad_blocks.check(range.clone()) {
+///         BlockStatus::None => {
+///             // Safe to proceed with I/O - convert range to start/count for legacy function
+///             perform_io(range)
+///         },
+///         BlockStatus::Acknowledged(bad_range) => {
+///             pr_warn!("I/O overlaps acknowledged bad blocks: {:?}", bad_range);
+///             // Attempt remapping or alternative strategy
+///             remap_and_retry(range, bad_range)
+///         },
+///         BlockStatus::Unacknowledged(bad_range) => {
+///             pr_err!("I/O overlaps unacknowledged bad blocks: {:?}", bad_range);
+///             // Treat as serious error
+///             Err(EIO)
+///         },
+///     }
+/// }
+/// # Ok::<(), kernel::error::Error>(())
+/// ```
+#[derive(Debug, Clone, PartialEq, Eq)]
+pub enum BlockStatus {
+    /// No bad blocks found in the checked range.
+    None,
+    /// The range contains acknowledged bad blocks.
+    ///
+    /// The contained range represents the first bad block
+    /// range encountered.
+    Acknowledged(Range<u64>),
+    /// The range contains unacknowledged bad blocks that need attention.
+    ///
+    /// The contained range represents the boundaries of the first bad block
+    /// range encountered.
+    Unacknowledged(Range<u64>),
+}

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 28/83] block: rust: mq: add Request::end() method for custom status codes
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (26 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 27/83] block: rust: add `BadBlocks` for bad block tracking Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 29/83] block: rnull: add badblocks support Andreas Hindborg
                   ` (54 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add end() method to Request that accepts a custom status code parameter,
refactoring end_ok() to use it with BLK_STS_OK.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/kernel/block/mq/request.rs | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/rust/kernel/block/mq/request.rs b/rust/kernel/block/mq/request.rs
index 9e176f015ab8..c06907dfe5b5 100644
--- a/rust/kernel/block/mq/request.rs
+++ b/rust/kernel/block/mq/request.rs
@@ -336,13 +336,18 @@ pub(crate) unsafe fn start_unchecked(&mut self) {
 
     /// Notify the block layer that the request has been completed without errors.
     pub fn end_ok(self) {
+        self.end(bindings::BLK_STS_OK)
+    }
+
+    /// Notify the block layer that the request has been completed.
+    pub fn end(self, status: u8) {
         let request_ptr = self.0.get().cast();
         core::mem::forget(self);
         // SAFETY: By type invariant, `this.0` was a valid `struct request`. The
         // existence of `self` guarantees that there are no `ARef`s pointing to
         // this request. Therefore it is safe to hand it back to the block
         // layer.
-        unsafe { bindings::blk_mq_end_request(request_ptr, bindings::BLK_STS_OK) };
+        unsafe { bindings::blk_mq_end_request(request_ptr, status) };
     }
 }
 

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 29/83] block: rnull: add badblocks support
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (27 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 28/83] block: rust: mq: add Request::end() method for custom status codes Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 30/83] block: rnull: add badblocks_once support Andreas Hindborg
                   ` (53 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add badblocks support to the rnull driver with a configfs interface for
managing bad sectors.

- Configfs attribute for adding/removing bad blocks via "+start-end" and
  "-start-end" syntax.
- Request handling that checks for bad blocks and returns IO errors.
- Updated request completion to handle error status properly.

The badblocks functionality is disabled by default and is enabled when
first bad block is added.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/configfs.rs | 63 ++++++++++++++++++++++++++++++++++++++---
 drivers/block/rnull/rnull.rs    | 46 ++++++++++++++++++++++++++----
 2 files changed, 100 insertions(+), 9 deletions(-)

diff --git a/drivers/block/rnull/configfs.rs b/drivers/block/rnull/configfs.rs
index d9aead646ae0..4db3ba26c2d1 100644
--- a/drivers/block/rnull/configfs.rs
+++ b/drivers/block/rnull/configfs.rs
@@ -6,9 +6,12 @@
 };
 use kernel::{
     bindings,
-    block::mq::gen_disk::{
-        GenDisk,
-        GenDiskBuilder, //
+    block::{
+        badblocks::BadBlocks,
+        mq::gen_disk::{
+            GenDisk,
+            GenDiskBuilder, //
+        }, //
     },
     configfs::{
         self,
@@ -26,7 +29,10 @@
         kstrtobool_bytes,
         CString, //
     },
-    sync::Mutex,
+    sync::{
+        Arc,
+        Mutex, //
+    },
     time, //
 };
 use macros::{
@@ -95,6 +101,7 @@ fn make_group(
                 home_node: 9,
                 discard: 10,
                 no_sched:11,
+                badblocks: 12,
             ],
         };
 
@@ -117,6 +124,7 @@ fn make_group(
                     home_node: bindings::NUMA_NO_NODE,
                     discard: false,
                     no_sched: false,
+                    bad_blocks: Arc::pin_init(BadBlocks::new(false), GFP_KERNEL)?,
                 }),
             }),
             core::iter::empty(),
@@ -186,6 +194,7 @@ struct DeviceConfigInner {
     home_node: i32,
     discard: bool,
     no_sched: bool,
+    bad_blocks: Arc<BadBlocks>,
 }
 
 #[vtable]
@@ -221,6 +230,7 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
                 home_node: guard.home_node,
                 discard: guard.discard,
                 no_sched: guard.no_sched,
+                bad_blocks: guard.bad_blocks.clone(),
             })?);
             guard.powered = true;
         } else if guard.powered && !power_op {
@@ -328,3 +338,48 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
 );
 
 configfs_simple_bool_field!(DeviceConfig, 11, no_sched);
+
+#[vtable]
+impl configfs::AttributeOperations<12> for DeviceConfig {
+    type Data = DeviceConfig;
+
+    fn show(this: &DeviceConfig, page: &mut [u8; PAGE_SIZE]) -> Result<usize> {
+        let ret = this.data.lock().bad_blocks.show(page, false);
+        if ret < 0 {
+            Err(Error::from_errno(ret as c_int))
+        } else {
+            Ok(ret as usize)
+        }
+    }
+
+    fn store(this: &DeviceConfig, page: &[u8]) -> Result {
+        // This attribute can be set while device is powered.
+
+        for line in core::str::from_utf8(page)?.lines() {
+            let mut chars = line.chars();
+            match chars.next() {
+                Some(sign @ '+' | sign @ '-') => {
+                    if let Some((start, end)) = chars.as_str().split_once('-') {
+                        let start: u64 = start.parse().map_err(|_| EINVAL)?;
+                        let end: u64 = end.parse().map_err(|_| EINVAL)?;
+
+                        if start > end {
+                            return Err(EINVAL);
+                        }
+
+                        this.data.lock().bad_blocks.enable();
+
+                        if sign == '+' {
+                            this.data.lock().bad_blocks.set_bad(start..=end, true)?;
+                        } else {
+                            this.data.lock().bad_blocks.set_good(start..=end)?;
+                        }
+                    }
+                }
+                _ => return Err(EINVAL),
+            }
+        }
+
+        Ok(())
+    }
+}
diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index 73f14d6e379f..90dbf318c2f8 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -9,6 +9,7 @@
     bindings,
     block::{
         self,
+        badblocks::{self, BadBlocks},
         bio::Segment,
         mq::{
             self,
@@ -38,6 +39,10 @@
     str::CString,
     sync::{
         aref::ARef,
+        atomic::{
+            ordering,
+            Atomic, //
+        },
         Arc,
         Mutex, //
     },
@@ -153,6 +158,7 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
                     home_node: module_parameters::home_node.value(),
                     discard: module_parameters::discard.value(),
                     no_sched: module_parameters::no_sched.value(),
+                    bad_blocks: Arc::pin_init(BadBlocks::new(false), GFP_KERNEL)?,
                 })?;
                 disks.push(disk, GFP_KERNEL)?;
             }
@@ -179,6 +185,7 @@ struct NullBlkOptions<'a> {
     home_node: i32,
     discard: bool,
     no_sched: bool,
+    bad_blocks: Arc<BadBlocks>,
 }
 struct NullBlkDevice;
 
@@ -196,6 +203,7 @@ fn new(options: NullBlkOptions<'_>) -> Result<GenDisk<Self>> {
             home_node,
             discard,
             no_sched,
+            bad_blocks,
         } = options;
 
         let mut flags = mq::tag_set::Flags::default();
@@ -237,6 +245,7 @@ fn new(options: NullBlkOptions<'_>) -> Result<GenDisk<Self>> {
                 completion_time,
                 memory_backed,
                 block_size: block_size.into(),
+                bad_blocks,
             }),
             GFP_KERNEL,
         )?;
@@ -351,6 +360,16 @@ fn transfer(
         }
         Ok(())
     }
+
+    fn end_request(rq: Owned<mq::Request<Self>>) {
+        let status = rq.data_ref().error.load(ordering::Relaxed);
+        rq.data_ref().error.store(0, ordering::Relaxed);
+
+        match status {
+            0 => rq.end_ok(),
+            _ => rq.end(bindings::BLK_STS_IOERR),
+        }
+    }
 }
 
 static_assert!((PAGE_SIZE >> SECTOR_SHIFT) <= 64);
@@ -396,12 +415,14 @@ struct QueueData {
     completion_time: Delta,
     memory_backed: bool,
     block_size: u64,
+    bad_blocks: Arc<BadBlocks>,
 }
 
 #[pin_data]
 struct Pdu {
     #[pin]
     timer: kernel::time::hrtimer::HrTimer<Self>,
+    error: Atomic<u32>,
 }
 
 impl HrTimerCallback for Pdu {
@@ -431,6 +452,7 @@ impl Operations for NullBlkDevice {
     fn new_request_data() -> impl PinInit<Self::RequestData> {
         pin_init!(Pdu {
             timer <- kernel::time::hrtimer::HrTimer::new(),
+            error: Atomic::new(0),
         })
     }
 
@@ -440,6 +462,19 @@ fn queue_rq(
         mut rq: Owned<mq::Request<Self>>,
         _is_last: bool,
     ) -> Result {
+        if queue_data.bad_blocks.enabled() {
+            let start = rq.sector();
+            let end = start + u64::from(rq.sectors());
+            if !matches!(
+                queue_data.bad_blocks.check(start..end),
+                badblocks::BlockStatus::None
+            ) {
+                rq.data_ref().error.store(1, ordering::Relaxed);
+            }
+        }
+
+        // TODO: Skip IO if bad block.
+
         if queue_data.memory_backed {
             memalloc_scope!(let _noio: NoIo);
             let tree = &queue_data.tree;
@@ -461,7 +496,7 @@ fn queue_rq(
         }
 
         match queue_data.irq_mode {
-            IRQMode::None => rq.end_ok(),
+            IRQMode::None => Self::end_request(rq),
             IRQMode::Soft => mq::Request::complete(rq.into()),
             IRQMode::Timer => {
                 OwnableRefCounted::into_shared(rq)
@@ -475,9 +510,10 @@ fn queue_rq(
     fn commit_rqs(_queue_data: Pin<&QueueData>) {}
 
     fn complete(rq: ARef<mq::Request<Self>>) {
-        OwnableRefCounted::try_from_shared(rq)
-            .map_err(|_e| kernel::error::code::EIO)
-            .expect("Failed to complete request")
-            .end_ok();
+        Self::end_request(
+            OwnableRefCounted::try_from_shared(rq)
+                .map_err(|_e| kernel::error::code::EIO)
+                .expect("Failed to complete request"),
+        )
     }
 }

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 30/83] block: rnull: add badblocks_once support
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (28 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 29/83] block: rnull: add badblocks support Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 31/83] block: rust: add `Segment::truncate` Andreas Hindborg
                   ` (52 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add support for the badblocks_once feature, which automatically clears
bad blocks after they are encountered during I/O operations. This
matches the functionality in the C null_blk driver.

When badblocks_once is enabled:
- Bad blocks are checked during I/O requests as usual
- If a bad block is encountered, the I/O is marked as failed
- The bad block range is immediately cleared from the bad blocks table
- Subsequent I/O to the same sectors will succeed

This feature is useful for testing scenarios where bad blocks are
transient or where devices can recover from bad sectors after a single
access attempt.

The feature is configurable via the configfs badblocks_once attribute
and disabled by default, maintaining compatibility with existing
behavior.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/configfs.rs |  6 ++++++
 drivers/block/rnull/rnull.rs    | 21 +++++++++++++++------
 2 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/drivers/block/rnull/configfs.rs b/drivers/block/rnull/configfs.rs
index 4db3ba26c2d1..05229ba9173a 100644
--- a/drivers/block/rnull/configfs.rs
+++ b/drivers/block/rnull/configfs.rs
@@ -102,6 +102,7 @@ fn make_group(
                 discard: 10,
                 no_sched:11,
                 badblocks: 12,
+                badblocks_once: 13,
             ],
         };
 
@@ -125,6 +126,7 @@ fn make_group(
                     discard: false,
                     no_sched: false,
                     bad_blocks: Arc::pin_init(BadBlocks::new(false), GFP_KERNEL)?,
+                    bad_blocks_once: false,
                 }),
             }),
             core::iter::empty(),
@@ -195,6 +197,7 @@ struct DeviceConfigInner {
     discard: bool,
     no_sched: bool,
     bad_blocks: Arc<BadBlocks>,
+    bad_blocks_once: bool,
 }
 
 #[vtable]
@@ -231,6 +234,7 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
                 discard: guard.discard,
                 no_sched: guard.no_sched,
                 bad_blocks: guard.bad_blocks.clone(),
+                bad_blocks_once: guard.bad_blocks_once,
             })?);
             guard.powered = true;
         } else if guard.powered && !power_op {
@@ -383,3 +387,5 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
         Ok(())
     }
 }
+
+configfs_simple_bool_field!(DeviceConfig, 13, bad_blocks_once);
diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index 90dbf318c2f8..5486eb6dd921 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -159,6 +159,7 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
                     discard: module_parameters::discard.value(),
                     no_sched: module_parameters::no_sched.value(),
                     bad_blocks: Arc::pin_init(BadBlocks::new(false), GFP_KERNEL)?,
+                    bad_blocks_once: false,
                 })?;
                 disks.push(disk, GFP_KERNEL)?;
             }
@@ -186,6 +187,7 @@ struct NullBlkOptions<'a> {
     discard: bool,
     no_sched: bool,
     bad_blocks: Arc<BadBlocks>,
+    bad_blocks_once: bool,
 }
 struct NullBlkDevice;
 
@@ -204,6 +206,7 @@ fn new(options: NullBlkOptions<'_>) -> Result<GenDisk<Self>> {
             discard,
             no_sched,
             bad_blocks,
+            bad_blocks_once,
         } = options;
 
         let mut flags = mq::tag_set::Flags::default();
@@ -246,6 +249,7 @@ fn new(options: NullBlkOptions<'_>) -> Result<GenDisk<Self>> {
                 memory_backed,
                 block_size: block_size.into(),
                 bad_blocks,
+                bad_blocks_once,
             }),
             GFP_KERNEL,
         )?;
@@ -416,6 +420,7 @@ struct QueueData {
     memory_backed: bool,
     block_size: u64,
     bad_blocks: Arc<BadBlocks>,
+    bad_blocks_once: bool,
 }
 
 #[pin_data]
@@ -465,12 +470,16 @@ fn queue_rq(
         if queue_data.bad_blocks.enabled() {
             let start = rq.sector();
             let end = start + u64::from(rq.sectors());
-            if !matches!(
-                queue_data.bad_blocks.check(start..end),
-                badblocks::BlockStatus::None
-            ) {
-                rq.data_ref().error.store(1, ordering::Relaxed);
-            }
+            match queue_data.bad_blocks.check(start..end) {
+                badblocks::BlockStatus::None => {}
+                badblocks::BlockStatus::Acknowledged(range)
+                | badblocks::BlockStatus::Unacknowledged(range) => {
+                    rq.data_ref().error.store(1, ordering::Relaxed);
+                    if queue_data.bad_blocks_once {
+                        queue_data.bad_blocks.set_good(range)?;
+                    }
+                }
+            };
         }
 
         // TODO: Skip IO if bad block.

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 31/83] block: rust: add `Segment::truncate`
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (29 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 30/83] block: rnull: add badblocks_once support Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 32/83] block: rnull: add partial I/O support for bad blocks Andreas Hindborg
                   ` (51 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add a method that limits the remaining length of a `Segment` without
moving its offset. This complements `Segment::advance`, which can skip
data at the front but cannot trim data at the back, and gives callers a
way to clip a segment to a maximum byte count before handing it to the
existing `copy_to_page` / `copy_from_page` / `zero_page` helpers, which
already bound themselves by `Segment::len()`.

This is needed by rnull's partial bad-block I/O path, which needs to
clamp per-segment work to a sector boundary computed from the bad-block
range.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/kernel/block/bio/vec.rs | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/rust/kernel/block/bio/vec.rs b/rust/kernel/block/bio/vec.rs
index 99ab164d4038..61d83a07397f 100644
--- a/rust/kernel/block/bio/vec.rs
+++ b/rust/kernel/block/bio/vec.rs
@@ -81,6 +81,18 @@ pub fn advance(&mut self, count: u32) -> Result {
         Ok(())
     }
 
+    /// Limit the remaining length of the segment.
+    ///
+    /// Shortens the segment to at most `new_len` bytes. If `new_len` is
+    /// greater than or equal to the current remaining length, the segment is
+    /// left unchanged. The offset is not modified, so subsequent copy
+    /// operations still start from the current position.
+    pub fn truncate(&mut self, new_len: u32) {
+        if new_len < self.len() {
+            self.bio_vec.bv_len = new_len;
+        }
+    }
+
     /// Copy data of this segment into `dst_page`.
     ///
     /// Copies data from the current offset to the next page boundary. That is `PAGE_SIZE -

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 32/83] block: rnull: add partial I/O support for bad blocks
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (30 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 31/83] block: rust: add `Segment::truncate` Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 33/83] block: rust: add `TagSet` private data support Andreas Hindborg
                   ` (50 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add bad_blocks_partial_io configuration option that allows partial I/O
completion when encountering bad blocks, rather than failing the entire
request.

When enabled, requests are truncated to stop before the first bad block
range, allowing the valid portion to be processed successfully. This
improves compatibility with applications that can handle partial
reads/writes.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/configfs.rs |   5 ++
 drivers/block/rnull/rnull.rs    | 126 +++++++++++++++++++++++++++++-----------
 2 files changed, 97 insertions(+), 34 deletions(-)

diff --git a/drivers/block/rnull/configfs.rs b/drivers/block/rnull/configfs.rs
index 05229ba9173a..0e9fe8cdc07f 100644
--- a/drivers/block/rnull/configfs.rs
+++ b/drivers/block/rnull/configfs.rs
@@ -103,6 +103,7 @@ fn make_group(
                 no_sched:11,
                 badblocks: 12,
                 badblocks_once: 13,
+                badblocks_partial_io: 14,
             ],
         };
 
@@ -127,6 +128,7 @@ fn make_group(
                     no_sched: false,
                     bad_blocks: Arc::pin_init(BadBlocks::new(false), GFP_KERNEL)?,
                     bad_blocks_once: false,
+                    bad_blocks_partial_io: false,
                 }),
             }),
             core::iter::empty(),
@@ -198,6 +200,7 @@ struct DeviceConfigInner {
     no_sched: bool,
     bad_blocks: Arc<BadBlocks>,
     bad_blocks_once: bool,
+    bad_blocks_partial_io: bool,
 }
 
 #[vtable]
@@ -235,6 +238,7 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
                 no_sched: guard.no_sched,
                 bad_blocks: guard.bad_blocks.clone(),
                 bad_blocks_once: guard.bad_blocks_once,
+                bad_blocks_partial_io: guard.bad_blocks_partial_io,
             })?);
             guard.powered = true;
         } else if guard.powered && !power_op {
@@ -389,3 +393,4 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
 }
 
 configfs_simple_bool_field!(DeviceConfig, 13, bad_blocks_once);
+configfs_simple_bool_field!(DeviceConfig, 14, bad_blocks_partial_io);
diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index 5486eb6dd921..be0b4bd25e53 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -160,6 +160,7 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
                     no_sched: module_parameters::no_sched.value(),
                     bad_blocks: Arc::pin_init(BadBlocks::new(false), GFP_KERNEL)?,
                     bad_blocks_once: false,
+                    bad_blocks_partial_io: false,
                 })?;
                 disks.push(disk, GFP_KERNEL)?;
             }
@@ -188,6 +189,7 @@ struct NullBlkOptions<'a> {
     no_sched: bool,
     bad_blocks: Arc<BadBlocks>,
     bad_blocks_once: bool,
+    bad_blocks_partial_io: bool,
 }
 struct NullBlkDevice;
 
@@ -207,6 +209,7 @@ fn new(options: NullBlkOptions<'_>) -> Result<GenDisk<Self>> {
             no_sched,
             bad_blocks,
             bad_blocks_once,
+            bad_blocks_partial_io,
         } = options;
 
         let mut flags = mq::tag_set::Flags::default();
@@ -250,6 +253,7 @@ fn new(options: NullBlkOptions<'_>) -> Result<GenDisk<Self>> {
                 block_size: block_size.into(),
                 bad_blocks,
                 bad_blocks_once,
+                bad_blocks_partial_io,
             }),
             GFP_KERNEL,
         )?;
@@ -352,15 +356,66 @@ fn discard(tree: &XArray<TreeNode>, mut sector: u64, sectors: u64, block_size: u
 
     #[inline(never)]
     fn transfer(
-        command: bindings::req_op,
+        rq: &mut Owned<mq::Request<Self>>,
         tree: &XArray<TreeNode>,
-        sector: u64,
-        segment: Segment<'_>,
+        max_sectors: u32,
     ) -> Result {
-        match command {
-            bindings::req_op_REQ_OP_WRITE => Self::write(tree, sector, segment)?,
-            bindings::req_op_REQ_OP_READ => Self::read(tree, sector, segment)?,
-            _ => (),
+        let mut sector = rq.sector();
+        let max_end_sector = sector + <u32 as Into<u64>>::into(max_sectors);
+        let command = rq.command();
+
+        for bio in rq.bio_iter_mut() {
+            let segment_iter = bio.segment_iter();
+            for mut segment in segment_iter {
+                // Length might be limited by bad blocks.
+                let segment_length_sectors = segment.len() >> SECTOR_SHIFT;
+                let max_remaining_sectors = (max_end_sector - sector) as u32;
+                let length_sectors_allowed = segment_length_sectors.min(max_remaining_sectors);
+                segment.truncate(length_sectors_allowed << SECTOR_SHIFT);
+                match command {
+                    bindings::req_op_REQ_OP_WRITE => Self::write(tree, sector, segment)?,
+                    bindings::req_op_REQ_OP_READ => Self::read(tree, sector, segment)?,
+                    _ => (),
+                }
+                sector += u64::from(length_sectors_allowed);
+
+                if sector >= max_end_sector {
+                    return Ok(());
+                }
+            }
+        }
+        Ok(())
+    }
+
+    fn handle_bad_blocks(
+        rq: &mut Owned<mq::Request<Self>>,
+        queue_data: &QueueData,
+        sectors: &mut u32,
+    ) -> Result {
+        if queue_data.bad_blocks.enabled() {
+            let start = rq.sector();
+            let end = start + u64::from(*sectors);
+            match queue_data.bad_blocks.check(start..end) {
+                badblocks::BlockStatus::None => {}
+                badblocks::BlockStatus::Acknowledged(mut range)
+                | badblocks::BlockStatus::Unacknowledged(mut range) => {
+                    rq.data_ref().error.store(1, ordering::Relaxed);
+
+                    if queue_data.bad_blocks_once {
+                        queue_data.bad_blocks.set_good(range.clone())?;
+                    }
+
+                    if queue_data.bad_blocks_partial_io {
+                        let block_size_sectors = queue_data.block_size >> SECTOR_SHIFT;
+                        range.start = align_down(range.start, block_size_sectors);
+                        if start < range.start {
+                            *sectors = (range.start - start) as u32;
+                        }
+                    } else {
+                        *sectors = 0;
+                    }
+                }
+            };
         }
         Ok(())
     }
@@ -421,6 +476,7 @@ struct QueueData {
     block_size: u64,
     bad_blocks: Arc<BadBlocks>,
     bad_blocks_once: bool,
+    bad_blocks_partial_io: bool,
 }
 
 #[pin_data]
@@ -449,6 +505,30 @@ impl HasHrTimer<Self> for Pdu {
     }
 }
 
+fn is_power_of_two<T>(value: T) -> bool
+where
+    T: core::ops::Sub<T, Output = T>,
+    T: core::ops::BitAnd<Output = T>,
+    T: core::cmp::PartialOrd<T>,
+    T: Copy,
+    T: From<u8>,
+{
+    (value > 0u8.into()) && (value & (value - 1u8.into())) == 0u8.into()
+}
+
+fn align_down<T>(value: T, to: T) -> T
+where
+    T: core::ops::Sub<T, Output = T>,
+    T: core::ops::Not<Output = T>,
+    T: core::ops::BitAnd<Output = T>,
+    T: core::cmp::PartialOrd<T>,
+    T: Copy,
+    T: From<u8>,
+{
+    debug_assert!(is_power_of_two(to));
+    value & !(to - 1u8.into())
+}
+
 #[vtable]
 impl Operations for NullBlkDevice {
     type QueueData = Pin<KBox<QueueData>>;
@@ -467,40 +547,18 @@ fn queue_rq(
         mut rq: Owned<mq::Request<Self>>,
         _is_last: bool,
     ) -> Result {
-        if queue_data.bad_blocks.enabled() {
-            let start = rq.sector();
-            let end = start + u64::from(rq.sectors());
-            match queue_data.bad_blocks.check(start..end) {
-                badblocks::BlockStatus::None => {}
-                badblocks::BlockStatus::Acknowledged(range)
-                | badblocks::BlockStatus::Unacknowledged(range) => {
-                    rq.data_ref().error.store(1, ordering::Relaxed);
-                    if queue_data.bad_blocks_once {
-                        queue_data.bad_blocks.set_good(range)?;
-                    }
-                }
-            };
-        }
+        let mut sectors = rq.sectors();
 
-        // TODO: Skip IO if bad block.
+        Self::handle_bad_blocks(&mut rq, queue_data.get_ref(), &mut sectors)?;
 
         if queue_data.memory_backed {
             memalloc_scope!(let _noio: NoIo);
             let tree = &queue_data.tree;
-            let command = rq.command();
-            let mut sector = rq.sector();
 
-            if command == bindings::req_op_REQ_OP_DISCARD {
-                Self::discard(tree, sector, rq.sectors().into(), queue_data.block_size)?;
+            if rq.command() == bindings::req_op_REQ_OP_DISCARD {
+                Self::discard(tree, rq.sector(), sectors.into(), queue_data.block_size)?;
             } else {
-                for bio in rq.bio_iter_mut() {
-                    let segment_iter = bio.segment_iter();
-                    for segment in segment_iter {
-                        let length = segment.len();
-                        Self::transfer(command, tree, sector, segment)?;
-                        sector += u64::from(length) >> block::SECTOR_SHIFT;
-                    }
-                }
+                Self::transfer(&mut rq, tree, sectors)?;
             }
         }
 

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 33/83] block: rust: add `TagSet` private data support
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (31 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 32/83] block: rnull: add partial I/O support for bad blocks Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 34/83] block: rust: add `hctx` " Andreas Hindborg
                   ` (49 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux, Andreas Hindborg

From: Andreas Hindborg <a.hindborg@samsung.com>

C block device drivers can attach private data to a `struct
blk_mq_tag_set`. Add support for this feature for Rust block device
drivers via the `Operations::TagSetData` associated type.

The private data is passed to `TagSet::new` and is stored in the
`driver_data` field of the underlying `struct blk_mq_tag_set`. It is
released when the `TagSet` is dropped.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/rnull.rs       |  3 ++-
 rust/kernel/block/mq.rs            |  6 ++++--
 rust/kernel/block/mq/operations.rs |  4 ++++
 rust/kernel/block/mq/tag_set.rs    | 26 ++++++++++++++++++++++----
 4 files changed, 32 insertions(+), 7 deletions(-)

diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index be0b4bd25e53..ad26a4a8dbbe 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -240,7 +240,7 @@ fn new(options: NullBlkOptions<'_>) -> Result<GenDisk<Self>> {
         }
 
         let tagset = Arc::pin_init(
-            TagSet::new(submit_queues, 256, 1, numa_node, flags),
+            TagSet::new(submit_queues, (), 256, 1, numa_node, flags),
             GFP_KERNEL,
         )?;
 
@@ -533,6 +533,7 @@ fn align_down<T>(value: T, to: T) -> T
 impl Operations for NullBlkDevice {
     type QueueData = Pin<KBox<QueueData>>;
     type RequestData = Pdu;
+    type TagSetData = ();
 
     fn new_request_data() -> impl PinInit<Self::RequestData> {
         pin_init!(Pdu {
diff --git a/rust/kernel/block/mq.rs b/rust/kernel/block/mq.rs
index bac15b509d90..28cee0d60846 100644
--- a/rust/kernel/block/mq.rs
+++ b/rust/kernel/block/mq.rs
@@ -71,6 +71,7 @@
 //! impl Operations for MyBlkDevice {
 //!     type RequestData = ();
 //!     type QueueData = ();
+//!     type TagSetData = ();
 //!
 //!     fn new_request_data(
 //!     ) -> impl PinInit<()> {
@@ -94,8 +95,9 @@
 //!
 //! let tagset: Arc<TagSet<MyBlkDevice>> =
 //!     Arc::pin_init(
-//!         TagSet::new(1, 256, 1, NumaNode::NO_NODE, mq::tag_set::Flags::default()),
-//!         GFP_KERNEL)?;
+//!         TagSet::new(1, (), 256, 1, NumaNode::NO_NODE, mq::tag_set::Flags::default()),
+//!         GFP_KERNEL
+//!     )?;
 //! let mut disk = gen_disk::GenDiskBuilder::new()
 //!     .capacity_sectors(4096)
 //!     .build(fmt!("myblk"), tagset, ())?;
diff --git a/rust/kernel/block/mq/operations.rs b/rust/kernel/block/mq/operations.rs
index c49ca2e8bbb2..093bb21fa1b2 100644
--- a/rust/kernel/block/mq/operations.rs
+++ b/rust/kernel/block/mq/operations.rs
@@ -63,6 +63,10 @@ pub trait Operations: Sized {
     /// the `GenDisk` associated with this `Operations` implementation.
     type QueueData: ForeignOwnable + Sync;
 
+    /// Data associated with a `TagSet`. This is stored as a pointer in `struct
+    /// blk_mq_tag_set`.
+    type TagSetData: ForeignOwnable + Sync;
+
     /// Called by the kernel to get an initializer for a `Pin<&mut RequestData>`.
     fn new_request_data() -> impl PinInit<Self::RequestData>;
 
diff --git a/rust/kernel/block/mq/tag_set.rs b/rust/kernel/block/mq/tag_set.rs
index d6d104adf4aa..bfb8f8af4ee1 100644
--- a/rust/kernel/block/mq/tag_set.rs
+++ b/rust/kernel/block/mq/tag_set.rs
@@ -19,7 +19,10 @@
         Result, //
     },
     prelude::*,
-    types::Opaque,
+    types::{
+        ForeignOwnable,
+        Opaque, //
+    },
 };
 use core::{
     convert::TryInto,
@@ -56,6 +59,7 @@ impl<T: Operations> TagSet<T> {
     /// Try to create a new tag set
     pub fn new(
         nr_hw_queues: u32,
+        tagset_data: T::TagSetData,
         num_tags: u32,
         num_maps: u32,
         numa_node: NumaNode,
@@ -73,7 +77,7 @@ pub fn new(
                     queue_depth: num_tags,
                     cmd_size,
                     flags: flags.into(),
-                    driver_data: core::ptr::null_mut::<c_void>(),
+                    driver_data: tagset_data.into_foreign(),
                     nr_maps: num_maps,
                     ..tag_set
                 }
@@ -86,7 +90,14 @@ pub fn new(
                 // SAFETY: we do not move out of `tag_set`.
                 let tag_set: &mut Opaque<_> = unsafe { Pin::get_unchecked_mut(tag_set) };
                 // SAFETY: `tag_set` is a reference to an initialized `blk_mq_tag_set`.
-                error::to_result( unsafe { bindings::blk_mq_alloc_tag_set(tag_set.get())})
+                let status = error::to_result(
+                    unsafe { bindings::blk_mq_alloc_tag_set(tag_set.get())}
+                );
+                if status.is_err() {
+                    // SAFETY: We created `driver_data` above with `into_foreign`
+                    unsafe { T::TagSetData::from_foreign((*tag_set.get()).driver_data) };
+                }
+                status
             }),
             _p: PhantomData,
         })
@@ -102,7 +113,14 @@ pub(crate) fn raw_tag_set(&self) -> *mut bindings::blk_mq_tag_set {
 impl<T: Operations> PinnedDrop for TagSet<T> {
     fn drop(self: Pin<&mut Self>) {
         // SAFETY: By type invariant `inner` is valid and has been properly
-        // initialized during construction.
+        // initialised during construction.
+        let tagset_data = unsafe { (*self.inner.get()).driver_data };
+
+        // SAFETY: `inner` is valid and has been properly initialised during construction.
         unsafe { bindings::blk_mq_free_tag_set(self.inner.get()) };
+
+        // SAFETY: `tagset_data` was created by a call to
+        // `ForeignOwnable::into_foreign` in `TagSet::try_new()`
+        unsafe { T::TagSetData::from_foreign(tagset_data) };
     }
 }

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 34/83] block: rust: add `hctx` private data support
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (32 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 33/83] block: rust: add `TagSet` private data support Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 35/83] block: rnull: add volatile cache emulation Andreas Hindborg
                   ` (48 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux, Andreas Hindborg

From: Andreas Hindborg <a.hindborg@samsung.com>

C block device drivers can attach private data to a hardware context
(`struct blk_mq_hw_ctx`). Add support for this feature for Rust block
device drivers via the `Operations::HwData` associated type.

The private data is created in the `init_hctx` callback and stored in
the `driver_data` field of `blk_mq_hw_ctx`. It is passed to `queue_rq`,
`commit_rqs`, and `poll` callbacks, and is released in `exit_hctx`.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/rnull.rs       |  8 +++-
 rust/kernel/block/mq.rs            | 23 +++++++++-
 rust/kernel/block/mq/operations.rs | 88 +++++++++++++++++++++++++++++++-------
 3 files changed, 100 insertions(+), 19 deletions(-)

diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index ad26a4a8dbbe..0c1bc2f5ae9c 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -534,6 +534,7 @@ impl Operations for NullBlkDevice {
     type QueueData = Pin<KBox<QueueData>>;
     type RequestData = Pdu;
     type TagSetData = ();
+    type HwData = ();
 
     fn new_request_data() -> impl PinInit<Self::RequestData> {
         pin_init!(Pdu {
@@ -544,6 +545,7 @@ fn new_request_data() -> impl PinInit<Self::RequestData> {
 
     #[inline(always)]
     fn queue_rq(
+        _hw_data: (),
         queue_data: Pin<&QueueData>,
         mut rq: Owned<mq::Request<Self>>,
         _is_last: bool,
@@ -575,7 +577,11 @@ fn queue_rq(
         Ok(())
     }
 
-    fn commit_rqs(_queue_data: Pin<&QueueData>) {}
+    fn commit_rqs(_hw_data: (), _queue_data: Pin<&QueueData>) {}
+
+    fn init_hctx(_tagset_data: (), _hctx_idx: u32) -> Result {
+        Ok(())
+    }
 
     fn complete(rq: ARef<mq::Request<Self>>) {
         Self::end_request(
diff --git a/rust/kernel/block/mq.rs b/rust/kernel/block/mq.rs
index 28cee0d60846..b095cc7f51ce 100644
--- a/rust/kernel/block/mq.rs
+++ b/rust/kernel/block/mq.rs
@@ -17,6 +17,12 @@
 //! - The [`GenDisk`] type that abstracts the C type `struct gendisk`.
 //! - The [`Request`] type that abstracts the C type `struct request`.
 //!
+//! Many of the C types that this module abstracts allow a driver to carry
+//! private data, either embedded in the struct directly, or as a C `void*`. In
+//! these abstractions, this data is typed. The types of the data is defined by
+//! associated types in `Operations`, see [`Operations::RequestData`] for an
+//! example.
+//!
 //! The kernel will interface with the block device driver by calling the method
 //! implementations of the `Operations` trait.
 //!
@@ -71,6 +77,7 @@
 //! impl Operations for MyBlkDevice {
 //!     type RequestData = ();
 //!     type QueueData = ();
+//!     type HwData = ();
 //!     type TagSetData = ();
 //!
 //!     fn new_request_data(
@@ -78,12 +85,17 @@
 //!         Ok(())
 //!     }
 //!
-//!     fn queue_rq(_queue_data: (), rq: Owned<Request<Self>>, _is_last: bool) -> Result {
+//!     fn queue_rq(
+//!         _hw_data: (),
+//!         _queue_data: (),
+//!         rq: Owned<Request<Self>>,
+//!         _is_last: bool
+//!     ) -> Result {
 //!         rq.end_ok();
 //!         Ok(())
 //!     }
 //!
-//!     fn commit_rqs(_queue_data: ()) {}
+//!     fn commit_rqs(_hw_data: (), _queue_data: ()) {}
 //!
 //!     fn complete(rq: ARef<Request<Self>>) {
 //!         OwnableRefCounted::try_from_shared(rq)
@@ -91,6 +103,13 @@
 //!             .expect("Fatal error - expected to be able to end request")
 //!             .end_ok();
 //!     }
+//!
+//!     fn init_hctx(
+//!         _tagset_data: (),
+//!         _hctx_idx: u32,
+//!     ) -> Result<Self::HwData> {
+//!         Ok(())
+//!     }
 //! }
 //!
 //! let tagset: Arc<TagSet<MyBlkDevice>> =
diff --git a/rust/kernel/block/mq/operations.rs b/rust/kernel/block/mq/operations.rs
index 093bb21fa1b2..1b20df25d6df 100644
--- a/rust/kernel/block/mq/operations.rs
+++ b/rust/kernel/block/mq/operations.rs
@@ -63,6 +63,13 @@ pub trait Operations: Sized {
     /// the `GenDisk` associated with this `Operations` implementation.
     type QueueData: ForeignOwnable + Sync;
 
+    /// Data associated with a dispatch queue. This is stored as a pointer in the C `struct
+    /// blk_mq_hw_ctx` that represents a hardware queue.
+    ///
+    /// Hardware contexts may be cleaned up by a thread different from the allocating thread, so
+    /// `HwData` must be `Send`.
+    type HwData: ForeignOwnable + Sync + Send;
+
     /// Data associated with a `TagSet`. This is stored as a pointer in `struct
     /// blk_mq_tag_set`.
     type TagSetData: ForeignOwnable + Sync;
@@ -73,20 +80,30 @@ pub trait Operations: Sized {
     /// Called by the kernel to queue a request with the driver. If `is_last` is
     /// `false`, the driver is allowed to defer committing the request.
     fn queue_rq(
+        hw_data: ForeignBorrowed<'_, Self::HwData>,
         queue_data: ForeignBorrowed<'_, Self::QueueData>,
         rq: Owned<Request<Self>>,
         is_last: bool,
     ) -> Result;
 
     /// Called by the kernel to indicate that queued requests should be submitted.
-    fn commit_rqs(queue_data: ForeignBorrowed<'_, Self::QueueData>);
+    fn commit_rqs(
+        hw_data: ForeignBorrowed<'_, Self::HwData>,
+        queue_data: ForeignBorrowed<'_, Self::QueueData>,
+    );
+
+    /// Called by the kernel to allocate and initialize a driver specific hardware context data.
+    fn init_hctx(
+        tagset_data: ForeignBorrowed<'_, Self::TagSetData>,
+        hctx_idx: u32,
+    ) -> Result<Self::HwData>;
 
     /// Called by the kernel when the request is completed.
     fn complete(rq: ARef<Request<Self>>);
 
     /// Called by the kernel to poll the device for completed requests. Only
     /// used for poll queues.
-    fn poll() -> bool {
+    fn poll(_hw_data: ForeignBorrowed<'_, Self::HwData>) -> bool {
         build_error!(crate::error::VTABLE_DEFAULT_ERROR)
     }
 }
@@ -146,6 +163,11 @@ impl<T: Operations> OperationsVTable<T> {
         let mut rq =
             unsafe { Owned::from_raw(NonNull::<Request<T>>::new_unchecked((*bd).rq.cast())) };
 
+        // SAFETY: The safety requirement for this function ensure that `hctx`
+        // is valid and that `driver_data` was produced by a call to
+        // `into_foreign` in `Self::init_hctx_callback`.
+        let hw_data = unsafe { T::HwData::borrow((*hctx).driver_data) };
+
         // SAFETY: `hctx` is valid as required by this function.
         let queue_data = unsafe { (*(*hctx).queue).queuedata };
 
@@ -159,6 +181,7 @@ impl<T: Operations> OperationsVTable<T> {
         unsafe { rq.start_unchecked() };
 
         let ret = T::queue_rq(
+            hw_data,
             queue_data,
             rq,
             // SAFETY: `bd` is valid as required by the safety requirement for
@@ -181,6 +204,10 @@ impl<T: Operations> OperationsVTable<T> {
     /// This function may only be called by blk-mq C infrastructure. The caller
     /// must ensure that `hctx` is valid.
     unsafe extern "C" fn commit_rqs_callback(hctx: *mut bindings::blk_mq_hw_ctx) {
+        // SAFETY: `driver_data` was installed by us in `init_hctx_callback` as
+        // the result of a call to `into_foreign`.
+        let hw_data = unsafe { T::HwData::borrow((*hctx).driver_data) };
+
         // SAFETY: `hctx` is valid as required by this function.
         let queue_data = unsafe { (*(*hctx).queue).queuedata };
 
@@ -189,7 +216,7 @@ impl<T: Operations> OperationsVTable<T> {
         // `ForeignOwnable::from_foreign()` is only called when the tagset is
         // dropped, which happens after we are dropped.
         let queue_data = unsafe { T::QueueData::borrow(queue_data) };
-        T::commit_rqs(queue_data)
+        T::commit_rqs(hw_data, queue_data)
     }
 
     /// This function is called by the C kernel. A pointer to this function is
@@ -213,12 +240,18 @@ impl<T: Operations> OperationsVTable<T> {
     ///
     /// # Safety
     ///
-    /// This function may only be called by blk-mq C infrastructure.
+    /// This function may only be called by blk-mq C infrastructure. `hctx` must
+    /// be a pointer to a valid and aligned `struct blk_mq_hw_ctx` that was
+    /// previously initialized by a call to `init_hctx_callback`.
     unsafe extern "C" fn poll_callback(
-        _hctx: *mut bindings::blk_mq_hw_ctx,
+        hctx: *mut bindings::blk_mq_hw_ctx,
         _iob: *mut bindings::io_comp_batch,
     ) -> crate::ffi::c_int {
-        T::poll().into()
+        // SAFETY: By function safety requirement, `hctx` was initialized by
+        // `init_hctx_callback` and thus `driver_data` came from a call to
+        // `into_foreign`.
+        let hw_data = unsafe { T::HwData::borrow((*hctx).driver_data) };
+        T::poll(hw_data).into()
     }
 
     /// This function is called by the C kernel. A pointer to this function is
@@ -226,15 +259,29 @@ impl<T: Operations> OperationsVTable<T> {
     ///
     /// # Safety
     ///
-    /// This function may only be called by blk-mq C infrastructure. This
-    /// function may only be called once before `exit_hctx_callback` is called
-    /// for the same context.
+    /// This function may only be called by blk-mq C infrastructure.
+    /// `tagset_data` must be initialized by the initializer returned by
+    /// `TagSet::try_new` as part of tag set initialization. `hctx` must be a
+    /// pointer to a valid `blk_mq_hw_ctx` where the `driver_data` field was not
+    /// yet initialized. This function may only be called once before
+    /// `exit_hctx_callback` is called for the same context.
     unsafe extern "C" fn init_hctx_callback(
-        _hctx: *mut bindings::blk_mq_hw_ctx,
-        _tagset_data: *mut crate::ffi::c_void,
-        _hctx_idx: crate::ffi::c_uint,
-    ) -> crate::ffi::c_int {
-        from_result(|| Ok(0))
+        hctx: *mut bindings::blk_mq_hw_ctx,
+        tagset_data: *mut c_void,
+        hctx_idx: c_uint,
+    ) -> c_int {
+        from_result(|| {
+            // SAFETY: By the safety requirements of this function,
+            // `tagset_data` came from a call to `into_foreign` when the
+            // `TagSet` was initialized.
+            let tagset_data = unsafe { T::TagSetData::borrow(tagset_data) };
+            let data = T::init_hctx(tagset_data, hctx_idx)?;
+
+            // SAFETY: by the safety requirements of this function, `hctx` is
+            // valid for write
+            unsafe { (*hctx).driver_data = data.into_foreign().cast() };
+            Ok(0)
+        })
     }
 
     /// This function is called by the C kernel. A pointer to this function is
@@ -242,11 +289,20 @@ impl<T: Operations> OperationsVTable<T> {
     ///
     /// # Safety
     ///
-    /// This function may only be called by blk-mq C infrastructure.
+    /// This function may only be called by blk-mq C infrastructure. `hctx` must
+    /// be a valid pointer that was previously initialized by a call to
+    /// `init_hctx_callback`. This function may be called only once after
+    /// `init_hctx_callback` was called.
     unsafe extern "C" fn exit_hctx_callback(
-        _hctx: *mut bindings::blk_mq_hw_ctx,
+        hctx: *mut bindings::blk_mq_hw_ctx,
         _hctx_idx: crate::ffi::c_uint,
     ) {
+        // SAFETY: By the safety requirements of this function, `hctx` is valid for read.
+        let ptr = unsafe { (*hctx).driver_data };
+
+        // SAFETY: By the safety requirements of this function, `ptr` came from
+        // a call to `into_foreign` in `init_hctx_callback`
+        unsafe { T::HwData::from_foreign(ptr) };
     }
 
     /// This function is called by the C kernel. A pointer to this function is

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 35/83] block: rnull: add volatile cache emulation
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (33 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 34/83] block: rust: add `hctx` " Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 36/83] block: rust: implement `Sync` for `GenDisk` Andreas Hindborg
                   ` (47 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add volatile cache emulation to rnull. When enabled via the
`cache_size_mib` configfs attribute, writes are first stored in a volatile
cache before being written back to the simulated non-volatile storage.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/configfs.rs          |  35 +++-
 drivers/block/rnull/disk_storage.rs      | 260 +++++++++++++++++++++++++
 drivers/block/rnull/disk_storage/page.rs |  77 ++++++++
 drivers/block/rnull/rnull.rs             | 314 ++++++++++++++++++-------------
 4 files changed, 545 insertions(+), 141 deletions(-)

diff --git a/drivers/block/rnull/configfs.rs b/drivers/block/rnull/configfs.rs
index 0e9fe8cdc07f..504bb477c2d0 100644
--- a/drivers/block/rnull/configfs.rs
+++ b/drivers/block/rnull/configfs.rs
@@ -1,9 +1,11 @@
 // SPDX-License-Identifier: GPL-2.0
 
 use super::{
+    DiskStorage,
     NullBlkDevice,
     THIS_MODULE, //
 };
+use core::fmt::Write;
 use kernel::{
     bindings,
     block::{
@@ -18,10 +20,7 @@
         AttributeOperations, //
     },
     configfs_attrs,
-    fmt::{
-        self,
-        Write as _, //
-    },
+    fmt,
     new_mutex,
     page::PAGE_SIZE,
     prelude::*,
@@ -104,17 +103,19 @@ fn make_group(
                 badblocks: 12,
                 badblocks_once: 13,
                 badblocks_partial_io: 14,
+                cache_size_mib: 15,
             ],
         };
 
+        let block_size = 4096;
         Ok(configfs::Group::new(
             name.try_into()?,
             item_type,
             // TODO: cannot coerce new_mutex!() to impl PinInit<_, Error>, so put mutex inside
-            try_pin_init!( DeviceConfig {
+            try_pin_init!(DeviceConfig {
                 data <- new_mutex!(DeviceConfigInner {
                     powered: false,
-                    block_size: 4096,
+                    block_size,
                     rotational: false,
                     disk: None,
                     capacity_mib: 4096,
@@ -129,6 +130,11 @@ fn make_group(
                     bad_blocks: Arc::pin_init(BadBlocks::new(false), GFP_KERNEL)?,
                     bad_blocks_once: false,
                     bad_blocks_partial_io: false,
+                    disk_storage: Arc::pin_init(
+                        DiskStorage::new(0, block_size as usize),
+                        GFP_KERNEL
+                    )?,
+                    cache_size_mib: 0,
                 }),
             }),
             core::iter::empty(),
@@ -201,6 +207,8 @@ struct DeviceConfigInner {
     bad_blocks: Arc<BadBlocks>,
     bad_blocks_once: bool,
     bad_blocks_partial_io: bool,
+    cache_size_mib: u64,
+    disk_storage: Arc<DiskStorage>,
 }
 
 #[vtable]
@@ -239,6 +247,7 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
                 bad_blocks: guard.bad_blocks.clone(),
                 bad_blocks_once: guard.bad_blocks_once,
                 bad_blocks_partial_io: guard.bad_blocks_partial_io,
+                storage: guard.disk_storage.clone(),
             })?);
             guard.powered = true;
         } else if guard.powered && !power_op {
@@ -250,6 +259,7 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
     }
 }
 
+// DiskStorage::new(cache_size_mib << 20, block_size as usize),
 configfs_simple_field!(DeviceConfig, 1, block_size, u32, check GenDiskBuilder::validate_block_size);
 configfs_simple_bool_field!(DeviceConfig, 2, rotational);
 configfs_simple_field!(DeviceConfig, 3, capacity_mib, u64);
@@ -394,3 +404,16 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
 
 configfs_simple_bool_field!(DeviceConfig, 13, bad_blocks_once);
 configfs_simple_bool_field!(DeviceConfig, 14, bad_blocks_partial_io);
+configfs_attribute!(DeviceConfig, 15,
+    show: |this, page| show_field(this.data.lock().cache_size_mib, page),
+    store: |this, page| store_with_power_check(this, page, |data, page| {
+        let text = core::str::from_utf8(page)?.trim();
+        let value = text.parse::<u64>().map_err(|_| EINVAL)?;
+        data.disk_storage = Arc::pin_init(
+            DiskStorage::new(value, data.block_size as usize),
+            GFP_KERNEL
+        )?;
+        data.cache_size_mib = value;
+        Ok(())
+    })
+);
diff --git a/drivers/block/rnull/disk_storage.rs b/drivers/block/rnull/disk_storage.rs
new file mode 100644
index 000000000000..b8fef411fffe
--- /dev/null
+++ b/drivers/block/rnull/disk_storage.rs
@@ -0,0 +1,260 @@
+// SPDX-License-Identifier: GPL-2.0
+
+use super::HwQueueContext;
+use core::pin::Pin;
+use kernel::{
+    block,
+    new_spinlock,
+    new_xarray,
+    page::PAGE_SIZE,
+    prelude::*,
+    sync::{
+        atomic::{ordering, Atomic},
+        SpinLock, SpinLockGuard,
+    },
+    uapi::PAGE_SECTORS,
+    xarray::{
+        self,
+        XArray,
+        XArraySheaf, //
+    }, //
+};
+pub(crate) use page::NullBlockPage;
+
+mod page;
+
+#[pin_data]
+pub(crate) struct DiskStorage {
+    // TODO: Get rid of this pointer indirection.
+    #[pin]
+    trees: SpinLock<Pin<KBox<TreeContainer>>>,
+    cache_size: u64,
+    cache_size_used: Atomic<u64>,
+    next_flush_sector: Atomic<u64>,
+    block_size: usize,
+}
+
+impl DiskStorage {
+    pub(crate) fn new(cache_size: u64, block_size: usize) -> impl PinInit<Self, Error> {
+        try_pin_init!( Self {
+            // TODO: Get rid of the box
+            // https://git.kernel.org/pub/scm/linux/kernel/git/boqun/linux.git/commit/?h=locking&id=a5d84cafb3e253a11d2e078902c5b090be2f4227
+            trees <- new_spinlock!(KBox::pin_init(TreeContainer::new(), GFP_KERNEL)?),
+            cache_size,
+            cache_size_used: Atomic::new(0),
+            next_flush_sector: Atomic::new(0),
+            block_size
+        })
+    }
+
+    pub(crate) fn access<'a, 'b, 'c>(
+        &'a self,
+        tree_guard: &'a mut SpinLockGuard<'b, Pin<KBox<TreeContainer>>>,
+        hw_data_guard: &'a mut SpinLockGuard<'b, HwQueueContext>,
+        sheaf: Option<XArraySheaf<'c>>,
+    ) -> DiskStorageAccess<'a, 'b, 'c> {
+        DiskStorageAccess::new(self, tree_guard, hw_data_guard, sheaf)
+    }
+
+    pub(crate) fn lock(&self) -> SpinLockGuard<'_, Pin<KBox<TreeContainer>>> {
+        self.trees.lock()
+    }
+}
+
+pub(crate) struct DiskStorageAccess<'a, 'b, 'c> {
+    cache_guard: xarray::Guard<'a, TreeNode>,
+    disk_guard: xarray::Guard<'a, TreeNode>,
+    hw_data_guard: &'a mut SpinLockGuard<'b, HwQueueContext>,
+    disk_storage: &'a DiskStorage,
+    pub(crate) sheaf: Option<XArraySheaf<'c>>,
+}
+
+impl<'a, 'b, 'c> DiskStorageAccess<'a, 'b, 'c> {
+    fn new(
+        disk_storage: &'a DiskStorage,
+        tree_guard: &'a mut SpinLockGuard<'b, Pin<KBox<TreeContainer>>>,
+        hw_data_guard: &'a mut SpinLockGuard<'b, HwQueueContext>,
+        sheaf: Option<XArraySheaf<'c>>,
+    ) -> Self {
+        Self {
+            cache_guard: tree_guard.cache_tree.lock(),
+            disk_guard: tree_guard.disk_tree.lock(),
+            hw_data_guard,
+            disk_storage,
+            sheaf,
+        }
+    }
+    fn to_index(sector: u64) -> usize {
+        // CAST: Device size limited during setup to (2^32)-1 on 32 bit systems.
+        (sector >> block::PAGE_SECTORS_SHIFT) as usize
+    }
+
+    fn to_sector(index: usize) -> u64 {
+        // CAST: Casting from `usize` to `u64` never overflows.
+        (index << block::PAGE_SECTORS_SHIFT) as u64
+    }
+
+    fn extract_cache_page_inner<'g>(
+        cache_guard: &mut xarray::Guard<'g, TreeNode>,
+        disk_guard: &mut xarray::Guard<'g, TreeNode>,
+        disk_storage: &DiskStorage,
+        hw_data: &mut HwQueueContext,
+        sheaf: Option<&mut XArraySheaf<'_>>,
+    ) -> Result<KBox<NullBlockPage>> {
+        let cache_entry = cache_guard
+            .find_next_entry_circular(
+                disk_storage.next_flush_sector.load(ordering::Relaxed) as usize
+            )
+            .expect("Expected to find a page in the cache");
+
+        let index = cache_entry.index();
+
+        disk_storage
+            .next_flush_sector
+            .store(Self::to_sector(index).wrapping_add(1), ordering::Relaxed);
+
+        disk_storage.cache_size_used.store(
+            disk_storage.cache_size_used.load(ordering::Relaxed) - PAGE_SIZE as u64,
+            ordering::Relaxed,
+        );
+
+        let page = match disk_guard.entry(index) {
+            xarray::Entry::Vacant(disk_entry) => {
+                disk_entry
+                    .insert(cache_entry.remove(), sheaf)
+                    .expect("Preload is set up to allow insert without failure");
+                hw_data.page.take().expect("Preload has allocated for us")
+            }
+            xarray::Entry::Occupied(mut disk_entry) => {
+                let mut page = if cache_entry.is_full() {
+                    disk_entry.insert(cache_entry.remove())
+                } else {
+                    let mut src = cache_entry;
+                    let mut offset = 0;
+                    for _ in 0..PAGE_SECTORS {
+                        src.page_mut().as_pin_mut().copy_to_page(
+                            disk_entry.page_mut().as_pin_mut(),
+                            offset,
+                            block::SECTOR_SIZE as usize,
+                        )?;
+                        offset += block::SECTOR_SIZE as usize;
+                    }
+                    src.remove()
+                };
+                page.reset();
+                page
+            }
+        };
+
+        Ok(page)
+    }
+
+    fn get_cache_page(&mut self, sector: u64) -> Result<&mut NullBlockPage> {
+        let index = Self::to_index(sector);
+
+        match self.cache_guard.entry(index) {
+            xarray::Entry::Occupied(occupied_entry) => Ok(occupied_entry.into_mut()),
+            xarray::Entry::Vacant(vacant_entry) => {
+                let cache_guard = vacant_entry.into_guard();
+                let page = if self.disk_storage.cache_size_used.load(ordering::Relaxed)
+                    < self.disk_storage.cache_size
+                {
+                    self.hw_data_guard
+                        .page
+                        .take()
+                        .expect("Expected to have a page available")
+                } else {
+                    Self::extract_cache_page_inner(
+                        cache_guard,
+                        &mut self.disk_guard,
+                        self.disk_storage,
+                        self.hw_data_guard,
+                        self.sheaf.as_mut(),
+                    )?
+                };
+                let xarray::Entry::Vacant(vacant_entry) = cache_guard.entry(index) else {
+                    unreachable!("slot was vacant and we hold the lock")
+                };
+                Ok(vacant_entry
+                    .insert(page, self.sheaf.as_mut())
+                    .expect("Should be able to insert"))
+            }
+        }
+    }
+
+    fn get_disk_page(&mut self, sector: u64) -> Result<&mut NullBlockPage> {
+        let index = Self::to_index(sector);
+
+        let page = match self.disk_guard.entry(index) {
+            xarray::Entry::Vacant(e) => e.insert(
+                self.hw_data_guard
+                    .page
+                    .take()
+                    .expect("Expected page to be available"),
+                self.sheaf.as_mut(),
+            )?,
+            xarray::Entry::Occupied(e) => e.into_mut(),
+        };
+
+        Ok(page)
+    }
+
+    pub(crate) fn get_write_page(&mut self, sector: u64) -> Result<&mut NullBlockPage> {
+        let page = if self.disk_storage.cache_size > 0 {
+            self.get_cache_page(sector)?
+        } else {
+            self.get_disk_page(sector)?
+        };
+
+        Ok(page)
+    }
+
+    pub(crate) fn get_read_page(&self, sector: u64) -> Option<&NullBlockPage> {
+        let index = Self::to_index(sector);
+        if self.disk_storage.cache_size > 0 {
+            self.cache_guard
+                .get(index)
+                .or_else(|| self.disk_guard.get(index))
+        } else {
+            self.disk_guard.get(index)
+        }
+    }
+
+    fn free_sector_tree(tree_access: &mut xarray::Guard<'_, TreeNode>, sector: u64) {
+        let index = Self::to_index(sector);
+        if let Some(page) = tree_access.get_mut(index) {
+            page.set_free(sector);
+
+            if page.is_empty() {
+                tree_access.remove(index);
+            }
+        }
+    }
+
+    pub(crate) fn free_sector(&mut self, sector: u64) {
+        if self.disk_storage.cache_size > 0 {
+            Self::free_sector_tree(&mut self.cache_guard, sector);
+        }
+
+        Self::free_sector_tree(&mut self.disk_guard, sector);
+    }
+}
+
+type TreeNode = KBox<NullBlockPage>;
+
+#[pin_data]
+pub(crate) struct TreeContainer {
+    #[pin]
+    disk_tree: XArray<TreeNode>,
+    #[pin]
+    cache_tree: XArray<TreeNode>,
+}
+
+impl TreeContainer {
+    fn new() -> impl PinInit<Self> {
+        pin_init!(TreeContainer {
+            disk_tree <- new_xarray!(xarray::AllocKind::Alloc),
+            cache_tree <- new_xarray!(xarray::AllocKind::Alloc),
+        })
+    }
+}
diff --git a/drivers/block/rnull/disk_storage/page.rs b/drivers/block/rnull/disk_storage/page.rs
new file mode 100644
index 000000000000..bc78973ad5d4
--- /dev/null
+++ b/drivers/block/rnull/disk_storage/page.rs
@@ -0,0 +1,77 @@
+// SPDX-License-Identifier: GPL-2.0
+
+use kernel::{
+    block::{
+        PAGE_SECTOR_MASK,
+        SECTOR_SHIFT, //
+    },
+    memalloc_scope,
+    page::{
+        SafePage,
+        PAGE_SIZE, //
+    },
+    prelude::*,
+    types::Owned,
+    uapi::PAGE_SECTORS, //
+};
+
+static_assert!((PAGE_SIZE >> SECTOR_SHIFT) <= 64);
+
+pub(crate) struct NullBlockPage {
+    page: Owned<SafePage>,
+    status: u64,
+    block_size: usize,
+}
+
+impl NullBlockPage {
+    pub(crate) fn new(block_size: usize) -> Result<KBox<Self>> {
+        memalloc_scope!(let _noio: NoIo);
+        Ok(KBox::new(
+            Self {
+                page: SafePage::alloc_page(__GFP_ZERO)?,
+                status: 0,
+                block_size,
+            },
+            GFP_KERNEL,
+        )?)
+    }
+
+    pub(crate) fn set_occupied(&mut self, sector: u64) {
+        let idx = sector & u64::from(PAGE_SECTOR_MASK);
+        self.status |= 1 << idx;
+    }
+
+    pub(crate) fn set_free(&mut self, sector: u64) {
+        let idx = sector & u64::from(PAGE_SECTOR_MASK);
+        self.status &= !(1 << idx);
+    }
+
+    pub(crate) fn is_empty(&self) -> bool {
+        self.status == 0
+    }
+
+    pub(crate) fn reset(&mut self) {
+        self.status = 0;
+    }
+
+    pub(crate) fn is_full(&self) -> bool {
+        let blocks_per_page = PAGE_SIZE >> self.block_size.trailing_zeros();
+        let shift = PAGE_SECTORS as usize / blocks_per_page;
+
+        for i in 0..blocks_per_page {
+            if self.status & (1 << (i * shift)) == 0 {
+                return false;
+            }
+        }
+
+        true
+    }
+
+    pub(crate) fn page_mut(&mut self) -> &mut Owned<SafePage> {
+        &mut self.page
+    }
+
+    pub(crate) fn page(&self) -> &Owned<SafePage> {
+        &self.page
+    }
+}
diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index 0c1bc2f5ae9c..877683dba0ac 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -3,13 +3,22 @@
 //! This is a Rust implementation of the C null block driver.
 
 mod configfs;
+mod disk_storage;
 
 use configfs::IRQMode;
+use disk_storage::{
+    DiskStorage,
+    NullBlockPage,
+    TreeContainer, //
+};
 use kernel::{
     bindings,
     block::{
         self,
-        badblocks::{self, BadBlocks},
+        badblocks::{
+            self,
+            BadBlocks, //
+        },
         bio::Segment,
         mq::{
             self,
@@ -20,7 +29,7 @@
             Operations,
             TagSet, //
         },
-        PAGE_SECTOR_MASK, SECTOR_SHIFT,
+        SECTOR_SHIFT,
     },
     error::{
         code,
@@ -29,11 +38,7 @@
     ffi,
     memalloc_scope,
     new_mutex,
-    new_xarray,
-    page::{
-        SafePage,
-        PAGE_SIZE, //
-    },
+    new_spinlock,
     pr_info,
     prelude::*,
     str::CString,
@@ -42,9 +47,11 @@
         atomic::{
             ordering,
             Atomic, //
-        },
+        }, //
         Arc,
-        Mutex, //
+        Mutex,
+        SpinLock,
+        SpinLockGuard,
     },
     time::{
         hrtimer::{
@@ -59,7 +66,7 @@
         OwnableRefCounted,
         Owned, //
     },
-    xarray::XArray, //
+    xarray::XArraySheaf, //
 };
 
 module! {
@@ -146,9 +153,11 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
                 } else {
                     module_parameters::submit_queues.value()
                 };
+
+                let block_size = module_parameters::bs.value();
                 let disk = NullBlkDevice::new(NullBlkOptions {
                     name: &name,
-                    block_size: module_parameters::bs.value(),
+                    block_size,
                     rotational: module_parameters::rotational.value(),
                     capacity_mib: module_parameters::gb.value() * 1024,
                     irq_mode: module_parameters::irqmode.value().try_into()?,
@@ -161,6 +170,7 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
                     bad_blocks: Arc::pin_init(BadBlocks::new(false), GFP_KERNEL)?,
                     bad_blocks_once: false,
                     bad_blocks_partial_io: false,
+                    storage: Arc::pin_init(DiskStorage::new(0, block_size as usize), GFP_KERNEL)?,
                 })?;
                 disks.push(disk, GFP_KERNEL)?;
             }
@@ -190,8 +200,20 @@ struct NullBlkOptions<'a> {
     bad_blocks: Arc<BadBlocks>,
     bad_blocks_once: bool,
     bad_blocks_partial_io: bool,
+    storage: Arc<DiskStorage>,
+}
+
+#[pin_data]
+struct NullBlkDevice {
+    storage: Arc<DiskStorage>,
+    irq_mode: IRQMode,
+    completion_time: Delta,
+    memory_backed: bool,
+    block_size: usize,
+    bad_blocks: Arc<BadBlocks>,
+    bad_blocks_once: bool,
+    bad_blocks_partial_io: bool,
 }
-struct NullBlkDevice;
 
 impl NullBlkDevice {
     fn new(options: NullBlkOptions<'_>) -> Result<GenDisk<Self>> {
@@ -210,6 +232,7 @@ fn new(options: NullBlkOptions<'_>) -> Result<GenDisk<Self>> {
             bad_blocks,
             bad_blocks_once,
             bad_blocks_partial_io,
+            storage,
         } = options;
 
         let mut flags = mq::tag_set::Flags::default();
@@ -244,13 +267,13 @@ fn new(options: NullBlkOptions<'_>) -> Result<GenDisk<Self>> {
             GFP_KERNEL,
         )?;
 
-        let queue_data = Box::pin_init(
-            pin_init!(QueueData {
-                tree <- new_xarray!(kernel::xarray::AllocKind::Alloc),
+        let queue_data = Box::try_pin_init(
+            try_pin_init!(Self {
+                storage,
                 irq_mode,
                 completion_time,
                 memory_backed,
-                block_size: block_size.into(),
+                block_size: block_size as usize,
                 bad_blocks,
                 bad_blocks_once,
                 bad_blocks_partial_io,
@@ -273,22 +296,68 @@ fn new(options: NullBlkOptions<'_>) -> Result<GenDisk<Self>> {
         builder.build(fmt!("{}", name.to_str()?), tagset, queue_data)
     }
 
+    fn sheaf_size() -> usize {
+        2 * ((usize::BITS as usize / bindings::XA_CHUNK_SHIFT)
+            + if (usize::BITS as usize % bindings::XA_CHUNK_SHIFT) == 0 {
+                0
+            } else {
+                1
+            })
+    }
+
+    fn preload<'b, 'c>(
+        tree_guard: &'b mut SpinLockGuard<'c, Pin<KBox<TreeContainer>>>,
+        hw_data_guard: &'b mut SpinLockGuard<'c, HwQueueContext>,
+        block_size: usize,
+        sheaf: &'b mut Option<XArraySheaf<'c>>,
+    ) -> Result {
+        match sheaf {
+            Some(sheaf) => {
+                tree_guard.do_unlocked(|| {
+                    hw_data_guard.do_unlocked(|| sheaf.refill(GFP_KERNEL, Self::sheaf_size()))
+                })?;
+            }
+            None => {
+                let _ = sheaf.insert(
+                    kernel::xarray::xarray_kmem_cache()
+                        .sheaf(Self::sheaf_size(), GFP_NOWAIT)
+                        .or(tree_guard.do_unlocked(|| {
+                            hw_data_guard.do_unlocked(|| -> Result<_> {
+                                kernel::xarray::xarray_kmem_cache()
+                                    .sheaf(Self::sheaf_size(), GFP_KERNEL)
+                            })
+                        }))?,
+                );
+            }
+        }
+
+        // Another thread may get the lock after we allocate. If this happens, retry.
+        while hw_data_guard.page.is_none() {
+            hw_data_guard.page =
+                Some(tree_guard.do_unlocked(|| {
+                    hw_data_guard.do_unlocked(|| NullBlockPage::new(block_size))
+                })?);
+        }
+
+        Ok(())
+    }
+
     #[inline(always)]
-    fn write(tree: &XArray<TreeNode>, mut sector: u64, mut segment: Segment<'_>) -> Result {
-        while !segment.is_empty() {
-            let page = NullBlockPage::new()?;
-            let mut tree = tree.lock();
+    fn write<'a, 'b, 'c>(
+        &'a self,
+        tree_guard: &'b mut SpinLockGuard<'c, Pin<KBox<TreeContainer>>>,
+        hw_data_guard: &'b mut SpinLockGuard<'c, HwQueueContext>,
+        mut sector: u64,
+        mut segment: Segment<'_>,
+    ) -> Result {
+        let mut sheaf: Option<XArraySheaf<'_>> = None;
 
-            // CAST: Device size limited during setup to (2^32)-1 on 32 bit systems.
-            let page_idx = (sector >> block::PAGE_SECTORS_SHIFT) as usize;
+        while !segment.is_empty() {
+            Self::preload(tree_guard, hw_data_guard, self.block_size, &mut sheaf)?;
 
-            let page = if let Some(page) = tree.get_mut(page_idx) {
-                page
-            } else {
-                tree.store(page_idx, page, GFP_KERNEL)?;
-                tree.get_mut(page_idx).unwrap()
-            };
+            let mut access = self.storage.access(tree_guard, hw_data_guard, sheaf);
 
+            let page = access.get_write_page(sector)?;
             page.set_occupied(sector);
 
             // CAST: Page offset always fits in 32 bits.
@@ -296,58 +365,73 @@ fn write(tree: &XArray<TreeNode>, mut sector: u64, mut segment: Segment<'_>) ->
                 ((sector & u64::from(block::PAGE_SECTOR_MASK)) << block::SECTOR_SHIFT) as usize;
 
             // CAST: Casting from `usize` to `u64` never overflows.
-            sector += segment.copy_to_page(page.page.as_pin_mut(), page_offset) as u64
+            sector += segment.copy_to_page(page.page_mut().as_pin_mut(), page_offset) as u64
                 >> block::SECTOR_SHIFT;
+
+            sheaf = access.sheaf;
         }
+
+        if let Some(sheaf) = sheaf {
+            tree_guard.do_unlocked(|| {
+                hw_data_guard.do_unlocked(|| {
+                    sheaf.return_refill(GFP_KERNEL);
+                })
+            });
+        }
+
         Ok(())
     }
 
     #[inline(always)]
-    fn read(tree: &XArray<TreeNode>, mut sector: u64, mut segment: Segment<'_>) -> Result {
-        let tree = tree.lock();
+    fn read<'a, 'b, 'c>(
+        &'a self,
+        tree_guard: &'b mut SpinLockGuard<'c, Pin<KBox<TreeContainer>>>,
+        hw_data_guard: &'b mut SpinLockGuard<'c, HwQueueContext>,
+        mut sector: u64,
+        mut segment: Segment<'_>,
+    ) -> Result {
+        let access = self.storage.access(tree_guard, hw_data_guard, None);
 
         while !segment.is_empty() {
-            // CAST: Device size limited during setup to (2^32)-1 on 32 bit systems.
-            let page_idx = (sector >> block::PAGE_SECTORS_SHIFT) as usize;
+            let page = access.get_read_page(sector);
 
-            if let Some(page) = tree.get(page_idx) {
-                // CAST: Page offset always fits in 32 bits.
-                let page_offset =
-                    ((sector & u64::from(block::PAGE_SECTOR_MASK)) << block::SECTOR_SHIFT) as usize;
+            match page {
+                Some(page) => {
+                    // CAST: Page offset always fits in 32 bits.
+                    let page_offset = ((sector & u64::from(block::PAGE_SECTOR_MASK))
+                        << block::SECTOR_SHIFT) as usize;
 
+                    // CAST: Casting from `usize` to `u64` never overflows.
+                    sector += segment.copy_from_page(page.page(), page_offset) as u64
+                        >> block::SECTOR_SHIFT;
+                }
                 // CAST: Casting from `usize` to `u64` never overflows.
-                sector +=
-                    segment.copy_from_page(&page.page, page_offset) as u64 >> block::SECTOR_SHIFT;
-            } else {
-                // CAST: Casting from `usize` to `u64` never overflows.
-                sector += segment.zero_page() as u64 >> block::SECTOR_SHIFT;
+                None => sector += segment.zero_page() as u64 >> block::SECTOR_SHIFT,
             }
         }
 
         Ok(())
     }
 
-    fn discard(tree: &XArray<TreeNode>, mut sector: u64, sectors: u64, block_size: u64) -> Result {
-        let mut remaining_bytes = sectors << SECTOR_SHIFT;
-        let mut tree = tree.lock();
+    fn discard(
+        &self,
+        hw_data: &Pin<&SpinLock<HwQueueContext>>,
+        mut sector: u64,
+        sectors: u32,
+    ) -> Result {
+        let mut tree_guard = self.storage.lock();
+        let mut hw_data_guard = hw_data.lock();
 
-        while remaining_bytes > 0 {
-            // CAST: Device size limited during setup to (2^32)-1 on 32 bit systems.
-            let page_idx = (sector >> block::PAGE_SECTORS_SHIFT) as usize;
-            let mut remove = false;
-            if let Some(page) = tree.get_mut(page_idx) {
-                page.set_free(sector);
-                if page.is_empty() {
-                    remove = true;
-                }
-            }
+        let mut access = self
+            .storage
+            .access(&mut tree_guard, &mut hw_data_guard, None);
 
-            if remove {
-                drop(tree.remove(page_idx))
-            }
+        let mut remaining_bytes = (sectors as usize) << SECTOR_SHIFT;
 
-            let processed = remaining_bytes.min(block_size);
-            sector += processed >> SECTOR_SHIFT;
+        while remaining_bytes > 0 {
+            access.free_sector(sector);
+            let processed = remaining_bytes.min(self.block_size);
+            sector += (processed >> SECTOR_SHIFT) as u64;
             remaining_bytes -= processed;
         }
 
@@ -356,14 +440,19 @@ fn discard(tree: &XArray<TreeNode>, mut sector: u64, sectors: u64, block_size: u
 
     #[inline(never)]
     fn transfer(
+        &self,
+        hw_data: &Pin<&SpinLock<HwQueueContext>>,
         rq: &mut Owned<mq::Request<Self>>,
-        tree: &XArray<TreeNode>,
         max_sectors: u32,
     ) -> Result {
         let mut sector = rq.sector();
         let max_end_sector = sector + <u32 as Into<u64>>::into(max_sectors);
         let command = rq.command();
 
+        // TODO: Use `PerCpu` to get rid of this lock
+        let mut hw_data_guard = hw_data.lock();
+        let mut tree_guard = self.storage.lock();
+
         for bio in rq.bio_iter_mut() {
             let segment_iter = bio.segment_iter();
             for mut segment in segment_iter {
@@ -373,8 +462,12 @@ fn transfer(
                 let length_sectors_allowed = segment_length_sectors.min(max_remaining_sectors);
                 segment.truncate(length_sectors_allowed << SECTOR_SHIFT);
                 match command {
-                    bindings::req_op_REQ_OP_WRITE => Self::write(tree, sector, segment)?,
-                    bindings::req_op_REQ_OP_READ => Self::read(tree, sector, segment)?,
+                    bindings::req_op_REQ_OP_WRITE => {
+                        self.write(&mut tree_guard, &mut hw_data_guard, sector, segment)?
+                    }
+                    bindings::req_op_REQ_OP_READ => {
+                        self.read(&mut tree_guard, &mut hw_data_guard, sector, segment)?
+                    }
                     _ => (),
                 }
                 sector += u64::from(length_sectors_allowed);
@@ -384,29 +477,26 @@ fn transfer(
                 }
             }
         }
+
         Ok(())
     }
 
-    fn handle_bad_blocks(
-        rq: &mut Owned<mq::Request<Self>>,
-        queue_data: &QueueData,
-        sectors: &mut u32,
-    ) -> Result {
-        if queue_data.bad_blocks.enabled() {
+    fn handle_bad_blocks(&self, rq: &mut Owned<mq::Request<Self>>, sectors: &mut u32) -> Result {
+        if self.bad_blocks.enabled() {
             let start = rq.sector();
             let end = start + u64::from(*sectors);
-            match queue_data.bad_blocks.check(start..end) {
+            match self.bad_blocks.check(start..end) {
                 badblocks::BlockStatus::None => {}
                 badblocks::BlockStatus::Acknowledged(mut range)
                 | badblocks::BlockStatus::Unacknowledged(mut range) => {
                     rq.data_ref().error.store(1, ordering::Relaxed);
 
-                    if queue_data.bad_blocks_once {
-                        queue_data.bad_blocks.set_good(range.clone())?;
+                    if self.bad_blocks_once {
+                        self.bad_blocks.set_good(range.clone())?;
                     }
 
-                    if queue_data.bad_blocks_partial_io {
-                        let block_size_sectors = queue_data.block_size >> SECTOR_SHIFT;
+                    if self.bad_blocks_partial_io {
+                        let block_size_sectors = (self.block_size >> SECTOR_SHIFT) as u64;
                         range.start = align_down(range.start, block_size_sectors);
                         if start < range.start {
                             *sectors = (range.start - start) as u32;
@@ -431,52 +521,8 @@ fn end_request(rq: Owned<mq::Request<Self>>) {
     }
 }
 
-static_assert!((PAGE_SIZE >> SECTOR_SHIFT) <= 64);
-
-struct NullBlockPage {
-    page: Owned<SafePage>,
-    status: u64,
-}
-
-impl NullBlockPage {
-    fn new() -> Result<KBox<Self>> {
-        Ok(KBox::new(
-            Self {
-                page: SafePage::alloc_page(GFP_KERNEL | __GFP_ZERO)?,
-                status: 0,
-            },
-            GFP_KERNEL,
-        )?)
-    }
-
-    fn set_occupied(&mut self, sector: u64) {
-        let idx = sector & u64::from(PAGE_SECTOR_MASK);
-        self.status |= 1 << idx;
-    }
-
-    fn set_free(&mut self, sector: u64) {
-        let idx = sector & u64::from(PAGE_SECTOR_MASK);
-        self.status &= !(1 << idx);
-    }
-
-    fn is_empty(&self) -> bool {
-        self.status == 0
-    }
-}
-
-type TreeNode = KBox<NullBlockPage>;
-
-#[pin_data]
-struct QueueData {
-    #[pin]
-    tree: XArray<TreeNode>,
-    irq_mode: IRQMode,
-    completion_time: Delta,
-    memory_backed: bool,
-    block_size: u64,
-    bad_blocks: Arc<BadBlocks>,
-    bad_blocks_once: bool,
-    bad_blocks_partial_io: bool,
+struct HwQueueContext {
+    page: Option<KBox<disk_storage::NullBlockPage>>,
 }
 
 #[pin_data]
@@ -531,10 +577,10 @@ fn align_down<T>(value: T, to: T) -> T
 
 #[vtable]
 impl Operations for NullBlkDevice {
-    type QueueData = Pin<KBox<QueueData>>;
+    type QueueData = Pin<KBox<Self>>;
     type RequestData = Pdu;
     type TagSetData = ();
-    type HwData = ();
+    type HwData = Pin<KBox<SpinLock<HwQueueContext>>>;
 
     fn new_request_data() -> impl PinInit<Self::RequestData> {
         pin_init!(Pdu {
@@ -545,42 +591,40 @@ fn new_request_data() -> impl PinInit<Self::RequestData> {
 
     #[inline(always)]
     fn queue_rq(
-        _hw_data: (),
-        queue_data: Pin<&QueueData>,
+        hw_data: Pin<&SpinLock<HwQueueContext>>,
+        this: Pin<&Self>,
         mut rq: Owned<mq::Request<Self>>,
         _is_last: bool,
     ) -> Result {
         let mut sectors = rq.sectors();
 
-        Self::handle_bad_blocks(&mut rq, queue_data.get_ref(), &mut sectors)?;
+        Self::handle_bad_blocks(this.get_ref(), &mut rq, &mut sectors)?;
 
-        if queue_data.memory_backed {
+        if this.memory_backed {
             memalloc_scope!(let _noio: NoIo);
-            let tree = &queue_data.tree;
-
             if rq.command() == bindings::req_op_REQ_OP_DISCARD {
-                Self::discard(tree, rq.sector(), sectors.into(), queue_data.block_size)?;
+                this.discard(&hw_data, rq.sector(), sectors)?;
             } else {
-                Self::transfer(&mut rq, tree, sectors)?;
+                this.transfer(&hw_data, &mut rq, sectors)?;
             }
         }
 
-        match queue_data.irq_mode {
+        match this.irq_mode {
             IRQMode::None => Self::end_request(rq),
             IRQMode::Soft => mq::Request::complete(rq.into()),
             IRQMode::Timer => {
                 OwnableRefCounted::into_shared(rq)
-                    .start(queue_data.completion_time)
+                    .start(this.completion_time)
                     .dismiss();
             }
         }
         Ok(())
     }
 
-    fn commit_rqs(_hw_data: (), _queue_data: Pin<&QueueData>) {}
+    fn commit_rqs(_hw_data: Pin<&SpinLock<HwQueueContext>>, _queue_data: Pin<&Self>) {}
 
-    fn init_hctx(_tagset_data: (), _hctx_idx: u32) -> Result {
-        Ok(())
+    fn init_hctx(_tagset_data: (), _hctx_idx: u32) -> Result<Self::HwData> {
+        KBox::pin_init(new_spinlock!(HwQueueContext { page: None }), GFP_KERNEL)
     }
 
     fn complete(rq: ARef<mq::Request<Self>>) {

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 36/83] block: rust: implement `Sync` for `GenDisk`.
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (34 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 35/83] block: rnull: add volatile cache emulation Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 37/83] block: rust: add a back reference feature to `GenDisk` Andreas Hindborg
                   ` (46 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

`GenDisk` is a pointer to a `struct gendisk`. It is safe to reference this
struct from multiple threads.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/kernel/block/mq/gen_disk.rs | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/rust/kernel/block/mq/gen_disk.rs b/rust/kernel/block/mq/gen_disk.rs
index 2b204b0ed49a..94af85fe1716 100644
--- a/rust/kernel/block/mq/gen_disk.rs
+++ b/rust/kernel/block/mq/gen_disk.rs
@@ -234,6 +234,17 @@ unsafe impl<T> Send for GenDisk<T>
 {
 }
 
+// SAFETY: `GenDisk` is an owned pointer to a `struct gendisk` and an `Arc` to a `TagSet`. It is
+// safe to reference these from multiple threads if the `Arc` and the `gendisk` private data is
+// `Sync`.
+unsafe impl<T> Sync for GenDisk<T>
+where
+    T: Operations,
+    T::QueueData: Sync,
+    Arc<TagSet<T>>: Sync,
+{
+}
+
 impl<T: Operations> Drop for GenDisk<T> {
     fn drop(&mut self) {
         // SAFETY: By type invariant of `Self`, `self.gendisk` points to a valid

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 37/83] block: rust: add a back reference feature to `GenDisk`
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (35 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 36/83] block: rust: implement `Sync` for `GenDisk` Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 38/83] block: rust: introduce an idle type state for `Request` Andreas Hindborg
                   ` (45 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

During certain block layer callbacks, drivers may need access to the Rust
`GenDisk` representing a disk the driver is managing. In some situations it
is only possible to obtain a pointer to the C `struct gendisk`. With the
current setup, it is not possible to obtain the `GenDisk` for this C
`gendisk`. To circumvent this, we add a back reference feature to the
`GenDisk` so that we can store a reference counted reference to the
`GenDisk` somewhere easily accessible.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/configfs.rs  |  2 +-
 drivers/block/rnull/rnull.rs     |  4 +--
 rust/kernel/block/mq/gen_disk.rs | 65 ++++++++++++++++++++++++++++++++++++----
 3 files changed, 62 insertions(+), 9 deletions(-)

diff --git a/drivers/block/rnull/configfs.rs b/drivers/block/rnull/configfs.rs
index 504bb477c2d0..4df0b748596a 100644
--- a/drivers/block/rnull/configfs.rs
+++ b/drivers/block/rnull/configfs.rs
@@ -198,7 +198,7 @@ struct DeviceConfigInner {
     capacity_mib: u64,
     irq_mode: IRQMode,
     completion_time: time::Delta,
-    disk: Option<GenDisk<NullBlkDevice>>,
+    disk: Option<Arc<GenDisk<NullBlkDevice>>>,
     memory_backed: bool,
     submit_queues: u32,
     home_node: i32,
diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index 877683dba0ac..fd9b770965a6 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -134,7 +134,7 @@ struct NullBlkModule {
     #[pin]
     configfs_subsystem: kernel::configfs::Subsystem<configfs::Config>,
     #[pin]
-    param_disks: Mutex<KVec<GenDisk<NullBlkDevice>>>,
+    param_disks: Mutex<KVec<Arc<GenDisk<NullBlkDevice>>>>,
 }
 
 impl kernel::InPlaceModule for NullBlkModule {
@@ -216,7 +216,7 @@ struct NullBlkDevice {
 }
 
 impl NullBlkDevice {
-    fn new(options: NullBlkOptions<'_>) -> Result<GenDisk<Self>> {
+    fn new(options: NullBlkOptions<'_>) -> Result<Arc<GenDisk<Self>>> {
         let NullBlkOptions {
             name,
             block_size,
diff --git a/rust/kernel/block/mq/gen_disk.rs b/rust/kernel/block/mq/gen_disk.rs
index 94af85fe1716..f51bccb0d2ef 100644
--- a/rust/kernel/block/mq/gen_disk.rs
+++ b/rust/kernel/block/mq/gen_disk.rs
@@ -21,14 +21,19 @@
         Write, //
     },
     prelude::*,
+    revocable::Revocable,
     static_lock_class,
     str::NullTerminatedFormatter,
-    sync::Arc,
+    sync::{
+        Arc,
+        UniqueArc, //
+    },
     types::{
         ForeignOwnable,
         ScopeGuard, //
     },
 };
+use core::ptr::NonNull;
 
 /// A builder for [`GenDisk`].
 ///
@@ -125,7 +130,7 @@ pub fn build<T: Operations>(
         name: fmt::Arguments<'_>,
         tagset: Arc<TagSet<T>>,
         queue_data: T::QueueData,
-    ) -> Result<GenDisk<T>> {
+    ) -> Result<Arc<GenDisk<T>>> {
         let data = queue_data.into_foreign();
         let recover_data = ScopeGuard::new(|| {
             // SAFETY: T::QueueData was created by the call to `into_foreign()` above
@@ -204,10 +209,28 @@ pub fn build<T: Operations>(
         // INVARIANT: `gendisk` was added to the VFS via `device_add_disk` above.
         // INVARIANT: `gendisk.queue.queue_data` is set to `data` in the call to
         // `__blk_mq_alloc_disk` above.
-        Ok(GenDisk {
-            _tagset: tagset,
-            gendisk,
-        })
+        let mut disk = UniqueArc::new(
+            GenDisk {
+                _tagset: tagset,
+                gendisk,
+                backref: Arc::pin_init(
+                    // INVARIANT: We break `GenDiskRef` invariant here, but we restore it below.
+                    Revocable::new(GenDiskRef(NonNull::dangling())),
+                    GFP_KERNEL,
+                )?,
+            },
+            GFP_KERNEL,
+        )?;
+
+        disk.backref = Arc::pin_init(
+            // INVARIANT: The `GenDisk` in `disk` is a valid for use as a reference.
+            Revocable::new(GenDiskRef(
+                NonNull::new(UniqueArc::as_ptr(&disk).cast_mut()).expect("Should not be null"),
+            )),
+            GFP_KERNEL,
+        )?;
+
+        Ok(disk.into())
     }
 }
 
@@ -222,6 +245,14 @@ pub fn build<T: Operations>(
 pub struct GenDisk<T: Operations> {
     _tagset: Arc<TagSet<T>>,
     gendisk: *mut bindings::gendisk,
+    backref: Arc<Revocable<GenDiskRef<T>>>,
+}
+
+impl<T: Operations> GenDisk<T> {
+    /// Get a `GenDiskRef` referencing this `GenDisk`.
+    pub fn get_ref(&self) -> Arc<Revocable<GenDiskRef<T>>> {
+        self.backref.clone()
+    }
 }
 
 // SAFETY: `GenDisk` is an owned pointer to a `struct gendisk` and an `Arc` to a
@@ -264,3 +295,25 @@ fn drop(&mut self) {
         drop(unsafe { T::QueueData::from_foreign(queue_data) });
     }
 }
+
+/// A reference to a `GenDisk`.
+///
+/// # Invariants
+///
+/// `self.0` is valid for use as a reference.
+pub struct GenDiskRef<T: Operations>(NonNull<GenDisk<T>>);
+
+// SAFETY: It is safe to transfer ownership of `GenDiskRef` across thread boundaries.
+unsafe impl<T: Operations> Send for GenDiskRef<T> {}
+
+// SAFETY: It is safe to share references to `GenDiskRef` across thread boundaries.
+unsafe impl<T: Operations> Sync for GenDiskRef<T> {}
+
+impl<T: Operations> core::ops::Deref for GenDiskRef<T> {
+    type Target = GenDisk<T>;
+
+    fn deref(&self) -> &Self::Target {
+        // SAFETY: By type invariant, `self.0` is valid for use as a reference.
+        unsafe { self.0.as_ref() }
+    }
+}

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 38/83] block: rust: introduce an idle type state for `Request`
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (36 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 37/83] block: rust: add a back reference feature to `GenDisk` Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 39/83] block: rust: add a request queue abstraction Andreas Hindborg
                   ` (44 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Block device drivers need to invoke `blk_mq_start_request` on a request to
indicate that they have started processing the request. This function may
only be called once after a request has been issued to a driver. For Rust
block device drivers, the Rust abstractions handle this call. However, in
some situations a driver may want to control when a request is started.
Thus, expose the start method to Rust block device drivers.

To ensure the method is not called more than once, introduce a type state
for `Request`. Requests are issued as `IdleRequest` and transition to
`Request` when the `start` method is called.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/rnull.rs       |   3 +-
 rust/kernel/block/mq.rs            |   5 +-
 rust/kernel/block/mq/operations.rs |  15 ++--
 rust/kernel/block/mq/request.rs    | 149 +++++++++++++++++++++++++++++++------
 4 files changed, 137 insertions(+), 35 deletions(-)

diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index fd9b770965a6..bb8c4df08218 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -593,9 +593,10 @@ fn new_request_data() -> impl PinInit<Self::RequestData> {
     fn queue_rq(
         hw_data: Pin<&SpinLock<HwQueueContext>>,
         this: Pin<&Self>,
-        mut rq: Owned<mq::Request<Self>>,
+        rq: Owned<mq::IdleRequest<Self>>,
         _is_last: bool,
     ) -> Result {
+        let mut rq = rq.start();
         let mut sectors = rq.sectors();
 
         Self::handle_bad_blocks(this.get_ref(), &mut rq, &mut sectors)?;
diff --git a/rust/kernel/block/mq.rs b/rust/kernel/block/mq.rs
index b095cc7f51ce..77e3593e8626 100644
--- a/rust/kernel/block/mq.rs
+++ b/rust/kernel/block/mq.rs
@@ -88,10 +88,10 @@
 //!     fn queue_rq(
 //!         _hw_data: (),
 //!         _queue_data: (),
-//!         rq: Owned<Request<Self>>,
+//!         rq: Owned<IdleRequest<Self>>,
 //!         _is_last: bool
 //!     ) -> Result {
-//!         rq.end_ok();
+//!         rq.start().end_ok();
 //!         Ok(())
 //!     }
 //!
@@ -131,6 +131,7 @@
 
 pub use operations::Operations;
 pub use request::{
+    IdleRequest,
     Request,
     RequestTimerHandle, //
 };
diff --git a/rust/kernel/block/mq/operations.rs b/rust/kernel/block/mq/operations.rs
index 1b20df25d6df..01917ef213d1 100644
--- a/rust/kernel/block/mq/operations.rs
+++ b/rust/kernel/block/mq/operations.rs
@@ -8,6 +8,7 @@
     bindings,
     block::mq::{
         request::RequestDataWrapper,
+        IdleRequest,
         Request, //
     },
     error::{
@@ -25,10 +26,7 @@
         Owned, //
     },
 };
-use core::{
-    marker::PhantomData,
-    ptr::NonNull, //
-};
+use core::marker::PhantomData;
 use pin_init::PinInit;
 
 type ForeignBorrowed<'a, T> = <T as ForeignOwnable>::Borrowed<'a>;
@@ -82,7 +80,7 @@ pub trait Operations: Sized {
     fn queue_rq(
         hw_data: ForeignBorrowed<'_, Self::HwData>,
         queue_data: ForeignBorrowed<'_, Self::QueueData>,
-        rq: Owned<Request<Self>>,
+        rq: Owned<IdleRequest<Self>>,
         is_last: bool,
     ) -> Result;
 
@@ -154,14 +152,14 @@ impl<T: Operations> OperationsVTable<T> {
                 == 0
         );
 
+        // INVARIANT: By C API contract, `bd.rq` has not been started yet.
         // SAFETY:
         //  - By API contract, we own the request.
         //  - By the safety requirements of this function, `request` is a valid
         //    `struct request` and the private data is properly initialized.
         //  - `rq` will be alive until `blk_mq_end_request` is called and is
         //    reference counted by until then.
-        let mut rq =
-            unsafe { Owned::from_raw(NonNull::<Request<T>>::new_unchecked((*bd).rq.cast())) };
+        let rq = unsafe { IdleRequest::from_raw((*bd).rq) };
 
         // SAFETY: The safety requirement for this function ensure that `hctx`
         // is valid and that `driver_data` was produced by a call to
@@ -177,9 +175,6 @@ impl<T: Operations> OperationsVTable<T> {
         // dropped, which happens after we are dropped.
         let queue_data = unsafe { T::QueueData::borrow(queue_data) };
 
-        // SAFETY: We have exclusive access and we just set the refcount above.
-        unsafe { rq.start_unchecked() };
-
         let ret = T::queue_rq(
             hw_data,
             queue_data,
diff --git a/rust/kernel/block/mq/request.rs b/rust/kernel/block/mq/request.rs
index c06907dfe5b5..f94e9c2181d0 100644
--- a/rust/kernel/block/mq/request.rs
+++ b/rust/kernel/block/mq/request.rs
@@ -24,6 +24,7 @@
         HrTimerPointer, //
     },
     types::{
+        ForeignOwnable,
         Opaque,
         Ownable,
         OwnableRefCounted,
@@ -33,6 +34,7 @@
 use core::{
     ffi::c_void,
     marker::PhantomData,
+    ops::Deref,
     pin::Pin,
     ptr::NonNull, //
 };
@@ -42,6 +44,104 @@
     BioIterator, //
 };
 
+/// A [`Request`] that a driver has not yet begun to process.
+///
+/// A driver can convert an `IdleRequest` to a [`Request`] by calling [`IdleRequest::start`].
+///
+/// # Invariants
+///
+/// - This request has not been started yet.
+#[repr(transparent)]
+pub struct IdleRequest<T>(RequestInner<T>);
+
+impl<T: Operations> IdleRequest<T> {
+    /// Mark the request as processing.
+    ///
+    /// This converts the [`IdleRequest`] into a [`Request`].
+    pub fn start(self: Owned<Self>) -> Owned<Request<T>> {
+        // SAFETY: By type invariant `self.0.0` is a valid request. Because we have an `Owned<_>`,
+        // the refcount is zero.
+        let mut request = unsafe { Request::from_raw(self.0 .0.get()) };
+
+        debug_assert!(
+            request
+                .wrapper_ref()
+                .refcount()
+                .as_atomic()
+                .load(ordering::Acquire)
+                == 0
+        );
+
+        // SAFETY: We have exclusive access and the refcount is 0. By type invariant `request` was
+        // not started yet.
+        unsafe { request.start_unchecked() };
+
+        request
+    }
+
+    /// Create a [`Self`] from a raw request pointer.
+    ///
+    /// # Safety
+    ///
+    /// - The request pointed to by `ptr` must satisfythe invariants of both [`Request`] and
+    ///   [`Self`].
+    /// - The refcount of the request pointed to by `ptr` must be 0.
+    pub(crate) unsafe fn from_raw(ptr: *mut bindings::request) -> Owned<Self> {
+        // SAFETY: By function safety requirements, `ptr` is valid for use as an `IdleRequest`.
+        unsafe { Owned::from_raw(NonNull::<Self>::new_unchecked(ptr.cast())) }
+    }
+}
+
+impl<T: Operations> Ownable for IdleRequest<T> {
+    // The `release` implementation leaks the `IdleRequest`, which is a valid state for a
+    // [`Request`] with refcount 0.
+    unsafe fn release(&mut self) {}
+}
+
+impl<T: Operations> Deref for IdleRequest<T> {
+    type Target = RequestInner<T>;
+
+    fn deref(&self) -> &Self::Target {
+        &self.0
+    }
+}
+
+pub struct RequestInner<T>(Opaque<bindings::request>, PhantomData<T>);
+
+impl<T: Operations> RequestInner<T> {
+    /// Get the command identifier for the request
+    pub fn command(&self) -> u32 {
+        // SAFETY: By C API contract and type invariant, `cmd_flags` is valid for read
+        unsafe { (*self.0.get()).cmd_flags & ((1 << bindings::REQ_OP_BITS) - 1) }
+    }
+
+    /// Get the target sector for the request.
+    #[inline(always)]
+    pub fn sector(&self) -> u64 {
+        // SAFETY: By type invariant of `Self`, `self.0` is valid and live.
+        unsafe { (*self.0.get()).__sector }
+    }
+
+    /// Get the size of the request in number of sectors.
+    #[inline(always)]
+    pub fn sectors(&self) -> u32 {
+        self.bytes() >> crate::block::SECTOR_SHIFT
+    }
+
+    /// Get the size of the request in bytes.
+    #[inline(always)]
+    pub fn bytes(&self) -> u32 {
+        // SAFETY: By type invariant of `Self`, `self.0` is valid and live.
+        unsafe { (*self.0.get()).__data_len }
+    }
+
+    /// Borrow the queue data from the request queue associated with this request.
+    pub fn queue_data(&self) -> <T::QueueData as ForeignOwnable>::Borrowed<'_> {
+        // SAFETY: By type invariants of `Request`, `self.0` is a valid request.
+        unsafe { T::QueueData::borrow((*(*self.0.get()).q).queuedata) }
+    }
+}
+
 /// A wrapper around a blk-mq [`struct request`]. This represents an IO request.
 ///
 /// # Lifetime
@@ -96,9 +196,28 @@
 /// [`struct request`]: srctree/include/linux/blk-mq.h
 ///
 #[repr(transparent)]
-pub struct Request<T>(Opaque<bindings::request>, PhantomData<T>);
+pub struct Request<T>(RequestInner<T>);
+
+impl<T: Operations> Deref for Request<T> {
+    type Target = RequestInner<T>;
+
+    fn deref(&self) -> &Self::Target {
+        &self.0
+    }
+}
 
 impl<T: Operations> Request<T> {
+    /// Create a `Owned<Request>` from a request pointer.
+    ///
+    /// # Safety
+    ///
+    /// - `ptr` must satisfy invariants of `Request`.
+    /// - The refcount of the request pointed to by `ptr` must be 0.
+    pub(crate) unsafe fn from_raw(ptr: *mut bindings::request) -> Owned<Self> {
+        // SAFETY: By function safety requirements, `ptr` is valid for use as `Owned<Request>`.
+        unsafe { Owned::from_raw(NonNull::<Self>::new_unchecked(ptr.cast())) }
+    }
+
     /// Create an [`ARef<Request>`] from a [`struct request`] pointer.
     ///
     /// # Safety
@@ -120,7 +239,7 @@ pub(crate) unsafe fn aref_from_raw(ptr: *mut bindings::request) -> ARef<Self> {
     pub fn command(&self) -> u32 {
         use core::ops::BitAnd;
         // SAFETY: By C API contract and type invariant, `cmd_flags` is valid for read
-        unsafe { (*self.0.get()).cmd_flags }.bitand((1u32 << bindings::REQ_OP_BITS) - 1)
+        unsafe { (*self.0 .0.get()).cmd_flags }.bitand((1u32 << bindings::REQ_OP_BITS) - 1)
     }
 
     /// Complete the request by scheduling `Operations::complete` for
@@ -145,7 +264,7 @@ pub fn complete(this: ARef<Self>) {
     pub fn bio(&self) -> Option<&Bio> {
         // SAFETY: By type invariant of `Self`, `self.0` is valid and the deref
         // is safe.
-        let ptr = unsafe { (*self.0.get()).bio };
+        let ptr = unsafe { (*self.0 .0.get()).bio };
         // SAFETY: By C API contract, if `bio` is not null it will have a
         // positive refcount at least for the duration of the lifetime of
         // `&self`.
@@ -157,7 +276,7 @@ pub fn bio(&self) -> Option<&Bio> {
     pub fn bio_mut(self: Pin<&mut Self>) -> Option<Pin<&mut Bio>> {
         // SAFETY: By type invariant of `Self`, `self.0` is valid and the deref
         // is safe.
-        let ptr = unsafe { (*self.0.get()).bio };
+        let ptr = unsafe { (*self.0 .0.get()).bio };
         // SAFETY: By C API contract, if `bio` is not null it will have a
         // positive refcount at least for the duration of the lifetime of
         // `&mut self`.
@@ -171,25 +290,11 @@ pub fn bio_iter_mut<'a>(self: &'a mut Owned<Self>) -> BioIterator<'a> {
         // `NonNull::new` will return `None` if the pointer is null.
         BioIterator {
             // SAFETY: By type invariant `self.0` is a valid `struct request`.
-            bio: NonNull::new(unsafe { (*self.0.get()).bio.cast() }),
+            bio: NonNull::new(unsafe { (*self.0 .0.get()).bio.cast() }),
             _p: PhantomData,
         }
     }
 
-    /// Get the target sector for the request.
-    #[inline(always)]
-    pub fn sector(&self) -> u64 {
-        // SAFETY: By type invariant of `Self`, `self.0` is valid and live.
-        unsafe { (*self.0.get()).__sector }
-    }
-
-    /// Get the size of the request in number of sectors.
-    #[inline(always)]
-    pub fn sectors(&self) -> u32 {
-        // SAFETY: By type invariant of `Self`, `self.0` is valid and live.
-        (unsafe { (*self.0.get()).__data_len }) >> crate::block::SECTOR_SHIFT
-    }
-
     /// Return a pointer to the [`RequestDataWrapper`] stored in the private area
     /// of the request structure.
     ///
@@ -328,10 +433,10 @@ impl<T: Operations> Owned<Request<T>> {
     /// `self.wrapper_ref().refcount() == 0`.
     ///
     /// This can only be called once in the request life cycle.
-    pub(crate) unsafe fn start_unchecked(&mut self) {
+    pub unsafe fn start_unchecked(&mut self) {
         // SAFETY: By type invariant, `self.0` is a valid `struct request` and
         // we have exclusive access.
-        unsafe { bindings::blk_mq_start_request(self.0.get()) };
+        unsafe { bindings::blk_mq_start_request(self.0 .0.get()) };
     }
 
     /// Notify the block layer that the request has been completed without errors.
@@ -341,7 +446,7 @@ pub fn end_ok(self) {
 
     /// Notify the block layer that the request has been completed.
     pub fn end(self, status: u8) {
-        let request_ptr = self.0.get().cast();
+        let request_ptr = self.0 .0.get().cast();
         core::mem::forget(self);
         // SAFETY: By type invariant, `this.0` was a valid `struct request`. The
         // existence of `self` guarantees that there are no `ARef`s pointing to

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 39/83] block: rust: add a request queue abstraction
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (37 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 38/83] block: rust: introduce an idle type state for `Request` Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 40/83] block: rust: add a method to get the request queue for a request Andreas Hindborg
                   ` (43 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add the `RequestQueue` type as a Rust abstraction for `struct
request_queue`. This type provides methods to access the request queue
associated with a `GenDisk` or `Request`.

The abstraction exposes queue-related functionality needed by block
device drivers.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/kernel/block/mq.rs               |  2 ++
 rust/kernel/block/mq/gen_disk.rs      |  7 ++++
 rust/kernel/block/mq/request_queue.rs | 60 +++++++++++++++++++++++++++++++++++
 3 files changed, 69 insertions(+)

diff --git a/rust/kernel/block/mq.rs b/rust/kernel/block/mq.rs
index 77e3593e8626..e89eb394001f 100644
--- a/rust/kernel/block/mq.rs
+++ b/rust/kernel/block/mq.rs
@@ -127,6 +127,7 @@
 pub mod gen_disk;
 mod operations;
 mod request;
+mod request_queue;
 pub mod tag_set;
 
 pub use operations::Operations;
@@ -135,4 +136,5 @@
     Request,
     RequestTimerHandle, //
 };
+pub use request_queue::RequestQueue;
 pub use tag_set::TagSet;
diff --git a/rust/kernel/block/mq/gen_disk.rs b/rust/kernel/block/mq/gen_disk.rs
index f51bccb0d2ef..6ba8d88f63a9 100644
--- a/rust/kernel/block/mq/gen_disk.rs
+++ b/rust/kernel/block/mq/gen_disk.rs
@@ -9,6 +9,7 @@
     bindings,
     block::mq::{
         Operations,
+        RequestQueue,
         TagSet, //
     },
     error::{
@@ -253,6 +254,12 @@ impl<T: Operations> GenDisk<T> {
     pub fn get_ref(&self) -> Arc<Revocable<GenDiskRef<T>>> {
         self.backref.clone()
     }
+
+    /// Get the [`RequestQueue`] associated with this [`GenDisk`].
+    pub fn queue(&self) -> &RequestQueue<T> {
+        // SAFETY: By type invariant, self is a valid gendisk.
+        unsafe { RequestQueue::from_raw((*self.gendisk).queue) }
+    }
 }
 
 // SAFETY: `GenDisk` is an owned pointer to a `struct gendisk` and an `Arc` to a
diff --git a/rust/kernel/block/mq/request_queue.rs b/rust/kernel/block/mq/request_queue.rs
new file mode 100644
index 000000000000..45fb55b1a310
--- /dev/null
+++ b/rust/kernel/block/mq/request_queue.rs
@@ -0,0 +1,60 @@
+// SPDX-License-Identifier: GPL-2.0
+
+use super::Operations;
+use crate::types::{
+    ForeignOwnable,
+    Opaque, //
+};
+use core::marker::PhantomData;
+
+/// A structure describing the queues associated with a block device.
+///
+/// Owned by a [`GenDisk`].
+///
+/// # Invariants
+///
+/// - `self.0` is a valid `bindings::request_queue`.
+/// - `self.0.queuedata` is a valid `T::QueueData`.
+#[repr(transparent)]
+pub struct RequestQueue<T>(Opaque<bindings::request_queue>, PhantomData<T>);
+
+impl<T> RequestQueue<T>
+where
+    T: Operations,
+{
+    /// Create a [`RequestQueue`] from a raw `bindings::request_queue` pointer
+    ///
+    /// # Safety
+    ///
+    /// - `ptr` must be valid for use as a reference for the duration of `'a`.
+    /// - `ptr` must have been initialized as part of [`GenDiskBuilder::build`].
+    pub(crate) unsafe fn from_raw<'a>(ptr: *const bindings::request_queue) -> &'a Self {
+        // INVARIANT:
+        // - By function safety requirements, `ptr` is a valid `request_queue`.
+        // - By function safety requirement `ptr` was initialized by [`GenDiskBuilder::build`], and
+        //   thus `queuedata` was set to point to a valid `T::QueueData`.
+        //
+        // SAFETY: By function safety requirements `ptr` is valid for use as a reference.
+        unsafe { &*ptr.cast() }
+    }
+
+    /// Get the driver private data associated with this [`RequestQueue`].
+    pub fn queue_data(&self) -> <T::QueueData as ForeignOwnable>::Borrowed<'_> {
+        // SAFETY: By type invariant, `queuedata` is a valid `T::QueueData`.
+        unsafe { T::QueueData::borrow((*self.0.get()).queuedata) }
+    }
+
+    /// Stop all hardware queues of this [`RequestQueue`].
+    pub fn stop_hw_queues(&self) {
+        // SAFETY: By type invariant, `self.0` is a valid `request_queue`.
+        unsafe { bindings::blk_mq_stop_hw_queues(self.0.get()) }
+    }
+
+    /// Start all hardware queues of this [`RequestQueue`].
+    ///
+    /// This function will mark the queues as ready and if necessary, schedule the queues to run.
+    pub fn start_stopped_hw_queues_async(&self) {
+        // SAFETY: By type invariant, `self.0` is a valid `request_queue`.
+        unsafe { bindings::blk_mq_start_stopped_hw_queues(self.0.get(), true) }
+    }
+}

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 40/83] block: rust: add a method to get the request queue for a request
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (38 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 39/83] block: rust: add a request queue abstraction Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 41/83] block: rust: introduce `kernel::block::error` Andreas Hindborg
                   ` (42 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add a method to `Request` for obtaining the associated `RequestQueue`.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/kernel/block/mq/request.rs | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/rust/kernel/block/mq/request.rs b/rust/kernel/block/mq/request.rs
index f94e9c2181d0..a05df2351c2c 100644
--- a/rust/kernel/block/mq/request.rs
+++ b/rust/kernel/block/mq/request.rs
@@ -39,6 +39,7 @@
     ptr::NonNull, //
 };
 
+use super::RequestQueue;
 use crate::block::bio::{
     Bio,
     BioIterator, //
@@ -140,6 +141,12 @@ pub fn queue_data(&self) -> <T::QueueData as ForeignOwnable>::Borrowed<'_> {
         // SAFETY: By type invariants of `Request`, `self.0` is a valid request.
         unsafe { T::QueueData::borrow((*(*self.0.get()).q).queuedata) }
     }
+
+    /// Get the request queue associated with this request.
+    pub fn queue(&self) -> &RequestQueue<T> {
+        // SAFETY: By type invariant, self.0 is guaranteed to be valid.
+        unsafe { RequestQueue::from_raw((*self.0.get()).q) }
+    }
 }
 
 /// A wrapper around a blk-mq [`struct request`]. This represents an IO request.

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 41/83] block: rust: introduce `kernel::block::error`
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (39 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 40/83] block: rust: add a method to get the request queue for a request Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 42/83] block: rust: require `queue_rq` to return a `BlkResult` Andreas Hindborg
                   ` (41 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Block layer status codes, represented by `blk_status_t`, are only one
byte. This is different from the general kernel error codes.

Add `BlkError` and `BlkResult` to handle these status codes.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/kernel/block.rs | 94 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 rust/kernel/error.rs |  3 +-
 2 files changed, 96 insertions(+), 1 deletion(-)

diff --git a/rust/kernel/block.rs b/rust/kernel/block.rs
index 96e48a2e6116..b3578f28871a 100644
--- a/rust/kernel/block.rs
+++ b/rust/kernel/block.rs
@@ -18,3 +18,97 @@
 /// The difference between the size of a page and the size of a sector,
 /// expressed as a power of two.
 pub const PAGE_SECTORS_SHIFT: u32 = bindings::PAGE_SECTORS_SHIFT;
+
+pub mod error {
+    //! Block layer errors.
+
+    use core::num::NonZeroU8;
+
+    pub mod code {
+        //! C compatible error codes for the block subsystem.
+        macro_rules! declare_err {
+            ($err:tt $(,)? $($doc:expr),+) => {
+                $(
+                    #[doc = $doc]
+                )*
+                    pub const $err: super::BlkError =
+                    match super::BlkError::try_from_blk_status(crate::bindings::$err as u8) {
+                        Some(err) => err,
+                        None => panic!("Invalid errno in `declare_err!`"),
+                    };
+            };
+        }
+
+        declare_err!(BLK_STS_NOTSUPP, "Operation not supported.");
+        declare_err!(BLK_STS_IOERR, "Generic IO error.");
+        declare_err!(BLK_STS_DEV_RESOURCE, "Device resource busy. Retry later.");
+    }
+
+    /// A wrapper around a 1 byte block layer error code.
+    #[derive(Clone, Copy, PartialEq, Eq)]
+    pub struct BlkError(NonZeroU8);
+
+    impl BlkError {
+        /// Create a [`BlkError`] from a `blk_status_t`.
+        ///
+        /// If the code is not know, this function will warn and return [`code::BLK_STS_IOERR`].
+        pub fn from_blk_status(status: bindings::blk_status_t) -> Self {
+            if let Some(error) = Self::try_from_blk_status(status) {
+                error
+            } else {
+                kernel::pr_warn!("Attempted to create `BlkError` from invalid value");
+                code::BLK_STS_IOERR
+            }
+        }
+
+        /// Convert `Self` to the underlying type.
+        pub fn to_blk_status(self) -> bindings::blk_status_t {
+            self.0.into()
+        }
+
+        /// Try to create a `Self` form a `blk_status_t`.
+        ///
+        /// Returns `None` if the conversion fails.
+        const fn try_from_blk_status(errno: bindings::blk_status_t) -> Option<Self> {
+            if errno == 0 {
+                None
+            } else {
+                Some(BlkError(
+                    // SAFETY: We just checked that `errno`is nonzero.
+                    unsafe { NonZeroU8::new_unchecked(errno) },
+                ))
+            }
+        }
+    }
+
+    impl From<BlkError> for u8 {
+        fn from(value: BlkError) -> Self {
+            value.0.into()
+        }
+    }
+
+    impl From<BlkError> for u32 {
+        fn from(value: BlkError) -> Self {
+            let value: u8 = value.0.into();
+            value.into()
+        }
+    }
+
+    impl From<kernel::error::Error> for BlkError {
+        fn from(_value: kernel::error::Error) -> Self {
+            code::BLK_STS_IOERR
+        }
+    }
+
+    /// A result with a [`BlkError`] error type.
+    pub type BlkResult<T = ()> = Result<T, BlkError>;
+
+    /// Convert a `blk_status_t` to a `BlkResult`.
+    pub fn to_result(status: bindings::blk_status_t) -> BlkResult {
+        if status == bindings::BLK_STS_OK {
+            Ok(())
+        } else {
+            Err(BlkError::from_blk_status(status))
+        }
+    }
+}
diff --git a/rust/kernel/error.rs b/rust/kernel/error.rs
index 05cf869ac090..6dd14a72526f 100644
--- a/rust/kernel/error.rs
+++ b/rust/kernel/error.rs
@@ -163,8 +163,9 @@ pub fn to_errno(self) -> crate::ffi::c_int {
         self.0.get()
     }
 
+    /// Convert a generic kernel error to a block layer error.
     #[cfg(CONFIG_BLOCK)]
-    pub(crate) fn to_blk_status(self) -> bindings::blk_status_t {
+    pub fn to_blk_status(self) -> bindings::blk_status_t {
         // SAFETY: `self.0` is a valid error due to its invariant.
         unsafe { bindings::errno_to_blk_status(self.0.get()) }
     }

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 42/83] block: rust: require `queue_rq` to return a `BlkResult`
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (40 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 41/83] block: rust: introduce `kernel::block::error` Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 43/83] block: rust: add `GenDisk::queue_data` Andreas Hindborg
                   ` (40 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Change the return type of `Operations::queue_rq` from `Result` to
`BlkResult`. This ensures that drivers return proper block layer status
codes that can be translated to the appropriate `blk_status_t` value.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/rnull.rs       |  3 ++-
 rust/kernel/block/mq.rs            |  4 ++--
 rust/kernel/block/mq/operations.rs | 13 ++++++++-----
 3 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index bb8c4df08218..6ceba23a4d3e 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -20,6 +20,7 @@
             BadBlocks, //
         },
         bio::Segment,
+        error::BlkResult,
         mq::{
             self,
             gen_disk::{
@@ -595,7 +596,7 @@ fn queue_rq(
         this: Pin<&Self>,
         rq: Owned<mq::IdleRequest<Self>>,
         _is_last: bool,
-    ) -> Result {
+    ) -> BlkResult {
         let mut rq = rq.start();
         let mut sectors = rq.sectors();
 
diff --git a/rust/kernel/block/mq.rs b/rust/kernel/block/mq.rs
index e89eb394001f..503623267b19 100644
--- a/rust/kernel/block/mq.rs
+++ b/rust/kernel/block/mq.rs
@@ -64,7 +64,7 @@
 //! ```rust
 //! use kernel::{
 //!     alloc::NumaNode,
-//!     block::mq::{self, *},
+//!     block::{error::BlkResult, mq::{self, *}},
 //!     new_mutex,
 //!     prelude::*,
 //!     sync::{aref::ARef, Arc, Mutex},
@@ -90,7 +90,7 @@
 //!         _queue_data: (),
 //!         rq: Owned<IdleRequest<Self>>,
 //!         _is_last: bool
-//!     ) -> Result {
+//!     ) -> BlkResult {
 //!         rq.start().end_ok();
 //!         Ok(())
 //!     }
diff --git a/rust/kernel/block/mq/operations.rs b/rust/kernel/block/mq/operations.rs
index 01917ef213d1..b9a2bf6592b3 100644
--- a/rust/kernel/block/mq/operations.rs
+++ b/rust/kernel/block/mq/operations.rs
@@ -6,10 +6,13 @@
 
 use crate::{
     bindings,
-    block::mq::{
-        request::RequestDataWrapper,
-        IdleRequest,
-        Request, //
+    block::{
+        error::BlkResult,
+        mq::{
+            request::RequestDataWrapper,
+            IdleRequest,
+            Request, //
+        },
     },
     error::{
         from_result,
@@ -82,7 +85,7 @@ fn queue_rq(
         queue_data: ForeignBorrowed<'_, Self::QueueData>,
         rq: Owned<IdleRequest<Self>>,
         is_last: bool,
-    ) -> Result;
+    ) -> BlkResult;
 
     /// Called by the kernel to indicate that queued requests should be submitted.
     fn commit_rqs(

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 43/83] block: rust: add `GenDisk::queue_data`
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (41 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 42/83] block: rust: require `queue_rq` to return a `BlkResult` Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 44/83] block: rnull: add bandwidth limiting Andreas Hindborg
                   ` (39 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add a method to borrow the private queue data of the queue a `GenDisk` is
associated with.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/kernel/block/mq/gen_disk.rs | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/rust/kernel/block/mq/gen_disk.rs b/rust/kernel/block/mq/gen_disk.rs
index 6ba8d88f63a9..49ce5ac4774d 100644
--- a/rust/kernel/block/mq/gen_disk.rs
+++ b/rust/kernel/block/mq/gen_disk.rs
@@ -260,6 +260,12 @@ pub fn queue(&self) -> &RequestQueue<T> {
         // SAFETY: By type invariant, self is a valid gendisk.
         unsafe { RequestQueue::from_raw((*self.gendisk).queue) }
     }
+
+    /// Get the queue data associated with this [`GenDisk`].
+    pub fn queue_data(&self) -> <T::QueueData as ForeignOwnable>::Borrowed<'_> {
+        // SAFETY: By type invariant, self is a valid gendisk.
+        unsafe { T::QueueData::borrow((*(*self.gendisk).queue).queuedata) }
+    }
 }
 
 // SAFETY: `GenDisk` is an owned pointer to a `struct gendisk` and an `Arc` to a

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 44/83] block: rnull: add bandwidth limiting
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (42 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 43/83] block: rust: add `GenDisk::queue_data` Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 45/83] block: rnull: add blocking queue mode Andreas Hindborg
                   ` (38 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add bandwidth limiting support to rnull via the `mbps` configfs
attribute. When set to a non-zero value, the driver limits I/O
throughput to the specified rate in megabytes per second.

The implementation uses a token bucket algorithm to enforce the rate
limit, delaying request completion when the limit is exceeded.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/configfs.rs |   7 ++-
 drivers/block/rnull/rnull.rs    | 111 +++++++++++++++++++++++++++++++++++-----
 2 files changed, 105 insertions(+), 13 deletions(-)

diff --git a/drivers/block/rnull/configfs.rs b/drivers/block/rnull/configfs.rs
index 4df0b748596a..59217d75f46b 100644
--- a/drivers/block/rnull/configfs.rs
+++ b/drivers/block/rnull/configfs.rs
@@ -104,6 +104,7 @@ fn make_group(
                 badblocks_once: 13,
                 badblocks_partial_io: 14,
                 cache_size_mib: 15,
+                mbps: 16,
             ],
         };
 
@@ -135,6 +136,7 @@ fn make_group(
                         GFP_KERNEL
                     )?,
                     cache_size_mib: 0,
+                    mbps: 0,
                 }),
             }),
             core::iter::empty(),
@@ -209,6 +211,7 @@ struct DeviceConfigInner {
     bad_blocks_partial_io: bool,
     cache_size_mib: u64,
     disk_storage: Arc<DiskStorage>,
+    mbps: u32,
 }
 
 #[vtable]
@@ -248,6 +251,7 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
                 bad_blocks_once: guard.bad_blocks_once,
                 bad_blocks_partial_io: guard.bad_blocks_partial_io,
                 storage: guard.disk_storage.clone(),
+                bandwidth_limit: u64::from(guard.mbps) * 2u64.pow(20),
             })?);
             guard.powered = true;
         } else if guard.powered && !power_op {
@@ -259,7 +263,6 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
     }
 }
 
-// DiskStorage::new(cache_size_mib << 20, block_size as usize),
 configfs_simple_field!(DeviceConfig, 1, block_size, u32, check GenDiskBuilder::validate_block_size);
 configfs_simple_bool_field!(DeviceConfig, 2, rotational);
 configfs_simple_field!(DeviceConfig, 3, capacity_mib, u64);
@@ -417,3 +420,5 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
         Ok(())
     })
 );
+
+configfs_simple_field!(DeviceConfig, 16, mbps, u32);
diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index 6ceba23a4d3e..1dda8d717b95 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -25,7 +25,8 @@
             self,
             gen_disk::{
                 self,
-                GenDisk, //
+                GenDisk,
+                GenDiskRef, //
             },
             Operations,
             TagSet, //
@@ -37,25 +38,32 @@
         Result, //
     },
     ffi,
+    impl_has_hr_timer,
     memalloc_scope,
     new_mutex,
     new_spinlock,
     pr_info,
     prelude::*,
+    revocable::Revocable,
     str::CString,
     sync::{
         aref::ARef,
         atomic::{
             ordering,
             Atomic, //
-        }, //
+        },
         Arc,
+        ArcBorrow,
         Mutex,
+        SetOnce,
         SpinLock,
-        SpinLockGuard,
+        SpinLockGuard, //
     },
     time::{
         hrtimer::{
+            self,
+            ArcHrTimerHandle,
+            HrTimer,
             HrTimerCallback,
             HrTimerCallbackContext,
             HrTimerPointer,
@@ -127,6 +135,10 @@
             default: false,
             description: "No IO scheduler",
         },
+        mbps: u32 {
+            default: 0,
+            description: "Max bandwidth in MiB/s. 0 means no limit.",
+        },
     },
 }
 
@@ -172,6 +184,7 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
                     bad_blocks_once: false,
                     bad_blocks_partial_io: false,
                     storage: Arc::pin_init(DiskStorage::new(0, block_size as usize), GFP_KERNEL)?,
+                    bandwidth_limit: u64::from(module_parameters::mbps.value()) * 2u64.pow(20),
                 })?;
                 disks.push(disk, GFP_KERNEL)?;
             }
@@ -202,6 +215,7 @@ struct NullBlkOptions<'a> {
     bad_blocks_once: bool,
     bad_blocks_partial_io: bool,
     storage: Arc<DiskStorage>,
+    bandwidth_limit: u64,
 }
 
 #[pin_data]
@@ -214,9 +228,18 @@ struct NullBlkDevice {
     bad_blocks: Arc<BadBlocks>,
     bad_blocks_once: bool,
     bad_blocks_partial_io: bool,
+    bandwidth_limit: u64,
+    #[pin]
+    bandwidth_timer: HrTimer<Self>,
+    bandwidth_bytes: Atomic<u64>,
+    #[pin]
+    bandwidth_timer_handle: SpinLock<Option<ArcHrTimerHandle<Self>>>,
+    disk: SetOnce<Arc<Revocable<GenDiskRef<Self>>>>,
 }
 
 impl NullBlkDevice {
+    const BANDWIDTH_TIMER_INTERVAL: Delta = Delta::from_millis(20);
+
     fn new(options: NullBlkOptions<'_>) -> Result<Arc<GenDisk<Self>>> {
         let NullBlkOptions {
             name,
@@ -234,6 +257,7 @@ fn new(options: NullBlkOptions<'_>) -> Result<Arc<GenDisk<Self>>> {
             bad_blocks_once,
             bad_blocks_partial_io,
             storage,
+            bandwidth_limit,
         } = options;
 
         let mut flags = mq::tag_set::Flags::default();
@@ -268,7 +292,7 @@ fn new(options: NullBlkOptions<'_>) -> Result<Arc<GenDisk<Self>>> {
             GFP_KERNEL,
         )?;
 
-        let queue_data = Box::try_pin_init(
+        let queue_data = Arc::try_pin_init(
             try_pin_init!(Self {
                 storage,
                 irq_mode,
@@ -278,6 +302,11 @@ fn new(options: NullBlkOptions<'_>) -> Result<Arc<GenDisk<Self>>> {
                 bad_blocks,
                 bad_blocks_once,
                 bad_blocks_partial_io,
+                bandwidth_limit: bandwidth_limit / 50,
+                bandwidth_timer <- HrTimer::new(),
+                bandwidth_bytes: Atomic::new(0),
+                bandwidth_timer_handle <- new_spinlock!(None),
+                disk: SetOnce::new(),
             }),
             GFP_KERNEL,
         )?;
@@ -294,7 +323,10 @@ fn new(options: NullBlkOptions<'_>) -> Result<Arc<GenDisk<Self>>> {
                 .max_hw_discard_sectors(ffi::c_uint::MAX >> block::SECTOR_SHIFT);
         }
 
-        builder.build(fmt!("{}", name.to_str()?), tagset, queue_data)
+        let disk = builder.build(fmt!("{}", name.to_str()?), tagset, queue_data)?;
+        let queue_data: ArcBorrow<'_, Self> = disk.queue_data();
+        queue_data.disk.populate(disk.get_ref());
+        Ok(disk)
     }
 
     fn sheaf_size() -> usize {
@@ -522,6 +554,36 @@ fn end_request(rq: Owned<mq::Request<Self>>) {
     }
 }
 
+impl_has_hr_timer! {
+    impl HasHrTimer<Self> for NullBlkDevice {
+        mode: hrtimer::RelativeHardMode<kernel::time::Monotonic>,
+        field: self.bandwidth_timer,
+    }
+}
+
+impl HrTimerCallback for NullBlkDevice {
+    type Pointer<'a> = Arc<Self>;
+
+    fn run(
+        this: ArcBorrow<'_, Self>,
+        mut context: HrTimerCallbackContext<'_, Self>,
+    ) -> HrTimerRestart {
+        if this.bandwidth_bytes.load(ordering::Relaxed) == 0 {
+            return HrTimerRestart::NoRestart;
+        }
+
+        this.disk.as_ref().map(|disk| {
+            disk.try_access()
+                .map(|disk| disk.queue().start_stopped_hw_queues_async())
+        });
+
+        this.bandwidth_bytes.store(0, ordering::Relaxed);
+
+        context.forward_now(Self::BANDWIDTH_TIMER_INTERVAL);
+        HrTimerRestart::Restart
+    }
+}
+
 struct HwQueueContext {
     page: Option<KBox<disk_storage::NullBlockPage>>,
 }
@@ -529,7 +591,7 @@ struct HwQueueContext {
 #[pin_data]
 struct Pdu {
     #[pin]
-    timer: kernel::time::hrtimer::HrTimer<Self>,
+    timer: HrTimer<Self>,
     error: Atomic<u32>,
 }
 
@@ -578,14 +640,14 @@ fn align_down<T>(value: T, to: T) -> T
 
 #[vtable]
 impl Operations for NullBlkDevice {
-    type QueueData = Pin<KBox<Self>>;
+    type QueueData = Arc<Self>;
     type RequestData = Pdu;
     type TagSetData = ();
     type HwData = Pin<KBox<SpinLock<HwQueueContext>>>;
 
     fn new_request_data() -> impl PinInit<Self::RequestData> {
         pin_init!(Pdu {
-            timer <- kernel::time::hrtimer::HrTimer::new(),
+            timer <- HrTimer::new(),
             error: Atomic::new(0),
         })
     }
@@ -593,14 +655,39 @@ fn new_request_data() -> impl PinInit<Self::RequestData> {
     #[inline(always)]
     fn queue_rq(
         hw_data: Pin<&SpinLock<HwQueueContext>>,
-        this: Pin<&Self>,
+        this: ArcBorrow<'_, Self>,
         rq: Owned<mq::IdleRequest<Self>>,
         _is_last: bool,
     ) -> BlkResult {
-        let mut rq = rq.start();
         let mut sectors = rq.sectors();
 
-        Self::handle_bad_blocks(this.get_ref(), &mut rq, &mut sectors)?;
+        if this.bandwidth_limit != 0 {
+            if !this.bandwidth_timer.active() {
+                drop(this.bandwidth_timer_handle.lock().take());
+                let arc: Arc<_> = this.into();
+                *this.bandwidth_timer_handle.lock() =
+                    Some(arc.start(Self::BANDWIDTH_TIMER_INTERVAL));
+            }
+
+            if this
+                .bandwidth_bytes
+                .fetch_add(u64::from(rq.bytes()), ordering::Relaxed)
+                + u64::from(rq.bytes())
+                > this.bandwidth_limit
+            {
+                rq.queue().stop_hw_queues();
+                if this.bandwidth_bytes.load(ordering::Relaxed) <= this.bandwidth_limit {
+                    rq.queue().start_stopped_hw_queues_async();
+                }
+
+                return Err(kernel::block::error::code::BLK_STS_DEV_RESOURCE);
+            }
+        }
+
+        let mut rq = rq.start();
+
+        use core::ops::Deref;
+        Self::handle_bad_blocks(this.deref(), &mut rq, &mut sectors)?;
 
         if this.memory_backed {
             memalloc_scope!(let _noio: NoIo);
@@ -623,7 +710,7 @@ fn queue_rq(
         Ok(())
     }
 
-    fn commit_rqs(_hw_data: Pin<&SpinLock<HwQueueContext>>, _queue_data: Pin<&Self>) {}
+    fn commit_rqs(_hw_data: Pin<&SpinLock<HwQueueContext>>, _queue_data: ArcBorrow<'_, Self>) {}
 
     fn init_hctx(_tagset_data: (), _hctx_idx: u32) -> Result<Self::HwData> {
         KBox::pin_init(new_spinlock!(HwQueueContext { page: None }), GFP_KERNEL)

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 45/83] block: rnull: add blocking queue mode
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (43 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 44/83] block: rnull: add bandwidth limiting Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 46/83] block: rnull: add shared tags Andreas Hindborg
                   ` (37 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add support for blocking queue mode via the `blocking` configfs
attribute. When enabled, the tag set is created with the
`BLK_MQ_F_BLOCKING` flag.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/configfs.rs | 7 ++++++-
 drivers/block/rnull/rnull.rs    | 9 ++++++++-
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/block/rnull/configfs.rs b/drivers/block/rnull/configfs.rs
index 59217d75f46b..5e6bcf9d31d8 100644
--- a/drivers/block/rnull/configfs.rs
+++ b/drivers/block/rnull/configfs.rs
@@ -69,7 +69,7 @@ impl AttributeOperations<0> for Config {
         let mut writer = kernel::str::Formatter::new(page);
         writer.write_str(
             "blocksize,size,rotational,irqmode,completion_nsec,memory_backed,\
-             submit_queues,use_per_node_hctx\n",
+             submit_queues,use_per_node_hctx,discard,blocking\n",
         )?;
         Ok(writer.bytes_written())
     }
@@ -105,6 +105,7 @@ fn make_group(
                 badblocks_partial_io: 14,
                 cache_size_mib: 15,
                 mbps: 16,
+                blocking: 17,
             ],
         };
 
@@ -137,6 +138,7 @@ fn make_group(
                     )?,
                     cache_size_mib: 0,
                     mbps: 0,
+                    blocking: false,
                 }),
             }),
             core::iter::empty(),
@@ -212,6 +214,7 @@ struct DeviceConfigInner {
     cache_size_mib: u64,
     disk_storage: Arc<DiskStorage>,
     mbps: u32,
+    blocking: bool,
 }
 
 #[vtable]
@@ -252,6 +255,7 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
                 bad_blocks_partial_io: guard.bad_blocks_partial_io,
                 storage: guard.disk_storage.clone(),
                 bandwidth_limit: u64::from(guard.mbps) * 2u64.pow(20),
+                blocking: guard.blocking,
             })?);
             guard.powered = true;
         } else if guard.powered && !power_op {
@@ -422,3 +426,4 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
 );
 
 configfs_simple_field!(DeviceConfig, 16, mbps, u32);
+configfs_simple_bool_field!(DeviceConfig, 17, blocking);
diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index 1dda8d717b95..181fce551a91 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -139,6 +139,10 @@
             default: 0,
             description: "Max bandwidth in MiB/s. 0 means no limit.",
         },
+        blocking: bool {
+            default: false,
+            description: "Register as a blocking blk-mq driver device",
+        },
     },
 }
 
@@ -185,6 +189,7 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
                     bad_blocks_partial_io: false,
                     storage: Arc::pin_init(DiskStorage::new(0, block_size as usize), GFP_KERNEL)?,
                     bandwidth_limit: u64::from(module_parameters::mbps.value()) * 2u64.pow(20),
+                    blocking: module_parameters::blocking.value(),
                 })?;
                 disks.push(disk, GFP_KERNEL)?;
             }
@@ -216,6 +221,7 @@ struct NullBlkOptions<'a> {
     bad_blocks_partial_io: bool,
     storage: Arc<DiskStorage>,
     bandwidth_limit: u64,
+    blocking: bool,
 }
 
 #[pin_data]
@@ -258,11 +264,12 @@ fn new(options: NullBlkOptions<'_>) -> Result<Arc<GenDisk<Self>>> {
             bad_blocks_partial_io,
             storage,
             bandwidth_limit,
+            blocking,
         } = options;
 
         let mut flags = mq::tag_set::Flags::default();
 
-        if memory_backed {
+        if blocking || memory_backed {
             flags |= mq::tag_set::Flag::Blocking;
         }
 

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 46/83] block: rnull: add shared tags
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (44 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 45/83] block: rnull: add blocking queue mode Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 47/83] block: rnull: add queue depth config option Andreas Hindborg
                   ` (36 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add support for sharing tags between multiple rnull devices. When
enabled via the `shared_tags` configfs attribute, all devices in the
group share a single tag set, reducing memory usage.

This feature requires creating a shared `TagSet` that can be referenced
by multiple devices.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/configfs.rs |  44 +++++++++----
 drivers/block/rnull/rnull.rs    | 136 +++++++++++++++++++++++++---------------
 rust/kernel/block/mq/tag_set.rs |  18 ++++++
 3 files changed, 136 insertions(+), 62 deletions(-)

diff --git a/drivers/block/rnull/configfs.rs b/drivers/block/rnull/configfs.rs
index 5e6bcf9d31d8..a84854e7c358 100644
--- a/drivers/block/rnull/configfs.rs
+++ b/drivers/block/rnull/configfs.rs
@@ -10,9 +10,12 @@
     bindings,
     block::{
         badblocks::BadBlocks,
-        mq::gen_disk::{
-            GenDisk,
-            GenDiskBuilder, //
+        mq::{
+            gen_disk::{
+                GenDisk,
+                GenDiskBuilder, //
+            },
+            TagSet, //
         }, //
     },
     configfs::{
@@ -45,7 +48,9 @@
 
 mod macros;
 
-pub(crate) fn subsystem() -> impl PinInit<kernel::configfs::Subsystem<Config>, Error> {
+pub(crate) fn subsystem(
+    shared_tag_set: Arc<TagSet<NullBlkDevice>>,
+) -> impl PinInit<kernel::configfs::Subsystem<Config>, Error> {
     let item_type = configfs_attrs! {
         container: configfs::Subsystem<Config>,
         data: Config,
@@ -55,11 +60,17 @@ pub(crate) fn subsystem() -> impl PinInit<kernel::configfs::Subsystem<Config>, E
         ],
     };
 
-    kernel::configfs::Subsystem::new(c"rnull", item_type, try_pin_init!(Config {}))
+    kernel::configfs::Subsystem::new(
+        c"rnull",
+        item_type,
+        try_pin_init!(Config { shared_tag_set }),
+    )
 }
 
 #[pin_data]
-pub(crate) struct Config {}
+pub(crate) struct Config {
+    shared_tag_set: Arc<TagSet<NullBlkDevice>>,
+}
 
 #[vtable]
 impl AttributeOperations<0> for Config {
@@ -69,7 +80,7 @@ impl AttributeOperations<0> for Config {
         let mut writer = kernel::str::Formatter::new(page);
         writer.write_str(
             "blocksize,size,rotational,irqmode,completion_nsec,memory_backed,\
-             submit_queues,use_per_node_hctx,discard,blocking\n",
+             submit_queues,use_per_node_hctx,discard,blocking,shared_tags\n",
         )?;
         Ok(writer.bytes_written())
     }
@@ -106,6 +117,7 @@ fn make_group(
                 cache_size_mib: 15,
                 mbps: 16,
                 blocking: 17,
+                shared_tags: 18,
             ],
         };
 
@@ -139,6 +151,8 @@ fn make_group(
                     cache_size_mib: 0,
                     mbps: 0,
                     blocking: false,
+                    shared_tags: false,
+                    shared_tag_set: self.shared_tag_set.clone(),
                 }),
             }),
             core::iter::empty(),
@@ -215,6 +229,8 @@ struct DeviceConfigInner {
     disk_storage: Arc<DiskStorage>,
     mbps: u32,
     blocking: bool,
+    shared_tags: bool,
+    shared_tag_set: Arc<TagSet<NullBlkDevice>>,
 }
 
 #[vtable]
@@ -245,17 +261,20 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
                 capacity_mib: guard.capacity_mib,
                 irq_mode: guard.irq_mode,
                 completion_time: guard.completion_time,
-                memory_backed: guard.memory_backed,
-                submit_queues: guard.submit_queues,
-                home_node: guard.home_node,
                 discard: guard.discard,
-                no_sched: guard.no_sched,
                 bad_blocks: guard.bad_blocks.clone(),
                 bad_blocks_once: guard.bad_blocks_once,
                 bad_blocks_partial_io: guard.bad_blocks_partial_io,
                 storage: guard.disk_storage.clone(),
                 bandwidth_limit: u64::from(guard.mbps) * 2u64.pow(20),
-                blocking: guard.blocking,
+                shared_tag_set: guard.shared_tags.then(|| guard.shared_tag_set.clone()),
+                tag_set: crate::TagSetOptions {
+                    submit_queues: guard.submit_queues,
+                    home_node: guard.home_node,
+                    blocking: guard.blocking,
+                    memory_backed: guard.memory_backed,
+                    no_sched: guard.no_sched,
+                },
             })?);
             guard.powered = true;
         } else if guard.powered && !power_op {
@@ -427,3 +446,4 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
 
 configfs_simple_field!(DeviceConfig, 16, mbps, u32);
 configfs_simple_bool_field!(DeviceConfig, 17, blocking);
+configfs_simple_bool_field!(DeviceConfig, 18, shared_tags);
diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index 181fce551a91..bcf6a85f1cbc 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -143,6 +143,10 @@
             default: false,
             description: "Register as a blocking blk-mq driver device",
         },
+        shared_tags: bool {
+            default: false,
+            description: "Share tag set between devices for blk-mq",
+        },
     },
 }
 
@@ -158,19 +162,30 @@ impl kernel::InPlaceModule for NullBlkModule {
     fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
         pr_info!("Rust null_blk loaded\n");
 
-        let mut disks = KVec::new();
+        pin_init::pin_init_scope(move || -> Result<_, Error> {
+            let submit_queues = if module_parameters::use_per_node_hctx.value() {
+                kernel::numa::num_online_nodes()
+            } else {
+                module_parameters::submit_queues.value()
+            };
+            let home_node = module_parameters::home_node.value();
+            let blocking = module_parameters::blocking.value();
+            let memory_backed = module_parameters::memory_backed.value();
+            let no_sched = module_parameters::no_sched.value();
+
+            let shared_tag_set = NullBlkDevice::build_tag_set(TagSetOptions {
+                submit_queues,
+                home_node,
+                blocking,
+                memory_backed,
+                no_sched,
+            })?;
 
-        let defer_init = move || -> Result<_, Error> {
+            let mut disks = KVec::new();
             let completion_time: i64 = module_parameters::completion_nsec.value().try_into()?;
             for i in 0..module_parameters::nr_devices.value() {
                 let name = CString::try_from_fmt(fmt!("rnullb{}", i))?;
 
-                let submit_queues = if module_parameters::use_per_node_hctx.value() {
-                    kernel::numa::num_online_nodes()
-                } else {
-                    module_parameters::submit_queues.value()
-                };
-
                 let block_size = module_parameters::bs.value();
                 let disk = NullBlkDevice::new(NullBlkOptions {
                     name: &name,
@@ -179,27 +194,30 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
                     capacity_mib: module_parameters::gb.value() * 1024,
                     irq_mode: module_parameters::irqmode.value().try_into()?,
                     completion_time: Delta::from_nanos(completion_time),
-                    memory_backed: module_parameters::memory_backed.value(),
-                    submit_queues,
-                    home_node: module_parameters::home_node.value(),
                     discard: module_parameters::discard.value(),
-                    no_sched: module_parameters::no_sched.value(),
                     bad_blocks: Arc::pin_init(BadBlocks::new(false), GFP_KERNEL)?,
                     bad_blocks_once: false,
                     bad_blocks_partial_io: false,
                     storage: Arc::pin_init(DiskStorage::new(0, block_size as usize), GFP_KERNEL)?,
                     bandwidth_limit: u64::from(module_parameters::mbps.value()) * 2u64.pow(20),
-                    blocking: module_parameters::blocking.value(),
+                    shared_tag_set: module_parameters::shared_tags
+                        .value()
+                        .then(|| shared_tag_set.clone()),
+                    tag_set: TagSetOptions {
+                        submit_queues,
+                        home_node,
+                        blocking,
+                        memory_backed,
+                        no_sched,
+                    },
                 })?;
                 disks.push(disk, GFP_KERNEL)?;
             }
 
-            Ok(disks)
-        };
-
-        try_pin_init!(Self {
-            configfs_subsystem <- configfs::subsystem(),
-            param_disks <- new_mutex!(defer_init()?),
+            Ok(try_pin_init!(Self {
+                configfs_subsystem <- configfs::subsystem(shared_tag_set),
+                param_disks <- new_mutex!(disks),
+            }))
         })
     }
 }
@@ -211,17 +229,14 @@ struct NullBlkOptions<'a> {
     capacity_mib: u64,
     irq_mode: IRQMode,
     completion_time: Delta,
-    memory_backed: bool,
-    submit_queues: u32,
-    home_node: i32,
     discard: bool,
-    no_sched: bool,
     bad_blocks: Arc<BadBlocks>,
     bad_blocks_once: bool,
     bad_blocks_partial_io: bool,
     storage: Arc<DiskStorage>,
     bandwidth_limit: u64,
-    blocking: bool,
+    shared_tag_set: Option<Arc<TagSet<NullBlkDevice>>>,
+    tag_set: TagSetOptions,
 }
 
 #[pin_data]
@@ -243,9 +258,50 @@ struct NullBlkDevice {
     disk: SetOnce<Arc<Revocable<GenDiskRef<Self>>>>,
 }
 
+struct TagSetOptions {
+    submit_queues: u32,
+    home_node: i32,
+    blocking: bool,
+    memory_backed: bool,
+    no_sched: bool,
+}
+
 impl NullBlkDevice {
     const BANDWIDTH_TIMER_INTERVAL: Delta = Delta::from_millis(20);
 
+    fn build_tag_set(options: TagSetOptions) -> Result<Arc<TagSet<Self>>> {
+        let TagSetOptions {
+            submit_queues,
+            home_node,
+            blocking,
+            memory_backed,
+            no_sched,
+        } = options;
+
+        if home_node > kernel::numa::num_online_nodes().try_into()? {
+            return Err(code::EINVAL);
+        }
+
+        let numa_node = if home_node == -1 {
+            kernel::alloc::NumaNode::NO_NODE
+        } else {
+            kernel::alloc::NumaNode::new(home_node)?
+        };
+
+        let mut flags = mq::tag_set::Flags::default();
+        if blocking || memory_backed {
+            flags |= mq::tag_set::Flag::Blocking;
+        }
+        if no_sched {
+            flags |= mq::tag_set::Flag::NoDefaultScheduler;
+        }
+
+        Arc::pin_init(
+            TagSet::new(submit_queues, (), 256, 1, numa_node, flags),
+            GFP_KERNEL,
+        )
+    }
+
     fn new(options: NullBlkOptions<'_>) -> Result<Arc<GenDisk<Self>>> {
         let NullBlkOptions {
             name,
@@ -254,37 +310,22 @@ fn new(options: NullBlkOptions<'_>) -> Result<Arc<GenDisk<Self>>> {
             capacity_mib,
             irq_mode,
             completion_time,
-            memory_backed,
-            submit_queues,
-            home_node,
             discard,
-            no_sched,
             bad_blocks,
             bad_blocks_once,
             bad_blocks_partial_io,
             storage,
             bandwidth_limit,
-            blocking,
+            shared_tag_set,
+            tag_set,
         } = options;
 
-        let mut flags = mq::tag_set::Flags::default();
+        let memory_backed = tag_set.memory_backed;
 
-        if blocking || memory_backed {
-            flags |= mq::tag_set::Flag::Blocking;
-        }
-
-        if no_sched {
-            flags |= mq::tag_set::Flag::NoDefaultScheduler;
-        }
-
-        if home_node > kernel::numa::num_online_nodes().try_into()? {
-            return Err(code::EINVAL);
-        }
-
-        let numa_node = if home_node == -1 {
-            kernel::alloc::NumaNode::NO_NODE
+        let tagset = if let Some(shared) = shared_tag_set {
+            shared
         } else {
-            kernel::alloc::NumaNode::new(home_node)?
+            Self::build_tag_set(tag_set)?
         };
 
         let capacity_sectors = capacity_mib << (20 - block::SECTOR_SHIFT);
@@ -294,11 +335,6 @@ fn new(options: NullBlkOptions<'_>) -> Result<Arc<GenDisk<Self>>> {
             return Err(code::EINVAL);
         }
 
-        let tagset = Arc::pin_init(
-            TagSet::new(submit_queues, (), 256, 1, numa_node, flags),
-            GFP_KERNEL,
-        )?;
-
         let queue_data = Arc::try_pin_init(
             try_pin_init!(Self {
                 storage,
diff --git a/rust/kernel/block/mq/tag_set.rs b/rust/kernel/block/mq/tag_set.rs
index bfb8f8af4ee1..5359e60fb5a5 100644
--- a/rust/kernel/block/mq/tag_set.rs
+++ b/rust/kernel/block/mq/tag_set.rs
@@ -124,3 +124,21 @@ fn drop(self: Pin<&mut Self>) {
         unsafe { T::TagSetData::from_foreign(tagset_data) };
     }
 }
+
+// SAFETY: It is safe to share references to `TagSet` across thread boundaries as long as
+// `TagSetData` is `Sync`.
+unsafe impl<T> Sync for TagSet<T>
+where
+    T: Operations,
+    T::TagSetData: Sync,
+{
+}
+
+// SAFETY: It is safe to transfer ownership of `TagSet` across thread boundaries if the associated
+// private data is `Send` (it will be dropped with the `TagSet`).
+unsafe impl<T> Send for TagSet<T>
+where
+    T: Operations,
+    T::TagSetData: Send,
+{
+}

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 47/83] block: rnull: add queue depth config option
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (45 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 46/83] block: rnull: add shared tags Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 48/83] block: rust: add an abstraction for `bindings::req_op` Andreas Hindborg
                   ` (35 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add a configfs attribute to configure the queue depth (number of tags)
for the rnull block device.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/configfs.rs |  5 +++++
 drivers/block/rnull/rnull.rs    | 11 ++++++++++-
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/block/rnull/configfs.rs b/drivers/block/rnull/configfs.rs
index a84854e7c358..2dfc87dff66a 100644
--- a/drivers/block/rnull/configfs.rs
+++ b/drivers/block/rnull/configfs.rs
@@ -118,6 +118,7 @@ fn make_group(
                 mbps: 16,
                 blocking: 17,
                 shared_tags: 18,
+                hw_queue_depth: 19
             ],
         };
 
@@ -153,6 +154,7 @@ fn make_group(
                     blocking: false,
                     shared_tags: false,
                     shared_tag_set: self.shared_tag_set.clone(),
+                    hw_queue_depth: 64,
                 }),
             }),
             core::iter::empty(),
@@ -231,6 +233,7 @@ struct DeviceConfigInner {
     blocking: bool,
     shared_tags: bool,
     shared_tag_set: Arc<TagSet<NullBlkDevice>>,
+    hw_queue_depth: u32,
 }
 
 #[vtable]
@@ -274,6 +277,7 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
                     blocking: guard.blocking,
                     memory_backed: guard.memory_backed,
                     no_sched: guard.no_sched,
+                    hw_queue_depth: guard.hw_queue_depth,
                 },
             })?);
             guard.powered = true;
@@ -447,3 +451,4 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
 configfs_simple_field!(DeviceConfig, 16, mbps, u32);
 configfs_simple_bool_field!(DeviceConfig, 17, blocking);
 configfs_simple_bool_field!(DeviceConfig, 18, shared_tags);
+configfs_simple_field!(DeviceConfig, 19, hw_queue_depth, u32);
diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index bcf6a85f1cbc..491979daa50e 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -147,6 +147,10 @@
             default: false,
             description: "Share tag set between devices for blk-mq",
         },
+        hw_queue_depth: u32 {
+            default: 64,
+            description:  "Queue depth for each hardware queue. Default: 64",
+        },
     },
 }
 
@@ -172,6 +176,7 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
             let blocking = module_parameters::blocking.value();
             let memory_backed = module_parameters::memory_backed.value();
             let no_sched = module_parameters::no_sched.value();
+            let hw_queue_depth = module_parameters::hw_queue_depth.value();
 
             let shared_tag_set = NullBlkDevice::build_tag_set(TagSetOptions {
                 submit_queues,
@@ -179,6 +184,7 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
                 blocking,
                 memory_backed,
                 no_sched,
+                hw_queue_depth,
             })?;
 
             let mut disks = KVec::new();
@@ -209,6 +215,7 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
                         blocking,
                         memory_backed,
                         no_sched,
+                        hw_queue_depth,
                     },
                 })?;
                 disks.push(disk, GFP_KERNEL)?;
@@ -264,6 +271,7 @@ struct TagSetOptions {
     blocking: bool,
     memory_backed: bool,
     no_sched: bool,
+    hw_queue_depth: u32,
 }
 
 impl NullBlkDevice {
@@ -276,6 +284,7 @@ fn build_tag_set(options: TagSetOptions) -> Result<Arc<TagSet<Self>>> {
             blocking,
             memory_backed,
             no_sched,
+            hw_queue_depth,
         } = options;
 
         if home_node > kernel::numa::num_online_nodes().try_into()? {
@@ -297,7 +306,7 @@ fn build_tag_set(options: TagSetOptions) -> Result<Arc<TagSet<Self>>> {
         }
 
         Arc::pin_init(
-            TagSet::new(submit_queues, (), 256, 1, numa_node, flags),
+            TagSet::new(submit_queues, (), hw_queue_depth, 1, numa_node, flags),
             GFP_KERNEL,
         )
     }

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 48/83] block: rust: add an abstraction for `bindings::req_op`
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (46 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 47/83] block: rnull: add queue depth config option Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 49/83] block: rust: add a method to set the target sector of a request Andreas Hindborg
                   ` (34 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add the `Command` enum as a Rust abstraction for block request operation
codes. The enum variants correspond to the C `REQ_OP_*` defines and
include read, write, flush, discard, and zone management operations.

Also add a `command()` method to `Request` to retrieve the operation
code.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/rnull.rs            |  6 +--
 rust/kernel/block/mq.rs                 |  1 +
 rust/kernel/block/mq/request.rs         | 18 +++++----
 rust/kernel/block/mq/request/command.rs | 65 +++++++++++++++++++++++++++++++++
 4 files changed, 79 insertions(+), 11 deletions(-)

diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index 491979daa50e..5ec17a2674b7 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -547,10 +547,10 @@ fn transfer(
                 let length_sectors_allowed = segment_length_sectors.min(max_remaining_sectors);
                 segment.truncate(length_sectors_allowed << SECTOR_SHIFT);
                 match command {
-                    bindings::req_op_REQ_OP_WRITE => {
+                    mq::Command::Write => {
                         self.write(&mut tree_guard, &mut hw_data_guard, sector, segment)?
                     }
-                    bindings::req_op_REQ_OP_READ => {
+                    mq::Command::Read => {
                         self.read(&mut tree_guard, &mut hw_data_guard, sector, segment)?
                     }
                     _ => (),
@@ -743,7 +743,7 @@ fn queue_rq(
 
         if this.memory_backed {
             memalloc_scope!(let _noio: NoIo);
-            if rq.command() == bindings::req_op_REQ_OP_DISCARD {
+            if rq.command() == mq::Command::Discard {
                 this.discard(&hw_data, rq.sector(), sectors)?;
             } else {
                 this.transfer(&hw_data, &mut rq, sectors)?;
diff --git a/rust/kernel/block/mq.rs b/rust/kernel/block/mq.rs
index 503623267b19..5bf2cf2736a5 100644
--- a/rust/kernel/block/mq.rs
+++ b/rust/kernel/block/mq.rs
@@ -132,6 +132,7 @@
 
 pub use operations::Operations;
 pub use request::{
+    Command,
     IdleRequest,
     Request,
     RequestTimerHandle, //
diff --git a/rust/kernel/block/mq/request.rs b/rust/kernel/block/mq/request.rs
index a05df2351c2c..63e248970ab1 100644
--- a/rust/kernel/block/mq/request.rs
+++ b/rust/kernel/block/mq/request.rs
@@ -45,6 +45,9 @@
     BioIterator, //
 };
 
+mod command;
+pub use command::Command;
+
 /// A [`Request`] that a driver has not yet begun to process.
 ///
 /// A driver can convert an `IdleRequest` to a [`Request`] by calling [`IdleRequest::start`].
@@ -111,11 +114,17 @@ fn deref(&self) -> &Self::Target {
 
 impl<T: Operations> RequestInner<T> {
     /// Get the command identifier for the request
-    pub fn command(&self) -> u32 {
+    fn command_raw(&self) -> u32 {
         // SAFETY: By C API contract and type invariant, `cmd_flags` is valid for read
         unsafe { (*self.0.get()).cmd_flags & ((1 << bindings::REQ_OP_BITS) - 1) }
     }
 
+    /// Get the command of this request.
+    pub fn command(&self) -> Command {
+        // SAFETY: By type invariant of `Self`, `self.0` is valid and live.
+        unsafe { Command::from_raw(self.command_raw()) }
+    }
+
     /// Get the target sector for the request.
     #[inline(always)]
     pub fn sector(&self) -> u64 {
@@ -242,13 +251,6 @@ pub(crate) unsafe fn aref_from_raw(ptr: *mut bindings::request) -> ARef<Self> {
         unsafe { ARef::from_raw(NonNull::new_unchecked(ptr.cast())) }
     }
 
-    /// Get the command identifier for the request
-    pub fn command(&self) -> u32 {
-        use core::ops::BitAnd;
-        // SAFETY: By C API contract and type invariant, `cmd_flags` is valid for read
-        unsafe { (*self.0 .0.get()).cmd_flags }.bitand((1u32 << bindings::REQ_OP_BITS) - 1)
-    }
-
     /// Complete the request by scheduling `Operations::complete` for
     /// execution.
     ///
diff --git a/rust/kernel/block/mq/request/command.rs b/rust/kernel/block/mq/request/command.rs
new file mode 100644
index 000000000000..70a8d67fa35c
--- /dev/null
+++ b/rust/kernel/block/mq/request/command.rs
@@ -0,0 +1,65 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/// Block I/O operation codes.
+///
+/// This is the Rust abstraction for the C [`enum req_op`].
+///
+/// Operations common to the bio and request structures. The kernel uses 8 bits
+/// for encoding the operation, and the remaining 24 bits for flags.
+///
+/// The least significant bit of the operation number indicates the data
+/// transfer direction:
+///
+/// - If the least significant bit is set, transfers are TO the device.
+/// - If the least significant bit is not set, transfers are FROM the device.
+///
+/// If an operation does not transfer data, the least significant bit has no
+/// meaning.
+///
+/// [`enum req_op`]: srctree/include/linux/blk_types.h
+#[derive(Copy, Clone, Debug, PartialEq, Eq)]
+#[repr(u32)]
+pub enum Command {
+    /// Read sectors from the device.
+    Read = bindings::req_op_REQ_OP_READ,
+    /// Write sectors to the device.
+    Write = bindings::req_op_REQ_OP_WRITE,
+    /// Flush the volatile write cache.
+    Flush = bindings::req_op_REQ_OP_FLUSH,
+    /// Discard sectors.
+    Discard = bindings::req_op_REQ_OP_DISCARD,
+    /// Securely erase sectors.
+    SecureErase = bindings::req_op_REQ_OP_SECURE_ERASE,
+    /// Write data at the current zone write pointer.
+    ZoneAppend = bindings::req_op_REQ_OP_ZONE_APPEND,
+    /// Write zeroes. This allows to implement zeroing for devices that don't use either discard
+    /// with a predictable zero pattern or WRITE SAME of zeroes.
+    WriteZeroes = bindings::req_op_REQ_OP_WRITE_ZEROES,
+    /// Open a zone.
+    ZoneOpen = bindings::req_op_REQ_OP_ZONE_OPEN,
+    /// Close a zone.
+    ZoneClose = bindings::req_op_REQ_OP_ZONE_CLOSE,
+    /// Transition a zone to full.
+    ZoneFinish = bindings::req_op_REQ_OP_ZONE_FINISH,
+    /// Reset a zone write pointer.
+    ZoneReset = bindings::req_op_REQ_OP_ZONE_RESET,
+    /// Reset all the zones present on the device.
+    ZoneResetAll = bindings::req_op_REQ_OP_ZONE_RESET_ALL,
+    /// Driver private request for data transfer to the driver.
+    DriverIn = bindings::req_op_REQ_OP_DRV_IN,
+    /// Driver private request for data transfer from the driver.
+    DriverOut = bindings::req_op_REQ_OP_DRV_OUT,
+}
+
+impl Command {
+    /// Creates a [`Command`] from a raw `u32` value.
+    ///
+    /// # Safety
+    ///
+    /// The value must be a valid `req_op` operation code.
+    pub unsafe fn from_raw(value: u32) -> Self {
+        // SAFETY: The caller guarantees that the value is a valid operation
+        // code.
+        unsafe { core::mem::transmute(value) }
+    }
+}

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 49/83] block: rust: add a method to set the target sector of a request
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (47 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 48/83] block: rust: add an abstraction for `bindings::req_op` Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 50/83] block: rust: move gendisk vtable construction to separate function Andreas Hindborg
                   ` (33 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add a `block::mq::Request::set_sector` to allow setting the target sector
of a request.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/kernel/block/mq/request.rs | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/rust/kernel/block/mq/request.rs b/rust/kernel/block/mq/request.rs
index 63e248970ab1..66ef2493c448 100644
--- a/rust/kernel/block/mq/request.rs
+++ b/rust/kernel/block/mq/request.rs
@@ -336,6 +336,13 @@ pub(crate) fn wrapper_ref(&self) -> &RequestDataWrapper<T> {
     pub fn data_ref(&self) -> &T::RequestData {
         &self.wrapper_ref().data
     }
+
+    /// Set the target sector for the request.
+    #[inline(always)]
+    pub fn set_sector(self: Pin<&mut Self>, sector: u64) {
+        // SAFETY: By type invariant of `Self`, `self.0` is valid and live.
+        unsafe { (*self.0 .0.get()).__sector = sector }
+    }
 }
 
 /// A wrapper around data stored in the private area of the C [`struct request`].

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 50/83] block: rust: move gendisk vtable construction to separate function
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (48 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 49/83] block: rust: add a method to set the target sector of a request Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 51/83] block: rust: add zoned block device support Andreas Hindborg
                   ` (32 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Refactor the `GenDiskBuilder::build` method to move the `gendisk`
vtable construction into a separate helper function. This prepares for
adding zoned block device support which requires conditional vtable
setup.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/configfs.rs  |  5 ++-
 rust/kernel/block/mq/gen_disk.rs | 67 +++++++++++++++++++++++-----------------
 2 files changed, 43 insertions(+), 29 deletions(-)

diff --git a/drivers/block/rnull/configfs.rs b/drivers/block/rnull/configfs.rs
index 2dfc87dff66a..8fa16dbc2a75 100644
--- a/drivers/block/rnull/configfs.rs
+++ b/drivers/block/rnull/configfs.rs
@@ -290,7 +290,10 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
     }
 }
 
-configfs_simple_field!(DeviceConfig, 1, block_size, u32, check GenDiskBuilder::validate_block_size);
+configfs_simple_field!(DeviceConfig, 1,
+                       block_size, u32,
+                       check GenDiskBuilder::<NullBlkDevice>::validate_block_size
+);
 configfs_simple_bool_field!(DeviceConfig, 2, rotational);
 configfs_simple_field!(DeviceConfig, 3, capacity_mib, u64);
 configfs_simple_field!(DeviceConfig, 4, irq_mode, IRQMode);
diff --git a/rust/kernel/block/mq/gen_disk.rs b/rust/kernel/block/mq/gen_disk.rs
index 49ce5ac4774d..79a67b545eca 100644
--- a/rust/kernel/block/mq/gen_disk.rs
+++ b/rust/kernel/block/mq/gen_disk.rs
@@ -34,20 +34,24 @@
         ScopeGuard, //
     },
 };
-use core::ptr::NonNull;
+use core::{
+    marker::PhantomData,
+    ptr::NonNull, //
+};
 
 /// A builder for [`GenDisk`].
 ///
 /// Use this struct to configure and add new [`GenDisk`] to the VFS.
-pub struct GenDiskBuilder {
+pub struct GenDiskBuilder<T> {
     rotational: bool,
     logical_block_size: u32,
     physical_block_size: u32,
     capacity_sectors: u64,
     max_hw_discard_sectors: u32,
+    _p: PhantomData<T>,
 }
 
-impl Default for GenDiskBuilder {
+impl<T> Default for GenDiskBuilder<T> {
     fn default() -> Self {
         Self {
             rotational: false,
@@ -55,11 +59,12 @@ fn default() -> Self {
             physical_block_size: bindings::PAGE_SIZE as u32,
             capacity_sectors: 0,
             max_hw_discard_sectors: 0,
+            _p: PhantomData,
         }
     }
 }
 
-impl GenDiskBuilder {
+impl<T: Operations> GenDiskBuilder<T> {
     /// Create a new instance.
     pub fn new() -> Self {
         Self::default()
@@ -126,7 +131,7 @@ pub fn max_hw_discard_sectors(mut self, max_hw_discard_sectors: u32) -> Self {
     }
 
     /// Build a new `GenDisk` and add it to the VFS.
-    pub fn build<T: Operations>(
+    pub fn build(
         self,
         name: fmt::Arguments<'_>,
         tagset: Arc<TagSet<T>>,
@@ -157,30 +162,8 @@ pub fn build<T: Operations>(
             )
         })?;
 
-        const TABLE: bindings::block_device_operations = bindings::block_device_operations {
-            submit_bio: None,
-            open: None,
-            release: None,
-            ioctl: None,
-            compat_ioctl: None,
-            check_events: None,
-            unlock_native_capacity: None,
-            getgeo: None,
-            set_read_only: None,
-            swap_slot_free_notify: None,
-            report_zones: None,
-            devnode: None,
-            alternative_gpt_sector: None,
-            get_unique_id: None,
-            // TODO: Set to `THIS_MODULE`.
-            owner: core::ptr::null_mut(),
-            pr_ops: core::ptr::null_mut(),
-            free_disk: None,
-            poll_bio: None,
-        };
-
         // SAFETY: `gendisk` is a valid pointer as we initialized it above
-        unsafe { (*gendisk).fops = &TABLE };
+        unsafe { (*gendisk).fops = Self::build_vtable() };
 
         let mut writer = NullTerminatedFormatter::new(
             // SAFETY: `gendisk` points to a valid and initialized instance. We
@@ -233,6 +216,34 @@ pub fn build<T: Operations>(
 
         Ok(disk.into())
     }
+
+    const VTABLE: bindings::block_device_operations = bindings::block_device_operations {
+        submit_bio: None,
+        open: None,
+        release: None,
+        ioctl: None,
+        compat_ioctl: None,
+        check_events: None,
+        unlock_native_capacity: None,
+        getgeo: None,
+        set_read_only: None,
+        swap_slot_free_notify: None,
+        report_zones: None,
+        devnode: None,
+        alternative_gpt_sector: None,
+        get_unique_id: None,
+        // TODO: Set to THIS_MODULE. Waiting for const_refs_to_static feature to
+        // be merged (unstable in rustc 1.78 which is staged for linux 6.10)
+        // <https://github.com/rust-lang/rust/issues/119618>
+        owner: core::ptr::null_mut(),
+        pr_ops: core::ptr::null_mut(),
+        free_disk: None,
+        poll_bio: None,
+    };
+
+    pub(crate) const fn build_vtable() -> &'static bindings::block_device_operations {
+        &Self::VTABLE
+    }
 }
 
 /// A generic block device.

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 51/83] block: rust: add zoned block device support
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (49 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 50/83] block: rust: move gendisk vtable construction to separate function Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 52/83] block: rust: add `TagSet::flags` Andreas Hindborg
                   ` (31 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add support for zoned block devices to the Rust block layer bindings.
This includes the `report_zones` callback in `Operations` and methods
in `GenDiskBuilder` to configure zoned device parameters.

Drivers can mark a disk as zoned and configure the zone size and
maximum zone append size. The `report_zones` callback is invoked by
the block layer to query zone information.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/bindings/bindings_helper.h    |  1 +
 rust/kernel/block/mq/gen_disk.rs   | 95 +++++++++++++++++++++++++++++++++-----
 rust/kernel/block/mq/operations.rs | 61 +++++++++++++++++++++++-
 3 files changed, 145 insertions(+), 12 deletions(-)

diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
index eaf05d60dda9..2a69c17bf271 100644
--- a/rust/bindings/bindings_helper.h
+++ b/rust/bindings/bindings_helper.h
@@ -139,6 +139,7 @@ const blk_status_t RUST_CONST_HELPER_BLK_STS_ZONE_ACTIVE_RESOURCE = BLK_STS_ZONE
 const blk_status_t RUST_CONST_HELPER_BLK_STS_OFFLINE = BLK_STS_OFFLINE;
 const blk_status_t RUST_CONST_HELPER_BLK_STS_DURATION_LIMIT = BLK_STS_DURATION_LIMIT;
 const blk_status_t RUST_CONST_HELPER_BLK_STS_INVAL = BLK_STS_INVAL;
+const blk_features_t RUST_CONST_HELPER_BLK_FEAT_ZONED = BLK_FEAT_ZONED;
 const fop_flags_t RUST_CONST_HELPER_FOP_UNSIGNED_OFFSET = FOP_UNSIGNED_OFFSET;
 
 const xa_mark_t RUST_CONST_HELPER_XA_PRESENT = XA_PRESENT;
diff --git a/rust/kernel/block/mq/gen_disk.rs b/rust/kernel/block/mq/gen_disk.rs
index 79a67b545eca..eedba691e167 100644
--- a/rust/kernel/block/mq/gen_disk.rs
+++ b/rust/kernel/block/mq/gen_disk.rs
@@ -8,6 +8,7 @@
 use crate::{
     bindings,
     block::mq::{
+        operations::OperationsVTable,
         Operations,
         RequestQueue,
         TagSet, //
@@ -48,6 +49,12 @@ pub struct GenDiskBuilder<T> {
     physical_block_size: u32,
     capacity_sectors: u64,
     max_hw_discard_sectors: u32,
+    #[cfg(CONFIG_BLK_DEV_ZONED)]
+    zoned: bool,
+    #[cfg(CONFIG_BLK_DEV_ZONED)]
+    zone_size_sectors: u32,
+    #[cfg(CONFIG_BLK_DEV_ZONED)]
+    zone_append_max_sectors: u32,
     _p: PhantomData<T>,
 }
 
@@ -59,6 +66,12 @@ fn default() -> Self {
             physical_block_size: bindings::PAGE_SIZE as u32,
             capacity_sectors: 0,
             max_hw_discard_sectors: 0,
+            #[cfg(CONFIG_BLK_DEV_ZONED)]
+            zoned: false,
+            #[cfg(CONFIG_BLK_DEV_ZONED)]
+            zone_size_sectors: 0,
+            #[cfg(CONFIG_BLK_DEV_ZONED)]
+            zone_append_max_sectors: 0,
             _p: PhantomData,
         }
     }
@@ -130,6 +143,27 @@ pub fn max_hw_discard_sectors(mut self, max_hw_discard_sectors: u32) -> Self {
         self
     }
 
+    /// Mark this device as a zoned block device.
+    #[cfg(CONFIG_BLK_DEV_ZONED)]
+    pub fn zoned(mut self, enable: bool) -> Self {
+        self.zoned = enable;
+        self
+    }
+
+    /// Set the zone size of this block device.
+    #[cfg(CONFIG_BLK_DEV_ZONED)]
+    pub fn zone_size(mut self, sectors: u32) -> Self {
+        self.zone_size_sectors = sectors;
+        self
+    }
+
+    /// Set the max zone append size for this block device.
+    #[cfg(CONFIG_BLK_DEV_ZONED)]
+    pub fn zone_append_max(mut self, sectors: u32) -> Self {
+        self.zone_append_max_sectors = sectors;
+        self
+    }
+
     /// Build a new `GenDisk` and add it to the VFS.
     pub fn build(
         self,
@@ -149,7 +183,18 @@ pub fn build(
         lim.physical_block_size = self.physical_block_size;
         lim.max_hw_discard_sectors = self.max_hw_discard_sectors;
         if self.rotational {
-            lim.features = bindings::BLK_FEAT_ROTATIONAL;
+            lim.features |= bindings::BLK_FEAT_ROTATIONAL;
+        }
+
+        #[cfg(CONFIG_BLK_DEV_ZONED)]
+        if self.zoned {
+            if !T::HAS_REPORT_ZONES {
+                return Err(error::code::EINVAL);
+            }
+
+            lim.features |= bindings::BLK_FEAT_ZONED;
+            lim.chunk_sectors = self.zone_size_sectors;
+            lim.max_hw_zone_append_sectors = self.zone_append_max_sectors;
         }
 
         // SAFETY: `tagset.raw_tag_set()` points to a valid and initialized tag set
@@ -179,14 +224,6 @@ pub fn build(
         // operation, so we will not race.
         unsafe { bindings::set_capacity(gendisk, self.capacity_sectors) };
 
-        crate::error::to_result(
-            // SAFETY: `gendisk` points to a valid and initialized instance of
-            // `struct gendisk`.
-            unsafe {
-                bindings::device_add_disk(core::ptr::null_mut(), gendisk, core::ptr::null_mut())
-            },
-        )?;
-
         recover_data.dismiss();
 
         // INVARIANT: `gendisk` was initialized above.
@@ -214,7 +251,27 @@ pub fn build(
             GFP_KERNEL,
         )?;
 
-        Ok(disk.into())
+        let disk: Arc<_> = disk.into();
+
+        // SAFETY: `disk.gendisk` is valid for write as we initialized it above. We have exclusive
+        // access.
+        unsafe { (*disk.gendisk).private_data = Arc::as_ptr(&disk).cast_mut().cast() };
+
+        #[cfg(CONFIG_BLK_DEV_ZONED)]
+        if self.zoned {
+            // SAFETY: `disk.gendisk` is valid as we initialized it above. We have exclusive access.
+            unsafe { bindings::blk_revalidate_disk_zones(gendisk) };
+        }
+
+        crate::error::to_result(
+            // SAFETY: `gendisk` points to a valid and initialized instance of
+            // `struct gendisk`.
+            unsafe {
+                bindings::device_add_disk(core::ptr::null_mut(), gendisk, core::ptr::null_mut())
+            },
+        )?;
+
+        Ok(disk)
     }
 
     const VTABLE: bindings::block_device_operations = bindings::block_device_operations {
@@ -228,7 +285,11 @@ pub fn build(
         getgeo: None,
         set_read_only: None,
         swap_slot_free_notify: None,
-        report_zones: None,
+        report_zones: if T::HAS_REPORT_ZONES {
+            Some(OperationsVTable::<T>::report_zones_callback)
+        } else {
+            None
+        },
         devnode: None,
         alternative_gpt_sector: None,
         get_unique_id: None,
@@ -327,6 +388,18 @@ fn drop(&mut self) {
 /// `self.0` is valid for use as a reference.
 pub struct GenDiskRef<T: Operations>(NonNull<GenDisk<T>>);
 
+impl<T: Operations> GenDiskRef<T> {
+    /// Create a `GenDiskRef` from a pointer to a `GenDisk`.
+    ///
+    /// # Safety
+    ///
+    /// `ptr` must be valid for use as a `GenDisk` reference for the lifetime of the returned
+    /// `GenDiskRef`.
+    pub(crate) unsafe fn from_ptr(ptr: NonNull<GenDisk<T>>) -> GenDiskRef<T> {
+        Self(ptr)
+    }
+}
+
 // SAFETY: It is safe to transfer ownership of `GenDiskRef` across thread boundaries.
 unsafe impl<T: Operations> Send for GenDiskRef<T> {}
 
diff --git a/rust/kernel/block/mq/operations.rs b/rust/kernel/block/mq/operations.rs
index b9a2bf6592b3..71d4192d627f 100644
--- a/rust/kernel/block/mq/operations.rs
+++ b/rust/kernel/block/mq/operations.rs
@@ -9,6 +9,7 @@
     block::{
         error::BlkResult,
         mq::{
+            gen_disk::GenDiskRef,
             request::RequestDataWrapper,
             IdleRequest,
             Request, //
@@ -16,6 +17,7 @@
     },
     error::{
         from_result,
+        to_result,
         Result, //
     },
     prelude::*,
@@ -29,7 +31,10 @@
         Owned, //
     },
 };
-use core::marker::PhantomData;
+use core::{
+    marker::PhantomData,
+    ptr::NonNull, //
+};
 use pin_init::PinInit;
 
 type ForeignBorrowed<'a, T> = <T as ForeignOwnable>::Borrowed<'a>;
@@ -107,6 +112,20 @@ fn init_hctx(
     fn poll(_hw_data: ForeignBorrowed<'_, Self::HwData>) -> bool {
         build_error!(crate::error::VTABLE_DEFAULT_ERROR)
     }
+
+    /// Called by the kernel to get a zone report from the driver.
+    ///
+    /// The driver must call `callback` once for each zone on `disk` and populate the first argument
+    /// with a zone descriptor and the second argument when the zone index.
+    // TODO: We cannot gate this on CONFIG_BLK_DEV_ZONED due to limitations of the `vtable` macro.
+    fn report_zones(
+        _disk: &GenDiskRef<Self>,
+        _sector: u64,
+        _nr_zones: u32,
+        _callback: impl Fn(&bindings::blk_zone, u32) -> Result,
+    ) -> Result<u32> {
+        Err(ENOTSUPP)
+    }
 }
 
 /// A vtable for blk-mq to interact with a block device driver.
@@ -359,6 +378,46 @@ impl<T: Operations> OperationsVTable<T> {
         unsafe { core::ptr::drop_in_place(pdu) };
     }
 
+    /// This function is a callback hook for the C kernel. A pointer to this function is
+    /// installed in the `blk_mq_ops` vtable for the driver.
+    ///
+    /// # Safety
+    ///
+    /// - This function may only be called by blk-mq C infrastructure.
+    /// - `disk_ptr` must be a pointer to a gendisk initialized by `GenDisk::build`.
+    pub(crate) unsafe extern "C" fn report_zones_callback(
+        disk_ptr: *mut bindings::gendisk,
+        sector: u64,
+        nr_zones: u32,
+        args: *mut bindings::blk_report_zones_args,
+    ) -> i32 {
+        // SAFETY: As `disk_ptr` is a gendisk initialized by `GenDisk::build`, `private_data` is not
+        // null.
+        let disk_ref_ptr = unsafe { NonNull::new_unchecked((*disk_ptr).private_data.cast()) };
+
+        // SAFETY: `disk_ptr.private_data` is a pointer to the `GenDisk` owner of `disk_ptr` that we
+        // installed when we initialized `disk_ptr`. It is valid for use as a reference for the
+        // duration of this call.
+        let disk = unsafe { GenDiskRef::from_ptr(disk_ref_ptr) };
+
+        from_result(|| {
+            T::report_zones(&disk, sector, nr_zones, |zone, idx| -> Result {
+                to_result(
+                    // SAFETY: `disk_ptr` is valid by function safety requirements.
+                    unsafe {
+                        bindings::disk_report_zone(
+                            disk_ptr,
+                            core::ptr::from_ref(zone).cast_mut(),
+                            idx,
+                            args,
+                        )
+                    },
+                )
+            })
+            .and_then(|v: u32| -> Result<_> { Ok(v.try_into()?) })
+        })
+    }
+
     const VTABLE: bindings::blk_mq_ops = bindings::blk_mq_ops {
         queue_rq: Some(Self::queue_rq_callback),
         queue_rqs: None,

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 52/83] block: rust: add `TagSet::flags`
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (50 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 51/83] block: rust: add zoned block device support Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 53/83] block: rnull: add zoned storage support Andreas Hindborg
                   ` (30 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add a way for block device drivers to query the flags that a `TagSet`
was configured with. This is needed so drivers can inspect properties
such as whether the tag set uses blocking queues.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/kernel/block/mq/tag_set.rs | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/rust/kernel/block/mq/tag_set.rs b/rust/kernel/block/mq/tag_set.rs
index 5359e60fb5a5..157c47f64334 100644
--- a/rust/kernel/block/mq/tag_set.rs
+++ b/rust/kernel/block/mq/tag_set.rs
@@ -107,6 +107,15 @@ pub fn new(
     pub(crate) fn raw_tag_set(&self) -> *mut bindings::blk_mq_tag_set {
         self.inner.get()
     }
+
+    /// Return the [`Flags`] that this tag set was configured with.
+    pub fn flags(&self) -> Flags {
+        let this = self.raw_tag_set();
+        // SAFETY: By type invariant, `this` points to a valid and initialized
+        // `blk_mq_tag_set`.
+        let flags_raw = unsafe { (*this).flags };
+        Flags::try_from(flags_raw).expect("Expected valid flags from C struct")
+    }
 }
 
 #[pinned_drop]

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 53/83] block: rnull: add zoned storage support
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (51 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 52/83] block: rust: add `TagSet::flags` Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 54/83] block: rust: add `map_queues` support Andreas Hindborg
                   ` (29 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add zoned block device emulation to rnull. When enabled via the `zoned`
configfs attribute, the driver emulates a zoned storage device with
configurable zone size and zone count.

The implementation supports zone management operations including zone
reset, zone open, zone close, and zone finish. Zone write pointer
tracking is maintained for sequential write required zones.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/configfs.rs          |  67 +++-
 drivers/block/rnull/disk_storage.rs      |  34 +-
 drivers/block/rnull/disk_storage/page.rs |   4 +-
 drivers/block/rnull/rnull.rs             | 243 +++++++----
 drivers/block/rnull/util.rs              |  65 +++
 drivers/block/rnull/zoned.rs             | 663 +++++++++++++++++++++++++++++++
 6 files changed, 973 insertions(+), 103 deletions(-)

diff --git a/drivers/block/rnull/configfs.rs b/drivers/block/rnull/configfs.rs
index 8fa16dbc2a75..f866595a263c 100644
--- a/drivers/block/rnull/configfs.rs
+++ b/drivers/block/rnull/configfs.rs
@@ -80,7 +80,8 @@ impl AttributeOperations<0> for Config {
         let mut writer = kernel::str::Formatter::new(page);
         writer.write_str(
             "blocksize,size,rotational,irqmode,completion_nsec,memory_backed,\
-             submit_queues,use_per_node_hctx,discard,blocking,shared_tags\n",
+             submit_queues,use_per_node_hctx,discard,blocking,shared_tags,\
+             zoned,zone_size,zone_capacity\n",
         )?;
         Ok(writer.bytes_written())
     }
@@ -118,7 +119,14 @@ fn make_group(
                 mbps: 16,
                 blocking: 17,
                 shared_tags: 18,
-                hw_queue_depth: 19
+                hw_queue_depth: 19,
+                zoned: 20,
+                zone_size: 21,
+                zone_capacity: 22,
+                zone_nr_conv: 23,
+                zone_max_open: 24,
+                zone_max_active: 25,
+                zone_append_max_sectors: 26,
             ],
         };
 
@@ -145,16 +153,20 @@ fn make_group(
                     bad_blocks: Arc::pin_init(BadBlocks::new(false), GFP_KERNEL)?,
                     bad_blocks_once: false,
                     bad_blocks_partial_io: false,
-                    disk_storage: Arc::pin_init(
-                        DiskStorage::new(0, block_size as usize),
-                        GFP_KERNEL
-                    )?,
+                    disk_storage: Arc::pin_init(DiskStorage::new(0, block_size), GFP_KERNEL)?,
                     cache_size_mib: 0,
                     mbps: 0,
                     blocking: false,
                     shared_tags: false,
                     shared_tag_set: self.shared_tag_set.clone(),
                     hw_queue_depth: 64,
+                    zoned: false,
+                    zone_size_mib: 256,
+                    zone_capacity_mib: 0,
+                    zone_nr_conv: 0,
+                    zone_max_open: 0,
+                    zone_max_active: 0,
+                    zone_append_max_sectors: u32::MAX,
                 }),
             }),
             core::iter::empty(),
@@ -234,6 +246,13 @@ struct DeviceConfigInner {
     shared_tags: bool,
     shared_tag_set: Arc<TagSet<NullBlkDevice>>,
     hw_queue_depth: u32,
+    zoned: bool,
+    zone_size_mib: u32,
+    zone_capacity_mib: u32,
+    zone_nr_conv: u32,
+    zone_max_open: u32,
+    zone_max_active: u32,
+    zone_append_max_sectors: u32,
 }
 
 #[vtable]
@@ -257,11 +276,24 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
         let mut guard = this.data.lock();
 
         if !guard.powered && power_op {
+            // We protect zone state with a mutex, so we require blocking queues for zone emulation.
+            if guard.shared_tags && guard.zoned {
+                if !guard
+                    .shared_tag_set
+                    .flags()
+                    .contains(kernel::block::mq::tag_set::Flag::Blocking)
+                {
+                    return Err(EINVAL);
+                }
+            } else if guard.zoned && !guard.blocking {
+                return Err(EINVAL);
+            }
+
             guard.disk = Some(NullBlkDevice::new(crate::NullBlkOptions {
                 name: &guard.name,
-                block_size: guard.block_size,
+                block_size_bytes: guard.block_size,
                 rotational: guard.rotational,
-                capacity_mib: guard.capacity_mib,
+                device_capacity_mib: guard.capacity_mib,
                 irq_mode: guard.irq_mode,
                 completion_time: guard.completion_time,
                 discard: guard.discard,
@@ -279,6 +311,13 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
                     no_sched: guard.no_sched,
                     hw_queue_depth: guard.hw_queue_depth,
                 },
+                zoned: guard.zoned,
+                zone_size_mib: guard.zone_size_mib,
+                zone_capacity_mib: guard.zone_capacity_mib,
+                zone_nr_conv: guard.zone_nr_conv,
+                zone_max_open: guard.zone_max_open,
+                zone_max_active: guard.zone_max_active,
+                zone_append_max_sectors: guard.zone_append_max_sectors,
             })?);
             guard.powered = true;
         } else if guard.powered && !power_op {
@@ -442,10 +481,7 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
     store: |this, page| store_with_power_check(this, page, |data, page| {
         let text = core::str::from_utf8(page)?.trim();
         let value = text.parse::<u64>().map_err(|_| EINVAL)?;
-        data.disk_storage = Arc::pin_init(
-            DiskStorage::new(value, data.block_size as usize),
-            GFP_KERNEL
-        )?;
+        data.disk_storage = Arc::pin_init(DiskStorage::new(value, data.block_size), GFP_KERNEL)?;
         data.cache_size_mib = value;
         Ok(())
     })
@@ -455,3 +491,10 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
 configfs_simple_bool_field!(DeviceConfig, 17, blocking);
 configfs_simple_bool_field!(DeviceConfig, 18, shared_tags);
 configfs_simple_field!(DeviceConfig, 19, hw_queue_depth, u32);
+configfs_simple_bool_field!(DeviceConfig, 20, zoned);
+configfs_simple_field!(DeviceConfig, 21, zone_size_mib, u32);
+configfs_simple_field!(DeviceConfig, 22, zone_capacity_mib, u32);
+configfs_simple_field!(DeviceConfig, 23, zone_nr_conv, u32);
+configfs_simple_field!(DeviceConfig, 24, zone_max_open, u32);
+configfs_simple_field!(DeviceConfig, 25, zone_max_active, u32);
+configfs_simple_field!(DeviceConfig, 26, zone_append_max_sectors, u32);
diff --git a/drivers/block/rnull/disk_storage.rs b/drivers/block/rnull/disk_storage.rs
index b8fef411fffe..82de1f656f68 100644
--- a/drivers/block/rnull/disk_storage.rs
+++ b/drivers/block/rnull/disk_storage.rs
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 
 use super::HwQueueContext;
+use crate::util::*;
 use core::pin::Pin;
 use kernel::{
     block,
@@ -9,8 +10,12 @@
     page::PAGE_SIZE,
     prelude::*,
     sync::{
-        atomic::{ordering, Atomic},
-        SpinLock, SpinLockGuard,
+        atomic::{
+            ordering,
+            Atomic, //
+        },
+        SpinLock,
+        SpinLockGuard, //
     },
     uapi::PAGE_SECTORS,
     xarray::{
@@ -31,11 +36,11 @@ pub(crate) struct DiskStorage {
     cache_size: u64,
     cache_size_used: Atomic<u64>,
     next_flush_sector: Atomic<u64>,
-    block_size: usize,
+    block_size: u32,
 }
 
 impl DiskStorage {
-    pub(crate) fn new(cache_size: u64, block_size: usize) -> impl PinInit<Self, Error> {
+    pub(crate) fn new(cache_size: u64, block_size: u32) -> impl PinInit<Self, Error> {
         try_pin_init!( Self {
             // TODO: Get rid of the box
             // https://git.kernel.org/pub/scm/linux/kernel/git/boqun/linux.git/commit/?h=locking&id=a5d84cafb3e253a11d2e078902c5b090be2f4227
@@ -59,6 +64,27 @@ pub(crate) fn access<'a, 'b, 'c>(
     pub(crate) fn lock(&self) -> SpinLockGuard<'_, Pin<KBox<TreeContainer>>> {
         self.trees.lock()
     }
+
+    pub(crate) fn discard(
+        &self,
+        hw_data: &Pin<&SpinLock<HwQueueContext>>,
+        mut sector: u64,
+        sectors: u32,
+    ) {
+        let mut tree_guard = self.lock();
+        let mut hw_data_guard = hw_data.lock();
+
+        let mut access = self.access(&mut tree_guard, &mut hw_data_guard, None);
+
+        let mut remaining_bytes = sectors_to_bytes(sectors);
+
+        while remaining_bytes > 0 {
+            access.free_sector(sector);
+            let processed = remaining_bytes.min(self.block_size);
+            sector += Into::<u64>::into(bytes_to_sectors(processed));
+            remaining_bytes -= processed;
+        }
+    }
 }
 
 pub(crate) struct DiskStorageAccess<'a, 'b, 'c> {
diff --git a/drivers/block/rnull/disk_storage/page.rs b/drivers/block/rnull/disk_storage/page.rs
index bc78973ad5d4..88dc9a2476bd 100644
--- a/drivers/block/rnull/disk_storage/page.rs
+++ b/drivers/block/rnull/disk_storage/page.rs
@@ -20,11 +20,11 @@
 pub(crate) struct NullBlockPage {
     page: Owned<SafePage>,
     status: u64,
-    block_size: usize,
+    block_size: u32,
 }
 
 impl NullBlockPage {
-    pub(crate) fn new(block_size: usize) -> Result<KBox<Self>> {
+    pub(crate) fn new(block_size: u32) -> Result<KBox<Self>> {
         memalloc_scope!(let _noio: NoIo);
         Ok(KBox::new(
             Self {
diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index 5ec17a2674b7..6fb307e33263 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -2,8 +2,13 @@
 
 //! This is a Rust implementation of the C null block driver.
 
+#![recursion_limit = "256"]
+
 mod configfs;
 mod disk_storage;
+mod util;
+#[cfg(CONFIG_BLK_DEV_ZONED)]
+mod zoned;
 
 use configfs::IRQMode;
 use disk_storage::{
@@ -77,6 +82,7 @@
     },
     xarray::XArraySheaf, //
 };
+use util::*;
 
 module! {
     type: NullBlkModule,
@@ -151,6 +157,35 @@
             default: 64,
             description:  "Queue depth for each hardware queue. Default: 64",
         },
+        zoned: bool {
+            default: false,
+            description: "Make device as a host-managed zoned block device.",
+        },
+        zone_size: u32 {
+            default: 256,
+            description:
+            "Zone size in MB when block device is zoned. Must be power-of-two: Default: 256",
+        },
+        zone_capacity: u32 {
+            default: 0,
+            description: "Zone capacity in MB when block device is zoned. Can be less than or equal to zone size. Default: Zone size",
+        },
+        zone_nr_conv: u32 {
+            default: 0,
+            description: "Number of conventional zones when block device is zoned. Default: 0",
+        },
+        zone_max_open: u32 {
+            default: 0,
+            description: "Maximum number of open zones when block device is zoned. Default: 0 (no limit)",
+        },
+        zone_max_active: u32 {
+            default: 0,
+            description: "Maximum number of active zones when block device is zoned. Default: 0 (no limit)",
+        },
+        zone_append_max_sectors: u32 {
+            default: 0,
+            description: "Maximum size of a zone append command (in 512B sectors). Specify 0 for no zone append.",
+        },
     },
 }
 
@@ -195,16 +230,16 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
                 let block_size = module_parameters::bs.value();
                 let disk = NullBlkDevice::new(NullBlkOptions {
                     name: &name,
-                    block_size,
+                    block_size_bytes: block_size,
                     rotational: module_parameters::rotational.value(),
-                    capacity_mib: module_parameters::gb.value() * 1024,
+                    device_capacity_mib: module_parameters::gb.value() * 1024,
                     irq_mode: module_parameters::irqmode.value().try_into()?,
                     completion_time: Delta::from_nanos(completion_time),
                     discard: module_parameters::discard.value(),
                     bad_blocks: Arc::pin_init(BadBlocks::new(false), GFP_KERNEL)?,
                     bad_blocks_once: false,
                     bad_blocks_partial_io: false,
-                    storage: Arc::pin_init(DiskStorage::new(0, block_size as usize), GFP_KERNEL)?,
+                    storage: Arc::pin_init(DiskStorage::new(0, block_size), GFP_KERNEL)?,
                     bandwidth_limit: u64::from(module_parameters::mbps.value()) * 2u64.pow(20),
                     shared_tag_set: module_parameters::shared_tags
                         .value()
@@ -217,6 +252,13 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
                         no_sched,
                         hw_queue_depth,
                     },
+                    zoned: module_parameters::zoned.value(),
+                    zone_size_mib: module_parameters::zone_size.value(),
+                    zone_capacity_mib: module_parameters::zone_capacity.value(),
+                    zone_nr_conv: module_parameters::zone_nr_conv.value(),
+                    zone_max_open: module_parameters::zone_max_open.value(),
+                    zone_max_active: module_parameters::zone_max_active.value(),
+                    zone_append_max_sectors: module_parameters::zone_append_max_sectors.value(),
                 })?;
                 disks.push(disk, GFP_KERNEL)?;
             }
@@ -231,9 +273,9 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
 
 struct NullBlkOptions<'a> {
     name: &'a CStr,
-    block_size: u32,
+    block_size_bytes: u32,
     rotational: bool,
-    capacity_mib: u64,
+    device_capacity_mib: u64,
     irq_mode: IRQMode,
     completion_time: Delta,
     discard: bool,
@@ -244,6 +286,19 @@ struct NullBlkOptions<'a> {
     bandwidth_limit: u64,
     shared_tag_set: Option<Arc<TagSet<NullBlkDevice>>>,
     tag_set: TagSetOptions,
+    zoned: bool,
+    #[cfg_attr(not(CONFIG_BLK_DEV_ZONED), allow(dead_code))]
+    zone_size_mib: u32,
+    #[cfg_attr(not(CONFIG_BLK_DEV_ZONED), allow(dead_code))]
+    zone_capacity_mib: u32,
+    #[cfg_attr(not(CONFIG_BLK_DEV_ZONED), allow(dead_code))]
+    zone_nr_conv: u32,
+    #[cfg_attr(not(CONFIG_BLK_DEV_ZONED), allow(dead_code))]
+    zone_max_open: u32,
+    #[cfg_attr(not(CONFIG_BLK_DEV_ZONED), allow(dead_code))]
+    zone_max_active: u32,
+    #[cfg_attr(not(CONFIG_BLK_DEV_ZONED), allow(dead_code))]
+    zone_append_max_sectors: u32,
 }
 
 #[pin_data]
@@ -252,7 +307,7 @@ struct NullBlkDevice {
     irq_mode: IRQMode,
     completion_time: Delta,
     memory_backed: bool,
-    block_size: usize,
+    block_size_bytes: u32,
     bad_blocks: Arc<BadBlocks>,
     bad_blocks_once: bool,
     bad_blocks_partial_io: bool,
@@ -263,6 +318,9 @@ struct NullBlkDevice {
     #[pin]
     bandwidth_timer_handle: SpinLock<Option<ArcHrTimerHandle<Self>>>,
     disk: SetOnce<Arc<Revocable<GenDiskRef<Self>>>>,
+    #[cfg(CONFIG_BLK_DEV_ZONED)]
+    #[pin]
+    zoned: zoned::ZoneOptions,
 }
 
 struct TagSetOptions {
@@ -314,9 +372,9 @@ fn build_tag_set(options: TagSetOptions) -> Result<Arc<TagSet<Self>>> {
     fn new(options: NullBlkOptions<'_>) -> Result<Arc<GenDisk<Self>>> {
         let NullBlkOptions {
             name,
-            block_size,
+            block_size_bytes,
             rotational,
-            capacity_mib,
+            device_capacity_mib,
             irq_mode,
             completion_time,
             discard,
@@ -327,6 +385,19 @@ fn new(options: NullBlkOptions<'_>) -> Result<Arc<GenDisk<Self>>> {
             bandwidth_limit,
             shared_tag_set,
             tag_set,
+            zoned,
+            #[cfg_attr(not(CONFIG_BLK_DEV_ZONED), allow(unused_variables))]
+            zone_size_mib,
+            #[cfg_attr(not(CONFIG_BLK_DEV_ZONED), allow(unused_variables))]
+            zone_capacity_mib,
+            #[cfg_attr(not(CONFIG_BLK_DEV_ZONED), allow(unused_variables))]
+            zone_nr_conv,
+            #[cfg_attr(not(CONFIG_BLK_DEV_ZONED), allow(unused_variables))]
+            zone_max_open,
+            #[cfg_attr(not(CONFIG_BLK_DEV_ZONED), allow(unused_variables))]
+            zone_max_active,
+            #[cfg_attr(not(CONFIG_BLK_DEV_ZONED), allow(unused_variables))]
+            zone_append_max_sectors,
         } = options;
 
         let memory_backed = tag_set.memory_backed;
@@ -337,10 +408,10 @@ fn new(options: NullBlkOptions<'_>) -> Result<Arc<GenDisk<Self>>> {
             Self::build_tag_set(tag_set)?
         };
 
-        let capacity_sectors = capacity_mib << (20 - block::SECTOR_SHIFT);
+        let device_capacity_sectors = mib_to_sectors(device_capacity_mib);
 
         // Prevent overflow in usize/u64 casts
-        if usize::BITS == 32 && capacity_sectors > u32::MAX.into() {
+        if usize::BITS == 32 && device_capacity_sectors > u32::MAX.into() {
             return Err(code::EINVAL);
         }
 
@@ -350,7 +421,7 @@ fn new(options: NullBlkOptions<'_>) -> Result<Arc<GenDisk<Self>>> {
                 irq_mode,
                 completion_time,
                 memory_backed,
-                block_size: block_size as usize,
+                block_size_bytes,
                 bad_blocks,
                 bad_blocks_once,
                 bad_blocks_partial_io,
@@ -359,17 +430,42 @@ fn new(options: NullBlkOptions<'_>) -> Result<Arc<GenDisk<Self>>> {
                 bandwidth_bytes: Atomic::new(0),
                 bandwidth_timer_handle <- new_spinlock!(None),
                 disk: SetOnce::new(),
+                #[cfg(CONFIG_BLK_DEV_ZONED)]
+                zoned <- zoned::ZoneOptions::new(zoned::ZoneOptionsArgs {
+                    enable: zoned,
+                    device_capacity_mib,
+                    block_size_bytes: *block_size_bytes,
+                    zone_size_mib,
+                    zone_capacity_mib,
+                    zone_nr_conv,
+                    zone_max_open,
+                    zone_max_active,
+                    zone_append_max_sectors,
+                })?,
             }),
             GFP_KERNEL,
         )?;
 
         let mut builder = gen_disk::GenDiskBuilder::new()
-            .capacity_sectors(capacity_sectors)
-            .logical_block_size(block_size)?
-            .physical_block_size(block_size)?
+            .capacity_sectors(device_capacity_sectors)
+            .logical_block_size(block_size_bytes)?
+            .physical_block_size(block_size_bytes)?
             .rotational(rotational);
 
-        if memory_backed && discard {
+        #[cfg(CONFIG_BLK_DEV_ZONED)]
+        {
+            builder = builder
+                .zoned(zoned)
+                .zone_size(queue_data.zoned.size_sectors)
+                .zone_append_max(zone_append_max_sectors);
+        }
+
+        if !cfg!(CONFIG_BLK_DEV_ZONED) && zoned {
+            return Err(ENOTSUPP);
+        }
+
+        // TODO: Warn on invalid discard configuration (zoned, memory)
+        if memory_backed && discard && !zoned {
             builder = builder
                 // Max IO size is u32::MAX bytes
                 .max_hw_discard_sectors(ffi::c_uint::MAX >> block::SECTOR_SHIFT);
@@ -393,7 +489,7 @@ fn sheaf_size() -> usize {
     fn preload<'b, 'c>(
         tree_guard: &'b mut SpinLockGuard<'c, Pin<KBox<TreeContainer>>>,
         hw_data_guard: &'b mut SpinLockGuard<'c, HwQueueContext>,
-        block_size: usize,
+        block_size_bytes: u32,
         sheaf: &'b mut Option<XArraySheaf<'c>>,
     ) -> Result {
         match sheaf {
@@ -418,10 +514,9 @@ fn preload<'b, 'c>(
 
         // Another thread may get the lock after we allocate. If this happens, retry.
         while hw_data_guard.page.is_none() {
-            hw_data_guard.page =
-                Some(tree_guard.do_unlocked(|| {
-                    hw_data_guard.do_unlocked(|| NullBlockPage::new(block_size))
-                })?);
+            hw_data_guard.page = Some(tree_guard.do_unlocked(|| {
+                hw_data_guard.do_unlocked(|| NullBlockPage::new(block_size_bytes))
+            })?);
         }
 
         Ok(())
@@ -438,7 +533,7 @@ fn write<'a, 'b, 'c>(
         let mut sheaf: Option<XArraySheaf<'_>> = None;
 
         while !segment.is_empty() {
-            Self::preload(tree_guard, hw_data_guard, self.block_size, &mut sheaf)?;
+            Self::preload(tree_guard, hw_data_guard, self.block_size_bytes, &mut sheaf)?;
 
             let mut access = self.storage.access(tree_guard, hw_data_guard, sheaf);
 
@@ -491,48 +586,23 @@ fn read<'a, 'b, 'c>(
                         >> block::SECTOR_SHIFT;
                 }
                 // CAST: Casting from `usize` to `u64` never overflows.
-                None => sector += segment.zero_page() as u64 >> block::SECTOR_SHIFT,
+                None => sector += bytes_to_sectors(segment.zero_page() as u64),
             }
         }
 
         Ok(())
     }
 
-    fn discard(
-        &self,
-        hw_data: &Pin<&SpinLock<HwQueueContext>>,
-        mut sector: u64,
-        sectors: u32,
-    ) -> Result {
-        let mut tree_guard = self.storage.lock();
-        let mut hw_data_guard = hw_data.lock();
-
-        let mut access = self
-            .storage
-            .access(&mut tree_guard, &mut hw_data_guard, None);
-
-        let mut remaining_bytes = (sectors as usize) << SECTOR_SHIFT;
-
-        while remaining_bytes > 0 {
-            access.free_sector(sector);
-            let processed = remaining_bytes.min(self.block_size);
-            sector += (processed >> SECTOR_SHIFT) as u64;
-            remaining_bytes -= processed;
-        }
-
-        Ok(())
-    }
-
     #[inline(never)]
     fn transfer(
         &self,
         hw_data: &Pin<&SpinLock<HwQueueContext>>,
         rq: &mut Owned<mq::Request<Self>>,
+        command: mq::Command,
         max_sectors: u32,
     ) -> Result {
         let mut sector = rq.sector();
         let max_end_sector = sector + <u32 as Into<u64>>::into(max_sectors);
-        let command = rq.command();
 
         // TODO: Use `PerCpu` to get rid of this lock
         let mut hw_data_guard = hw_data.lock();
@@ -566,6 +636,27 @@ fn transfer(
         Ok(())
     }
 
+    fn handle_regular_command(
+        &self,
+        hw_data: &Pin<&SpinLock<HwQueueContext>>,
+        rq: &mut Owned<mq::Request<Self>>,
+    ) -> Result {
+        let mut sectors = rq.sectors();
+
+        self.handle_bad_blocks(rq, &mut sectors)?;
+
+        if self.memory_backed {
+            memalloc_scope!(let _noio: NoIo);
+            if rq.command() == mq::Command::Discard {
+                self.storage.discard(hw_data, rq.sector(), sectors);
+            } else {
+                self.transfer(hw_data, rq, rq.command(), sectors)?;
+            }
+        }
+
+        Ok(())
+    }
+
     fn handle_bad_blocks(&self, rq: &mut Owned<mq::Request<Self>>, sectors: &mut u32) -> Result {
         if self.bad_blocks.enabled() {
             let start = rq.sector();
@@ -581,7 +672,7 @@ fn handle_bad_blocks(&self, rq: &mut Owned<mq::Request<Self>>, sectors: &mut u32
                     }
 
                     if self.bad_blocks_partial_io {
-                        let block_size_sectors = (self.block_size >> SECTOR_SHIFT) as u64;
+                        let block_size_sectors = u64::from(bytes_to_sectors(self.block_size_bytes));
                         range.start = align_down(range.start, block_size_sectors);
                         if start < range.start {
                             *sectors = (range.start - start) as u32;
@@ -666,30 +757,6 @@ impl HasHrTimer<Self> for Pdu {
     }
 }
 
-fn is_power_of_two<T>(value: T) -> bool
-where
-    T: core::ops::Sub<T, Output = T>,
-    T: core::ops::BitAnd<Output = T>,
-    T: core::cmp::PartialOrd<T>,
-    T: Copy,
-    T: From<u8>,
-{
-    (value > 0u8.into()) && (value & (value - 1u8.into())) == 0u8.into()
-}
-
-fn align_down<T>(value: T, to: T) -> T
-where
-    T: core::ops::Sub<T, Output = T>,
-    T: core::ops::Not<Output = T>,
-    T: core::ops::BitAnd<Output = T>,
-    T: core::cmp::PartialOrd<T>,
-    T: Copy,
-    T: From<u8>,
-{
-    debug_assert!(is_power_of_two(to));
-    value & !(to - 1u8.into())
-}
-
 #[vtable]
 impl Operations for NullBlkDevice {
     type QueueData = Arc<Self>;
@@ -711,8 +778,6 @@ fn queue_rq(
         rq: Owned<mq::IdleRequest<Self>>,
         _is_last: bool,
     ) -> BlkResult {
-        let mut sectors = rq.sectors();
-
         if this.bandwidth_limit != 0 {
             if !this.bandwidth_timer.active() {
                 drop(this.bandwidth_timer_handle.lock().take());
@@ -738,18 +803,16 @@ fn queue_rq(
 
         let mut rq = rq.start();
 
-        use core::ops::Deref;
-        Self::handle_bad_blocks(this.deref(), &mut rq, &mut sectors)?;
-
-        if this.memory_backed {
-            memalloc_scope!(let _noio: NoIo);
-            if rq.command() == mq::Command::Discard {
-                this.discard(&hw_data, rq.sector(), sectors)?;
-            } else {
-                this.transfer(&hw_data, &mut rq, sectors)?;
-            }
+        #[cfg(CONFIG_BLK_DEV_ZONED)]
+        if this.zoned.enabled {
+            this.handle_zoned_command(&hw_data, &mut rq)?;
+        } else {
+            this.handle_regular_command(&hw_data, &mut rq)?;
         }
 
+        #[cfg(not(CONFIG_BLK_DEV_ZONED))]
+        this.handle_regular_command(&hw_data, &mut rq)?;
+
         match this.irq_mode {
             IRQMode::None => Self::end_request(rq),
             IRQMode::Soft => mq::Request::complete(rq.into()),
@@ -775,4 +838,14 @@ fn complete(rq: ARef<mq::Request<Self>>) {
                 .expect("Failed to complete request"),
         )
     }
+
+    #[cfg(CONFIG_BLK_DEV_ZONED)]
+    fn report_zones(
+        disk: &GenDiskRef<Self>,
+        sector: u64,
+        nr_zones: u32,
+        callback: impl Fn(&bindings::blk_zone, u32) -> Result,
+    ) -> Result<u32> {
+        Self::report_zones_internal(disk, sector, nr_zones, callback)
+    }
 }
diff --git a/drivers/block/rnull/util.rs b/drivers/block/rnull/util.rs
new file mode 100644
index 000000000000..044926c8e284
--- /dev/null
+++ b/drivers/block/rnull/util.rs
@@ -0,0 +1,65 @@
+// SPDX-License-Identifier: GPL-2.0
+
+// Return true if `value` is a power of two.
+pub(crate) fn is_power_of_two<T>(value: T) -> bool
+where
+    T: core::ops::Sub<T, Output = T>,
+    T: core::ops::BitAnd<Output = T>,
+    T: core::cmp::PartialOrd<T>,
+    T: Copy,
+    T: From<u8>,
+{
+    (value > 0u8.into()) && (value & (value - 1u8.into())) == 0u8.into()
+}
+
+// Round `value` down to the next multiple of `to`, which must be a power of
+// two.
+pub(crate) fn align_down<T>(value: T, to: T) -> T
+where
+    T: core::ops::Sub<T, Output = T>,
+    T: core::ops::Not<Output = T>,
+    T: core::ops::BitAnd<Output = T>,
+    T: core::cmp::PartialOrd<T>,
+    T: Copy,
+    T: From<u8>,
+{
+    debug_assert!(is_power_of_two(to));
+    value & !(to - 1u8.into())
+}
+
+// Round `value` up to the next multiple of `to`, which must be a power of two.
+#[cfg(CONFIG_BLK_DEV_ZONED)]
+pub(crate) fn align_up<T>(value: T, to: T) -> T
+where
+    T: core::ops::Sub<T, Output = T>,
+    T: core::ops::Add<T, Output = T>,
+    T: core::ops::BitAnd<Output = T>,
+    T: core::ops::BitOr<Output = T>,
+    T: core::cmp::PartialOrd<T>,
+    T: Copy,
+    T: From<u8>,
+{
+    debug_assert!(is_power_of_two(to));
+    ((value - 1u8.into()) | (to - 1u8.into())) + 1u8.into()
+}
+
+pub(crate) fn mib_to_sectors<T>(mib: T) -> T
+where
+    T: core::ops::Shl<u32, Output = T>,
+{
+    mib << (20 - kernel::block::SECTOR_SHIFT)
+}
+
+pub(crate) fn sectors_to_bytes<T>(sectors: T) -> T
+where
+    T: core::ops::Shl<u32, Output = T>,
+{
+    sectors << kernel::block::SECTOR_SHIFT
+}
+
+pub(crate) fn bytes_to_sectors<T>(bytes: T) -> T
+where
+    T: core::ops::Shl<u32, Output = T>,
+{
+    bytes << kernel::block::SECTOR_SHIFT
+}
diff --git a/drivers/block/rnull/zoned.rs b/drivers/block/rnull/zoned.rs
new file mode 100644
index 000000000000..808449cc49e1
--- /dev/null
+++ b/drivers/block/rnull/zoned.rs
@@ -0,0 +1,663 @@
+// SPDX-License-Identifier: GPL-2.0
+
+use crate::{
+    util::*,
+    HwQueueContext, //
+};
+use kernel::{
+    bindings,
+    block::mq::{
+        self,
+        gen_disk::GenDiskRef, //
+    },
+    memalloc_scope,
+    new_mutex,
+    new_spinlock,
+    prelude::*,
+    sync::Mutex,
+    sync::SpinLock,
+    types::Owned, //
+};
+
+pub(crate) struct ZoneOptionsArgs {
+    pub(crate) enable: bool,
+    pub(crate) device_capacity_mib: u64,
+    pub(crate) block_size_bytes: u32,
+    pub(crate) zone_size_mib: u32,
+    pub(crate) zone_capacity_mib: u32,
+    pub(crate) zone_nr_conv: u32,
+    pub(crate) zone_max_open: u32,
+    pub(crate) zone_max_active: u32,
+    pub(crate) zone_append_max_sectors: u32,
+}
+
+#[pin_data]
+pub(crate) struct ZoneOptions {
+    pub(crate) enabled: bool,
+    zones: Pin<KBox<[Mutex<ZoneDescriptor>]>>,
+    conventional_count: u32,
+    pub(crate) size_sectors: u32,
+    append_max_sectors: u32,
+    max_open: u32,
+    max_active: u32,
+    #[pin]
+    accounting: SpinLock<ZoneAccounting>,
+}
+
+impl ZoneOptions {
+    pub(crate) fn new(args: ZoneOptionsArgs) -> Result<impl PinInit<Self, Error>> {
+        let ZoneOptionsArgs {
+            enable,
+            device_capacity_mib,
+            block_size_bytes,
+            zone_size_mib,
+            zone_capacity_mib,
+            mut zone_nr_conv,
+            mut zone_max_open,
+            mut zone_max_active,
+            zone_append_max_sectors,
+        } = args;
+
+        if !is_power_of_two(zone_size_mib) {
+            return Err(EINVAL);
+        }
+
+        if zone_capacity_mib > zone_size_mib {
+            return Err(EINVAL);
+        }
+
+        let zone_size_sectors = mib_to_sectors(zone_size_mib);
+        let device_capacity_sectors = mib_to_sectors(device_capacity_mib);
+        let zone_capacity_sectors = mib_to_sectors(zone_capacity_mib);
+        let zone_count: u32 = (align_up(device_capacity_sectors, zone_size_sectors.into())
+            >> zone_size_sectors.ilog2())
+        .try_into()?;
+
+        if zone_nr_conv >= zone_count {
+            zone_nr_conv = zone_count - 1;
+            pr_info!("changed the number of conventional zones to {zone_nr_conv}\n");
+        }
+
+        let zone_append_max_sectors =
+            align_down(zone_append_max_sectors, bytes_to_sectors(block_size_bytes))
+                .min(zone_capacity_sectors);
+
+        let seq_zone_count = zone_count - zone_nr_conv;
+
+        if zone_max_active >= seq_zone_count {
+            zone_max_active = 0;
+            pr_info!("zone_max_active limit disabled, limit >= zone count\n");
+        }
+
+        if zone_max_active != 0 && zone_max_open > zone_max_active {
+            zone_max_open = zone_max_active;
+            pr_info!("changed the maximum number of open zones to {zone_max_open}\n");
+        } else if zone_max_open >= seq_zone_count {
+            zone_max_open = 0;
+            pr_info!("zone_max_open limit disabled, limit >= zone count\n");
+        }
+
+        Ok(try_pin_init!(Self {
+            enabled: enable,
+            zones: init_zone_descriptors(
+                zone_size_sectors,
+                zone_capacity_sectors,
+                zone_count,
+                zone_nr_conv,
+            )?,
+            size_sectors: zone_size_sectors,
+            append_max_sectors: zone_append_max_sectors,
+            max_open: zone_max_open,
+            max_active: zone_max_active,
+            accounting <- new_spinlock!(ZoneAccounting {
+                implicit_open: 0,
+                explicit_open: 0,
+                closed: 0,
+                start_zone: zone_nr_conv,
+            }),
+            conventional_count: zone_nr_conv,
+        }))
+    }
+}
+
+struct ZoneAccounting {
+    implicit_open: u32,
+    explicit_open: u32,
+    closed: u32,
+    start_zone: u32,
+}
+
+pub(crate) fn init_zone_descriptors(
+    zone_size_sectors: u32,
+    zone_capacity_sectors: u32,
+    zone_count: u32,
+    zone_nr_conv: u32,
+) -> Result<Pin<KBox<[Mutex<ZoneDescriptor>]>>> {
+    let zone_capacity_sectors = if zone_capacity_sectors == 0 {
+        zone_size_sectors
+    } else {
+        zone_capacity_sectors
+    };
+
+    KBox::pin_slice(
+        |i| {
+            let sector = i as u64 * Into::<u64>::into(zone_size_sectors);
+            new_mutex!(
+                if i < zone_nr_conv.try_into().expect("Fewer than 2^32 zones") {
+                    ZoneDescriptor {
+                        start_sector: sector,
+                        size_sectors: zone_size_sectors,
+                        capacity_sectors: zone_size_sectors,
+                        kind: ZoneType::Conventional,
+                        write_pointer: sector + Into::<u64>::into(zone_size_sectors),
+                        condition: ZoneCondition::NoWritePointer,
+                    }
+                } else {
+                    ZoneDescriptor {
+                        start_sector: sector,
+                        size_sectors: zone_size_sectors,
+                        capacity_sectors: zone_capacity_sectors,
+                        kind: ZoneType::SequentialWriteRequired,
+                        write_pointer: sector,
+                        condition: ZoneCondition::Empty,
+                    }
+                }
+            )
+        },
+        zone_count as usize,
+        GFP_KERNEL,
+    )
+}
+
+impl super::NullBlkDevice {
+    pub(crate) fn handle_zoned_command(
+        &self,
+        hw_data: &Pin<&SpinLock<HwQueueContext>>,
+        rq: &mut Owned<mq::Request<Self>>,
+    ) -> Result {
+        use mq::Command::*;
+        match rq.command() {
+            ZoneAppend | Write => self.zoned_write(hw_data, rq)?,
+            ZoneReset | ZoneResetAll | ZoneOpen | ZoneClose | ZoneFinish => {
+                self.zone_management(hw_data, rq)?
+            }
+            _ => self.zoned_read(hw_data, rq)?,
+        }
+
+        Ok(())
+    }
+
+    fn zone_management(
+        &self,
+        hw_data: &Pin<&SpinLock<HwQueueContext>>,
+        rq: &mut Owned<mq::Request<Self>>,
+    ) -> Result {
+        if rq.command() == mq::Command::ZoneResetAll {
+            for zone in self.zoned.zones_iter() {
+                let mut zone = zone.lock();
+                use ZoneCondition::*;
+                match zone.condition {
+                    Empty | ReadOnly | Offline => continue,
+                    _ => self.zoned.reset_zone(&self.storage, hw_data, &mut zone)?,
+                }
+            }
+
+            return Ok(());
+        }
+
+        let zone = self.zoned.zone(rq.sector())?;
+        let mut zone = zone.lock();
+
+        if zone.condition == ZoneCondition::ReadOnly || zone.condition == ZoneCondition::Offline {
+            return Err(EIO);
+        }
+
+        use mq::Command::*;
+        match rq.command() {
+            ZoneOpen => self.zoned.open_zone(&mut zone, rq.sector()),
+            ZoneClose => self.zoned.close_zone(&mut zone),
+            ZoneReset => self.zoned.reset_zone(&self.storage, hw_data, &mut zone),
+            ZoneFinish => self.zoned.finish_zone(&mut zone, rq.sector()),
+            _ => Err(EIO),
+        }
+    }
+
+    fn zoned_read(
+        &self,
+        hw_data: &Pin<&SpinLock<HwQueueContext>>,
+        rq: &mut Owned<mq::Request<Self>>,
+    ) -> Result {
+        let zone = self.zoned.zone(rq.sector())?;
+        let zone = zone.lock();
+        if zone.condition == ZoneCondition::Offline {
+            return Err(EINVAL);
+        }
+
+        zone.check_bounds_read(rq.sector(), rq.sectors())?;
+
+        self.handle_regular_command(hw_data, rq)
+    }
+
+    fn zoned_write(
+        &self,
+        hw_data: &Pin<&SpinLock<HwQueueContext>>,
+        rq: &mut Owned<mq::Request<Self>>,
+    ) -> Result {
+        let zone = self.zoned.zone(rq.sector())?;
+        let mut zone = zone.lock();
+        let append: bool = rq.command() == mq::Command::ZoneAppend;
+
+        if zone.kind == ZoneType::Conventional {
+            if append {
+                return Err(EINVAL);
+            }
+
+            // NOTE: C driver does not check bounds on write.
+            zone.check_bounds_write(rq.sector(), rq.sectors())?;
+
+            let mut sectors = rq.sectors();
+            self.handle_bad_blocks(rq, &mut sectors)?;
+            return self.transfer(hw_data, rq, rq.command(), sectors);
+        }
+
+        // Check zoned write fits within zone
+        if zone.write_pointer + Into::<u64>::into(rq.sectors())
+            > zone.start_sector + Into::<u64>::into(zone.capacity_sectors)
+        {
+            return Err(EINVAL);
+        }
+
+        if append {
+            if self.zoned.append_max_sectors == 0 {
+                return Err(EINVAL);
+            }
+            rq.as_pin_mut().set_sector(zone.write_pointer);
+        }
+
+        // Check write pointer alignment
+        if !append && rq.sector() != zone.write_pointer {
+            return Err(EINVAL);
+        }
+
+        if zone.condition == ZoneCondition::Closed || zone.condition == ZoneCondition::Empty {
+            if self.zoned.use_accounting() {
+                let mut accounting = self.zoned.accounting.lock();
+                self.zoned
+                    .check_zone_resources(&mut accounting, &mut zone, rq.sector())?;
+
+                if zone.condition == ZoneCondition::Closed {
+                    accounting.closed -= 1;
+                    accounting.implicit_open += 1;
+                } else if zone.condition == ZoneCondition::Empty {
+                    accounting.implicit_open += 1;
+                }
+            }
+
+            zone.condition = ZoneCondition::ImplicitOpen;
+        }
+
+        let mut sectors = rq.sectors();
+        self.handle_bad_blocks(rq, &mut sectors)?;
+
+        if self.memory_backed {
+            memalloc_scope!(let _noio: NoIo);
+            self.transfer(hw_data, rq, mq::Command::Write, sectors)?;
+        }
+
+        zone.write_pointer += Into::<u64>::into(sectors);
+        if zone.write_pointer == zone.start_sector + Into::<u64>::into(zone.capacity_sectors) {
+            if self.zoned.use_accounting() {
+                let mut accounting = self.zoned.accounting.lock();
+
+                if zone.condition == ZoneCondition::ExplicitOpen {
+                    accounting.explicit_open -= 1;
+                } else if zone.condition == ZoneCondition::ImplicitOpen {
+                    accounting.implicit_open -= 1;
+                }
+            }
+
+            zone.condition = ZoneCondition::Full;
+        }
+
+        Ok(())
+    }
+
+    pub(crate) fn report_zones_internal(
+        disk: &GenDiskRef<Self>,
+        sector: u64,
+        nr_zones: u32,
+        callback: impl Fn(&bindings::blk_zone, u32) -> Result,
+    ) -> Result<u32> {
+        let device = disk.queue_data();
+        let first_zone = sector >> device.zoned.size_sectors.ilog2();
+
+        let mut count = 0;
+
+        for (i, zone) in device
+            .zoned
+            .zones
+            .split_at(first_zone as usize)
+            .1
+            .iter()
+            .take(nr_zones as usize)
+            .enumerate()
+        {
+            let zone = zone.lock();
+            let descriptor = bindings::blk_zone {
+                start: zone.start_sector,
+                len: zone.size_sectors.into(),
+                wp: zone.write_pointer,
+                capacity: zone.capacity_sectors.into(),
+                type_: zone.kind as u8,
+                cond: zone.condition as u8,
+                ..bindings::blk_zone::zeroed()
+            };
+            drop(zone);
+            callback(&descriptor, i as u32)?;
+
+            count += 1;
+        }
+
+        Ok(count)
+    }
+}
+
+impl ZoneOptions {
+    fn zone_no(&self, sector: u64) -> usize {
+        (sector >> self.size_sectors.ilog2()) as usize
+    }
+
+    fn zone(&self, sector: u64) -> Result<&Mutex<ZoneDescriptor>> {
+        self.zones.get(self.zone_no(sector)).ok_or(EINVAL)
+    }
+
+    fn zones_iter(&self) -> impl Iterator<Item = &Mutex<ZoneDescriptor>> {
+        self.zones.iter()
+    }
+
+    fn use_accounting(&self) -> bool {
+        self.max_active != 0 || self.max_open != 0
+    }
+
+    fn try_close_implicit_open_zone(&self, accounting: &mut ZoneAccounting, sector: u64) -> Result {
+        let skip = self.zone_no(sector) as u32;
+
+        let it = Iterator::chain(
+            self.zones[(accounting.start_zone as usize)..]
+                .iter()
+                .enumerate()
+                .map(|(i, z)| (i + accounting.start_zone as usize, z)),
+            self.zones[(self.conventional_count as usize)..(accounting.start_zone as usize)]
+                .iter()
+                .enumerate()
+                .map(|(i, z)| (i + self.conventional_count as usize, z)),
+        )
+        .filter(|(i, _)| *i != skip as usize);
+
+        for (index, zone) in it {
+            let mut zone = zone.lock();
+            if zone.condition == ZoneCondition::ImplicitOpen {
+                accounting.implicit_open -= 1;
+
+                let index_u32: u32 = index.try_into()?;
+                let next_zone: u32 = index_u32 + 1;
+                accounting.start_zone = if next_zone == self.zones.len().try_into()? {
+                    self.conventional_count
+                } else {
+                    next_zone
+                };
+
+                if zone.write_pointer == zone.start_sector {
+                    zone.condition = ZoneCondition::Empty;
+                } else {
+                    zone.condition = ZoneCondition::Closed;
+                    accounting.closed += 1;
+                }
+                return Ok(());
+            }
+        }
+
+        Err(EINVAL)
+    }
+
+    fn open_zone(&self, zone: &mut ZoneDescriptor, sector: u64) -> Result {
+        if zone.kind == ZoneType::Conventional {
+            return Err(EINVAL);
+        }
+
+        use ZoneCondition::*;
+        match zone.condition {
+            ExplicitOpen => return Ok(()),
+            Empty | ImplicitOpen | Closed => (),
+            _ => return Err(EIO),
+        }
+
+        if self.use_accounting() {
+            let mut accounting = self.accounting.lock();
+            match zone.condition {
+                Empty => {
+                    self.check_zone_resources(&mut accounting, zone, sector)?;
+                }
+                ImplicitOpen => {
+                    accounting.implicit_open -= 1;
+                }
+                Closed => {
+                    self.check_zone_resources(&mut accounting, zone, sector)?;
+                    accounting.closed -= 1;
+                }
+                _ => (),
+            }
+
+            accounting.explicit_open += 1;
+        }
+
+        zone.condition = ExplicitOpen;
+        Ok(())
+    }
+
+    fn check_zone_resources(
+        &self,
+        accounting: &mut ZoneAccounting,
+        zone: &mut ZoneDescriptor,
+        sector: u64,
+    ) -> Result {
+        match zone.condition {
+            ZoneCondition::Empty => {
+                self.check_active_zones(accounting)?;
+                self.check_open_zones(accounting, sector)
+            }
+            ZoneCondition::Closed => self.check_open_zones(accounting, sector),
+            _ => Err(EIO),
+        }
+    }
+
+    fn check_open_zones(&self, accounting: &mut ZoneAccounting, sector: u64) -> Result {
+        if self.max_open == 0 {
+            return Ok(());
+        }
+
+        if self.max_open > accounting.explicit_open + accounting.implicit_open {
+            return Ok(());
+        }
+
+        if accounting.implicit_open > 0 {
+            self.check_active_zones(accounting)?;
+            return self.try_close_implicit_open_zone(accounting, sector);
+        }
+
+        Err(EBUSY)
+    }
+
+    fn check_active_zones(&self, accounting: &mut ZoneAccounting) -> Result {
+        if self.max_active == 0 {
+            return Ok(());
+        }
+
+        if self.max_active > accounting.implicit_open + accounting.explicit_open + accounting.closed
+        {
+            return Ok(());
+        }
+
+        Err(EBUSY)
+    }
+
+    fn close_zone(&self, zone: &mut ZoneDescriptor) -> Result {
+        if zone.kind == ZoneType::Conventional {
+            return Err(EINVAL);
+        }
+
+        use ZoneCondition::*;
+        match zone.condition {
+            Closed => return Ok(()),
+            ImplicitOpen | ExplicitOpen => (),
+            _ => return Err(EIO),
+        }
+
+        if self.use_accounting() {
+            let mut accounting = self.accounting.lock();
+            match zone.condition {
+                ImplicitOpen => accounting.implicit_open -= 1,
+                ExplicitOpen => accounting.explicit_open -= 1,
+                _ => (),
+            }
+
+            if zone.write_pointer > zone.start_sector {
+                accounting.closed += 1;
+            }
+        }
+
+        if zone.write_pointer == zone.start_sector {
+            zone.condition = Empty;
+        } else {
+            zone.condition = Closed;
+        }
+
+        Ok(())
+    }
+
+    fn finish_zone(&self, zone: &mut ZoneDescriptor, sector: u64) -> Result {
+        if zone.kind == ZoneType::Conventional {
+            return Err(EINVAL);
+        }
+
+        if self.use_accounting() {
+            let mut accounting = self.accounting.lock();
+
+            use ZoneCondition::*;
+            match zone.condition {
+                Full => return Ok(()),
+                Empty => {
+                    self.check_zone_resources(&mut accounting, zone, sector)?;
+                }
+                ImplicitOpen => accounting.implicit_open -= 1,
+                ExplicitOpen => accounting.explicit_open -= 1,
+                Closed => {
+                    self.check_zone_resources(&mut accounting, zone, sector)?;
+                    accounting.closed -= 1;
+                }
+                _ => return Err(EIO),
+            }
+        }
+
+        zone.condition = ZoneCondition::Full;
+        zone.write_pointer = zone.start_sector + Into::<u64>::into(zone.size_sectors);
+
+        Ok(())
+    }
+
+    fn reset_zone(
+        &self,
+        storage: &crate::disk_storage::DiskStorage,
+        hw_data: &Pin<&SpinLock<HwQueueContext>>,
+        zone: &mut ZoneDescriptor,
+    ) -> Result {
+        if zone.kind == ZoneType::Conventional {
+            return Err(EINVAL);
+        }
+
+        if self.use_accounting() {
+            let mut accounting = self.accounting.lock();
+
+            use ZoneCondition::*;
+            match zone.condition {
+                ImplicitOpen => accounting.implicit_open -= 1,
+                ExplicitOpen => accounting.explicit_open -= 1,
+                Closed => accounting.closed -= 1,
+                Empty | Full => (),
+                _ => return Err(EIO),
+            }
+        }
+
+        zone.condition = ZoneCondition::Empty;
+        zone.write_pointer = zone.start_sector;
+
+        storage.discard(hw_data, zone.start_sector, zone.size_sectors);
+
+        Ok(())
+    }
+}
+
+pub(crate) struct ZoneDescriptor {
+    start_sector: u64,
+    size_sectors: u32,
+    kind: ZoneType,
+    capacity_sectors: u32,
+    write_pointer: u64,
+    condition: ZoneCondition,
+}
+
+impl ZoneDescriptor {
+    fn check_bounds_write(&self, sector: u64, sectors: u32) -> Result {
+        if sector + Into::<u64>::into(sectors)
+            > self.start_sector + Into::<u64>::into(self.capacity_sectors)
+        {
+            Err(EIO)
+        } else {
+            Ok(())
+        }
+    }
+
+    fn check_bounds_read(&self, sector: u64, sectors: u32) -> Result {
+        if sector + Into::<u64>::into(sectors) > self.write_pointer {
+            Err(EIO)
+        } else {
+            Ok(())
+        }
+    }
+}
+
+#[derive(Copy, Clone, PartialEq, Eq, Debug)]
+#[repr(u32)]
+enum ZoneType {
+    Conventional = bindings::blk_zone_type_BLK_ZONE_TYPE_CONVENTIONAL,
+    SequentialWriteRequired = bindings::blk_zone_type_BLK_ZONE_TYPE_SEQWRITE_REQ,
+    #[expect(dead_code)]
+    SequentialWritePreferred = bindings::blk_zone_type_BLK_ZONE_TYPE_SEQWRITE_PREF,
+}
+
+impl ZoneType {
+    #[expect(dead_code)]
+    fn as_raw(self) -> u32 {
+        self as u32
+    }
+}
+
+#[derive(Copy, Clone, PartialEq, Eq, Debug)]
+#[repr(u32)]
+enum ZoneCondition {
+    NoWritePointer = bindings::blk_zone_cond_BLK_ZONE_COND_NOT_WP,
+    Empty = bindings::blk_zone_cond_BLK_ZONE_COND_EMPTY,
+    ImplicitOpen = bindings::blk_zone_cond_BLK_ZONE_COND_IMP_OPEN,
+    ExplicitOpen = bindings::blk_zone_cond_BLK_ZONE_COND_EXP_OPEN,
+    Closed = bindings::blk_zone_cond_BLK_ZONE_COND_CLOSED,
+    Full = bindings::blk_zone_cond_BLK_ZONE_COND_FULL,
+    ReadOnly = bindings::blk_zone_cond_BLK_ZONE_COND_READONLY,
+    Offline = bindings::blk_zone_cond_BLK_ZONE_COND_OFFLINE,
+}
+
+impl ZoneCondition {
+    #[expect(dead_code)]
+    fn as_raw(self) -> u32 {
+        self as u32
+    }
+}

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 54/83] block: rust: add `map_queues` support
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (52 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 53/83] block: rnull: add zoned storage support Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 55/83] block: rust: add an abstraction for `struct blk_mq_queue_map` Andreas Hindborg
                   ` (28 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux, Andreas Hindborg

From: Andreas Hindborg <a.hindborg@samsung.com>

Add support for the `map_queues` callback to the Rust block layer
bindings. This callback allows drivers to customize the mapping between
CPUs and hardware queues.

The callback receives a mutable reference to the `TagSet`, and drivers
can use the `TagSet::update_maps` method to configure the mappings for
each queue type.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/kernel/block/mq/operations.rs | 29 +++++++++++++++++++++++++++--
 rust/kernel/block/mq/tag_set.rs    | 13 +++++++++++++
 2 files changed, 40 insertions(+), 2 deletions(-)

diff --git a/rust/kernel/block/mq/operations.rs b/rust/kernel/block/mq/operations.rs
index 71d4192d627f..8a418bf0f3ba 100644
--- a/rust/kernel/block/mq/operations.rs
+++ b/rust/kernel/block/mq/operations.rs
@@ -12,7 +12,8 @@
             gen_disk::GenDiskRef,
             request::RequestDataWrapper,
             IdleRequest,
-            Request, //
+            Request,
+            TagSet, //
         },
     },
     error::{
@@ -126,6 +127,11 @@ fn report_zones(
     ) -> Result<u32> {
         Err(ENOTSUPP)
     }
+
+    /// Called by the kernel to map submission queues to CPU cores.
+    fn map_queues(_tag_set: &TagSet<Self>) {
+        build_error!(crate::error::VTABLE_DEFAULT_ERROR)
+    }
 }
 
 /// A vtable for blk-mq to interact with a block device driver.
@@ -418,6 +424,21 @@ impl<T: Operations> OperationsVTable<T> {
         })
     }
 
+    /// This function is called by the C kernel. A pointer to this function is
+    /// installed in the `blk_mq_ops` vtable for the driver.
+    ///
+    /// # Safety
+    ///
+    /// This function may only be called by blk-mq C infrastructure. `tag_set`
+    /// must be a pointer to a valid and initialized `TagSet<T>`. The pointee
+    /// must be valid for use as a reference at least the duration of this call.
+    unsafe extern "C" fn map_queues_callback(tag_set: *mut bindings::blk_mq_tag_set) {
+        // SAFETY: The safety requirements of this function satiesfies the
+        // requirements of `TagSet::from_ptr`.
+        let tag_set = unsafe { TagSet::from_ptr(tag_set) };
+        T::map_queues(tag_set);
+    }
+
     const VTABLE: bindings::blk_mq_ops = bindings::blk_mq_ops {
         queue_rq: Some(Self::queue_rq_callback),
         queue_rqs: None,
@@ -439,7 +460,11 @@ impl<T: Operations> OperationsVTable<T> {
         exit_request: Some(Self::exit_request_callback),
         cleanup_rq: None,
         busy: None,
-        map_queues: None,
+        map_queues: if T::HAS_MAP_QUEUES {
+            Some(Self::map_queues_callback)
+        } else {
+            None
+        },
         #[cfg(CONFIG_BLK_DEBUG_FS)]
         show_rq: None,
     };
diff --git a/rust/kernel/block/mq/tag_set.rs b/rust/kernel/block/mq/tag_set.rs
index 157c47f64334..d3e93ad98b6e 100644
--- a/rust/kernel/block/mq/tag_set.rs
+++ b/rust/kernel/block/mq/tag_set.rs
@@ -116,6 +116,19 @@ pub fn flags(&self) -> Flags {
         let flags_raw = unsafe { (*this).flags };
         Flags::try_from(flags_raw).expect("Expected valid flags from C struct")
     }
+
+    /// Create a `TagSet<T>` from a raw pointer.
+    ///
+    /// # Safety
+    ///
+    /// `ptr` must be a pointer to a valid and initialized `TagSet<T>`. There
+    /// may be no other mutable references to the tag set. The pointee must be
+    /// live and valid at least for the duration of the returned lifetime `'a`.
+    pub(crate) unsafe fn from_ptr<'a>(ptr: *mut bindings::blk_mq_tag_set) -> &'a Self {
+        // SAFETY: By the safety requirements of this function, `ptr` is valid
+        // for use as a reference for the duration of `'a`.
+        unsafe { &*(ptr.cast::<Self>()) }
+    }
 }
 
 #[pinned_drop]

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 55/83] block: rust: add an abstraction for `struct blk_mq_queue_map`
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (53 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 54/83] block: rust: add `map_queues` support Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 56/83] block: rust: add polled completion support Andreas Hindborg
                   ` (27 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add the `QueueMap` and `QueueType` types as Rust abstractions for CPU
to hardware queue mappings. The `QueueMap` type wraps `struct
blk_mq_queue_map` and provides methods to set up the mapping between
CPUs and hardware queues.

`QueueType` represents the different queue types: default, read, and
poll queues.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/kernel/block/mq.rs            |  5 +-
 rust/kernel/block/mq/operations.rs | 10 ++--
 rust/kernel/block/mq/tag_set.rs    | 96 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 105 insertions(+), 6 deletions(-)

diff --git a/rust/kernel/block/mq.rs b/rust/kernel/block/mq.rs
index 5bf2cf2736a5..e9bea19d684b 100644
--- a/rust/kernel/block/mq.rs
+++ b/rust/kernel/block/mq.rs
@@ -138,4 +138,7 @@
     RequestTimerHandle, //
 };
 pub use request_queue::RequestQueue;
-pub use tag_set::TagSet;
+pub use tag_set::{
+    QueueType,
+    TagSet, //
+};
diff --git a/rust/kernel/block/mq/operations.rs b/rust/kernel/block/mq/operations.rs
index 8a418bf0f3ba..06faf5647aaa 100644
--- a/rust/kernel/block/mq/operations.rs
+++ b/rust/kernel/block/mq/operations.rs
@@ -128,8 +128,8 @@ fn report_zones(
         Err(ENOTSUPP)
     }
 
-    /// Called by the kernel to map submission queues to CPU cores.
-    fn map_queues(_tag_set: &TagSet<Self>) {
+    /// Called by the kernel to map hardware queues to CPU cores.
+    fn map_queues(_tag_set: Pin<&mut TagSet<Self>>) {
         build_error!(crate::error::VTABLE_DEFAULT_ERROR)
     }
 }
@@ -433,9 +433,9 @@ impl<T: Operations> OperationsVTable<T> {
     /// must be a pointer to a valid and initialized `TagSet<T>`. The pointee
     /// must be valid for use as a reference at least the duration of this call.
     unsafe extern "C" fn map_queues_callback(tag_set: *mut bindings::blk_mq_tag_set) {
-        // SAFETY: The safety requirements of this function satiesfies the
-        // requirements of `TagSet::from_ptr`.
-        let tag_set = unsafe { TagSet::from_ptr(tag_set) };
+        // SAFETY: By C API contract `tag_set` is the tag set registered with the `GenDisk` created
+        // by `GenDiskBuilder`.
+        let tag_set = unsafe { TagSet::from_ptr_mut(tag_set) };
         T::map_queues(tag_set);
     }
 
diff --git a/rust/kernel/block/mq/tag_set.rs b/rust/kernel/block/mq/tag_set.rs
index d3e93ad98b6e..e62dfd267fd9 100644
--- a/rust/kernel/block/mq/tag_set.rs
+++ b/rust/kernel/block/mq/tag_set.rs
@@ -124,11 +124,46 @@ pub fn flags(&self) -> Flags {
     /// `ptr` must be a pointer to a valid and initialized `TagSet<T>`. There
     /// may be no other mutable references to the tag set. The pointee must be
     /// live and valid at least for the duration of the returned lifetime `'a`.
+    #[expect(dead_code)]
     pub(crate) unsafe fn from_ptr<'a>(ptr: *mut bindings::blk_mq_tag_set) -> &'a Self {
         // SAFETY: By the safety requirements of this function, `ptr` is valid
         // for use as a reference for the duration of `'a`.
         unsafe { &*(ptr.cast::<Self>()) }
     }
+
+    /// Create a `TagSet<T>` from a raw pointer.
+    ///
+    /// # Safety
+    ///
+    /// `ptr` must be a pointer to a valid and initialized `TagSet<T>`. There
+    /// may be no other mutable references to the tag set. The pointee must be
+    /// live and valid at least for the duration of the returned lifetime `'a`.
+    pub(crate) unsafe fn from_ptr_mut<'a>(ptr: *mut bindings::blk_mq_tag_set) -> Pin<&'a mut Self> {
+        // SAFETY: By function safety requirements, `ptr` is valid for use as a mutable reference.
+        let mref = unsafe { &mut *(ptr.cast::<Self>()) };
+
+        // SAFETY: We never move out of `mref`.
+        unsafe { Pin::new_unchecked(mref) }
+    }
+
+    /// Helper function to invoke a closure each hardware queue type supported.
+    ///
+    /// This function invokes `cb` for each variant of [`QueueType`] that this [`TagSet`] supports.
+    /// This is helpful for setting up CPU to hardware queue maps in the [`Operations::map_queues`]
+    /// callback.
+    pub fn update_maps(self: Pin<&mut Self>, mut cb: impl FnMut(QueueMap)) -> Result {
+        // SAFETY: By type invariant, `self.inner` is valid.
+        let nr_maps = unsafe { (*self.inner.get()).nr_maps };
+        for i in 0..nr_maps {
+            cb(QueueMap {
+                // SAFETY: By type invariant, `self.inner` is valid.
+                map: unsafe { &raw mut (*self.inner.get()).map[i as usize] },
+                kind: i.try_into()?,
+            });
+        }
+
+        Ok(())
+    }
 }
 
 #[pinned_drop]
@@ -164,3 +199,64 @@ unsafe impl<T> Send for TagSet<T>
     T::TagSetData: Send,
 {
 }
+
+/// A [`TagSet`] CPU to hardware queue mapping.
+///
+/// # Invariants
+///
+/// - `self.map` points to a valid `blk_mq_queue_map`
+pub struct QueueMap {
+    map: *mut bindings::blk_mq_queue_map,
+    kind: QueueType,
+}
+
+impl QueueMap {
+    /// Set the number of queues for this mapping kind.
+    pub fn set_queue_count(&mut self, nr_queues: u32) {
+        // SAFETY: By type invariant, `self.map` is valid.
+        unsafe { (*self.map).nr_queues = nr_queues }
+    }
+
+    /// First hardware queue to map this queue kind onto. Used by the PCIe NVMe driver to map each
+    /// hardware queue type ([`QueueType`]) onto a distinct set of hardware queues.
+    pub fn set_offset(&mut self, offset: u32) {
+        // SAFETY: By type invariant, `self.map` is valid.
+        unsafe { (*self.map).queue_offset = offset }
+    }
+
+    /// Effectuate the mapping described by [`Self`].
+    pub fn map_queues(&self) {
+        // SAFETY: By type invariant, `self.map` is valid.
+        unsafe { bindings::blk_mq_map_queues(self.map) }
+    }
+
+    /// Return the kind of this queue mapping.
+    pub fn kind(&self) -> QueueType {
+        self.kind
+    }
+}
+
+/// Type of hardware queue.
+#[derive(Copy, Clone, Debug, PartialEq, Eq)]
+#[repr(u32)]
+pub enum QueueType {
+    /// All I/O not otherwise accounted for.
+    Default = bindings::hctx_type_HCTX_TYPE_DEFAULT,
+    /// Just for READ I/O.
+    Read = bindings::hctx_type_HCTX_TYPE_READ,
+    ///  Polled I/O of any kind.
+    Poll = bindings::hctx_type_HCTX_TYPE_POLL,
+}
+
+impl TryFrom<u32> for QueueType {
+    type Error = kernel::error::Error;
+
+    fn try_from(value: u32) -> core::result::Result<Self, Self::Error> {
+        match value {
+            bindings::hctx_type_HCTX_TYPE_DEFAULT => Ok(QueueType::Default),
+            bindings::hctx_type_HCTX_TYPE_READ => Ok(QueueType::Read),
+            bindings::hctx_type_HCTX_TYPE_POLL => Ok(QueueType::Poll),
+            _ => Err(kernel::error::code::EINVAL),
+        }
+    }
+}

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 56/83] block: rust: add polled completion support
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (54 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 55/83] block: rust: add an abstraction for `struct blk_mq_queue_map` Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 57/83] block: rust: add accessors to `TagSet` Andreas Hindborg
                   ` (26 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add support for polled I/O completion to the Rust block layer bindings.
This includes the `poll` callback in `Operations` and the
`IoCompletionBatch` type for batched completions.

The `poll` callback is invoked by the block layer when polling for
completed requests on poll queues. Drivers can use `IoCompletionBatch`
to batch multiple completions efficiently.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/rnull.rs       |   1 +
 rust/helpers/blk.c                 |   7 +++
 rust/kernel/block/mq.rs            |   8 ++-
 rust/kernel/block/mq/operations.rs | 104 +++++++++++++++++++++++++++++++++++--
 rust/kernel/block/mq/request.rs    |   5 ++
 5 files changed, 120 insertions(+), 5 deletions(-)

diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index 6fb307e33263..076493f92516 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -777,6 +777,7 @@ fn queue_rq(
         this: ArcBorrow<'_, Self>,
         rq: Owned<mq::IdleRequest<Self>>,
         _is_last: bool,
+        _is_poll: bool,
     ) -> BlkResult {
         if this.bandwidth_limit != 0 {
             if !this.bandwidth_timer.active() {
diff --git a/rust/helpers/blk.c b/rust/helpers/blk.c
index 6a70e1306a3a..500e3c6fd951 100644
--- a/rust/helpers/blk.c
+++ b/rust/helpers/blk.c
@@ -20,3 +20,10 @@ __rust_helper void rust_helper_bio_advance_iter_single(const struct bio *bio,
 {
 	bio_advance_iter_single(bio, iter, bytes);
 }
+
+bool rust_helper_blk_mq_add_to_batch(struct request *req,
+				     struct io_comp_batch *iob, bool is_error,
+				     void (*complete)(struct io_comp_batch *))
+{
+	return blk_mq_add_to_batch(req, iob, is_error, complete);
+}
diff --git a/rust/kernel/block/mq.rs b/rust/kernel/block/mq.rs
index e9bea19d684b..23bf95136bc1 100644
--- a/rust/kernel/block/mq.rs
+++ b/rust/kernel/block/mq.rs
@@ -89,7 +89,8 @@
 //!         _hw_data: (),
 //!         _queue_data: (),
 //!         rq: Owned<IdleRequest<Self>>,
-//!         _is_last: bool
+//!         _is_last: bool,
+//!         is_poll: bool
 //!     ) -> BlkResult {
 //!         rq.start().end_ok();
 //!         Ok(())
@@ -130,7 +131,10 @@
 mod request_queue;
 pub mod tag_set;
 
-pub use operations::Operations;
+pub use operations::{
+    IoCompletionBatch,
+    Operations, //
+};
 pub use request::{
     Command,
     IdleRequest,
diff --git a/rust/kernel/block/mq/operations.rs b/rust/kernel/block/mq/operations.rs
index 06faf5647aaa..1be4695ca944 100644
--- a/rust/kernel/block/mq/operations.rs
+++ b/rust/kernel/block/mq/operations.rs
@@ -91,6 +91,7 @@ fn queue_rq(
         queue_data: ForeignBorrowed<'_, Self::QueueData>,
         rq: Owned<IdleRequest<Self>>,
         is_last: bool,
+        is_poll: bool,
     ) -> BlkResult;
 
     /// Called by the kernel to indicate that queued requests should be submitted.
@@ -110,7 +111,15 @@ fn init_hctx(
 
     /// Called by the kernel to poll the device for completed requests. Only
     /// used for poll queues.
-    fn poll(_hw_data: ForeignBorrowed<'_, Self::HwData>) -> bool {
+    ///
+    /// Should return `Ok(true)` if any requests were completed during the call,
+    /// `Ok(false)` if no requests were completed, and `Err(e)` to signal an
+    /// error condition.
+    fn poll(
+        _hw_data: ForeignBorrowed<'_, Self::HwData>,
+        _queue_data: ForeignBorrowed<'_, Self::QueueData>,
+        _batch: &mut IoCompletionBatch<Self>,
+    ) -> Result<bool> {
         build_error!(crate::error::VTABLE_DEFAULT_ERROR)
     }
 
@@ -194,6 +203,11 @@ impl<T: Operations> OperationsVTable<T> {
         // `into_foreign` in `Self::init_hctx_callback`.
         let hw_data = unsafe { T::HwData::borrow((*hctx).driver_data) };
 
+        let is_poll = u32::from(
+            // SAFETY: `hctx` is valid as required by this function.
+            unsafe { (*hctx).type_ },
+        ) == bindings::hctx_type_HCTX_TYPE_POLL;
+
         // SAFETY: `hctx` is valid as required by this function.
         let queue_data = unsafe { (*(*hctx).queue).queuedata };
 
@@ -210,6 +224,7 @@ impl<T: Operations> OperationsVTable<T> {
             // SAFETY: `bd` is valid as required by the safety requirement for
             // this function.
             unsafe { (*bd).last },
+            is_poll,
         );
 
         if let Err(e) = ret {
@@ -268,13 +283,32 @@ impl<T: Operations> OperationsVTable<T> {
     /// previously initialized by a call to `init_hctx_callback`.
     unsafe extern "C" fn poll_callback(
         hctx: *mut bindings::blk_mq_hw_ctx,
-        _iob: *mut bindings::io_comp_batch,
+        iob: *mut bindings::io_comp_batch,
     ) -> crate::ffi::c_int {
         // SAFETY: By function safety requirement, `hctx` was initialized by
         // `init_hctx_callback` and thus `driver_data` came from a call to
         // `into_foreign`.
         let hw_data = unsafe { T::HwData::borrow((*hctx).driver_data) };
-        T::poll(hw_data).into()
+
+        // SAFETY: `hctx` is valid as required by this function.
+        let queue_data = unsafe { (*(*hctx).queue).queuedata };
+
+        // SAFETY: `queue.queuedata` was created by `GenDiskBuilder::build` with
+        // a call to `ForeignOwnable::into_foreign` to create `queuedata`.
+        // `ForeignOwnable::from_foreign` is only called when the tagset is
+        // dropped, which happens after we are dropped.
+        let queue_data = unsafe { T::QueueData::borrow(queue_data) };
+
+        let mut batch = IoCompletionBatch {
+            inner: iob,
+            _p: PhantomData,
+        };
+
+        let ret = T::poll(hw_data, queue_data, &mut batch);
+        match ret {
+            Ok(val) => val.into(),
+            Err(e) => e.to_errno(),
+        }
     }
 
     /// This function is called by the C kernel. A pointer to this function is
@@ -473,3 +507,67 @@ pub(crate) const fn build() -> &'static bindings::blk_mq_ops {
         &Self::VTABLE
     }
 }
+
+/// A batch of I/O completions for polled I/O.
+///
+/// This struct wraps the C `struct io_comp_batch` and is used to batch
+/// multiple request completions together for improved efficiency during polled
+/// I/O operations.
+///
+/// When the kernel polls for completed requests via [`Operations::poll`], it
+/// passes an `IoCompletionBatch` to collect completed requests. The driver can
+/// add completed requests to the batch using [`add_request`], allowing the
+/// kernel to process multiple completions together rather than one at a time.
+///
+/// # Invariants
+///
+/// - `inner` must point to a valid `io_comp_batch`.
+///
+/// [`add_request`]: IoCompletionBatch::add_request
+#[repr(transparent)]
+pub struct IoCompletionBatch<T> {
+    inner: *mut bindings::io_comp_batch,
+    _p: PhantomData<T>,
+}
+
+impl<T: Operations> IoCompletionBatch<T> {
+    /// Attempt to add a completed request to this batch.
+    ///
+    /// This method tries to add `rq` to the batch for deferred completion. If
+    /// the request is successfully added, ownership is transferred to the batch
+    /// and the request will be completed later when the batch is processed.
+    ///
+    /// # Arguments
+    ///
+    /// - `rq`: The completed request to add to the batch.
+    /// - `error`: Set to `true` if the request completed with an error.
+    ///
+    /// # Return
+    ///
+    /// When this method returns `Err`, the caller is responsible for completing
+    /// the request through other means, such as calling
+    /// [`Request::complete`](super::Request::complete).
+    pub fn add_request(
+        &mut self,
+        rq: Owned<Request<T>>,
+        error: bool,
+    ) -> Result<(), Owned<Request<T>>> {
+        // SAFETY: By type invariant, `self.inner` is a valid `io_comp_batch`.
+        let ret = unsafe {
+            bindings::blk_mq_add_to_batch(
+                rq.as_raw(),
+                self.inner,
+                error,
+                Some(bindings::blk_mq_end_request_batch),
+            )
+        };
+
+        match ret {
+            true => {
+                core::mem::forget(rq);
+                Ok(())
+            }
+            false => Err(rq),
+        }
+    }
+}
diff --git a/rust/kernel/block/mq/request.rs b/rust/kernel/block/mq/request.rs
index 66ef2493c448..dbe657a80324 100644
--- a/rust/kernel/block/mq/request.rs
+++ b/rust/kernel/block/mq/request.rs
@@ -156,6 +156,11 @@ pub fn queue(&self) -> &RequestQueue<T> {
         // SAFETY: By type invariant, self.0 is guaranteed to be valid.
         unsafe { RequestQueue::from_raw((*self.0.get()).q) }
     }
+
+    /// Return a raw pointer to the underlying C structure.
+    pub fn as_raw(&self) -> *mut bindings::request {
+        self.0.get()
+    }
 }
 
 /// A wrapper around a blk-mq [`struct request`]. This represents an IO request.

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 57/83] block: rust: add accessors to `TagSet`
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (55 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 56/83] block: rust: add polled completion support Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 58/83] block: rnull: add polled completion support Andreas Hindborg
                   ` (25 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add `hw_queue_count()` to query the number of hardware queues and
`data()` to borrow the private tag set data associated with a `TagSet`.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/kernel/block/mq/tag_set.rs | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/rust/kernel/block/mq/tag_set.rs b/rust/kernel/block/mq/tag_set.rs
index e62dfd267fd9..858c1b952b00 100644
--- a/rust/kernel/block/mq/tag_set.rs
+++ b/rust/kernel/block/mq/tag_set.rs
@@ -4,8 +4,6 @@
 //!
 //! C header: [`include/linux/blk-mq.h`](srctree/include/linux/blk-mq.h)
 
-use core::pin::Pin;
-
 use crate::{
     alloc::NumaNode,
     bindings,
@@ -26,7 +24,8 @@
 };
 use core::{
     convert::TryInto,
-    marker::PhantomData, //
+    marker::PhantomData,
+    pin::Pin, //
 };
 use pin_init::{
     pin_data,
@@ -164,6 +163,22 @@ pub fn update_maps(self: Pin<&mut Self>, mut cb: impl FnMut(QueueMap)) -> Result
 
         Ok(())
     }
+
+    /// Return the number of hardware queues for this tag set.
+    pub fn hw_queue_count(&self) -> u32 {
+        // SAFETY: By type invariant, `self.inner` is valid.
+        unsafe { (*self.inner.get()).nr_hw_queues }
+    }
+
+    /// Borrow the [`T::TagSetData`] associated with this tag set.
+    pub fn data(&self) -> <T::TagSetData as ForeignOwnable>::Borrowed<'_> {
+        // SAFETY: By type invariant, `self.inner` is valid.
+        let ptr = unsafe { (*self.inner.get()).driver_data };
+
+        // SAFETY: `ptr` was created by `into_foreign` during initialization and the target is not
+        // converted back with `from_foreign` while `&self` is live.
+        unsafe { T::TagSetData::borrow(ptr) }
+    }
 }
 
 #[pinned_drop]

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 58/83] block: rnull: add polled completion support
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (56 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 57/83] block: rust: add accessors to `TagSet` Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 59/83] block: rnull: add REQ_OP_FLUSH support Andreas Hindborg
                   ` (24 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add support for polled I/O completion in rnull. This feature requires
configuring poll queues via the `poll_queues` attribute.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/configfs.rs |  19 +++++-
 drivers/block/rnull/rnull.rs    | 133 ++++++++++++++++++++++++++++++++++++----
 2 files changed, 139 insertions(+), 13 deletions(-)

diff --git a/drivers/block/rnull/configfs.rs b/drivers/block/rnull/configfs.rs
index f866595a263c..0637c1e0ab22 100644
--- a/drivers/block/rnull/configfs.rs
+++ b/drivers/block/rnull/configfs.rs
@@ -81,7 +81,7 @@ impl AttributeOperations<0> for Config {
         writer.write_str(
             "blocksize,size,rotational,irqmode,completion_nsec,memory_backed,\
              submit_queues,use_per_node_hctx,discard,blocking,shared_tags,\
-             zoned,zone_size,zone_capacity\n",
+             zoned,zone_size,zone_capacity,poll_queues\n",
         )?;
         Ok(writer.bytes_written())
     }
@@ -127,6 +127,7 @@ fn make_group(
                 zone_max_open: 24,
                 zone_max_active: 25,
                 zone_append_max_sectors: 26,
+                poll_queues: 27,
             ],
         };
 
@@ -167,6 +168,7 @@ fn make_group(
                     zone_max_open: 0,
                     zone_max_active: 0,
                     zone_append_max_sectors: u32::MAX,
+                    poll_queues: 0,
                 }),
             }),
             core::iter::empty(),
@@ -253,6 +255,7 @@ struct DeviceConfigInner {
     zone_max_open: u32,
     zone_max_active: u32,
     zone_append_max_sectors: u32,
+    poll_queues: u32,
 }
 
 #[vtable]
@@ -305,6 +308,7 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
                 shared_tag_set: guard.shared_tags.then(|| guard.shared_tag_set.clone()),
                 tag_set: crate::TagSetOptions {
                     submit_queues: guard.submit_queues,
+                    poll_queues: guard.poll_queues,
                     home_node: guard.home_node,
                     blocking: guard.blocking,
                     memory_backed: guard.memory_backed,
@@ -498,3 +502,16 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
 configfs_simple_field!(DeviceConfig, 24, zone_max_open, u32);
 configfs_simple_field!(DeviceConfig, 25, zone_max_active, u32);
 configfs_simple_field!(DeviceConfig, 26, zone_append_max_sectors, u32);
+configfs_simple_field!(
+    DeviceConfig,
+    27,
+    poll_queues,
+    u32,
+    check(|value| {
+        if value > kernel::cpu::num_possible_cpus() {
+            Err(kernel::error::code::EINVAL)
+        } else {
+            Ok(())
+        }
+    })
+);
diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index 076493f92516..edb4ef53d6ad 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -33,6 +33,7 @@
                 GenDisk,
                 GenDiskRef, //
             },
+            IoCompletionBatch,
             Operations,
             TagSet, //
         },
@@ -186,6 +187,10 @@
             default: 0,
             description: "Maximum size of a zone append command (in 512B sectors). Specify 0 for no zone append.",
         },
+        poll_queues: u32 {
+            default: 0,
+            description: "Number of IOPOLL submission queues.",
+        },
     },
 }
 
@@ -207,6 +212,7 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
             } else {
                 module_parameters::submit_queues.value()
             };
+            let poll_queues = module_parameters::poll_queues.value();
             let home_node = module_parameters::home_node.value();
             let blocking = module_parameters::blocking.value();
             let memory_backed = module_parameters::memory_backed.value();
@@ -215,6 +221,7 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
 
             let shared_tag_set = NullBlkDevice::build_tag_set(TagSetOptions {
                 submit_queues,
+                poll_queues,
                 home_node,
                 blocking,
                 memory_backed,
@@ -246,6 +253,7 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
                         .then(|| shared_tag_set.clone()),
                     tag_set: TagSetOptions {
                         submit_queues,
+                        poll_queues,
                         home_node,
                         blocking,
                         memory_backed,
@@ -325,6 +333,7 @@ struct NullBlkDevice {
 
 struct TagSetOptions {
     submit_queues: u32,
+    poll_queues: u32,
     home_node: i32,
     blocking: bool,
     memory_backed: bool,
@@ -338,6 +347,7 @@ impl NullBlkDevice {
     fn build_tag_set(options: TagSetOptions) -> Result<Arc<TagSet<Self>>> {
         let TagSetOptions {
             submit_queues,
+            poll_queues,
             home_node,
             blocking,
             memory_backed,
@@ -364,7 +374,21 @@ fn build_tag_set(options: TagSetOptions) -> Result<Arc<TagSet<Self>>> {
         }
 
         Arc::pin_init(
-            TagSet::new(submit_queues, (), hw_queue_depth, 1, numa_node, flags),
+            TagSet::new(
+                submit_queues + poll_queues,
+                KBox::new(
+                    NullBlkTagsetData {
+                        queue_depth: hw_queue_depth,
+                        submit_queue_count: submit_queues,
+                        poll_queue_count: poll_queues,
+                    },
+                    GFP_KERNEL,
+                )?,
+                hw_queue_depth,
+                if poll_queues == 0 { 1 } else { 3 },
+                numa_node,
+                flags,
+            ),
             GFP_KERNEL,
         )
     }
@@ -729,6 +753,7 @@ fn run(
 
 struct HwQueueContext {
     page: Option<KBox<disk_storage::NullBlockPage>>,
+    poll_queue: kernel::alloc::ringbuffer::KRingBuffer<Owned<mq::Request<NullBlkDevice>>>,
 }
 
 #[pin_data]
@@ -757,11 +782,17 @@ impl HasHrTimer<Self> for Pdu {
     }
 }
 
+struct NullBlkTagsetData {
+    queue_depth: u32,
+    submit_queue_count: u32,
+    poll_queue_count: u32,
+}
+
 #[vtable]
 impl Operations for NullBlkDevice {
     type QueueData = Arc<Self>;
     type RequestData = Pdu;
-    type TagSetData = ();
+    type TagSetData = KBox<NullBlkTagsetData>;
     type HwData = Pin<KBox<SpinLock<HwQueueContext>>>;
 
     fn new_request_data() -> impl PinInit<Self::RequestData> {
@@ -777,7 +808,7 @@ fn queue_rq(
         this: ArcBorrow<'_, Self>,
         rq: Owned<mq::IdleRequest<Self>>,
         _is_last: bool,
-        _is_poll: bool,
+        is_poll: bool,
     ) -> BlkResult {
         if this.bandwidth_limit != 0 {
             if !this.bandwidth_timer.active() {
@@ -814,13 +845,29 @@ fn queue_rq(
         #[cfg(not(CONFIG_BLK_DEV_ZONED))]
         this.handle_regular_command(&hw_data, &mut rq)?;
 
-        match this.irq_mode {
-            IRQMode::None => Self::end_request(rq),
-            IRQMode::Soft => mq::Request::complete(rq.into()),
-            IRQMode::Timer => {
-                OwnableRefCounted::into_shared(rq)
-                    .start(this.completion_time)
-                    .dismiss();
+        if is_poll {
+            // NOTE: We lack the ability to insert `Owned<Request>` into a
+            // `kernel::list::List`, so we use a `RingBuffer` instead. The
+            // drawback of this is that we have to allocate the space for the
+            // ring buffer during drive initialization, and we have to hold the
+            // lock protecting the list until we have processed all the requests
+            // in the list. Change to a linked list when the kernel gets this
+            // ability.
+
+            // NOTE: We are processing requests during submit rather than during
+            // poll. This is different from C driver. C driver does processing
+            // during poll.
+
+            hw_data.lock().poll_queue.push_head(rq)?;
+        } else {
+            match this.irq_mode {
+                IRQMode::None => Self::end_request(rq),
+                IRQMode::Soft => mq::Request::complete(rq.into()),
+                IRQMode::Timer => {
+                    OwnableRefCounted::into_shared(rq)
+                        .start(this.completion_time)
+                        .dismiss();
+                }
             }
         }
         Ok(())
@@ -828,8 +875,40 @@ fn queue_rq(
 
     fn commit_rqs(_hw_data: Pin<&SpinLock<HwQueueContext>>, _queue_data: ArcBorrow<'_, Self>) {}
 
-    fn init_hctx(_tagset_data: (), _hctx_idx: u32) -> Result<Self::HwData> {
-        KBox::pin_init(new_spinlock!(HwQueueContext { page: None }), GFP_KERNEL)
+    fn poll(
+        hw_data: Pin<&SpinLock<HwQueueContext>>,
+        _this: ArcBorrow<'_, Self>,
+        batch: &mut IoCompletionBatch<Self>,
+    ) -> Result<bool> {
+        let mut guard = hw_data.lock();
+        let mut completed = false;
+
+        while let Some(rq) = guard.poll_queue.pop_tail() {
+            let status = rq.data_ref().error.load(ordering::Relaxed);
+            rq.data_ref().error.store(0, ordering::Relaxed);
+
+            // TODO: check error handling via status
+            if let Err(rq) = batch.add_request(rq, status != 0) {
+                Self::end_request(rq);
+            }
+
+            completed = true;
+        }
+
+        Ok(completed)
+    }
+
+    fn init_hctx(tagset_data: &NullBlkTagsetData, _hctx_idx: u32) -> Result<Self::HwData> {
+        KBox::pin_init(
+            new_spinlock!(HwQueueContext {
+                page: None,
+                poll_queue: kernel::alloc::ringbuffer::KRingBuffer::new(
+                    tagset_data.queue_depth.try_into()?,
+                    GFP_KERNEL,
+                )?,
+            }),
+            GFP_KERNEL,
+        )
     }
 
     fn complete(rq: ARef<mq::Request<Self>>) {
@@ -849,4 +928,34 @@ fn report_zones(
     ) -> Result<u32> {
         Self::report_zones_internal(disk, sector, nr_zones, callback)
     }
+
+    fn map_queues(tag_set: Pin<&mut TagSet<Self>>) {
+        let mut submit_queue_count = tag_set.data().submit_queue_count;
+        let mut poll_queue_count = tag_set.data().poll_queue_count;
+
+        if tag_set.hw_queue_count() != submit_queue_count + poll_queue_count {
+            pr_warn!(
+                "tag set has unexpected hardware queue count: {}\n",
+                tag_set.hw_queue_count()
+            );
+            submit_queue_count = 1;
+            poll_queue_count = 0;
+        }
+
+        let mut offset = 0;
+        tag_set
+            .update_maps(|mut qmap| {
+                use mq::QueueType::*;
+                let queue_count = match qmap.kind() {
+                    Default => submit_queue_count,
+                    Read => 0,
+                    Poll => poll_queue_count,
+                };
+                qmap.set_queue_count(queue_count);
+                qmap.set_offset(offset);
+                offset += queue_count;
+                qmap.map_queues();
+            })
+            .unwrap()
+    }
 }

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 59/83] block: rnull: add REQ_OP_FLUSH support
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (57 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 58/83] block: rnull: add polled completion support Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 60/83] block: rust: add request flags abstraction Andreas Hindborg
                   ` (23 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add support for handling flush requests in rnull. When memory backing
and write cache are enabled, flush requests trigger a cache flush
operation that writes all dirty cache pages to the backing store.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/disk_storage.rs | 45 +++++++++++++++++++++++++++++++------
 drivers/block/rnull/rnull.rs        | 31 +++++++++++++++++--------
 2 files changed, 60 insertions(+), 16 deletions(-)

diff --git a/drivers/block/rnull/disk_storage.rs b/drivers/block/rnull/disk_storage.rs
index 82de1f656f68..7667830bd616 100644
--- a/drivers/block/rnull/disk_storage.rs
+++ b/drivers/block/rnull/disk_storage.rs
@@ -85,6 +85,13 @@ pub(crate) fn discard(
             remaining_bytes -= processed;
         }
     }
+
+    pub(crate) fn flush(&self, hw_data: &Pin<&SpinLock<HwQueueContext>>) -> Result {
+        let mut tree_guard = self.lock();
+        let mut hw_data_guard = hw_data.lock();
+        let mut access = self.access(&mut tree_guard, &mut hw_data_guard, None);
+        access.flush()
+    }
 }
 
 pub(crate) struct DiskStorageAccess<'a, 'b, 'c> {
@@ -120,18 +127,32 @@ fn to_sector(index: usize) -> u64 {
         (index << block::PAGE_SECTORS_SHIFT) as u64
     }
 
+    fn extract_cache_page(&mut self) -> Result<Option<KBox<NullBlockPage>>> {
+        Self::extract_cache_page_inner(
+            &mut self.cache_guard,
+            &mut self.disk_guard,
+            self.disk_storage,
+            self.hw_data_guard,
+            self.sheaf.as_mut(),
+        )
+    }
+
     fn extract_cache_page_inner<'g>(
         cache_guard: &mut xarray::Guard<'g, TreeNode>,
         disk_guard: &mut xarray::Guard<'g, TreeNode>,
         disk_storage: &DiskStorage,
         hw_data: &mut HwQueueContext,
         sheaf: Option<&mut XArraySheaf<'_>>,
-    ) -> Result<KBox<NullBlockPage>> {
-        let cache_entry = cache_guard
-            .find_next_entry_circular(
-                disk_storage.next_flush_sector.load(ordering::Relaxed) as usize
-            )
-            .expect("Expected to find a page in the cache");
+    ) -> Result<Option<KBox<NullBlockPage>>> {
+        let cache_entry = cache_guard.find_next_entry_circular(
+            disk_storage.next_flush_sector.load(ordering::Relaxed) as usize,
+        );
+
+        let cache_entry = if let Some(entry) = cache_entry {
+            entry
+        } else {
+            return Ok(None);
+        };
 
         let index = cache_entry.index();
 
@@ -172,7 +193,16 @@ fn extract_cache_page_inner<'g>(
             }
         };
 
-        Ok(page)
+        Ok(Some(page))
+    }
+
+    fn flush(&mut self) -> Result {
+        if self.disk_storage.cache_size > 0 {
+            while let Some(page) = self.extract_cache_page()? {
+                drop(page);
+            }
+        }
+        Ok(())
     }
 
     fn get_cache_page(&mut self, sector: u64) -> Result<&mut NullBlockPage> {
@@ -197,6 +227,7 @@ fn get_cache_page(&mut self, sector: u64) -> Result<&mut NullBlockPage> {
                         self.hw_data_guard,
                         self.sheaf.as_mut(),
                     )?
+                    .expect("Expected to find a page in the cache")
                 };
                 let xarray::Entry::Vacant(vacant_entry) = cache_guard.entry(index) else {
                     unreachable!("slot was vacant and we hold the lock")
diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index edb4ef53d6ad..0695cbd07f1d 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -719,6 +719,18 @@ fn end_request(rq: Owned<mq::Request<Self>>) {
             _ => rq.end(bindings::BLK_STS_IOERR),
         }
     }
+
+    fn complete_request(&self, rq: Owned<mq::Request<Self>>) {
+        match self.irq_mode {
+            IRQMode::None => Self::end_request(rq),
+            IRQMode::Soft => mq::Request::complete(rq.into()),
+            IRQMode::Timer => {
+                OwnableRefCounted::into_shared(rq)
+                    .start(self.completion_time)
+                    .dismiss();
+            }
+        }
+    }
 }
 
 impl_has_hr_timer! {
@@ -835,6 +847,15 @@ fn queue_rq(
 
         let mut rq = rq.start();
 
+        if rq.command() == mq::Command::Flush {
+            if this.memory_backed {
+                this.storage.flush(&hw_data)?;
+            }
+            this.complete_request(rq);
+
+            return Ok(());
+        }
+
         #[cfg(CONFIG_BLK_DEV_ZONED)]
         if this.zoned.enabled {
             this.handle_zoned_command(&hw_data, &mut rq)?;
@@ -860,15 +881,7 @@ fn queue_rq(
 
             hw_data.lock().poll_queue.push_head(rq)?;
         } else {
-            match this.irq_mode {
-                IRQMode::None => Self::end_request(rq),
-                IRQMode::Soft => mq::Request::complete(rq.into()),
-                IRQMode::Timer => {
-                    OwnableRefCounted::into_shared(rq)
-                        .start(this.completion_time)
-                        .dismiss();
-                }
-            }
+            this.complete_request(rq);
         }
         Ok(())
     }

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 60/83] block: rust: add request flags abstraction
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (58 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 59/83] block: rnull: add REQ_OP_FLUSH support Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 61/83] block: rust: add abstraction for block queue feature flags Andreas Hindborg
                   ` (22 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add the `Flag` enum and `Flags` type as Rust abstractions for the C
`REQ_*` request flags. These flags modify how block I/O requests are
processed, including sync behavior, priority hints, and integrity
settings.

Also add a `flags()` method to `Request` to retrieve the flags for a
given request.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/bindings/bindings_helper.h      | 21 ++++++++++++
 rust/kernel/block/mq.rs              |  2 ++
 rust/kernel/block/mq/request.rs      | 12 +++++++
 rust/kernel/block/mq/request/flag.rs | 65 ++++++++++++++++++++++++++++++++++++
 4 files changed, 100 insertions(+)

diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
index 2a69c17bf271..7acda3ae9725 100644
--- a/rust/bindings/bindings_helper.h
+++ b/rust/bindings/bindings_helper.h
@@ -140,6 +140,27 @@ const blk_status_t RUST_CONST_HELPER_BLK_STS_OFFLINE = BLK_STS_OFFLINE;
 const blk_status_t RUST_CONST_HELPER_BLK_STS_DURATION_LIMIT = BLK_STS_DURATION_LIMIT;
 const blk_status_t RUST_CONST_HELPER_BLK_STS_INVAL = BLK_STS_INVAL;
 const blk_features_t RUST_CONST_HELPER_BLK_FEAT_ZONED = BLK_FEAT_ZONED;
+const blk_opf_t RUST_CONST_HELPER_REQ_FAILFAST_DEV = REQ_FAILFAST_DEV;
+const blk_opf_t RUST_CONST_HELPER_REQ_FAILFAST_TRANSPORT = REQ_FAILFAST_TRANSPORT;
+const blk_opf_t RUST_CONST_HELPER_REQ_FAILFAST_DRIVER = REQ_FAILFAST_DRIVER;
+const blk_opf_t RUST_CONST_HELPER_REQ_SYNC = REQ_SYNC;
+const blk_opf_t RUST_CONST_HELPER_REQ_META = REQ_META;
+const blk_opf_t RUST_CONST_HELPER_REQ_PRIO = REQ_PRIO;
+const blk_opf_t RUST_CONST_HELPER_REQ_NOMERGE = REQ_NOMERGE;
+const blk_opf_t RUST_CONST_HELPER_REQ_IDLE = REQ_IDLE;
+const blk_opf_t RUST_CONST_HELPER_REQ_INTEGRITY = REQ_INTEGRITY;
+const blk_opf_t RUST_CONST_HELPER_REQ_FUA = REQ_FUA;
+const blk_opf_t RUST_CONST_HELPER_REQ_PREFLUSH = REQ_PREFLUSH;
+const blk_opf_t RUST_CONST_HELPER_REQ_RAHEAD = REQ_RAHEAD;
+const blk_opf_t RUST_CONST_HELPER_REQ_BACKGROUND = REQ_BACKGROUND;
+const blk_opf_t RUST_CONST_HELPER_REQ_NOWAIT = REQ_NOWAIT;
+const blk_opf_t RUST_CONST_HELPER_REQ_POLLED = REQ_POLLED;
+const blk_opf_t RUST_CONST_HELPER_REQ_ALLOC_CACHE = REQ_ALLOC_CACHE;
+const blk_opf_t RUST_CONST_HELPER_REQ_SWAP = REQ_SWAP;
+const blk_opf_t RUST_CONST_HELPER_REQ_DRV = REQ_DRV;
+const blk_opf_t RUST_CONST_HELPER_REQ_FS_PRIVATE = REQ_FS_PRIVATE;
+const blk_opf_t RUST_CONST_HELPER_REQ_ATOMIC = REQ_ATOMIC;
+const blk_opf_t RUST_CONST_HELPER_REQ_NOUNMAP = REQ_NOUNMAP;
 const fop_flags_t RUST_CONST_HELPER_FOP_UNSIGNED_OFFSET = FOP_UNSIGNED_OFFSET;
 
 const xa_mark_t RUST_CONST_HELPER_XA_PRESENT = XA_PRESENT;
diff --git a/rust/kernel/block/mq.rs b/rust/kernel/block/mq.rs
index 23bf95136bc1..9bad95d79230 100644
--- a/rust/kernel/block/mq.rs
+++ b/rust/kernel/block/mq.rs
@@ -137,6 +137,8 @@
 };
 pub use request::{
     Command,
+    Flag as RequestFlag,
+    Flags as RequestFlags,
     IdleRequest,
     Request,
     RequestTimerHandle, //
diff --git a/rust/kernel/block/mq/request.rs b/rust/kernel/block/mq/request.rs
index dbe657a80324..84f8b2c17f85 100644
--- a/rust/kernel/block/mq/request.rs
+++ b/rust/kernel/block/mq/request.rs
@@ -48,6 +48,12 @@
 mod command;
 pub use command::Command;
 
+mod flag;
+pub use flag::{
+    Flag,
+    Flags, //
+};
+
 /// A [`Request`] that a driver has not yet begun to process.
 ///
 /// A driver can convert an `IdleRequest` to a [`Request`] by calling [`IdleRequest::start`].
@@ -125,6 +131,12 @@ pub fn command(&self) -> Command {
         unsafe { Command::from_raw(self.command_raw()) }
     }
 
+    pub fn flags(&self) -> Flags {
+        // SAFETY: By C API contract and type invariant, `cmd_flags` is valid for read
+        let flags = unsafe { (*self.0.get()).cmd_flags & !((1 << bindings::REQ_OP_BITS) - 1) };
+        Flags::try_from(flags).expect("Request should have valid flags")
+    }
+
     /// Get the target sector for the request.
     #[inline(always)]
     pub fn sector(&self) -> u64 {
diff --git a/rust/kernel/block/mq/request/flag.rs b/rust/kernel/block/mq/request/flag.rs
new file mode 100644
index 000000000000..01f249269803
--- /dev/null
+++ b/rust/kernel/block/mq/request/flag.rs
@@ -0,0 +1,65 @@
+// SPDX-License-Identifier: GPL-2.0
+use crate::{
+    bindings,
+    impl_flags, //
+};
+
+impl_flags! {
+    /// A set of request flags.
+    ///
+    /// This type wraps the C `REQ_*` flags and allows combining multiple flags
+    /// together. These flags modify how a block I/O request is processed.
+    #[derive(Debug, Clone, Default, Copy, PartialEq, Eq)]
+    pub struct Flags(u32);
+
+    /// Individual request flags for block I/O operations.
+    ///
+    /// These flags correspond to the C `REQ_*` defines in `linux/blk_types.h`
+    /// and are used to modify the behavior of block I/O requests.
+    #[derive(Debug, Clone, Copy, PartialEq, Eq)]
+    pub enum Flag {
+        /// No driver retries on device errors.
+        FailfastDev = bindings::REQ_FAILFAST_DEV,
+        /// No driver retries on transport errors.
+        FailfastTransport = bindings::REQ_FAILFAST_TRANSPORT,
+        /// No driver retries on driver errors.
+        FailfastDriver = bindings::REQ_FAILFAST_DRIVER,
+        /// Request is synchronous (sync write or read).
+        Sync = bindings::REQ_SYNC,
+        /// Metadata I/O request.
+        Meta = bindings::REQ_META,
+        /// Boost priority in CFQ scheduler.
+        Priority = bindings::REQ_PRIO,
+        /// Don't merge this request with others.
+        NoMerge = bindings::REQ_NOMERGE,
+        /// Anticipate more I/O after this one.
+        Idle = bindings::REQ_IDLE,
+        /// I/O includes block integrity payload.
+        Integrity = bindings::REQ_INTEGRITY,
+        /// Forced unit access - data must be written to persistent storage
+        /// before command completion is signaled.
+        ForcedUnitAccess = bindings::REQ_FUA,
+        /// Request a cache flush before this operation.
+        Preflush = bindings::REQ_PREFLUSH,
+        /// Read ahead request, can fail anytime.
+        ReadAhead = bindings::REQ_RAHEAD,
+        /// Background I/O operation.
+        Background = bindings::REQ_BACKGROUND,
+        /// Don't wait if the request would block.
+        NoWait = bindings::REQ_NOWAIT,
+        /// Caller polls for completion using `bio_poll`.
+        Polled = bindings::REQ_POLLED,
+        /// Allocate I/O from cache if available.
+        AllocCache = bindings::REQ_ALLOC_CACHE,
+        /// Swap I/O operation.
+        Swap = bindings::REQ_SWAP,
+        /// Reserved for driver use.
+        Driver = bindings::REQ_DRV,
+        /// Reserved for file system (submitter) use.
+        FsPrivate = bindings::REQ_FS_PRIVATE,
+        /// Atomic write operation.
+        Atomic = bindings::REQ_ATOMIC,
+        /// Do not free blocks when zeroing (for write zeroes operations).
+        NoUnmap = bindings::REQ_NOUNMAP,
+    }
+}

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 61/83] block: rust: add abstraction for block queue feature flags
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (59 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 60/83] block: rust: add request flags abstraction Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 62/83] block: rust: allow setting write cache and FUA flags for `GenDisk` Andreas Hindborg
                   ` (21 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add the `Feature` enum and `Features` type as Rust abstractions for the
C `blk_features_t` bitfield. These types wrap the `BLK_FEAT_*` flags
and allow drivers to describe block device capabilities such as write
cache support, FUA, rotational media, and DAX.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/bindings/bindings_helper.h | 15 +++++++-
 rust/kernel/block/mq.rs         |  5 +++
 rust/kernel/block/mq/feature.rs | 76 +++++++++++++++++++++++++++++++++++++++++
 3 files changed, 95 insertions(+), 1 deletion(-)

diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
index 7acda3ae9725..af0330b9e491 100644
--- a/rust/bindings/bindings_helper.h
+++ b/rust/bindings/bindings_helper.h
@@ -119,7 +119,6 @@ const gfp_t RUST_CONST_HELPER_GFP_NOWAIT = GFP_NOWAIT;
 const gfp_t RUST_CONST_HELPER___GFP_ZERO = __GFP_ZERO;
 const gfp_t RUST_CONST_HELPER___GFP_HIGHMEM = ___GFP_HIGHMEM;
 const gfp_t RUST_CONST_HELPER___GFP_NOWARN = ___GFP_NOWARN;
-const blk_features_t RUST_CONST_HELPER_BLK_FEAT_ROTATIONAL = BLK_FEAT_ROTATIONAL;
 const blk_status_t RUST_CONST_HELPER_BLK_STS_OK = BLK_STS_OK;
 const blk_status_t RUST_CONST_HELPER_BLK_STS_NOTSUPP = BLK_STS_NOTSUPP;
 const blk_status_t RUST_CONST_HELPER_BLK_STS_TIMEOUT = BLK_STS_TIMEOUT;
@@ -139,7 +138,21 @@ const blk_status_t RUST_CONST_HELPER_BLK_STS_ZONE_ACTIVE_RESOURCE = BLK_STS_ZONE
 const blk_status_t RUST_CONST_HELPER_BLK_STS_OFFLINE = BLK_STS_OFFLINE;
 const blk_status_t RUST_CONST_HELPER_BLK_STS_DURATION_LIMIT = BLK_STS_DURATION_LIMIT;
 const blk_status_t RUST_CONST_HELPER_BLK_STS_INVAL = BLK_STS_INVAL;
+const blk_features_t RUST_CONST_HELPER_BLK_FEAT_WRITE_CACHE = BLK_FEAT_WRITE_CACHE;
+const blk_features_t RUST_CONST_HELPER_BLK_FEAT_FUA = BLK_FEAT_FUA;
+const blk_features_t RUST_CONST_HELPER_BLK_FEAT_ROTATIONAL = BLK_FEAT_ROTATIONAL;
+const blk_features_t RUST_CONST_HELPER_BLK_FEAT_ADD_RANDOM = BLK_FEAT_ADD_RANDOM;
+const blk_features_t RUST_CONST_HELPER_BLK_FEAT_IO_STAT = BLK_FEAT_IO_STAT;
+const blk_features_t RUST_CONST_HELPER_BLK_FEAT_STABLE_WRITES = BLK_FEAT_STABLE_WRITES;
+const blk_features_t RUST_CONST_HELPER_BLK_FEAT_SYNCHRONOUS = BLK_FEAT_SYNCHRONOUS;
+const blk_features_t RUST_CONST_HELPER_BLK_FEAT_NOWAIT = BLK_FEAT_NOWAIT;
+const blk_features_t RUST_CONST_HELPER_BLK_FEAT_DAX = BLK_FEAT_DAX;
+const blk_features_t RUST_CONST_HELPER_BLK_FEAT_POLL = BLK_FEAT_POLL;
 const blk_features_t RUST_CONST_HELPER_BLK_FEAT_ZONED = BLK_FEAT_ZONED;
+const blk_features_t RUST_CONST_HELPER_BLK_FEAT_PCI_P2PDMA = BLK_FEAT_PCI_P2PDMA;
+const blk_features_t RUST_CONST_HELPER_BLK_FEAT_SKIP_TAGSET_QUIESCE = BLK_FEAT_SKIP_TAGSET_QUIESCE;
+const blk_features_t RUST_CONST_HELPER_BLK_FEAT_RAID_PARTIAL_STRIPES_EXPENSIVE = BLK_FEAT_RAID_PARTIAL_STRIPES_EXPENSIVE;
+const blk_features_t RUST_CONST_HELPER_BLK_FEAT_ATOMIC_WRITES = BLK_FEAT_ATOMIC_WRITES;
 const blk_opf_t RUST_CONST_HELPER_REQ_FAILFAST_DEV = REQ_FAILFAST_DEV;
 const blk_opf_t RUST_CONST_HELPER_REQ_FAILFAST_TRANSPORT = REQ_FAILFAST_TRANSPORT;
 const blk_opf_t RUST_CONST_HELPER_REQ_FAILFAST_DRIVER = REQ_FAILFAST_DRIVER;
diff --git a/rust/kernel/block/mq.rs b/rust/kernel/block/mq.rs
index 9bad95d79230..7c346be843e1 100644
--- a/rust/kernel/block/mq.rs
+++ b/rust/kernel/block/mq.rs
@@ -125,12 +125,17 @@
 //! # Ok::<(), kernel::error::Error>(())
 //! ```
 
+mod feature;
 pub mod gen_disk;
 mod operations;
 mod request;
 mod request_queue;
 pub mod tag_set;
 
+pub use feature::{
+    Feature,
+    Features, //
+};
 pub use operations::{
     IoCompletionBatch,
     Operations, //
diff --git a/rust/kernel/block/mq/feature.rs b/rust/kernel/block/mq/feature.rs
new file mode 100644
index 000000000000..015d7925d5f0
--- /dev/null
+++ b/rust/kernel/block/mq/feature.rs
@@ -0,0 +1,76 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Block device feature flags.
+//!
+//! This module provides Rust abstractions for the C `blk_features_t` type and
+//! the associated `BLK_FEAT_*` flags defined in `include/linux/blkdev.h`.
+
+use crate::{
+    bindings,
+    impl_flags, //
+};
+
+impl_flags! {
+    /// A set of block device feature flags.
+    ///
+    /// This type wraps the C `blk_features_t` bitfield and represents a
+    /// combination of zero or more [`Feature`] flags. It is used to describe
+    /// the capabilities of a block device in [`struct queue_limits`].
+    ///
+    /// [`struct queue_limits`]: srctree/include/linux/blkdev.h
+    #[derive(Debug, Clone, Default, Copy, PartialEq, Eq)]
+    pub struct Features(u32);
+
+    /// A block device feature flag.
+    ///
+    /// Each variant corresponds to a `BLK_FEAT_*` constant defined in
+    /// `include/linux/blkdev.h`. These flags describe individual capabilities
+    /// or properties of a block device.
+    #[derive(Debug, Clone, Copy, PartialEq, Eq)]
+    pub enum Feature {
+        /// Supports a volatile write cache.
+        WriteCache = bindings::BLK_FEAT_WRITE_CACHE,
+
+        /// Supports passing on the FUA bit.
+        ForcedUnitAccess = bindings::BLK_FEAT_FUA,
+
+        /// Rotational device (hard drive or floppy).
+        Rotational = bindings::BLK_FEAT_ROTATIONAL,
+
+        /// Contributes to the random number pool.
+        AddRandom = bindings::BLK_FEAT_ADD_RANDOM,
+
+        /// Enables disk/partitions I/O accounting.
+        IoStat = bindings::BLK_FEAT_IO_STAT,
+
+        /// Don't modify data until writeback is done.
+        StableWrites = bindings::BLK_FEAT_STABLE_WRITES,
+
+        /// Always completes in submit context.
+        Synchronous = bindings::BLK_FEAT_SYNCHRONOUS,
+
+        /// Supports REQ_NOWAIT.
+        Nowait = bindings::BLK_FEAT_NOWAIT,
+
+        /// Supports DAX.
+        Dax = bindings::BLK_FEAT_DAX,
+
+        /// Supports I/O polling.
+        Poll = bindings::BLK_FEAT_POLL,
+
+        /// Is a zoned device.
+        Zoned = bindings::BLK_FEAT_ZONED,
+
+        /// Supports PCI(e) p2p requests.
+        PciP2Pdma = bindings::BLK_FEAT_PCI_P2PDMA,
+
+        /// Skips this queue in `blk_mq_(un)quiesce_tagset`.
+        SkipTagsetQuiesce = bindings::BLK_FEAT_SKIP_TAGSET_QUIESCE,
+
+        /// Undocumented magic for bcache.
+        RaidPartialStripesExpensive = bindings::BLK_FEAT_RAID_PARTIAL_STRIPES_EXPENSIVE,
+
+        /// Atomic writes enabled.
+        AtomicWrites = bindings::BLK_FEAT_ATOMIC_WRITES,
+    }
+}

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 62/83] block: rust: allow setting write cache and FUA flags for `GenDisk`
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (60 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 61/83] block: rust: add abstraction for block queue feature flags Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 63/83] block: rust: add `Segment::copy_to_page_limit` Andreas Hindborg
                   ` (20 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add methods to `GenDiskBuilder` for enabling the write cache and FUA
feature flags. These flags are set in the `queue_limits` structure
when building the disk.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/kernel/block/mq/gen_disk.rs | 29 +++++++++++++++++++++++++++--
 1 file changed, 27 insertions(+), 2 deletions(-)

diff --git a/rust/kernel/block/mq/gen_disk.rs b/rust/kernel/block/mq/gen_disk.rs
index eedba691e167..5367ca92b7aa 100644
--- a/rust/kernel/block/mq/gen_disk.rs
+++ b/rust/kernel/block/mq/gen_disk.rs
@@ -9,6 +9,7 @@
     bindings,
     block::mq::{
         operations::OperationsVTable,
+        Feature,
         Operations,
         RequestQueue,
         TagSet, //
@@ -55,6 +56,8 @@ pub struct GenDiskBuilder<T> {
     zone_size_sectors: u32,
     #[cfg(CONFIG_BLK_DEV_ZONED)]
     zone_append_max_sectors: u32,
+    write_cache: bool,
+    forced_unit_access: bool,
     _p: PhantomData<T>,
 }
 
@@ -72,6 +75,8 @@ fn default() -> Self {
             zone_size_sectors: 0,
             #[cfg(CONFIG_BLK_DEV_ZONED)]
             zone_append_max_sectors: 0,
+            write_cache: false,
+            forced_unit_access: false,
             _p: PhantomData,
         }
     }
@@ -164,6 +169,18 @@ pub fn zone_append_max(mut self, sectors: u32) -> Self {
         self
     }
 
+    /// Declare that this device supports forced unit access.
+    pub fn forced_unit_access(mut self, enable: bool) -> Self {
+        self.forced_unit_access = enable;
+        self
+    }
+
+    /// Declare that this device has a write-back cache.
+    pub fn write_cache(mut self, enable: bool) -> Self {
+        self.write_cache = enable;
+        self
+    }
+
     /// Build a new `GenDisk` and add it to the VFS.
     pub fn build(
         self,
@@ -183,7 +200,7 @@ pub fn build(
         lim.physical_block_size = self.physical_block_size;
         lim.max_hw_discard_sectors = self.max_hw_discard_sectors;
         if self.rotational {
-            lim.features |= bindings::BLK_FEAT_ROTATIONAL;
+            lim.features = Feature::Rotational.into();
         }
 
         #[cfg(CONFIG_BLK_DEV_ZONED)]
@@ -192,11 +209,19 @@ pub fn build(
                 return Err(error::code::EINVAL);
             }
 
-            lim.features |= bindings::BLK_FEAT_ZONED;
+            lim.features |= Feature::Zoned;
             lim.chunk_sectors = self.zone_size_sectors;
             lim.max_hw_zone_append_sectors = self.zone_append_max_sectors;
         }
 
+        if self.write_cache {
+            lim.features |= Feature::WriteCache;
+        }
+
+        if self.forced_unit_access {
+            lim.features |= Feature::ForcedUnitAccess;
+        }
+
         // SAFETY: `tagset.raw_tag_set()` points to a valid and initialized tag set
         let gendisk = from_err_ptr(unsafe {
             bindings::__blk_mq_alloc_disk(

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 63/83] block: rust: add `Segment::copy_to_page_limit`
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (61 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 62/83] block: rust: allow setting write cache and FUA flags for `GenDisk` Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 64/83] block: rnull: add fua support Andreas Hindborg
                   ` (19 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add a method to `block::mq::bio::Segment` to copy a bounded amount of bytes
to a page.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/kernel/block/bio/vec.rs | 27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/rust/kernel/block/bio/vec.rs b/rust/kernel/block/bio/vec.rs
index 61d83a07397f..82e89a1d17c3 100644
--- a/rust/kernel/block/bio/vec.rs
+++ b/rust/kernel/block/bio/vec.rs
@@ -102,13 +102,38 @@ pub fn truncate(&mut self, new_len: u32) {
     /// Returns the number of bytes copied.
     #[inline(always)]
     pub fn copy_to_page(&mut self, dst_page: Pin<&mut SafePage>, dst_offset: usize) -> usize {
+        self.copy_to_page_limit(dst_page, dst_offset, 0)
+    }
+
+    /// Copy data of this segment into `dst_page`.
+    ///
+    /// Copies at most `limit` bytes of data from the current offset to the next page boundary. That
+    /// is `PAGE_SIZE - (self.offeset() % PAGE_SIZE)` bytes of data. Data is placed at offset
+    /// `self.offset()` in the target page. This call will advance offset and reduce length of
+    /// `self`.
+    ///
+    /// If `limit` is zero it is ignored.
+    ///
+    /// Returns the number of bytes copied.
+    #[inline(always)]
+    pub fn copy_to_page_limit(
+        &mut self,
+        dst_page: Pin<&mut SafePage>,
+        dst_offset: usize,
+        limit: usize,
+    ) -> usize {
         // SAFETY: We are not moving out of `dst_page`.
         let dst_page = unsafe { Pin::into_inner_unchecked(dst_page) };
         let src_offset = self.offset() % PAGE_SIZE;
         debug_assert!(dst_offset <= PAGE_SIZE);
-        let length = (PAGE_SIZE - src_offset)
+        let mut length = (PAGE_SIZE - src_offset)
             .min(self.len() as usize)
             .min(PAGE_SIZE - dst_offset);
+
+        if limit > 0 {
+            length = length.min(limit);
+        }
+
         let page_idx = self.offset() / PAGE_SIZE;
 
         // SAFETY: self.bio_vec is valid and thus bv_page must be a valid

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 64/83] block: rnull: add fua support
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (62 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 63/83] block: rust: add `Segment::copy_to_page_limit` Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 65/83] block: rust: add `GenDisk::tag_set` Andreas Hindborg
                   ` (18 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add Forced Unit Access (FUA) support to rnull. When enabled via the `fua`
configfs attribute, the driver advertises FUA capability and handles FUA
requests by bypassing the volatile cache in the write path.

FUA support requires memory backing and write cache to be enabled.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/configfs.rs          |  5 ++++
 drivers/block/rnull/disk_storage.rs      | 22 +++++++++++++----
 drivers/block/rnull/disk_storage/page.rs |  1 +
 drivers/block/rnull/rnull.rs             | 41 ++++++++++++++++++++++++++------
 4 files changed, 58 insertions(+), 11 deletions(-)

diff --git a/drivers/block/rnull/configfs.rs b/drivers/block/rnull/configfs.rs
index 0637c1e0ab22..8195d645ecc6 100644
--- a/drivers/block/rnull/configfs.rs
+++ b/drivers/block/rnull/configfs.rs
@@ -128,6 +128,7 @@ fn make_group(
                 zone_max_active: 25,
                 zone_append_max_sectors: 26,
                 poll_queues: 27,
+                fua: 28,
             ],
         };
 
@@ -169,6 +170,7 @@ fn make_group(
                     zone_max_active: 0,
                     zone_append_max_sectors: u32::MAX,
                     poll_queues: 0,
+                    fua: true,
                 }),
             }),
             core::iter::empty(),
@@ -256,6 +258,7 @@ struct DeviceConfigInner {
     zone_max_active: u32,
     zone_append_max_sectors: u32,
     poll_queues: u32,
+    fua: bool,
 }
 
 #[vtable]
@@ -322,6 +325,7 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
                 zone_max_open: guard.zone_max_open,
                 zone_max_active: guard.zone_max_active,
                 zone_append_max_sectors: guard.zone_append_max_sectors,
+                forced_unit_access: guard.fua,
             })?);
             guard.powered = true;
         } else if guard.powered && !power_op {
@@ -515,3 +519,4 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
         }
     })
 );
+configfs_simple_bool_field!(DeviceConfig, 28, fua);
diff --git a/drivers/block/rnull/disk_storage.rs b/drivers/block/rnull/disk_storage.rs
index 7667830bd616..4a9bf480221f 100644
--- a/drivers/block/rnull/disk_storage.rs
+++ b/drivers/block/rnull/disk_storage.rs
@@ -92,6 +92,10 @@ pub(crate) fn flush(&self, hw_data: &Pin<&SpinLock<HwQueueContext>>) -> Result {
         let mut access = self.access(&mut tree_guard, &mut hw_data_guard, None);
         access.flush()
     }
+
+    pub(crate) fn cache_enabled(&self) -> bool {
+        self.cache_size > 0
+    }
 }
 
 pub(crate) struct DiskStorageAccess<'a, 'b, 'c> {
@@ -205,7 +209,7 @@ fn flush(&mut self) -> Result {
         Ok(())
     }
 
-    fn get_cache_page(&mut self, sector: u64) -> Result<&mut NullBlockPage> {
+    fn get_or_alloc_cache_page(&mut self, sector: u64) -> Result<&mut NullBlockPage> {
         let index = Self::to_index(sector);
 
         match self.cache_guard.entry(index) {
@@ -239,6 +243,12 @@ fn get_cache_page(&mut self, sector: u64) -> Result<&mut NullBlockPage> {
         }
     }
 
+    pub(crate) fn get_cache_page(&mut self, sector: u64) -> Option<&mut NullBlockPage> {
+        let index = Self::to_index(sector);
+
+        self.cache_guard.get_mut(index)
+    }
+
     fn get_disk_page(&mut self, sector: u64) -> Result<&mut NullBlockPage> {
         let index = Self::to_index(sector);
 
@@ -256,9 +266,13 @@ fn get_disk_page(&mut self, sector: u64) -> Result<&mut NullBlockPage> {
         Ok(page)
     }
 
-    pub(crate) fn get_write_page(&mut self, sector: u64) -> Result<&mut NullBlockPage> {
-        let page = if self.disk_storage.cache_size > 0 {
-            self.get_cache_page(sector)?
+    pub(crate) fn get_write_page(
+        &mut self,
+        sector: u64,
+        bypass_cache: bool,
+    ) -> Result<&mut NullBlockPage> {
+        let page = if self.disk_storage.cache_size > 0 && !bypass_cache {
+            self.get_or_alloc_cache_page(sector)?
         } else {
             self.get_disk_page(sector)?
         };
diff --git a/drivers/block/rnull/disk_storage/page.rs b/drivers/block/rnull/disk_storage/page.rs
index 88dc9a2476bd..846269d31c63 100644
--- a/drivers/block/rnull/disk_storage/page.rs
+++ b/drivers/block/rnull/disk_storage/page.rs
@@ -15,6 +15,7 @@
     uapi::PAGE_SECTORS, //
 };
 
+// TODO: Use rust bitmap
 static_assert!((PAGE_SIZE >> SECTOR_SHIFT) <= 64);
 
 pub(crate) struct NullBlockPage {
diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index 0695cbd07f1d..c3126b923367 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -191,6 +191,10 @@
             default: 0,
             description: "Number of IOPOLL submission queues.",
         },
+        fua: bool {
+            default: true,
+            description: "Enable/disable FUA support when cache_size is used.",
+        },
     },
 }
 
@@ -267,6 +271,7 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
                     zone_max_open: module_parameters::zone_max_open.value(),
                     zone_max_active: module_parameters::zone_max_active.value(),
                     zone_append_max_sectors: module_parameters::zone_append_max_sectors.value(),
+                    forced_unit_access: module_parameters::fua.value(),
                 })?;
                 disks.push(disk, GFP_KERNEL)?;
             }
@@ -307,6 +312,7 @@ struct NullBlkOptions<'a> {
     zone_max_active: u32,
     #[cfg_attr(not(CONFIG_BLK_DEV_ZONED), allow(dead_code))]
     zone_append_max_sectors: u32,
+    forced_unit_access: bool,
 }
 
 #[pin_data]
@@ -422,6 +428,7 @@ fn new(options: NullBlkOptions<'_>) -> Result<Arc<GenDisk<Self>>> {
             zone_max_active,
             #[cfg_attr(not(CONFIG_BLK_DEV_ZONED), allow(unused_variables))]
             zone_append_max_sectors,
+            forced_unit_access,
         } = options;
 
         let memory_backed = tag_set.memory_backed;
@@ -439,9 +446,10 @@ fn new(options: NullBlkOptions<'_>) -> Result<Arc<GenDisk<Self>>> {
             return Err(code::EINVAL);
         }
 
+        let s = storage.clone();
         let queue_data = Arc::try_pin_init(
             try_pin_init!(Self {
-                storage,
+                storage: s,
                 irq_mode,
                 completion_time,
                 memory_backed,
@@ -474,7 +482,9 @@ fn new(options: NullBlkOptions<'_>) -> Result<Arc<GenDisk<Self>>> {
             .capacity_sectors(device_capacity_sectors)
             .logical_block_size(block_size_bytes)?
             .physical_block_size(block_size_bytes)?
-            .rotational(rotational);
+            .rotational(rotational)
+            .write_cache(storage.cache_enabled())
+            .forced_unit_access(forced_unit_access && storage.cache_enabled());
 
         #[cfg(CONFIG_BLK_DEV_ZONED)]
         {
@@ -553,6 +563,7 @@ fn write<'a, 'b, 'c>(
         hw_data_guard: &'b mut SpinLockGuard<'c, HwQueueContext>,
         mut sector: u64,
         mut segment: Segment<'_>,
+        bypass_cache: bool,
     ) -> Result {
         let mut sheaf: Option<XArraySheaf<'_>> = None;
 
@@ -561,7 +572,13 @@ fn write<'a, 'b, 'c>(
 
             let mut access = self.storage.access(tree_guard, hw_data_guard, sheaf);
 
-            let page = access.get_write_page(sector)?;
+            if bypass_cache {
+                if let Some(page) = access.get_cache_page(sector) {
+                    page.set_free(sector);
+                }
+            }
+
+            let page = access.get_write_page(sector, bypass_cache)?;
             page.set_occupied(sector);
 
             // CAST: Page offset always fits in 32 bits.
@@ -569,7 +586,11 @@ fn write<'a, 'b, 'c>(
                 ((sector & u64::from(block::PAGE_SECTOR_MASK)) << block::SECTOR_SHIFT) as usize;
 
             // CAST: Casting from `usize` to `u64` never overflows.
-            sector += segment.copy_to_page(page.page_mut().as_pin_mut(), page_offset) as u64
+            sector += segment.copy_to_page_limit(
+                page.page_mut().as_pin_mut(),
+                page_offset,
+                self.block_size_bytes.try_into()?,
+            ) as u64
                 >> block::SECTOR_SHIFT;
 
             sheaf = access.sheaf;
@@ -632,6 +653,8 @@ fn transfer(
         let mut hw_data_guard = hw_data.lock();
         let mut tree_guard = self.storage.lock();
 
+        let skip_cache = rq.flags().contains(mq::RequestFlag::ForcedUnitAccess);
+
         for bio in rq.bio_iter_mut() {
             let segment_iter = bio.segment_iter();
             for mut segment in segment_iter {
@@ -641,9 +664,13 @@ fn transfer(
                 let length_sectors_allowed = segment_length_sectors.min(max_remaining_sectors);
                 segment.truncate(length_sectors_allowed << SECTOR_SHIFT);
                 match command {
-                    mq::Command::Write => {
-                        self.write(&mut tree_guard, &mut hw_data_guard, sector, segment)?
-                    }
+                    mq::Command::Write => self.write(
+                        &mut tree_guard,
+                        &mut hw_data_guard,
+                        sector,
+                        segment,
+                        skip_cache,
+                    )?,
                     mq::Command::Read => {
                         self.read(&mut tree_guard, &mut hw_data_guard, sector, segment)?
                     }

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 65/83] block: rust: add `GenDisk::tag_set`
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (63 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 64/83] block: rnull: add fua support Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 66/83] block: rust: add `TagSet::update_hw_queue_count` Andreas Hindborg
                   ` (17 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add a method to `GenDisk` to obtain a reference to the associated `TagSet`.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/kernel/block/mq/gen_disk.rs | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/rust/kernel/block/mq/gen_disk.rs b/rust/kernel/block/mq/gen_disk.rs
index 5367ca92b7aa..a50ba7b605d7 100644
--- a/rust/kernel/block/mq/gen_disk.rs
+++ b/rust/kernel/block/mq/gen_disk.rs
@@ -257,7 +257,7 @@ pub fn build(
         // `__blk_mq_alloc_disk` above.
         let mut disk = UniqueArc::new(
             GenDisk {
-                _tagset: tagset,
+                tag_set: tagset,
                 gendisk,
                 backref: Arc::pin_init(
                     // INVARIANT: We break `GenDiskRef` invariant here, but we restore it below.
@@ -341,7 +341,7 @@ pub(crate) const fn build_vtable() -> &'static bindings::block_device_operations
 ///    `bindings::device_add_disk`.
 ///  - `self.gendisk.queue.queuedata` is initialized by a call to `ForeignOwnable::into_foreign`.
 pub struct GenDisk<T: Operations> {
-    _tagset: Arc<TagSet<T>>,
+    tag_set: Arc<TagSet<T>>,
     gendisk: *mut bindings::gendisk,
     backref: Arc<Revocable<GenDiskRef<T>>>,
 }
@@ -363,6 +363,11 @@ pub fn queue_data(&self) -> <T::QueueData as ForeignOwnable>::Borrowed<'_> {
         // SAFETY: By type invariant, self is a valid gendisk.
         unsafe { T::QueueData::borrow((*(*self.gendisk).queue).queuedata) }
     }
+
+    /// Get a reference to the `TagSet` used by this `GenDisk`.
+    pub fn tag_set(&self) -> &Arc<TagSet<T>> {
+        &self.tag_set
+    }
 }
 
 // SAFETY: `GenDisk` is an owned pointer to a `struct gendisk` and an `Arc` to a

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 66/83] block: rust: add `TagSet::update_hw_queue_count`
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (64 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 65/83] block: rust: add `GenDisk::tag_set` Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 67/83] block: rnull: add an option to change the number of hardware queues Andreas Hindborg
                   ` (16 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add a method to `TagSet` that allows changing the number of hardware queues
dynamically.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/kernel/block/mq/tag_set.rs | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/rust/kernel/block/mq/tag_set.rs b/rust/kernel/block/mq/tag_set.rs
index 858c1b952b00..e89c76987b54 100644
--- a/rust/kernel/block/mq/tag_set.rs
+++ b/rust/kernel/block/mq/tag_set.rs
@@ -170,6 +170,20 @@ pub fn hw_queue_count(&self) -> u32 {
         unsafe { (*self.inner.get()).nr_hw_queues }
     }
 
+    /// Update the number of hardware queues for this tag set.
+    ///
+    /// This operation may fail if memory for tags cannot be allocated.
+    pub fn update_hw_queue_count(&self, nr_hw_queues: u32) -> Result {
+        // SAFETY: blk_mq_update_nr_hw_queues applies internal synchronization.
+        unsafe { bindings::blk_mq_update_nr_hw_queues(self.inner.get(), nr_hw_queues) }
+
+        if self.hw_queue_count() == nr_hw_queues {
+            Ok(())
+        } else {
+            Err(ENOMEM)
+        }
+    }
+
     /// Borrow the [`T::TagSetData`] associated with this tag set.
     pub fn data(&self) -> <T::TagSetData as ForeignOwnable>::Borrowed<'_> {
         // SAFETY: By type invariant, `self.inner` is valid.

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 67/83] block: rnull: add an option to change the number of hardware queues
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (65 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 66/83] block: rust: add `TagSet::update_hw_queue_count` Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 68/83] block: rust: add an abstraction for `struct rq_list` Andreas Hindborg
                   ` (15 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add a feature to rnull that allows changing the number of simulated
hardware queues during device operation.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/configfs.rs | 117 ++++++++++++++++++++++++++--------------
 drivers/block/rnull/rnull.rs    |  46 ++++++++++------
 2 files changed, 108 insertions(+), 55 deletions(-)

diff --git a/drivers/block/rnull/configfs.rs b/drivers/block/rnull/configfs.rs
index 8195d645ecc6..d9246b9150f4 100644
--- a/drivers/block/rnull/configfs.rs
+++ b/drivers/block/rnull/configfs.rs
@@ -148,7 +148,13 @@ fn make_group(
                     completion_time: time::Delta::ZERO,
                     name: name.try_into()?,
                     memory_backed: false,
-                    submit_queues: 1,
+                    queue_config: Arc::pin_init(
+                        new_mutex!(QueueConfig {
+                            submit_queues: 1,
+                            poll_queues: 0
+                        }),
+                        GFP_KERNEL
+                    )?,
                     home_node: bindings::NUMA_NO_NODE,
                     discard: false,
                     no_sched: false,
@@ -169,7 +175,6 @@ fn make_group(
                     zone_max_open: 0,
                     zone_max_active: 0,
                     zone_append_max_sectors: u32::MAX,
-                    poll_queues: 0,
                     fua: true,
                 }),
             }),
@@ -236,7 +241,7 @@ struct DeviceConfigInner {
     completion_time: time::Delta,
     disk: Option<Arc<GenDisk<NullBlkDevice>>>,
     memory_backed: bool,
-    submit_queues: u32,
+    queue_config: Arc<Mutex<QueueConfig>>,
     home_node: i32,
     discard: bool,
     no_sched: bool,
@@ -257,7 +262,6 @@ struct DeviceConfigInner {
     zone_max_open: u32,
     zone_max_active: u32,
     zone_append_max_sectors: u32,
-    poll_queues: u32,
     fua: bool,
 }
 
@@ -310,9 +314,8 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
                 bandwidth_limit: u64::from(guard.mbps) * 2u64.pow(20),
                 shared_tag_set: guard.shared_tags.then(|| guard.shared_tag_set.clone()),
                 tag_set: crate::TagSetOptions {
-                    submit_queues: guard.submit_queues,
-                    poll_queues: guard.poll_queues,
                     home_node: guard.home_node,
+                    queue_config: guard.queue_config.clone(),
                     blocking: guard.blocking,
                     memory_backed: guard.memory_backed,
                     no_sched: guard.no_sched,
@@ -337,9 +340,17 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
     }
 }
 
-configfs_simple_field!(DeviceConfig, 1,
-                       block_size, u32,
-                       check GenDiskBuilder::<NullBlkDevice>::validate_block_size
+pub(crate) struct QueueConfig {
+    pub(crate) submit_queues: u32,
+    pub(crate) poll_queues: u32,
+}
+
+configfs_simple_field!(
+    DeviceConfig,
+    1,
+    block_size,
+    u32,
+    check GenDiskBuilder::<NullBlkDevice>::validate_block_size
 );
 configfs_simple_bool_field!(DeviceConfig, 2, rotational);
 configfs_simple_field!(DeviceConfig, 3, capacity_mib, u64);
@@ -363,38 +374,44 @@ fn from_str(s: &str) -> Result<Self> {
 
 configfs_simple_bool_field!(DeviceConfig, 6, memory_backed);
 
-#[vtable]
-impl configfs::AttributeOperations<7> for DeviceConfig {
-    type Data = DeviceConfig;
-
-    fn show(this: &DeviceConfig, page: &mut [u8; PAGE_SIZE]) -> Result<usize> {
-        let mut writer = kernel::str::Formatter::new(page);
-        writer.write_fmt(fmt!("{}\n", this.data.lock().submit_queues))?;
-        Ok(writer.bytes_written())
-    }
+configfs_attribute! {
+    DeviceConfig,
+    7,
+    show: |this, page| show_field(this.data.lock().queue_config.lock().submit_queues, page),
+    store: |this,page| {
+        let config_guard = this.data.lock();
+        let mut queue_config = config_guard.queue_config.lock();
 
-    fn store(this: &DeviceConfig, page: &[u8]) -> Result {
-        if this.data.lock().powered {
-            return Err(EBUSY);
+        let text = core::str::from_utf8(page)?.trim();
+        let value = text.parse().map_err(|_| EINVAL)?;
+        if value > kernel::cpu::num_possible_cpus() {
+            return Err(kernel::error::code::EINVAL)
         }
 
-        let text = core::str::from_utf8(page)?.trim();
-        let value = text
-            .parse::<u32>()
-            .map_err(|_| kernel::error::code::EINVAL)?;
+        let old_submit_queues = queue_config.submit_queues;
+        queue_config.submit_queues = value;
+        let total_queue_count = queue_config.submit_queues + queue_config.poll_queues;
+
+        let disk = config_guard.disk.clone();
+
+        drop(queue_config);
+        drop(config_guard);
 
-        if value == 0 || value > kernel::cpu::num_possible_cpus() {
-            return Err(kernel::error::code::EINVAL);
+        if let Some(disk) = &disk {
+            if let Err(e) = disk.tag_set().update_hw_queue_count(total_queue_count) {
+                this.data.lock().queue_config.lock().submit_queues = old_submit_queues;
+                return Err(e);
+            }
         }
 
-        this.data.lock().submit_queues = value;
         Ok(())
-    }
+    },
 }
 
 configfs_attribute!(DeviceConfig, 8,
     show: |this, page| show_field(
-        this.data.lock().submit_queues == kernel::numa::num_online_nodes(), page
+        this.data.lock().queue_config.lock().submit_queues == kernel::numa::num_online_nodes(),
+        page
     ),
     store: |this, page| store_with_power_check(this, page, |data, page| {
         let value = core::str::from_utf8(page)?
@@ -404,7 +421,7 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
             != 0;
 
         if value {
-            data.submit_queues = kernel::numa::num_online_nodes();
+            data.queue_config.lock().submit_queues = kernel::numa::num_online_nodes();
         }
         Ok(())
     })
@@ -506,17 +523,37 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
 configfs_simple_field!(DeviceConfig, 24, zone_max_open, u32);
 configfs_simple_field!(DeviceConfig, 25, zone_max_active, u32);
 configfs_simple_field!(DeviceConfig, 26, zone_append_max_sectors, u32);
-configfs_simple_field!(
+configfs_attribute! {
     DeviceConfig,
     27,
-    poll_queues,
-    u32,
-    check(|value| {
+    show: |this, page| show_field(this.data.lock().queue_config.lock().poll_queues, page),
+    store: |this,page| {
+        let config_guard = this.data.lock();
+        let mut queue_config = config_guard.queue_config.lock();
+
+        let text = core::str::from_utf8(page)?.trim();
+        let value = text.parse().map_err(|_| EINVAL)?;
         if value > kernel::cpu::num_possible_cpus() {
-            Err(kernel::error::code::EINVAL)
-        } else {
-            Ok(())
+            return Err(kernel::error::code::EINVAL)
         }
-    })
-);
+
+        let old_poll_queues = queue_config.poll_queues;
+        queue_config.poll_queues = value;
+        let total_queue_count = queue_config.submit_queues + queue_config.poll_queues;
+
+        let disk = config_guard.disk.clone();
+
+        drop(queue_config);
+        drop(config_guard);
+
+        if let Some(disk) = &disk {
+            if let Err(e) = disk.tag_set().update_hw_queue_count(total_queue_count) {
+                this.data.lock().queue_config.lock().poll_queues = old_poll_queues;
+                return Err(e);
+            }
+        }
+
+        Ok(())
+    },
+}
 configfs_simple_bool_field!(DeviceConfig, 28, fua);
diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index c3126b923367..6653db5c069b 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -10,7 +10,10 @@
 #[cfg(CONFIG_BLK_DEV_ZONED)]
 mod zoned;
 
-use configfs::IRQMode;
+use configfs::{
+    IRQMode,
+    QueueConfig, //
+};
 use disk_storage::{
     DiskStorage,
     NullBlockPage,
@@ -224,9 +227,14 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
             let hw_queue_depth = module_parameters::hw_queue_depth.value();
 
             let shared_tag_set = NullBlkDevice::build_tag_set(TagSetOptions {
-                submit_queues,
-                poll_queues,
                 home_node,
+                queue_config: Arc::pin_init(
+                    new_mutex!(QueueConfig {
+                        submit_queues,
+                        poll_queues,
+                    }),
+                    GFP_KERNEL,
+                )?,
                 blocking,
                 memory_backed,
                 no_sched,
@@ -256,9 +264,14 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
                         .value()
                         .then(|| shared_tag_set.clone()),
                     tag_set: TagSetOptions {
-                        submit_queues,
-                        poll_queues,
                         home_node,
+                        queue_config: Arc::pin_init(
+                            new_mutex!(QueueConfig {
+                                submit_queues,
+                                poll_queues,
+                            }),
+                            GFP_KERNEL,
+                        )?,
                         blocking,
                         memory_backed,
                         no_sched,
@@ -338,9 +351,8 @@ struct NullBlkDevice {
 }
 
 struct TagSetOptions {
-    submit_queues: u32,
-    poll_queues: u32,
     home_node: i32,
+    queue_config: Arc<Mutex<QueueConfig>>,
     blocking: bool,
     memory_backed: bool,
     no_sched: bool,
@@ -352,9 +364,8 @@ impl NullBlkDevice {
 
     fn build_tag_set(options: TagSetOptions) -> Result<Arc<TagSet<Self>>> {
         let TagSetOptions {
-            submit_queues,
-            poll_queues,
             home_node,
+            queue_config,
             blocking,
             memory_backed,
             no_sched,
@@ -379,14 +390,18 @@ fn build_tag_set(options: TagSetOptions) -> Result<Arc<TagSet<Self>>> {
             flags |= mq::tag_set::Flag::NoDefaultScheduler;
         }
 
+        let queue_config_guard = queue_config.lock();
+        let submit_queues = queue_config_guard.submit_queues;
+        let poll_queues = queue_config_guard.poll_queues;
+        drop(queue_config_guard);
+
         Arc::pin_init(
             TagSet::new(
                 submit_queues + poll_queues,
                 KBox::new(
                     NullBlkTagsetData {
                         queue_depth: hw_queue_depth,
-                        submit_queue_count: submit_queues,
-                        poll_queue_count: poll_queues,
+                        queue_config,
                     },
                     GFP_KERNEL,
                 )?,
@@ -823,8 +838,7 @@ impl HasHrTimer<Self> for Pdu {
 
 struct NullBlkTagsetData {
     queue_depth: u32,
-    submit_queue_count: u32,
-    poll_queue_count: u32,
+    queue_config: Arc<Mutex<QueueConfig>>,
 }
 
 #[vtable]
@@ -970,8 +984,10 @@ fn report_zones(
     }
 
     fn map_queues(tag_set: Pin<&mut TagSet<Self>>) {
-        let mut submit_queue_count = tag_set.data().submit_queue_count;
-        let mut poll_queue_count = tag_set.data().poll_queue_count;
+        let queue_config = tag_set.data().queue_config.lock();
+        let mut submit_queue_count = queue_config.submit_queues;
+        let mut poll_queue_count = queue_config.poll_queues;
+        drop(queue_config);
 
         if tag_set.hw_queue_count() != submit_queue_count + poll_queue_count {
             pr_warn!(

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 68/83] block: rust: add an abstraction for `struct rq_list`
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (66 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 67/83] block: rnull: add an option to change the number of hardware queues Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 69/83] block: rust: add `queue_rqs` vtable hook Andreas Hindborg
                   ` (14 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add the `RequestList` type as a safe wrapper around the C `struct
rq_list`. This type provides methods to iterate over and manipulate
lists of block requests, which is needed for implementing the
`queue_rqs` callback.

The abstraction includes methods for popping requests from the list,
checking if the list is empty, and peeking at the head request.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/helpers/blk.c                   |  26 ++++++++
 rust/kernel/block/mq.rs              |   2 +
 rust/kernel/block/mq/request_list.rs | 119 +++++++++++++++++++++++++++++++++++
 3 files changed, 147 insertions(+)

diff --git a/rust/helpers/blk.c b/rust/helpers/blk.c
index 500e3c6fd951..422289d617ae 100644
--- a/rust/helpers/blk.c
+++ b/rust/helpers/blk.c
@@ -27,3 +27,29 @@ bool rust_helper_blk_mq_add_to_batch(struct request *req,
 {
 	return blk_mq_add_to_batch(req, iob, is_error, complete);
 }
+
+__rust_helper struct request *rust_helper_rq_list_pop(struct rq_list *rl)
+{
+	return rq_list_pop(rl);
+}
+
+__rust_helper int rust_helper_rq_list_empty(const struct rq_list *rl)
+{
+	return rq_list_empty(rl);
+}
+
+__rust_helper void rust_helper_rq_list_add_tail(struct rq_list *rl,
+						struct request *rq)
+{
+	rq_list_add_tail(rl, rq);
+}
+
+__rust_helper void rust_helper_rq_list_init(struct rq_list *rl)
+{
+	rq_list_init(rl);
+}
+
+__rust_helper struct request *rust_helper_rq_list_peek(struct rq_list *rl)
+{
+	return rq_list_peek(rl);
+}
diff --git a/rust/kernel/block/mq.rs b/rust/kernel/block/mq.rs
index 7c346be843e1..e8f0d03f2ff7 100644
--- a/rust/kernel/block/mq.rs
+++ b/rust/kernel/block/mq.rs
@@ -129,6 +129,7 @@
 pub mod gen_disk;
 mod operations;
 mod request;
+mod request_list;
 mod request_queue;
 pub mod tag_set;
 
@@ -148,6 +149,7 @@
     Request,
     RequestTimerHandle, //
 };
+pub use request_list::RequestList;
 pub use request_queue::RequestQueue;
 pub use tag_set::{
     QueueType,
diff --git a/rust/kernel/block/mq/request_list.rs b/rust/kernel/block/mq/request_list.rs
new file mode 100644
index 000000000000..82e6005126f7
--- /dev/null
+++ b/rust/kernel/block/mq/request_list.rs
@@ -0,0 +1,119 @@
+// SPDX-License-Identifier: GPL-2.0
+
+use core::marker::PhantomData;
+
+use crate::{
+    owned::Owned,
+    types::Opaque, //
+};
+
+use super::{
+    IdleRequest,
+    Operations, //
+};
+
+/// A list of [`Request`].
+///
+/// # INVARIANTS
+///
+/// - `self.inner` is always a valid list, meaning the `next` and `prev`
+///   pointers point to valid requests, or are both null.
+/// - All requests in the list are valid for use as `IdleRequest<T>`.
+#[repr(transparent)]
+pub struct RequestList<T: Operations> {
+    inner: Opaque<bindings::rq_list>,
+    _p: PhantomData<T>,
+}
+
+impl<T: Operations> RequestList<T> {
+    /// Create a new [`RequestList`].
+    pub fn new() -> Self {
+        let this = Self {
+            inner: Opaque::zeroed(),
+            _p: PhantomData,
+        };
+
+        // NOTE: We are actually good to go, but we call the C initializer for forward
+        // compatibility.
+        // SAFETY: `this.inner` is a valid allocation for use as `bindings::rq_list!.
+        unsafe { bindings::rq_list_init(this.inner.get()) }
+
+        //INVARIANT: `self.inner` was initialized above and is empty.
+        this
+    }
+
+    /// Create a mutable reference to a [`RequestList`] from a raw pointer.
+    ///
+    /// # SAFETY
+    /// - The list pointed to by `ptr` must satisfy the invariants of `Self`.
+    /// - The list pointed to by `ptr` must remain valid for use as a mutable reference for the
+    ///   duration of `'a`.
+    pub unsafe fn from_raw<'a>(ptr: *mut bindings::rq_list) -> &'a mut Self {
+        // SAFETY:
+        // - RequestList is transparent.
+        // - By function safety requirements, `ptr` is valid for us as a mutable reference.
+        unsafe { &mut (*ptr.cast()) }
+    }
+
+    /// Check if the list is empty.
+    pub fn empty(&self) -> bool {
+        // SAFETY: By type invariant, self.inner is valid.
+        let ret = unsafe { bindings::rq_list_empty(self.inner.get()) };
+        ret != 0
+    }
+
+    /// Pop a request from the list.
+    ///
+    /// Returns [`None`] if the list is empty.
+    pub fn pop(&mut self) -> Option<Owned<IdleRequest<T>>> {
+        // SAFETY: By type invariant `self.inner` is a valid list.
+        let ptr = unsafe { bindings::rq_list_pop(self.inner.get()) };
+
+        if !ptr.is_null() {
+            // SAFETY: If `rq_list_pop` returns a non-null pointer, it points to a valid request. By
+            // type invariant all requests in this list are valid for use as `IdleRequest`.
+            Some(unsafe { IdleRequest::from_raw(ptr) })
+        } else {
+            None
+        }
+    }
+
+    /// Push a request on the tail of the list.
+    pub fn push_tail(&mut self, rq: Owned<IdleRequest<T>>) {
+        let ptr = rq.as_raw();
+        core::mem::forget(rq);
+        // INVARIANT: rq is an `IdleRequest<T>`.
+        // SAFETY: By type invariant, `self.inner` is a valid list.
+        unsafe { bindings::rq_list_add_tail(self.inner.get(), ptr) };
+    }
+
+    /// Peek at the head of the list.
+    ///
+    /// Returns a null pointer if the list is empty.
+    pub fn peek_raw(&self) -> *mut bindings::request {
+        // SAFETY: By type invariant, `self.inner` is a valid list.
+        unsafe { bindings::rq_list_peek(self.inner.get()) }
+    }
+}
+
+impl<T: Operations> Default for RequestList<T> {
+    fn default() -> Self {
+        Self::new()
+    }
+}
+
+impl<T: Operations> Drop for RequestList<T> {
+    fn drop(&mut self) {
+        while let Some(rq) = self.pop() {
+            drop(rq)
+        }
+    }
+}
+
+impl<T: Operations> Iterator for &mut RequestList<T> {
+    type Item = Owned<IdleRequest<T>>;
+
+    fn next(&mut self) -> Option<Self::Item> {
+        self.pop()
+    }
+}

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 69/83] block: rust: add `queue_rqs` vtable hook
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (67 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 68/83] block: rust: add an abstraction for `struct rq_list` Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 70/83] block: rnull: support queue_rqs Andreas Hindborg
                   ` (13 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add support for the `queue_rqs` callback to the Rust block layer
bindings. This callback allows drivers to receive multiple requests in
a single call, enabling batch processing optimizations.

The callback receives a `RequestList` containing the requests to be
processed. Drivers should remove successfully processed requests from
the list; any remaining requests will be requeued individually.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/kernel/block/mq/operations.rs | 61 +++++++++++++++++++++++++++++++++++++-
 rust/kernel/block/mq/request.rs    | 26 ++++++++++++++++
 2 files changed, 86 insertions(+), 1 deletion(-)

diff --git a/rust/kernel/block/mq/operations.rs b/rust/kernel/block/mq/operations.rs
index 1be4695ca944..505e7d2b2253 100644
--- a/rust/kernel/block/mq/operations.rs
+++ b/rust/kernel/block/mq/operations.rs
@@ -38,6 +38,8 @@
 };
 use pin_init::PinInit;
 
+use super::request_list::RequestList;
+
 type ForeignBorrowed<'a, T> = <T as ForeignOwnable>::Borrowed<'a>;
 
 /// Implement this trait to interface blk-mq as block devices.
@@ -94,6 +96,15 @@ fn queue_rq(
         is_poll: bool,
     ) -> BlkResult;
 
+    /// Called by the kernel to queue a list of requests with the driver.
+    fn queue_rqs(
+        _hw_data: ForeignBorrowed<'_, Self::HwData>,
+        _queue_data: ForeignBorrowed<'_, Self::QueueData>,
+        _requests: &mut RequestList<Self>,
+    ) {
+        build_error!(crate::error::VTABLE_DEFAULT_ERROR)
+    }
+
     /// Called by the kernel to indicate that queued requests should be submitted.
     fn commit_rqs(
         hw_data: ForeignBorrowed<'_, Self::HwData>,
@@ -234,6 +245,50 @@ impl<T: Operations> OperationsVTable<T> {
         }
     }
 
+    /// This function is called by the C kernel to queue a list of new requests.
+    ///
+    /// Driver is guaranteed that each request belongs to the same queue. If the
+    /// driver doesn't empty the `rqlist` completely, then the rest will be
+    /// queued individually by the block layer upon return.
+    ///
+    /// # SAFETY
+    ///
+    /// - `requests` must satisfy the safety requirements of `RequestList<T>`
+    /// - All requests in `requests` must belong to the same hardware context.
+    unsafe extern "C" fn queue_rqs_callback(requests: *mut bindings::rq_list) {
+        // SAFETY:
+        // - By the safety requirements of this function, `requests` is valid for use as a
+        // `RequestList`.
+        // - We have exclusive access to `requests` for the duration of this function.
+        let requests = unsafe { RequestList::from_raw(requests) };
+
+        let rq_ptr = requests.peek_raw();
+
+        if rq_ptr.is_null() {
+            return;
+        }
+
+        // SAFETY: By function safety requirements, rq_ptr is pointing to a
+        // valid request.
+        let hctx = unsafe { (*rq_ptr).mq_hctx };
+
+        // SAFETY: The safety requirement for this function ensure that `hctx`
+        // is valid and that `driver_data` was produced by a call to
+        // `into_foreign` in `Self::init_hctx_callback`.
+        let hw_data = unsafe { T::HwData::borrow((*hctx).driver_data) };
+
+        // SAFETY: `hctx` is valid as required by this function.
+        let queue_data = unsafe { (*(*hctx).queue).queuedata };
+
+        // SAFETY: `queue.queuedata` was created by `GenDiskBuilder::build` with
+        // a call to `ForeignOwnable::into_foreign` to create `queuedata`.
+        // `ForeignOwnable::from_foreign` is only called when the tagset is
+        // dropped, which happens after we are dropped.
+        let queue_data = unsafe { T::QueueData::borrow(queue_data) };
+
+        T::queue_rqs(hw_data, queue_data, requests);
+    }
+
     /// This function is called by the C kernel. A pointer to this function is
     /// installed in the `blk_mq_ops` vtable for the driver.
     ///
@@ -475,7 +530,11 @@ impl<T: Operations> OperationsVTable<T> {
 
     const VTABLE: bindings::blk_mq_ops = bindings::blk_mq_ops {
         queue_rq: Some(Self::queue_rq_callback),
-        queue_rqs: None,
+        queue_rqs: if T::HAS_QUEUE_RQS {
+            Some(Self::queue_rqs_callback)
+        } else {
+            None
+        },
         commit_rqs: Some(Self::commit_rqs_callback),
         get_budget: None,
         put_budget: None,
diff --git a/rust/kernel/block/mq/request.rs b/rust/kernel/block/mq/request.rs
index 84f8b2c17f85..9c451583e75d 100644
--- a/rust/kernel/block/mq/request.rs
+++ b/rust/kernel/block/mq/request.rs
@@ -173,6 +173,32 @@ pub fn queue(&self) -> &RequestQueue<T> {
     pub fn as_raw(&self) -> *mut bindings::request {
         self.0.get()
     }
+
+    // Return a valid hctx pointer.
+    fn hctx_raw(&self) -> *mut bindings::blk_mq_hw_ctx {
+        // SAFETY: The requests is guaranteed to be associated with a hardware
+        // context while we have access to it.
+        unsafe { (*self.0.get()).mq_hctx }
+    }
+
+    /// Get a reference to the [`T::HwData`] for the hardware context that this
+    /// request is associated with.
+    pub fn hw_data(&self) -> <T::HwData as ForeignOwnable>::Borrowed<'_> {
+        let hctx = self.hctx_raw();
+
+        // SAFETY: `hctx` is valid and `driver_data` was produced by a call to
+        // `into_foreign` in `Operations::init_hctx_callback`.
+        unsafe { T::HwData::borrow((*hctx).driver_data) }
+    }
+
+    pub fn is_poll(&self) -> bool {
+        let hctx = self.hctx_raw();
+
+        u32::from(
+            // SAFETY: `hctx_raw` returns a valid pointer.
+            unsafe { (*hctx).type_ },
+        ) == bindings::hctx_type_HCTX_TYPE_POLL
+    }
 }
 
 /// A wrapper around a blk-mq [`struct request`]. This represents an IO request.

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 70/83] block: rnull: support queue_rqs
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (68 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 69/83] block: rust: add `queue_rqs` vtable hook Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 71/83] block: rust: remove the `is_poll` parameter from `queue_rq` Andreas Hindborg
                   ` (12 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Implement the `queue_rqs` callback for rnull, allowing the block layer
to submit multiple requests in a single call. This improves performance
by reducing per-request overhead and enabling batch processing.

The implementation processes requests from the list one at a time,
removing successfully processed requests from the list.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/disk_storage.rs |  36 ++++----
 drivers/block/rnull/rnull.rs        | 180 +++++++++++++++++++++++-------------
 2 files changed, 133 insertions(+), 83 deletions(-)

diff --git a/drivers/block/rnull/disk_storage.rs b/drivers/block/rnull/disk_storage.rs
index 4a9bf480221f..6797b7996da3 100644
--- a/drivers/block/rnull/disk_storage.rs
+++ b/drivers/block/rnull/disk_storage.rs
@@ -86,7 +86,7 @@ pub(crate) fn discard(
         }
     }
 
-    pub(crate) fn flush(&self, hw_data: &Pin<&SpinLock<HwQueueContext>>) -> Result {
+    pub(crate) fn flush(&self, hw_data: &Pin<&SpinLock<HwQueueContext>>) {
         let mut tree_guard = self.lock();
         let mut hw_data_guard = hw_data.lock();
         let mut access = self.access(&mut tree_guard, &mut hw_data_guard, None);
@@ -131,7 +131,7 @@ fn to_sector(index: usize) -> u64 {
         (index << block::PAGE_SECTORS_SHIFT) as u64
     }
 
-    fn extract_cache_page(&mut self) -> Result<Option<KBox<NullBlockPage>>> {
+    fn extract_cache_page(&mut self) -> Option<KBox<NullBlockPage>> {
         Self::extract_cache_page_inner(
             &mut self.cache_guard,
             &mut self.disk_guard,
@@ -147,16 +147,10 @@ fn extract_cache_page_inner<'g>(
         disk_storage: &DiskStorage,
         hw_data: &mut HwQueueContext,
         sheaf: Option<&mut XArraySheaf<'_>>,
-    ) -> Result<Option<KBox<NullBlockPage>>> {
+    ) -> Option<KBox<NullBlockPage>> {
         let cache_entry = cache_guard.find_next_entry_circular(
             disk_storage.next_flush_sector.load(ordering::Relaxed) as usize,
-        );
-
-        let cache_entry = if let Some(entry) = cache_entry {
-            entry
-        } else {
-            return Ok(None);
-        };
+        )?;
 
         let index = cache_entry.index();
 
@@ -183,11 +177,14 @@ fn extract_cache_page_inner<'g>(
                     let mut src = cache_entry;
                     let mut offset = 0;
                     for _ in 0..PAGE_SECTORS {
-                        src.page_mut().as_pin_mut().copy_to_page(
-                            disk_entry.page_mut().as_pin_mut(),
-                            offset,
-                            block::SECTOR_SIZE as usize,
-                        )?;
+                        src.page_mut()
+                            .as_pin_mut()
+                            .copy_to_page(
+                                disk_entry.page_mut().as_pin_mut(),
+                                offset,
+                                block::SECTOR_SIZE as usize,
+                            )
+                            .expect("Write to succeed");
                         offset += block::SECTOR_SIZE as usize;
                     }
                     src.remove()
@@ -197,16 +194,15 @@ fn extract_cache_page_inner<'g>(
             }
         };
 
-        Ok(Some(page))
+        Some(page)
     }
 
-    fn flush(&mut self) -> Result {
+    fn flush(&mut self) {
         if self.disk_storage.cache_size > 0 {
-            while let Some(page) = self.extract_cache_page()? {
+            while let Some(page) = self.extract_cache_page() {
                 drop(page);
             }
         }
-        Ok(())
     }
 
     fn get_or_alloc_cache_page(&mut self, sector: u64) -> Result<&mut NullBlockPage> {
@@ -230,7 +226,7 @@ fn get_or_alloc_cache_page(&mut self, sector: u64) -> Result<&mut NullBlockPage>
                         self.disk_storage,
                         self.hw_data_guard,
                         self.sheaf.as_mut(),
-                    )?
+                    )
                     .expect("Expected to find a page in the cache")
                 };
                 let xarray::Entry::Vacant(vacant_entry) = cache_guard.entry(index) else {
diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index 6653db5c069b..32af69bbf8f0 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -28,7 +28,7 @@
             BadBlocks, //
         },
         bio::Segment,
-        error::BlkResult,
+        error::{BlkError, BlkResult},
         mq::{
             self,
             gen_disk::{
@@ -36,8 +36,10 @@
                 GenDisk,
                 GenDiskRef, //
             },
+            IdleRequest,
             IoCompletionBatch,
             Operations,
+            RequestList,
             TagSet, //
         },
         SECTOR_SHIFT,
@@ -773,6 +775,104 @@ fn complete_request(&self, rq: Owned<mq::Request<Self>>) {
             }
         }
     }
+
+    #[inline(always)]
+    fn queue_rq_internal(
+        hw_data: Pin<&SpinLock<HwQueueContext>>,
+        this: ArcBorrow<'_, Self>,
+        rq: Owned<mq::IdleRequest<Self>>,
+        _is_last: bool,
+    ) -> Result<(), QueueRequestError> {
+        if this.bandwidth_limit != 0 {
+            if !this.bandwidth_timer.active() {
+                drop(this.bandwidth_timer_handle.lock().take());
+                let arc: Arc<_> = this.into();
+                *this.bandwidth_timer_handle.lock() =
+                    Some(arc.start(Self::BANDWIDTH_TIMER_INTERVAL));
+            }
+
+            if this
+                .bandwidth_bytes
+                .fetch_add(u64::from(rq.bytes()), ordering::Relaxed)
+                + u64::from(rq.bytes())
+                > this.bandwidth_limit
+            {
+                rq.queue().stop_hw_queues();
+                if this.bandwidth_bytes.load(ordering::Relaxed) <= this.bandwidth_limit {
+                    rq.queue().start_stopped_hw_queues_async();
+                }
+
+                return Err(QueueRequestError { request: rq });
+            }
+        }
+
+        let mut rq = rq.start();
+
+        if rq.command() == mq::Command::Flush {
+            if this.memory_backed {
+                this.storage.flush(&hw_data);
+            }
+            this.complete_request(rq);
+
+            return Ok(());
+        }
+
+        let status = (|| -> Result {
+            #[cfg(CONFIG_BLK_DEV_ZONED)]
+            if this.zoned.enabled {
+                this.handle_zoned_command(&hw_data, &mut rq)?;
+            } else {
+                this.handle_regular_command(&hw_data, &mut rq)?;
+            }
+
+            #[cfg(not(CONFIG_BLK_DEV_ZONED))]
+            this.handle_regular_command(&hw_data, &mut rq)?;
+
+            Ok(())
+        })();
+
+        if let Err(e) = status {
+            // Do not overwrite existing error. We do not care whether this write fails.
+            let _ = rq
+                .data_ref()
+                .error
+                .cmpxchg(0, e.to_errno(), ordering::Relaxed);
+        }
+
+        if rq.is_poll() {
+            // NOTE: We lack the ability to insert `Owned<Request>` into a
+            // `kernel::list::List`, so we use a `RingBuffer` instead. The
+            // drawback of this is that we have to allocate the space for the
+            // ring buffer during drive initialization, and we have to hold the
+            // lock protecting the list until we have processed all the requests
+            // in the list. Change to a linked list when the kernel gets this
+            // ability.
+
+            // NOTE: We are processing requests during submit rather than during
+            // poll. This is different from C driver. C driver does processing
+            // during poll.
+
+            hw_data
+                .lock()
+                .poll_queue
+                .push_head(rq)
+                .expect("Buffer is sized to hold all in flight requests");
+        } else {
+            this.complete_request(rq);
+        }
+
+        Ok(())
+    }
+}
+
+struct QueueRequestError {
+    request: Owned<IdleRequest<NullBlkDevice>>,
+}
+
+impl From<QueueRequestError> for BlkError {
+    fn from(_value: QueueRequestError) -> Self {
+        kernel::block::error::code::BLK_STS_IOERR
+    }
 }
 
 impl_has_hr_timer! {
@@ -814,7 +914,7 @@ struct HwQueueContext {
 struct Pdu {
     #[pin]
     timer: HrTimer<Self>,
-    error: Atomic<u32>,
+    error: Atomic<i32>,
 }
 
 impl HrTimerCallback for Pdu {
@@ -855,76 +955,31 @@ fn new_request_data() -> impl PinInit<Self::RequestData> {
         })
     }
 
-    #[inline(always)]
     fn queue_rq(
         hw_data: Pin<&SpinLock<HwQueueContext>>,
         this: ArcBorrow<'_, Self>,
         rq: Owned<mq::IdleRequest<Self>>,
-        _is_last: bool,
-        is_poll: bool,
+        is_last: bool,
+        _is_poll: bool,
     ) -> BlkResult {
-        if this.bandwidth_limit != 0 {
-            if !this.bandwidth_timer.active() {
-                drop(this.bandwidth_timer_handle.lock().take());
-                let arc: Arc<_> = this.into();
-                *this.bandwidth_timer_handle.lock() =
-                    Some(arc.start(Self::BANDWIDTH_TIMER_INTERVAL));
-            }
+        Ok(Self::queue_rq_internal(hw_data, this, rq, is_last)?)
+    }
 
-            if this
-                .bandwidth_bytes
-                .fetch_add(u64::from(rq.bytes()), ordering::Relaxed)
-                + u64::from(rq.bytes())
-                > this.bandwidth_limit
+    fn queue_rqs(
+        hw_data: Pin<&SpinLock<HwQueueContext>>,
+        this: ArcBorrow<'_, Self>,
+        requests: &mut RequestList<Self>,
+    ) {
+        let mut requeue = RequestList::new();
+        while let Some(request) = requests.pop() {
+            if let Err(QueueRequestError { request }) =
+                Self::queue_rq_internal(hw_data, this, request, false)
             {
-                rq.queue().stop_hw_queues();
-                if this.bandwidth_bytes.load(ordering::Relaxed) <= this.bandwidth_limit {
-                    rq.queue().start_stopped_hw_queues_async();
-                }
-
-                return Err(kernel::block::error::code::BLK_STS_DEV_RESOURCE);
+                requeue.push_tail(request);
             }
         }
 
-        let mut rq = rq.start();
-
-        if rq.command() == mq::Command::Flush {
-            if this.memory_backed {
-                this.storage.flush(&hw_data)?;
-            }
-            this.complete_request(rq);
-
-            return Ok(());
-        }
-
-        #[cfg(CONFIG_BLK_DEV_ZONED)]
-        if this.zoned.enabled {
-            this.handle_zoned_command(&hw_data, &mut rq)?;
-        } else {
-            this.handle_regular_command(&hw_data, &mut rq)?;
-        }
-
-        #[cfg(not(CONFIG_BLK_DEV_ZONED))]
-        this.handle_regular_command(&hw_data, &mut rq)?;
-
-        if is_poll {
-            // NOTE: We lack the ability to insert `Owned<Request>` into a
-            // `kernel::list::List`, so we use a `RingBuffer` instead. The
-            // drawback of this is that we have to allocate the space for the
-            // ring buffer during drive initialization, and we have to hold the
-            // lock protecting the list until we have processed all the requests
-            // in the list. Change to a linked list when the kernel gets this
-            // ability.
-
-            // NOTE: We are processing requests during submit rather than during
-            // poll. This is different from C driver. C driver does processing
-            // during poll.
-
-            hw_data.lock().poll_queue.push_head(rq)?;
-        } else {
-            this.complete_request(rq);
-        }
-        Ok(())
+        drop(core::mem::replace(requests, requeue));
     }
 
     fn commit_rqs(_hw_data: Pin<&SpinLock<HwQueueContext>>, _queue_data: ArcBorrow<'_, Self>) {}
@@ -941,7 +996,6 @@ fn poll(
             let status = rq.data_ref().error.load(ordering::Relaxed);
             rq.data_ref().error.store(0, ordering::Relaxed);
 
-            // TODO: check error handling via status
             if let Err(rq) = batch.add_request(rq, status != 0) {
                 Self::end_request(rq);
             }

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 71/83] block: rust: remove the `is_poll` parameter from `queue_rq`
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (69 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 70/83] block: rnull: support queue_rqs Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 72/83] block: rust: add a debug assert for refcounts Andreas Hindborg
                   ` (11 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

The information can now be obtained from `Request::is_poll`.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/rnull.rs       | 1 -
 rust/kernel/block/mq.rs            | 1 -
 rust/kernel/block/mq/operations.rs | 7 -------
 3 files changed, 9 deletions(-)

diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index 32af69bbf8f0..8e17b2b17a66 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -960,7 +960,6 @@ fn queue_rq(
         this: ArcBorrow<'_, Self>,
         rq: Owned<mq::IdleRequest<Self>>,
         is_last: bool,
-        _is_poll: bool,
     ) -> BlkResult {
         Ok(Self::queue_rq_internal(hw_data, this, rq, is_last)?)
     }
diff --git a/rust/kernel/block/mq.rs b/rust/kernel/block/mq.rs
index e8f0d03f2ff7..47e1f860c6ba 100644
--- a/rust/kernel/block/mq.rs
+++ b/rust/kernel/block/mq.rs
@@ -90,7 +90,6 @@
 //!         _queue_data: (),
 //!         rq: Owned<IdleRequest<Self>>,
 //!         _is_last: bool,
-//!         is_poll: bool
 //!     ) -> BlkResult {
 //!         rq.start().end_ok();
 //!         Ok(())
diff --git a/rust/kernel/block/mq/operations.rs b/rust/kernel/block/mq/operations.rs
index 505e7d2b2253..d28af9a5e006 100644
--- a/rust/kernel/block/mq/operations.rs
+++ b/rust/kernel/block/mq/operations.rs
@@ -93,7 +93,6 @@ fn queue_rq(
         queue_data: ForeignBorrowed<'_, Self::QueueData>,
         rq: Owned<IdleRequest<Self>>,
         is_last: bool,
-        is_poll: bool,
     ) -> BlkResult;
 
     /// Called by the kernel to queue a list of requests with the driver.
@@ -214,11 +213,6 @@ impl<T: Operations> OperationsVTable<T> {
         // `into_foreign` in `Self::init_hctx_callback`.
         let hw_data = unsafe { T::HwData::borrow((*hctx).driver_data) };
 
-        let is_poll = u32::from(
-            // SAFETY: `hctx` is valid as required by this function.
-            unsafe { (*hctx).type_ },
-        ) == bindings::hctx_type_HCTX_TYPE_POLL;
-
         // SAFETY: `hctx` is valid as required by this function.
         let queue_data = unsafe { (*(*hctx).queue).queuedata };
 
@@ -235,7 +229,6 @@ impl<T: Operations> OperationsVTable<T> {
             // SAFETY: `bd` is valid as required by the safety requirement for
             // this function.
             unsafe { (*bd).last },
-            is_poll,
         );
 
         if let Err(e) = ret {

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 72/83] block: rust: add a debug assert for refcounts
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (70 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 71/83] block: rust: remove the `is_poll` parameter from `queue_rq` Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 73/83] block: rust: add `TagSet::tag_to_rq` Andreas Hindborg
                   ` (10 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add a debug assertion in `ARef<Request>::dismiss` to verify that the
request refcount is at least two when an `ARef<Request>` exists. This
helps catch reference counting bugs during development.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/kernel/block/mq/request.rs | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/rust/kernel/block/mq/request.rs b/rust/kernel/block/mq/request.rs
index 9c451583e75d..05b167dfc6c6 100644
--- a/rust/kernel/block/mq/request.rs
+++ b/rust/kernel/block/mq/request.rs
@@ -619,9 +619,20 @@ impl<T> RequestTimerHandle<T>
     pub fn dismiss(mut self) {
         let inner = core::ptr::from_mut(&mut self.inner);
 
+        debug_assert!(
+            self.inner
+                .wrapper_ref()
+                .refcount()
+                .as_atomic()
+                .load(ordering::Relaxed)
+                >= 2,
+            "Request refcount must be at least two when an ARef<Request> exist"
+        );
+
         // SAFETY: `inner` is valid for reads and writes, is properly aligned and nonnull. We have
         // exclusive access to `inner` and we do not access `inner` after this call.
         unsafe { core::ptr::drop_in_place(inner) };
+
         core::mem::forget(self);
     }
 }

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 73/83] block: rust: add `TagSet::tag_to_rq`
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (71 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 72/83] block: rust: add a debug assert for refcounts Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 74/83] block: rust: add `Request::queue_index` Andreas Hindborg
                   ` (9 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add a way for block device drivers to obtain a `Request` from a tag. This
is backed by the C `blk_mq_tag_to_rq` but with added checks to ensure
memory safety.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/helpers/blk.c              |  6 ++++
 rust/kernel/block/mq/tag_set.rs | 66 ++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 71 insertions(+), 1 deletion(-)

diff --git a/rust/helpers/blk.c b/rust/helpers/blk.c
index 422289d617ae..1f3e5c661096 100644
--- a/rust/helpers/blk.c
+++ b/rust/helpers/blk.c
@@ -53,3 +53,9 @@ __rust_helper struct request *rust_helper_rq_list_peek(struct rq_list *rl)
 {
 	return rq_list_peek(rl);
 }
+
+__rust_helper struct request *
+rust_helper_blk_mq_tag_to_rq(struct blk_mq_tags *tags, unsigned int tag)
+{
+	return blk_mq_tag_to_rq(tags, tag);
+}
diff --git a/rust/kernel/block/mq/tag_set.rs b/rust/kernel/block/mq/tag_set.rs
index e89c76987b54..66b6a30a9e66 100644
--- a/rust/kernel/block/mq/tag_set.rs
+++ b/rust/kernel/block/mq/tag_set.rs
@@ -6,7 +6,6 @@
 
 use crate::{
     alloc::NumaNode,
-    bindings,
     block::mq::{
         operations::OperationsVTable,
         request::RequestDataWrapper,
@@ -17,7 +16,9 @@
         Result, //
     },
     prelude::*,
+    sync::atomic::ordering,
     types::{
+        ARef,
         ForeignOwnable,
         Opaque, //
     },
@@ -39,6 +40,8 @@
     Flags, //
 };
 
+use super::Request;
+
 /// A wrapper for the C `struct blk_mq_tag_set`.
 ///
 /// `struct blk_mq_tag_set` contains a `struct list_head` and so must be pinned.
@@ -193,6 +196,67 @@ pub fn data(&self) -> <T::TagSetData as ForeignOwnable>::Borrowed<'_> {
         // converted back with `from_foreign` while `&self` is live.
         unsafe { T::TagSetData::borrow(ptr) }
     }
+
+    /// Obtain a shared reference to a request.
+    ///
+    /// This method will hang if the request is not owned by the driver, or if
+    /// the driver holds an [`Ownable<Request>`] reference to the request.
+    pub fn tag_to_rq(&self, qid: u32, tag: u32) -> Option<ARef<Request<T>>> {
+        if qid >= self.hw_queue_count() {
+            kernel::pr_warn_once!("Invalid queue id: {qid}\n");
+            return None;
+        }
+
+        // SAFETY: We checked that `qid` is within bounds.
+        let tags = unsafe { *(*self.inner.get()).tags.add(qid as usize) };
+
+        // SAFETY: We checked `qid` for overflow above, so `tags` is valid.
+        let rq_ptr = unsafe { bindings::blk_mq_tag_to_rq(tags, tag) };
+        if rq_ptr.is_null() {
+            None
+        } else {
+            // SAFETY: if `rq_ptr`is not null, it is a valid request pointer.
+            let refcount_ptr = unsafe {
+                RequestDataWrapper::refcount_ptr(
+                    Request::wrapper_ptr(rq_ptr.cast::<Request<T>>()).as_ptr(),
+                )
+            };
+
+            // SAFETY: The refcount was initialized in `init_request_callback` and is never
+            // referenced mutably.
+            let refcount_ref = unsafe { &*refcount_ptr };
+
+            let atomic_ref = refcount_ref.as_atomic();
+
+            // It is possible for an interrupt to arrive faster than the last
+            // change to the refcount, so retry if the refcount is not what we
+            // think it should be.
+            loop {
+                // Load acquire to sync with store release of `Owned<Request>`
+                // being destroyed (prevent mutable access overlapping shared
+                // access).
+                let prev = atomic_ref.load(ordering::Acquire);
+
+                if prev >= 1 {
+                    // Store relaxed as no other operations need to happen strictly
+                    // before or after the increment.
+                    match atomic_ref.cmpxchg(prev, prev + 1, ordering::Relaxed) {
+                        Ok(_) => break,
+                        // NOTE: We cannot use the load part of a failed cmpxchg as it is always
+                        // relaxed.
+                        Err(_) => continue,
+                    }
+                } else {
+                    // We are probably waiting to observe a refcount increment.
+                    core::hint::spin_loop();
+                    continue;
+                };
+            }
+
+            // SAFETY: We checked above that `rq_ptr` is valid for use as an `ARef`.
+            Some(unsafe { Request::aref_from_raw(rq_ptr) })
+        }
+    }
 }
 
 #[pinned_drop]

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 74/83] block: rust: add `Request::queue_index`
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (72 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 73/83] block: rust: add `TagSet::tag_to_rq` Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 75/83] block: rust: add `Request::requeue` Andreas Hindborg
                   ` (8 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add a method to query a request about the index for the hardware queue
associated with the request.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/kernel/block/mq/request.rs | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/rust/kernel/block/mq/request.rs b/rust/kernel/block/mq/request.rs
index 05b167dfc6c6..54b5202567f8 100644
--- a/rust/kernel/block/mq/request.rs
+++ b/rust/kernel/block/mq/request.rs
@@ -191,6 +191,13 @@ pub fn hw_data(&self) -> <T::HwData as ForeignOwnable>::Borrowed<'_> {
         unsafe { T::HwData::borrow((*hctx).driver_data) }
     }
 
+    /// Get the queue index for the hardware queue associated with this request.
+    pub fn queue_index(&self) -> u32 {
+        // SAFETY: The requests is guaranteed to be associated with a hardware
+        // context while we have access to it.
+        unsafe { (*self.hctx_raw()).queue_num }
+    }
+
     pub fn is_poll(&self) -> bool {
         let hctx = self.hctx_raw();
 

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 75/83] block: rust: add `Request::requeue`
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (73 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 74/83] block: rust: add `Request::queue_index` Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 76/83] block: rust: add `request_timeout` hook Andreas Hindborg
                   ` (7 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add a method on `Request` to requeue the request with the block layer.
Drivers can use this method to send a request back to the block layer
without processing the request.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/kernel/block/mq/request.rs | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/rust/kernel/block/mq/request.rs b/rust/kernel/block/mq/request.rs
index 54b5202567f8..bf6d58139ab4 100644
--- a/rust/kernel/block/mq/request.rs
+++ b/rust/kernel/block/mq/request.rs
@@ -100,6 +100,18 @@ pub(crate) unsafe fn from_raw(ptr: *mut bindings::request) -> Owned<Self> {
         // SAFETY: By function safety requirements, `ptr` is valid for use as an `IdleRequest`.
         unsafe { Owned::from_raw(NonNull::<Self>::new_unchecked(ptr.cast())) }
     }
+
+    /// Requeue this request at the block layer.
+    ///
+    /// If `kick_requeue_list` is true, this method will schedule processing of
+    /// the requeue list on a workqueue.
+    pub fn requeue(self: Owned<Self>, kick_requeue_list: bool) {
+        let ptr = self.0 .0.get();
+        core::mem::forget(self);
+
+        // SAFETY: By type invariant, the wrapped request is valid.
+        unsafe { bindings::blk_mq_requeue_request(ptr, kick_requeue_list) };
+    }
 }
 
 impl<T: Operations> Ownable for IdleRequest<T> {

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 76/83] block: rust: add `request_timeout` hook
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (74 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 75/83] block: rust: add `Request::requeue` Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 77/83] block: rnull: add fault injection support Andreas Hindborg
                   ` (6 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add a hook for the request timeout feature. This allows the kernel to call
into a block device driver when it decides a request has timed out. Rust
block device drivers can now implement `Operations::request_timeout` to
respond to request timeouts.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/kernel/block.rs               |  1 +
 rust/kernel/block/mq.rs            |  3 +-
 rust/kernel/block/mq/operations.rs | 78 +++++++++++++++++++++++++++++++++++++-
 rust/kernel/block/mq/tag_set.rs    |  1 -
 4 files changed, 80 insertions(+), 3 deletions(-)

diff --git a/rust/kernel/block.rs b/rust/kernel/block.rs
index b3578f28871a..23795dbe08c3 100644
--- a/rust/kernel/block.rs
+++ b/rust/kernel/block.rs
@@ -42,6 +42,7 @@ macro_rules! declare_err {
         declare_err!(BLK_STS_NOTSUPP, "Operation not supported.");
         declare_err!(BLK_STS_IOERR, "Generic IO error.");
         declare_err!(BLK_STS_DEV_RESOURCE, "Device resource busy. Retry later.");
+        declare_err!(BLK_STS_TIMEOUT, "Operation timed out.");
     }
 
     /// A wrapper around a 1 byte block layer error code.
diff --git a/rust/kernel/block/mq.rs b/rust/kernel/block/mq.rs
index 47e1f860c6ba..a306181d88ce 100644
--- a/rust/kernel/block/mq.rs
+++ b/rust/kernel/block/mq.rs
@@ -138,7 +138,8 @@
 };
 pub use operations::{
     IoCompletionBatch,
-    Operations, //
+    Operations,
+    RequestTimeoutStatus, //
 };
 pub use request::{
     Command,
diff --git a/rust/kernel/block/mq/operations.rs b/rust/kernel/block/mq/operations.rs
index d28af9a5e006..2b340675f976 100644
--- a/rust/kernel/block/mq/operations.rs
+++ b/rust/kernel/block/mq/operations.rs
@@ -151,6 +151,51 @@ fn report_zones(
     fn map_queues(_tag_set: Pin<&mut TagSet<Self>>) {
         build_error!(crate::error::VTABLE_DEFAULT_ERROR)
     }
+
+    /// Called by the kernel when a request has been queued with the driver for too long.
+    ///
+    /// We identify the request by `queue_id` and `tag` as we cannot pass
+    /// `Owned<Request>` or `ARef<Request>`. The driver may hold either of these
+    /// already.
+    ///
+    /// A driver can use [`TagSet::tag_to_rq`] to try to obtain a request reference.
+    ///
+    /// A driver must return [`RequestTimeoutStatus::Completed`] if the request
+    /// was completed during the call. Otherwise
+    /// [`RequestTimeoutStatus::RetryLater`] must be returned, and the kernel
+    /// will retry the call later.
+    fn request_timeout(_tag_set: &TagSet<Self>, _queue_id: u32, _tag: u32) -> RequestTimeoutStatus {
+        build_error!(crate::error::VTABLE_DEFAULT_ERROR)
+    }
+}
+
+/// Return value for [`Operations::request_timeout`].
+#[repr(u32)]
+pub enum RequestTimeoutStatus {
+    /// The request was completed.
+    Completed = bindings::blk_eh_timer_return_BLK_EH_DONE,
+
+    /// The request is still processing, retry later.
+    RetryLater = bindings::blk_eh_timer_return_BLK_EH_RESET_TIMER,
+}
+
+impl RequestTimeoutStatus {
+    /// Create a [`RequestTimeoutStatus`] from an integer.
+    ///
+    /// # SAFETY
+    ///
+    /// - `value` must be one of the enum values declared for [`bindings::blk_eh_timer_return`].
+    pub unsafe fn from_raw(value: u32) -> Self {
+        // SAFETY: By function safety requirements, value is usable as `Self`.
+        unsafe { core::mem::transmute(value) }
+    }
+}
+
+impl From<RequestTimeoutStatus> for u32 {
+    fn from(value: RequestTimeoutStatus) -> Self {
+        // SAFETY: All `RequestTimeoutStatus` representations are valid as `u32`.
+        unsafe { core::mem::transmute(value) }
+    }
 }
 
 /// A vtable for blk-mq to interact with a block device driver.
@@ -521,6 +566,33 @@ impl<T: Operations> OperationsVTable<T> {
         T::map_queues(tag_set);
     }
 
+    /// This function is called by the block layer when a request has been
+    /// queued with the driver for too long.
+    ///
+    /// # Safety
+    ///
+    /// - This function may only be called by blk-mq C infrastructure.
+    /// - `rq` must point to an initialized and valid `Request`.
+    unsafe extern "C" fn request_timeout_callback(
+        rq: *mut bindings::request,
+    ) -> bindings::blk_eh_timer_return {
+        // SAFETY: `rq` is valid and initialized.
+        let hctx = unsafe { (*rq).mq_hctx };
+        // SAFETY: `rq` is valid and initialized, so `hctx` is also valid and initialized.
+        let qid = unsafe { (*hctx).queue_num };
+        // SAFETY: `rq` is valid and initialized.
+        let tag = unsafe { (*rq).tag } as u32;
+        // SAFETY: `rq` is valid and initialized, so `hctx` is also valid and initialized.
+        let queue = unsafe { (*hctx).queue };
+        // SAFETY: `rq` is valid and initialized, so is `queue`.
+        let tag_set = unsafe { (*queue).tag_set };
+        // SAFETY: As `rq` is valid, so is `tag_set`. We never create mutable references to a
+        // `TagSet` without proper locking.
+        let tag_set: &TagSet<T> = unsafe { TagSet::from_ptr(tag_set) };
+
+        T::request_timeout(tag_set, qid, tag).into()
+    }
+
     const VTABLE: bindings::blk_mq_ops = bindings::blk_mq_ops {
         queue_rq: Some(Self::queue_rq_callback),
         queue_rqs: if T::HAS_QUEUE_RQS {
@@ -533,7 +605,11 @@ impl<T: Operations> OperationsVTable<T> {
         put_budget: None,
         set_rq_budget_token: None,
         get_rq_budget_token: None,
-        timeout: None,
+        timeout: if T::HAS_REQUEST_TIMEOUT {
+            Some(Self::request_timeout_callback)
+        } else {
+            None
+        },
         poll: if T::HAS_POLL {
             Some(Self::poll_callback)
         } else {
diff --git a/rust/kernel/block/mq/tag_set.rs b/rust/kernel/block/mq/tag_set.rs
index 66b6a30a9e66..6d3882c01d9d 100644
--- a/rust/kernel/block/mq/tag_set.rs
+++ b/rust/kernel/block/mq/tag_set.rs
@@ -126,7 +126,6 @@ pub fn flags(&self) -> Flags {
     /// `ptr` must be a pointer to a valid and initialized `TagSet<T>`. There
     /// may be no other mutable references to the tag set. The pointee must be
     /// live and valid at least for the duration of the returned lifetime `'a`.
-    #[expect(dead_code)]
     pub(crate) unsafe fn from_ptr<'a>(ptr: *mut bindings::blk_mq_tag_set) -> &'a Self {
         // SAFETY: By the safety requirements of this function, `ptr` is valid
         // for use as a reference for the duration of `'a`.

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 77/83] block: rnull: add fault injection support
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (75 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 76/83] block: rust: add `request_timeout` hook Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 78/83] block: rust: add max_sectors option to `GenDiskBuilder` Andreas Hindborg
                   ` (5 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add fault injection support to rnull using the kernel fault injection
infrastructure. When enabled via `CONFIG_FAULT_INJECTION`, users can
inject failures into I/O requests through the standard fault injection
debugfs interface.

The fault injection point is exposed as a configfs default group,
allowing per-device fault injection configuration.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/Kconfig     |  11 ++++
 drivers/block/rnull/configfs.rs |  57 ++++++++++++++++++-
 drivers/block/rnull/rnull.rs    | 121 +++++++++++++++++++++++++++++++++++++---
 3 files changed, 180 insertions(+), 9 deletions(-)

diff --git a/drivers/block/rnull/Kconfig b/drivers/block/rnull/Kconfig
index 7bc5b376c128..1ade5d8c1799 100644
--- a/drivers/block/rnull/Kconfig
+++ b/drivers/block/rnull/Kconfig
@@ -11,3 +11,14 @@ config BLK_DEV_RUST_NULL
 	  devices that can be configured via various configuration options.
 
 	  If unsure, say N.
+
+config BLK_DEV_RUST_NULL_FAULT_INJECTION
+	bool "Support fault injection for Rust Null test block driver"
+	depends on BLK_DEV_RUST_NULL && FAULT_INJECTION_CONFIGFS
+	help
+	  Enable fault injection support for the Rust null block driver. This
+	  allows injecting errors into block I/O operations for testing error
+	  handling paths and verifying system resilience. Fault injection is
+	  configured through configfs alongside the null block device settings.
+
+	  If unsure, say N.
diff --git a/drivers/block/rnull/configfs.rs b/drivers/block/rnull/configfs.rs
index d9246b9150f4..eaa7617e5ffa 100644
--- a/drivers/block/rnull/configfs.rs
+++ b/drivers/block/rnull/configfs.rs
@@ -48,6 +48,9 @@
 
 mod macros;
 
+#[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+use kernel::fault_injection::FaultConfig;
+
 pub(crate) fn subsystem(
     shared_tag_set: Arc<TagSet<NullBlkDevice>>,
 ) -> impl PinInit<kernel::configfs::Subsystem<Config>, Error> {
@@ -132,10 +135,44 @@ fn make_group(
             ],
         };
 
+        use kernel::configfs::CDefaultGroup;
+
+        #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+        let mut default_groups: KVec<Arc<dyn CDefaultGroup>> = KVec::new();
+
+        #[cfg(not(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION))]
+        let default_groups: KVec<Arc<dyn CDefaultGroup>> = KVec::new();
+
+        #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+        let timeout_inject = Arc::pin_init(
+            kernel::fault_injection::FaultConfig::new(c"timeout_inject"),
+            GFP_KERNEL,
+        )?;
+
+        #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+        let requeue_inject = Arc::pin_init(
+            kernel::fault_injection::FaultConfig::new(c"requeue_inject"),
+            GFP_KERNEL,
+        )?;
+
+        #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+        let init_hctx_inject = Arc::pin_init(
+            kernel::fault_injection::FaultConfig::new(c"init_hctx_fault_inject"),
+            GFP_KERNEL,
+        )?;
+
+        #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+        {
+            default_groups.push(timeout_inject.clone(), GFP_KERNEL)?;
+            default_groups.push(requeue_inject.clone(), GFP_KERNEL)?;
+            default_groups.push(init_hctx_inject.clone(), GFP_KERNEL)?;
+        }
+
         let block_size = 4096;
         Ok(configfs::Group::new(
             name.try_into()?,
             item_type,
+            // default_groups,
             // TODO: cannot coerce new_mutex!() to impl PinInit<_, Error>, so put mutex inside
             try_pin_init!(DeviceConfig {
                 data <- new_mutex!(DeviceConfigInner {
@@ -176,9 +213,15 @@ fn make_group(
                     zone_max_active: 0,
                     zone_append_max_sectors: u32::MAX,
                     fua: true,
+                    #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+                    timeout_inject,
+                    #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+                    requeue_inject,
+                    #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+                    init_hctx_inject,
                 }),
             }),
-            core::iter::empty(),
+            default_groups,
         ))
     }
 }
@@ -263,6 +306,12 @@ struct DeviceConfigInner {
     zone_max_active: u32,
     zone_append_max_sectors: u32,
     fua: bool,
+    #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+    timeout_inject: Arc<FaultConfig>,
+    #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+    requeue_inject: Arc<FaultConfig>,
+    #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+    init_hctx_inject: Arc<FaultConfig>,
 }
 
 #[vtable]
@@ -320,6 +369,8 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
                     memory_backed: guard.memory_backed,
                     no_sched: guard.no_sched,
                     hw_queue_depth: guard.hw_queue_depth,
+                    #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+                    init_hctx_inject: guard.init_hctx_inject.clone(),
                 },
                 zoned: guard.zoned,
                 zone_size_mib: guard.zone_size_mib,
@@ -329,6 +380,10 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
                 zone_max_active: guard.zone_max_active,
                 zone_append_max_sectors: guard.zone_append_max_sectors,
                 forced_unit_access: guard.fua,
+                #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+                requeue_inject: guard.requeue_inject.clone(),
+                #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+                timeout_inject: guard.timeout_inject.clone(),
             })?);
             guard.powered = true;
         } else if guard.powered && !power_op {
diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index 8e17b2b17a66..f909360ec70d 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -40,6 +40,7 @@
             IoCompletionBatch,
             Operations,
             RequestList,
+            RequestTimeoutStatus,
             TagSet, //
         },
         SECTOR_SHIFT,
@@ -90,6 +91,9 @@
 };
 use util::*;
 
+#[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+use kernel::fault_injection::FaultConfig;
+
 module! {
     type: NullBlkModule,
     name: "rnull_mod",
@@ -203,6 +207,8 @@
     },
 }
 
+// TODO: Fault inject via params - requires module_params string support.
+
 #[pin_data]
 struct NullBlkModule {
     #[pin]
@@ -241,6 +247,11 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
                 memory_backed,
                 no_sched,
                 hw_queue_depth,
+                #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+                init_hctx_inject: Arc::pin_init(
+                    FaultConfig::new(c"init_hctx_fault_inject"),
+                    GFP_KERNEL,
+                )?,
             })?;
 
             let mut disks = KVec::new();
@@ -278,6 +289,11 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
                         memory_backed,
                         no_sched,
                         hw_queue_depth,
+                        #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+                        init_hctx_inject: Arc::pin_init(
+                            FaultConfig::new(c"init_hctx_fault_inject"),
+                            GFP_KERNEL,
+                        )?,
                     },
                     zoned: module_parameters::zoned.value(),
                     zone_size_mib: module_parameters::zone_size.value(),
@@ -287,6 +303,10 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
                     zone_max_active: module_parameters::zone_max_active.value(),
                     zone_append_max_sectors: module_parameters::zone_append_max_sectors.value(),
                     forced_unit_access: module_parameters::fua.value(),
+                    #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+                    requeue_inject: Arc::pin_init(FaultConfig::new(c"requeue_inject"), GFP_KERNEL)?,
+                    #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+                    timeout_inject: Arc::pin_init(FaultConfig::new(c"timeout_inject"), GFP_KERNEL)?,
                 })?;
                 disks.push(disk, GFP_KERNEL)?;
             }
@@ -328,6 +348,10 @@ struct NullBlkOptions<'a> {
     #[cfg_attr(not(CONFIG_BLK_DEV_ZONED), allow(dead_code))]
     zone_append_max_sectors: u32,
     forced_unit_access: bool,
+    #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+    requeue_inject: Arc<FaultConfig>,
+    #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+    timeout_inject: Arc<FaultConfig>,
 }
 
 #[pin_data]
@@ -350,6 +374,12 @@ struct NullBlkDevice {
     #[cfg(CONFIG_BLK_DEV_ZONED)]
     #[pin]
     zoned: zoned::ZoneOptions,
+    #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+    requeue_inject: Arc<FaultConfig>,
+    #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+    requeue_selector: kernel::sync::atomic::Atomic<u64>,
+    #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+    timeout_inject: Arc<FaultConfig>,
 }
 
 struct TagSetOptions {
@@ -359,6 +389,8 @@ struct TagSetOptions {
     memory_backed: bool,
     no_sched: bool,
     hw_queue_depth: u32,
+    #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+    init_hctx_inject: Arc<FaultConfig>,
 }
 
 impl NullBlkDevice {
@@ -372,6 +404,8 @@ fn build_tag_set(options: TagSetOptions) -> Result<Arc<TagSet<Self>>> {
             memory_backed,
             no_sched,
             hw_queue_depth,
+            #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+            init_hctx_inject,
         } = options;
 
         if home_node > kernel::numa::num_online_nodes().try_into()? {
@@ -404,6 +438,8 @@ fn build_tag_set(options: TagSetOptions) -> Result<Arc<TagSet<Self>>> {
                     NullBlkTagsetData {
                         queue_depth: hw_queue_depth,
                         queue_config,
+                        #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+                        init_hctx_inject,
                     },
                     GFP_KERNEL,
                 )?,
@@ -446,6 +482,11 @@ fn new(options: NullBlkOptions<'_>) -> Result<Arc<GenDisk<Self>>> {
             #[cfg_attr(not(CONFIG_BLK_DEV_ZONED), allow(unused_variables))]
             zone_append_max_sectors,
             forced_unit_access,
+
+            #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+            requeue_inject,
+            #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+            timeout_inject,
         } = options;
 
         let memory_backed = tag_set.memory_backed;
@@ -491,6 +532,12 @@ fn new(options: NullBlkOptions<'_>) -> Result<Arc<GenDisk<Self>>> {
                     zone_max_active,
                     zone_append_max_sectors,
                 })?,
+                #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+                requeue_inject,
+                #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+                requeue_selector: Atomic::new(0),
+                #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+                timeout_inject,
             }),
             GFP_KERNEL,
         )?;
@@ -733,7 +780,9 @@ fn handle_bad_blocks(&self, rq: &mut Owned<mq::Request<Self>>, sectors: &mut u32
                 badblocks::BlockStatus::None => {}
                 badblocks::BlockStatus::Acknowledged(mut range)
                 | badblocks::BlockStatus::Unacknowledged(mut range) => {
-                    rq.data_ref().error.store(1, ordering::Relaxed);
+                    rq.data_ref()
+                        .error
+                        .store(block::error::code::BLK_STS_IOERR.into(), ordering::Relaxed);
 
                     if self.bad_blocks_once {
                         self.bad_blocks.set_good(range.clone())?;
@@ -783,6 +832,22 @@ fn queue_rq_internal(
         rq: Owned<mq::IdleRequest<Self>>,
         _is_last: bool,
     ) -> Result<(), QueueRequestError> {
+        #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+        if rq.queue_data().requeue_inject.should_fail(1) {
+            if rq
+                .queue_data()
+                .requeue_selector
+                .fetch_add(1, ordering::Relaxed)
+                & 1
+                == 0
+            {
+                return Err(QueueRequestError { request: rq });
+            } else {
+                rq.requeue(true);
+                return Ok(());
+            }
+        }
+
         if this.bandwidth_limit != 0 {
             if !this.bandwidth_timer.active() {
                 drop(this.bandwidth_timer_handle.lock().take());
@@ -808,6 +873,12 @@ fn queue_rq_internal(
 
         let mut rq = rq.start();
 
+        #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+        if rq.queue_data().timeout_inject.should_fail(1) {
+            rq.data_ref().fake_timeout.store(1, ordering::Relaxed);
+            return Ok(());
+        }
+
         if rq.command() == mq::Command::Flush {
             if this.memory_backed {
                 this.storage.flush(&hw_data);
@@ -831,12 +902,13 @@ fn queue_rq_internal(
             Ok(())
         })();
 
-        if let Err(e) = status {
-            // Do not overwrite existing error. We do not care whether this write fails.
-            let _ = rq
-                .data_ref()
-                .error
-                .cmpxchg(0, e.to_errno(), ordering::Relaxed);
+        if status.is_err() {
+            // Do not overwrite existing error.
+            let _ = rq.data_ref().error.cmpxchg(
+                0,
+                kernel::block::error::code::BLK_STS_IOERR.into(),
+                ordering::Relaxed,
+            );
         }
 
         if rq.is_poll() {
@@ -914,7 +986,8 @@ struct HwQueueContext {
 struct Pdu {
     #[pin]
     timer: HrTimer<Self>,
-    error: Atomic<i32>,
+    error: Atomic<u32>,
+    fake_timeout: Atomic<u32>,
 }
 
 impl HrTimerCallback for Pdu {
@@ -939,6 +1012,8 @@ impl HasHrTimer<Self> for Pdu {
 struct NullBlkTagsetData {
     queue_depth: u32,
     queue_config: Arc<Mutex<QueueConfig>>,
+    #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+    init_hctx_inject: Arc<FaultConfig>,
 }
 
 #[vtable]
@@ -952,6 +1027,7 @@ fn new_request_data() -> impl PinInit<Self::RequestData> {
         pin_init!(Pdu {
             timer <- HrTimer::new(),
             error: Atomic::new(0),
+            fake_timeout: Atomic::new(0),
         })
     }
 
@@ -1006,6 +1082,11 @@ fn poll(
     }
 
     fn init_hctx(tagset_data: &NullBlkTagsetData, _hctx_idx: u32) -> Result<Self::HwData> {
+        #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
+        if tagset_data.init_hctx_inject.should_fail(1) {
+            return Err(EFAULT);
+        }
+
         KBox::pin_init(
             new_spinlock!(HwQueueContext {
                 page: None,
@@ -1067,4 +1148,28 @@ fn map_queues(tag_set: Pin<&mut TagSet<Self>>) {
             })
             .unwrap()
     }
+
+    fn request_timeout(tag_set: &TagSet<Self>, qid: u32, tag: u32) -> RequestTimeoutStatus {
+        if let Some(request) = tag_set.tag_to_rq(qid, tag) {
+            pr_info!("Request timed out\n");
+            // Only fail requests that are faking timeouts. Requests that time
+            // out due to memory pressure will be completed normally.
+            if request.data_ref().fake_timeout.load(ordering::Relaxed) != 0 {
+                request.data_ref().error.store(
+                    block::error::code::BLK_STS_TIMEOUT.into(),
+                    ordering::Relaxed,
+                );
+                request.data_ref().fake_timeout.store(0, ordering::Relaxed);
+
+                if let Ok(request) = OwnableRefCounted::try_from_shared(request) {
+                    Self::end_request(request);
+                    return RequestTimeoutStatus::Completed;
+                }
+                kernel::pr_warn_once!("Timed out request could not be completed\n");
+            }
+        } else {
+            kernel::pr_warn_once!("Timed out request referenced in timeout handler\n");
+        }
+        RequestTimeoutStatus::RetryLater
+    }
 }

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 78/83] block: rust: add max_sectors option to `GenDiskBuilder`
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (76 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 77/83] block: rnull: add fault injection support Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 79/83] block: rnull: allow configuration of the maximum IO size Andreas Hindborg
                   ` (4 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Allow drivers to set the maximum I/O size when building a `GenDisk`.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/kernel/block/mq/gen_disk.rs | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/rust/kernel/block/mq/gen_disk.rs b/rust/kernel/block/mq/gen_disk.rs
index a50ba7b605d7..6d760dafade5 100644
--- a/rust/kernel/block/mq/gen_disk.rs
+++ b/rust/kernel/block/mq/gen_disk.rs
@@ -58,6 +58,7 @@ pub struct GenDiskBuilder<T> {
     zone_append_max_sectors: u32,
     write_cache: bool,
     forced_unit_access: bool,
+    max_sectors: u32,
     _p: PhantomData<T>,
 }
 
@@ -77,6 +78,7 @@ fn default() -> Self {
             zone_append_max_sectors: 0,
             write_cache: false,
             forced_unit_access: false,
+            max_sectors: 0,
             _p: PhantomData,
         }
     }
@@ -181,6 +183,12 @@ pub fn write_cache(mut self, enable: bool) -> Self {
         self
     }
 
+    /// Maximum size of a command in 512 byte sectors.
+    pub fn max_sectors(mut self, sectors: u32) -> Self {
+        self.max_sectors = sectors;
+        self
+    }
+
     /// Build a new `GenDisk` and add it to the VFS.
     pub fn build(
         self,
@@ -199,6 +207,7 @@ pub fn build(
         lim.logical_block_size = self.logical_block_size;
         lim.physical_block_size = self.physical_block_size;
         lim.max_hw_discard_sectors = self.max_hw_discard_sectors;
+        lim.max_sectors = self.max_sectors;
         if self.rotational {
             lim.features = Feature::Rotational.into();
         }

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 79/83] block: rnull: allow configuration of the maximum IO size
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (77 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 78/83] block: rust: add max_sectors option to `GenDiskBuilder` Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:08 ` [PATCH v2 80/83] block: rust: add `virt_boundary_mask` option to `GenDiskBuilder` Andreas Hindborg
                   ` (3 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add module parameter and configfs option for controlling the maximum size
of an IO for the emulated block device.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/configfs.rs |  5 +++++
 drivers/block/rnull/rnull.rs    | 10 +++++++++-
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/block/rnull/configfs.rs b/drivers/block/rnull/configfs.rs
index eaa7617e5ffa..5ab217e43e2b 100644
--- a/drivers/block/rnull/configfs.rs
+++ b/drivers/block/rnull/configfs.rs
@@ -132,6 +132,7 @@ fn make_group(
                 zone_append_max_sectors: 26,
                 poll_queues: 27,
                 fua: 28,
+                max_sectors: 29,
             ],
         };
 
@@ -219,6 +220,7 @@ fn make_group(
                     requeue_inject,
                     #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
                     init_hctx_inject,
+                    max_sectors: 0,
                 }),
             }),
             default_groups,
@@ -312,6 +314,7 @@ struct DeviceConfigInner {
     requeue_inject: Arc<FaultConfig>,
     #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
     init_hctx_inject: Arc<FaultConfig>,
+    max_sectors: u32,
 }
 
 #[vtable]
@@ -384,6 +387,7 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
                 requeue_inject: guard.requeue_inject.clone(),
                 #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
                 timeout_inject: guard.timeout_inject.clone(),
+                max_sectors: guard.max_sectors,
             })?);
             guard.powered = true;
         } else if guard.powered && !power_op {
@@ -612,3 +616,4 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
     },
 }
 configfs_simple_bool_field!(DeviceConfig, 28, fua);
+configfs_simple_field!(DeviceConfig, 29, max_sectors, u32);
diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index f909360ec70d..15b8c365b9fa 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -204,6 +204,10 @@
             default: true,
             description: "Enable/disable FUA support when cache_size is used.",
         },
+        max_sectors: u32 {
+            default: 0,
+            description: "Maximum size of a command (in 512B sectors)",
+        },
     },
 }
 
@@ -307,6 +311,7 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
                     requeue_inject: Arc::pin_init(FaultConfig::new(c"requeue_inject"), GFP_KERNEL)?,
                     #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
                     timeout_inject: Arc::pin_init(FaultConfig::new(c"timeout_inject"), GFP_KERNEL)?,
+                    max_sectors: module_parameters::max_sectors.value(),
                 })?;
                 disks.push(disk, GFP_KERNEL)?;
             }
@@ -352,6 +357,7 @@ struct NullBlkOptions<'a> {
     requeue_inject: Arc<FaultConfig>,
     #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
     timeout_inject: Arc<FaultConfig>,
+    max_sectors: u32,
 }
 
 #[pin_data]
@@ -487,6 +493,7 @@ fn new(options: NullBlkOptions<'_>) -> Result<Arc<GenDisk<Self>>> {
             requeue_inject,
             #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
             timeout_inject,
+            max_sectors,
         } = options;
 
         let memory_backed = tag_set.memory_backed;
@@ -548,7 +555,8 @@ fn new(options: NullBlkOptions<'_>) -> Result<Arc<GenDisk<Self>>> {
             .physical_block_size(block_size_bytes)?
             .rotational(rotational)
             .write_cache(storage.cache_enabled())
-            .forced_unit_access(forced_unit_access && storage.cache_enabled());
+            .forced_unit_access(forced_unit_access && storage.cache_enabled())
+            .max_sectors(max_sectors);
 
         #[cfg(CONFIG_BLK_DEV_ZONED)]
         {

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 80/83] block: rust: add `virt_boundary_mask` option to `GenDiskBuilder`
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (78 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 79/83] block: rnull: allow configuration of the maximum IO size Andreas Hindborg
@ 2026-06-09 19:08 ` Andreas Hindborg
  2026-06-09 19:09 ` [PATCH v2 81/83] block: rnull: add `virt_boundary` option Andreas Hindborg
                   ` (2 subsequent siblings)
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:08 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Allow Rust device drivers to set the `virt_boundary_mask` property for
block devices.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 rust/kernel/block/mq/gen_disk.rs | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/rust/kernel/block/mq/gen_disk.rs b/rust/kernel/block/mq/gen_disk.rs
index 6d760dafade5..38057ebc0878 100644
--- a/rust/kernel/block/mq/gen_disk.rs
+++ b/rust/kernel/block/mq/gen_disk.rs
@@ -59,6 +59,7 @@ pub struct GenDiskBuilder<T> {
     write_cache: bool,
     forced_unit_access: bool,
     max_sectors: u32,
+    virt_boundary_mask: usize,
     _p: PhantomData<T>,
 }
 
@@ -79,6 +80,7 @@ fn default() -> Self {
             write_cache: false,
             forced_unit_access: false,
             max_sectors: 0,
+            virt_boundary_mask: 0,
             _p: PhantomData,
         }
     }
@@ -189,6 +191,15 @@ pub fn max_sectors(mut self, sectors: u32) -> Self {
         self
     }
 
+    /// Set the I/O segment memory alignment mask for the block device. I/O requests to this device
+    /// will be split between segments wherever either the memory address of the end of the previous
+    /// segment or the memory address of the beginning of the current segment is not aligned to
+    /// virt_boundary_mask + 1 bytes.
+    pub fn virt_boundary_mask(mut self, mask: usize) -> Self {
+        self.virt_boundary_mask = mask;
+        self
+    }
+
     /// Build a new `GenDisk` and add it to the VFS.
     pub fn build(
         self,
@@ -208,6 +219,7 @@ pub fn build(
         lim.physical_block_size = self.physical_block_size;
         lim.max_hw_discard_sectors = self.max_hw_discard_sectors;
         lim.max_sectors = self.max_sectors;
+        lim.virt_boundary_mask = self.virt_boundary_mask;
         if self.rotational {
             lim.features = Feature::Rotational.into();
         }

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 81/83] block: rnull: add `virt_boundary` option
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (79 preceding siblings ...)
  2026-06-09 19:08 ` [PATCH v2 80/83] block: rust: add `virt_boundary_mask` option to `GenDiskBuilder` Andreas Hindborg
@ 2026-06-09 19:09 ` Andreas Hindborg
  2026-06-09 19:09 ` [PATCH v2 82/83] block: rnull: add `shared_tag_bitmap` config option Andreas Hindborg
  2026-06-09 19:09 ` [PATCH v2 83/83] block: rnull: add zone offline and readonly configfs files Andreas Hindborg
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:09 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add a configfs attribute to configure the virtual memory boundary mask
for the rnull block device. This allows testing how drivers and
filesystems handle devices with specific alignment requirements.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/configfs.rs |  5 +++++
 drivers/block/rnull/rnull.rs    | 17 ++++++++++++++++-
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/drivers/block/rnull/configfs.rs b/drivers/block/rnull/configfs.rs
index 5ab217e43e2b..3e054339226c 100644
--- a/drivers/block/rnull/configfs.rs
+++ b/drivers/block/rnull/configfs.rs
@@ -133,6 +133,7 @@ fn make_group(
                 poll_queues: 27,
                 fua: 28,
                 max_sectors: 29,
+                virt_boundary: 30,
             ],
         };
 
@@ -221,6 +222,7 @@ fn make_group(
                     #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
                     init_hctx_inject,
                     max_sectors: 0,
+                    virt_boundary: false,
                 }),
             }),
             default_groups,
@@ -315,6 +317,7 @@ struct DeviceConfigInner {
     #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
     init_hctx_inject: Arc<FaultConfig>,
     max_sectors: u32,
+    virt_boundary: bool,
 }
 
 #[vtable]
@@ -388,6 +391,7 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
                 #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
                 timeout_inject: guard.timeout_inject.clone(),
                 max_sectors: guard.max_sectors,
+                virt_boundary: guard.virt_boundary,
             })?);
             guard.powered = true;
         } else if guard.powered && !power_op {
@@ -617,3 +621,4 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
 }
 configfs_simple_bool_field!(DeviceConfig, 28, fua);
 configfs_simple_field!(DeviceConfig, 29, max_sectors, u32);
+configfs_simple_bool_field!(DeviceConfig, 30, virt_boundary);
diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index 15b8c365b9fa..147dc8498c3a 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -28,7 +28,10 @@
             BadBlocks, //
         },
         bio::Segment,
-        error::{BlkError, BlkResult},
+        error::{
+            BlkError,
+            BlkResult, //
+        },
         mq::{
             self,
             gen_disk::{
@@ -54,6 +57,7 @@
     memalloc_scope,
     new_mutex,
     new_spinlock,
+    page::PAGE_SIZE,
     pr_info,
     prelude::*,
     revocable::Revocable,
@@ -208,6 +212,10 @@
             default: 0,
             description: "Maximum size of a command (in 512B sectors)",
         },
+        virt_boundary: bool {
+            default: false,
+            description: "Set alignment requirement for IO buffers to be page size.",
+        },
     },
 }
 
@@ -312,6 +320,7 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
                     #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
                     timeout_inject: Arc::pin_init(FaultConfig::new(c"timeout_inject"), GFP_KERNEL)?,
                     max_sectors: module_parameters::max_sectors.value(),
+                    virt_boundary: module_parameters::virt_boundary.value(),
                 })?;
                 disks.push(disk, GFP_KERNEL)?;
             }
@@ -358,6 +367,7 @@ struct NullBlkOptions<'a> {
     #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
     timeout_inject: Arc<FaultConfig>,
     max_sectors: u32,
+    virt_boundary: bool,
 }
 
 #[pin_data]
@@ -494,6 +504,7 @@ fn new(options: NullBlkOptions<'_>) -> Result<Arc<GenDisk<Self>>> {
             #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
             timeout_inject,
             max_sectors,
+            virt_boundary,
         } = options;
 
         let memory_backed = tag_set.memory_backed;
@@ -558,6 +569,10 @@ fn new(options: NullBlkOptions<'_>) -> Result<Arc<GenDisk<Self>>> {
             .forced_unit_access(forced_unit_access && storage.cache_enabled())
             .max_sectors(max_sectors);
 
+        if virt_boundary {
+            builder = builder.virt_boundary_mask(PAGE_SIZE - 1);
+        }
+
         #[cfg(CONFIG_BLK_DEV_ZONED)]
         {
             builder = builder

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 82/83] block: rnull: add `shared_tag_bitmap` config option
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (80 preceding siblings ...)
  2026-06-09 19:09 ` [PATCH v2 81/83] block: rnull: add `virt_boundary` option Andreas Hindborg
@ 2026-06-09 19:09 ` Andreas Hindborg
  2026-06-09 19:09 ` [PATCH v2 83/83] block: rnull: add zone offline and readonly configfs files Andreas Hindborg
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:09 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add a configfs attribute and module parameter to enable the
`BLK_MQ_F_TAG_HCTX_SHARED` flag for the rnull tag set. When enabled,
a tag bitmap is shared across all hardware queues.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/configfs.rs       |  5 +++++
 drivers/block/rnull/rnull.rs          | 12 ++++++++++++
 rust/kernel/block/mq/tag_set/flags.rs |  4 ++++
 3 files changed, 21 insertions(+)

diff --git a/drivers/block/rnull/configfs.rs b/drivers/block/rnull/configfs.rs
index 3e054339226c..1bab38c55698 100644
--- a/drivers/block/rnull/configfs.rs
+++ b/drivers/block/rnull/configfs.rs
@@ -134,6 +134,7 @@ fn make_group(
                 fua: 28,
                 max_sectors: 29,
                 virt_boundary: 30,
+                shared_tag_bitmap: 31,
             ],
         };
 
@@ -223,6 +224,7 @@ fn make_group(
                     init_hctx_inject,
                     max_sectors: 0,
                     virt_boundary: false,
+                    shared_tag_bitmap: false,
                 }),
             }),
             default_groups,
@@ -318,6 +320,7 @@ struct DeviceConfigInner {
     init_hctx_inject: Arc<FaultConfig>,
     max_sectors: u32,
     virt_boundary: bool,
+    shared_tag_bitmap: bool,
 }
 
 #[vtable]
@@ -374,6 +377,7 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
                     blocking: guard.blocking,
                     memory_backed: guard.memory_backed,
                     no_sched: guard.no_sched,
+                    shared_tag_bitmap: guard.shared_tag_bitmap,
                     hw_queue_depth: guard.hw_queue_depth,
                     #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
                     init_hctx_inject: guard.init_hctx_inject.clone(),
@@ -622,3 +626,4 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
 configfs_simple_bool_field!(DeviceConfig, 28, fua);
 configfs_simple_field!(DeviceConfig, 29, max_sectors, u32);
 configfs_simple_bool_field!(DeviceConfig, 30, virt_boundary);
+configfs_simple_bool_field!(DeviceConfig, 31, shared_tag_bitmap);
diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index 147dc8498c3a..81f9e2d03f31 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -216,6 +216,10 @@
             default: false,
             description: "Set alignment requirement for IO buffers to be page size.",
         },
+        shared_tag_bitmap: bool {
+            default: false,
+            description: "Use shared tag bitmap for all submission queues for blk-mq.",
+        },
     },
 }
 
@@ -245,6 +249,7 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
             let memory_backed = module_parameters::memory_backed.value();
             let no_sched = module_parameters::no_sched.value();
             let hw_queue_depth = module_parameters::hw_queue_depth.value();
+            let shared_tag_bitmap = module_parameters::shared_tag_bitmap.value();
 
             let shared_tag_set = NullBlkDevice::build_tag_set(TagSetOptions {
                 home_node,
@@ -258,6 +263,7 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
                 blocking,
                 memory_backed,
                 no_sched,
+                shared_tag_bitmap,
                 hw_queue_depth,
                 #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
                 init_hctx_inject: Arc::pin_init(
@@ -300,6 +306,7 @@ fn init(_module: &'static ThisModule) -> impl PinInit<Self, Error> {
                         blocking,
                         memory_backed,
                         no_sched,
+                        shared_tag_bitmap,
                         hw_queue_depth,
                         #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
                         init_hctx_inject: Arc::pin_init(
@@ -404,6 +411,7 @@ struct TagSetOptions {
     blocking: bool,
     memory_backed: bool,
     no_sched: bool,
+    shared_tag_bitmap: bool,
     hw_queue_depth: u32,
     #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
     init_hctx_inject: Arc<FaultConfig>,
@@ -419,6 +427,7 @@ fn build_tag_set(options: TagSetOptions) -> Result<Arc<TagSet<Self>>> {
             blocking,
             memory_backed,
             no_sched,
+            shared_tag_bitmap,
             hw_queue_depth,
             #[cfg(CONFIG_BLK_DEV_RUST_NULL_FAULT_INJECTION)]
             init_hctx_inject,
@@ -441,6 +450,9 @@ fn build_tag_set(options: TagSetOptions) -> Result<Arc<TagSet<Self>>> {
         if no_sched {
             flags |= mq::tag_set::Flag::NoDefaultScheduler;
         }
+        if shared_tag_bitmap {
+            flags |= mq::tag_set::Flag::TagHctxShared;
+        }
 
         let queue_config_guard = queue_config.lock();
         let submit_queues = queue_config_guard.submit_queues;
diff --git a/rust/kernel/block/mq/tag_set/flags.rs b/rust/kernel/block/mq/tag_set/flags.rs
index 2561d7090c49..afc9d31ed998 100644
--- a/rust/kernel/block/mq/tag_set/flags.rs
+++ b/rust/kernel/block/mq/tag_set/flags.rs
@@ -21,5 +21,9 @@ pub enum Flag {
         /// Select 'none' during queue registration in case of a single hwq or shared
         /// hwqs instead of 'mq-deadline'.
         NoDefaultScheduler = bindings::BLK_MQ_F_NO_SCHED_BY_DEFAULT,
+
+        /// Use shared tag bitmap for all submission queues.
+        TagHctxShared = bindings::BLK_MQ_F_TAG_HCTX_SHARED,
+
     }
 }

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* [PATCH v2 83/83] block: rnull: add zone offline and readonly configfs files
  2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
                   ` (81 preceding siblings ...)
  2026-06-09 19:09 ` [PATCH v2 82/83] block: rnull: add `shared_tag_bitmap` config option Andreas Hindborg
@ 2026-06-09 19:09 ` Andreas Hindborg
  82 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-09 19:09 UTC (permalink / raw)
  To: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, Liam R. Howlett,
	Boqun Feng, Lorenzo Stoakes
  Cc: Andreas Hindborg, linux-block, linux-kernel, linux-mm,
	rust-for-linux

Add configfs attributes for managing zone states in the rnull zoned
block device emulation. The `zone_offline` and `zone_readonly`
attributes allow setting specific zones to offline or read-only states,
which is useful for testing how applications handle degraded zones.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
 drivers/block/rnull/configfs.rs     | 76 +++++++++++++++++++++++++++++++++
 drivers/block/rnull/disk_storage.rs | 59 +++++++++++++-------------
 drivers/block/rnull/rnull.rs        |  2 +-
 drivers/block/rnull/zoned.rs        | 83 ++++++++++++++++++++++++++-----------
 4 files changed, 164 insertions(+), 56 deletions(-)

diff --git a/drivers/block/rnull/configfs.rs b/drivers/block/rnull/configfs.rs
index 1bab38c55698..43d01b419579 100644
--- a/drivers/block/rnull/configfs.rs
+++ b/drivers/block/rnull/configfs.rs
@@ -135,6 +135,8 @@ fn make_group(
                 max_sectors: 29,
                 virt_boundary: 30,
                 shared_tag_bitmap: 31,
+                zone_offline: 32,
+                zone_readonly: 33,
             ],
         };
 
@@ -627,3 +629,77 @@ fn store(this: &DeviceConfig, page: &[u8]) -> Result {
 configfs_simple_field!(DeviceConfig, 29, max_sectors, u32);
 configfs_simple_bool_field!(DeviceConfig, 30, virt_boundary);
 configfs_simple_bool_field!(DeviceConfig, 31, shared_tag_bitmap);
+
+#[cfg(CONFIG_BLK_DEV_ZONED)]
+fn set_zone_condition(
+    this: &DeviceConfig,
+    sector: u64,
+    cb: impl FnOnce(
+        &crate::zoned::ZoneOptions,
+        &DiskStorage,
+        &mut crate::zoned::ZoneDescriptor,
+    ) -> Result,
+) -> Result {
+    use crate::zoned::ZoneType;
+    let data_guard = this.data.lock();
+    let null_disk = data_guard.disk.as_ref().ok_or(EBUSY)?.queue_data();
+    let storage = &null_disk.storage;
+    let zone_options = &null_disk.zoned;
+    zone_options.enabled.then_some(()).ok_or(EINVAL)?;
+    let mut zone = zone_options.zone(sector)?.lock();
+
+    if zone.kind == ZoneType::Conventional {
+        return Err(EINVAL);
+    }
+
+    cb(zone_options, storage, &mut zone)
+}
+
+#[cfg(CONFIG_BLK_DEV_ZONED)]
+configfs_attribute!(
+    DeviceConfig,
+    32,
+    show: |_this, _page| Ok(0),
+    store: |this,page| {
+        let text = core::str::from_utf8(page)?.trim();
+        let sector = text.parse().map_err(|_| EINVAL)?;
+
+        set_zone_condition(this, sector, |zone_options, storage, zone| {
+            zone_options.offline_zone(storage, zone)
+        })?;
+        Ok(())
+    },
+);
+
+#[cfg(CONFIG_BLK_DEV_ZONED)]
+configfs_attribute!(
+    DeviceConfig,
+    33,
+    show: |_this, _page| Ok(0),
+    store: |this,page| {
+        let text = core::str::from_utf8(page)?.trim();
+        let sector = text.parse().map_err(|_| EINVAL)?;
+
+        set_zone_condition(this, sector, |zone_options, storage, zone| {
+            zone_options.read_only_zone(storage, zone)
+        })?;
+
+        Ok(())
+    },
+);
+
+#[cfg(not(CONFIG_BLK_DEV_ZONED))]
+configfs_attribute!(
+    DeviceConfig,
+    32,
+    show: |_this, _page| Ok(0),
+    store: |_this, _page| Err(ENOTSUPP),
+);
+
+#[cfg(not(CONFIG_BLK_DEV_ZONED))]
+configfs_attribute!(
+    DeviceConfig,
+    33,
+    show: |_this, _page| Ok(0),
+    store: |_this, _page| Err(ENOTSUPP),
+);
diff --git a/drivers/block/rnull/disk_storage.rs b/drivers/block/rnull/disk_storage.rs
index 6797b7996da3..879dd5d96e65 100644
--- a/drivers/block/rnull/disk_storage.rs
+++ b/drivers/block/rnull/disk_storage.rs
@@ -65,27 +65,45 @@ pub(crate) fn lock(&self) -> SpinLockGuard<'_, Pin<KBox<TreeContainer>>> {
         self.trees.lock()
     }
 
-    pub(crate) fn discard(
-        &self,
-        hw_data: &Pin<&SpinLock<HwQueueContext>>,
-        mut sector: u64,
-        sectors: u32,
-    ) {
-        let mut tree_guard = self.lock();
-        let mut hw_data_guard = hw_data.lock();
-
-        let mut access = self.access(&mut tree_guard, &mut hw_data_guard, None);
+    pub(crate) fn discard(&self, mut sector: u64, sectors: u32) {
+        let tree_guard = self.lock();
+        let mut cache_guard = tree_guard.cache_tree.lock();
+        let mut disk_guard = tree_guard.cache_tree.lock();
 
         let mut remaining_bytes = sectors_to_bytes(sectors);
 
         while remaining_bytes > 0 {
-            access.free_sector(sector);
+            self.free_sector(&mut cache_guard, &mut disk_guard, sector);
             let processed = remaining_bytes.min(self.block_size);
             sector += Into::<u64>::into(bytes_to_sectors(processed));
             remaining_bytes -= processed;
         }
     }
 
+    fn free_sector_tree(tree_access: &mut xarray::Guard<'_, TreeNode>, sector: u64) {
+        let index = DiskStorageAccess::to_index(sector);
+        if let Some(page) = tree_access.get_mut(index) {
+            page.set_free(sector);
+
+            if page.is_empty() {
+                tree_access.remove(index);
+            }
+        }
+    }
+
+    pub(crate) fn free_sector<'a>(
+        &self,
+        cache_guard: &mut xarray::Guard<'a, TreeNode>,
+        disk_guard: &mut xarray::Guard<'a, TreeNode>,
+        sector: u64,
+    ) {
+        if self.cache_size > 0 {
+            Self::free_sector_tree(cache_guard, sector);
+        }
+
+        Self::free_sector_tree(disk_guard, sector);
+    }
+
     pub(crate) fn flush(&self, hw_data: &Pin<&SpinLock<HwQueueContext>>) {
         let mut tree_guard = self.lock();
         let mut hw_data_guard = hw_data.lock();
@@ -286,25 +304,6 @@ pub(crate) fn get_read_page(&self, sector: u64) -> Option<&NullBlockPage> {
             self.disk_guard.get(index)
         }
     }
-
-    fn free_sector_tree(tree_access: &mut xarray::Guard<'_, TreeNode>, sector: u64) {
-        let index = Self::to_index(sector);
-        if let Some(page) = tree_access.get_mut(index) {
-            page.set_free(sector);
-
-            if page.is_empty() {
-                tree_access.remove(index);
-            }
-        }
-    }
-
-    pub(crate) fn free_sector(&mut self, sector: u64) {
-        if self.disk_storage.cache_size > 0 {
-            Self::free_sector_tree(&mut self.cache_guard, sector);
-        }
-
-        Self::free_sector_tree(&mut self.disk_guard, sector);
-    }
 }
 
 type TreeNode = KBox<NullBlockPage>;
diff --git a/drivers/block/rnull/rnull.rs b/drivers/block/rnull/rnull.rs
index 81f9e2d03f31..b6371fe4ebeb 100644
--- a/drivers/block/rnull/rnull.rs
+++ b/drivers/block/rnull/rnull.rs
@@ -798,7 +798,7 @@ fn handle_regular_command(
         if self.memory_backed {
             memalloc_scope!(let _noio: NoIo);
             if rq.command() == mq::Command::Discard {
-                self.storage.discard(hw_data, rq.sector(), sectors);
+                self.storage.discard(rq.sector(), sectors);
             } else {
                 self.transfer(hw_data, rq, rq.command(), sectors)?;
             }
diff --git a/drivers/block/rnull/zoned.rs b/drivers/block/rnull/zoned.rs
index 808449cc49e1..cf0eb5d31840 100644
--- a/drivers/block/rnull/zoned.rs
+++ b/drivers/block/rnull/zoned.rs
@@ -179,7 +179,7 @@ pub(crate) fn handle_zoned_command(
         match rq.command() {
             ZoneAppend | Write => self.zoned_write(hw_data, rq)?,
             ZoneReset | ZoneResetAll | ZoneOpen | ZoneClose | ZoneFinish => {
-                self.zone_management(hw_data, rq)?
+                self.zone_management(rq)?
             }
             _ => self.zoned_read(hw_data, rq)?,
         }
@@ -187,18 +187,14 @@ pub(crate) fn handle_zoned_command(
         Ok(())
     }
 
-    fn zone_management(
-        &self,
-        hw_data: &Pin<&SpinLock<HwQueueContext>>,
-        rq: &mut Owned<mq::Request<Self>>,
-    ) -> Result {
+    fn zone_management(&self, rq: &mut Owned<mq::Request<Self>>) -> Result {
         if rq.command() == mq::Command::ZoneResetAll {
             for zone in self.zoned.zones_iter() {
                 let mut zone = zone.lock();
                 use ZoneCondition::*;
                 match zone.condition {
                     Empty | ReadOnly | Offline => continue,
-                    _ => self.zoned.reset_zone(&self.storage, hw_data, &mut zone)?,
+                    _ => self.zoned.reset_zone(&self.storage, &mut zone)?,
                 }
             }
 
@@ -214,10 +210,10 @@ fn zone_management(
 
         use mq::Command::*;
         match rq.command() {
-            ZoneOpen => self.zoned.open_zone(&mut zone, rq.sector()),
+            ZoneOpen => self.zoned.open_zone(&mut zone),
             ZoneClose => self.zoned.close_zone(&mut zone),
-            ZoneReset => self.zoned.reset_zone(&self.storage, hw_data, &mut zone),
-            ZoneFinish => self.zoned.finish_zone(&mut zone, rq.sector()),
+            ZoneReset => self.zoned.reset_zone(&self.storage, &mut zone),
+            ZoneFinish => self.zoned.finish_zone(&mut zone),
             _ => Err(EIO),
         }
     }
@@ -283,7 +279,7 @@ fn zoned_write(
             if self.zoned.use_accounting() {
                 let mut accounting = self.zoned.accounting.lock();
                 self.zoned
-                    .check_zone_resources(&mut accounting, &mut zone, rq.sector())?;
+                    .check_zone_resources(&mut accounting, &mut zone)?;
 
                 if zone.condition == ZoneCondition::Closed {
                     accounting.closed -= 1;
@@ -367,7 +363,7 @@ fn zone_no(&self, sector: u64) -> usize {
         (sector >> self.size_sectors.ilog2()) as usize
     }
 
-    fn zone(&self, sector: u64) -> Result<&Mutex<ZoneDescriptor>> {
+    pub(crate) fn zone(&self, sector: u64) -> Result<&Mutex<ZoneDescriptor>> {
         self.zones.get(self.zone_no(sector)).ok_or(EINVAL)
     }
 
@@ -420,7 +416,7 @@ fn try_close_implicit_open_zone(&self, accounting: &mut ZoneAccounting, sector:
         Err(EINVAL)
     }
 
-    fn open_zone(&self, zone: &mut ZoneDescriptor, sector: u64) -> Result {
+    fn open_zone(&self, zone: &mut ZoneDescriptor) -> Result {
         if zone.kind == ZoneType::Conventional {
             return Err(EINVAL);
         }
@@ -436,13 +432,13 @@ fn open_zone(&self, zone: &mut ZoneDescriptor, sector: u64) -> Result {
             let mut accounting = self.accounting.lock();
             match zone.condition {
                 Empty => {
-                    self.check_zone_resources(&mut accounting, zone, sector)?;
+                    self.check_zone_resources(&mut accounting, zone)?;
                 }
                 ImplicitOpen => {
                     accounting.implicit_open -= 1;
                 }
                 Closed => {
-                    self.check_zone_resources(&mut accounting, zone, sector)?;
+                    self.check_zone_resources(&mut accounting, zone)?;
                     accounting.closed -= 1;
                 }
                 _ => (),
@@ -459,14 +455,13 @@ fn check_zone_resources(
         &self,
         accounting: &mut ZoneAccounting,
         zone: &mut ZoneDescriptor,
-        sector: u64,
     ) -> Result {
         match zone.condition {
             ZoneCondition::Empty => {
                 self.check_active_zones(accounting)?;
-                self.check_open_zones(accounting, sector)
+                self.check_open_zones(accounting, zone.start_sector)
             }
-            ZoneCondition::Closed => self.check_open_zones(accounting, sector),
+            ZoneCondition::Closed => self.check_open_zones(accounting, zone.start_sector),
             _ => Err(EIO),
         }
     }
@@ -535,7 +530,7 @@ fn close_zone(&self, zone: &mut ZoneDescriptor) -> Result {
         Ok(())
     }
 
-    fn finish_zone(&self, zone: &mut ZoneDescriptor, sector: u64) -> Result {
+    fn finish_zone(&self, zone: &mut ZoneDescriptor) -> Result {
         if zone.kind == ZoneType::Conventional {
             return Err(EINVAL);
         }
@@ -547,12 +542,12 @@ fn finish_zone(&self, zone: &mut ZoneDescriptor, sector: u64) -> Result {
             match zone.condition {
                 Full => return Ok(()),
                 Empty => {
-                    self.check_zone_resources(&mut accounting, zone, sector)?;
+                    self.check_zone_resources(&mut accounting, zone)?;
                 }
                 ImplicitOpen => accounting.implicit_open -= 1,
                 ExplicitOpen => accounting.explicit_open -= 1,
                 Closed => {
-                    self.check_zone_resources(&mut accounting, zone, sector)?;
+                    self.check_zone_resources(&mut accounting, zone)?;
                     accounting.closed -= 1;
                 }
                 _ => return Err(EIO),
@@ -568,7 +563,6 @@ fn finish_zone(&self, zone: &mut ZoneDescriptor, sector: u64) -> Result {
     fn reset_zone(
         &self,
         storage: &crate::disk_storage::DiskStorage,
-        hw_data: &Pin<&SpinLock<HwQueueContext>>,
         zone: &mut ZoneDescriptor,
     ) -> Result {
         if zone.kind == ZoneType::Conventional {
@@ -591,16 +585,55 @@ fn reset_zone(
         zone.condition = ZoneCondition::Empty;
         zone.write_pointer = zone.start_sector;
 
-        storage.discard(hw_data, zone.start_sector, zone.size_sectors);
+        storage.discard(zone.start_sector, zone.size_sectors);
+
+        Ok(())
+    }
+
+    fn set_zone_condition(
+        &self,
+        storage: &crate::disk_storage::DiskStorage,
+        zone: &mut ZoneDescriptor,
+        condition: ZoneCondition,
+    ) -> Result {
+        if zone.condition == condition {
+            zone.condition = ZoneCondition::Empty;
+            zone.write_pointer = zone.start_sector;
+            storage.discard(zone.start_sector, zone.size_sectors);
+        } else {
+            if matches!(
+                zone.condition,
+                ZoneCondition::ReadOnly | ZoneCondition::Offline
+            ) {
+                self.finish_zone(zone)?;
+            }
 
+            zone.condition = ZoneCondition::Offline;
+            zone.write_pointer = u64::MAX;
+        }
         Ok(())
     }
+    pub(crate) fn offline_zone(
+        &self,
+        storage: &crate::disk_storage::DiskStorage,
+        zone: &mut ZoneDescriptor,
+    ) -> Result {
+        self.set_zone_condition(storage, zone, ZoneCondition::Offline)
+    }
+
+    pub(crate) fn read_only_zone(
+        &self,
+        storage: &crate::disk_storage::DiskStorage,
+        zone: &mut ZoneDescriptor,
+    ) -> Result {
+        self.set_zone_condition(storage, zone, ZoneCondition::ReadOnly)
+    }
 }
 
 pub(crate) struct ZoneDescriptor {
     start_sector: u64,
     size_sectors: u32,
-    kind: ZoneType,
+    pub(crate) kind: ZoneType,
     capacity_sectors: u32,
     write_pointer: u64,
     condition: ZoneCondition,
@@ -628,7 +661,7 @@ fn check_bounds_read(&self, sector: u64, sectors: u32) -> Result {
 
 #[derive(Copy, Clone, PartialEq, Eq, Debug)]
 #[repr(u32)]
-enum ZoneType {
+pub(crate) enum ZoneType {
     Conventional = bindings::blk_zone_type_BLK_ZONE_TYPE_CONVENTIONAL,
     SequentialWriteRequired = bindings::blk_zone_type_BLK_ZONE_TYPE_SEQWRITE_REQ,
     #[expect(dead_code)]

-- 
2.51.2




^ permalink raw reply related	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 01/83] block: rust: fix `Send` bound for `GenDisk`
  2026-06-09 19:07 ` [PATCH v2 01/83] block: rust: fix `Send` bound for `GenDisk` Andreas Hindborg
@ 2026-06-09 20:44   ` Yuan Tan
  2026-06-09 21:45   ` Yuan Tan
  1 sibling, 0 replies; 88+ messages in thread
From: Yuan Tan @ 2026-06-09 20:44 UTC (permalink / raw)
  To: Andreas Hindborg
  Cc: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, linux-block,
	linux-kernel, linux-mm, rust-for-linux

[-- Attachment #1: Type: text/plain, Size: 2693 bytes --]

On Tue, Jun 9, 2026 at 12:13 PM Andreas Hindborg <a.hindborg@kernel.org>
wrote:

> The `Send` implementation for `GenDisk<T>` was conditioned on `T: Send`.
> This constrains the wrong type. `T` is the `Operations` implementation,
> which is typically a zero-sized marker type that carries no data, so `T:
> Send` says nothing about whether the data a `GenDisk` actually owns can be
> moved to another thread.
>
> A `GenDisk<T>` owns the queue data `T::QueueData` (stored as the
> `gendisk`'s `queuedata` and dropped when the `GenDisk` is dropped) and an
> `Arc<TagSet<T>>`. These are the values transferred when a `GenDisk` is sent
> across a thread boundary, so the `Send` bound must constrain exactly them.
> Bound `T::QueueData: Send` and `Arc<TagSet<T>>: Send` instead.
>
> Fixes: 3253aba3408a ("rust: block: introduce `kernel::block::mq` module")
> Suggested-by: Yuan Tan <ytan089@ucr.edu>
> Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
> ---
>
> Please take patch from Yuan instead of this one, if they send a fixed
> version [1].
>
> [1]
> https://lore.kernel.org/r/8839ddc5ff54bf454d508cde91d27d00fc3e2dd8.1780633578.git.ytan089@ucr.edu


Sorry, I've been busy with other things and haven't had the chance to send
the fixed version.

Thank you very much for reviewing the patch and for preparing the v2
version.

Could you please add the following when applying this patch?
Reported-by: Priya Bala Govindasamy <pgovind2@uci.edu>
Reported-by: Dylan Zueck <dzueck@uci.edu>

I didn't discover this issue myself. I just helped write the patch and I
don't want them to lose their credit for it.

Please let me know if you would prefer that I send a v3 instead.


>
> ---
>  rust/kernel/block/mq/gen_disk.rs | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/rust/kernel/block/mq/gen_disk.rs
> b/rust/kernel/block/mq/gen_disk.rs
> index 912cb805caf5..b36d24382cc3 100644
> --- a/rust/kernel/block/mq/gen_disk.rs
> +++ b/rust/kernel/block/mq/gen_disk.rs
> @@ -199,8 +199,14 @@ pub struct GenDisk<T: Operations> {
>  }
>
>  // SAFETY: `GenDisk` is an owned pointer to a `struct gendisk` and an
> `Arc` to a
> -// `TagSet` It is safe to send this to other threads as long as T is Send.
> -unsafe impl<T: Operations + Send> Send for GenDisk<T> {}
> +// `TagSet`. It is safe to send this to other threads as long as these
> two are `Send`.
> +unsafe impl<T> Send for GenDisk<T>
> +where
> +    T: Operations,
> +    T::QueueData: Send,
> +    Arc<TagSet<T>>: Send,
> +{
> +}
>
>  impl<T: Operations> Drop for GenDisk<T> {
>      fn drop(&mut self) {
>
> --
> 2.51.2
>
>
>

[-- Attachment #2: Type: text/html, Size: 3770 bytes --]

^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 01/83] block: rust: fix `Send` bound for `GenDisk`
  2026-06-09 19:07 ` [PATCH v2 01/83] block: rust: fix `Send` bound for `GenDisk` Andreas Hindborg
  2026-06-09 20:44   ` Yuan Tan
@ 2026-06-09 21:45   ` Yuan Tan
  2026-06-10  9:00     ` Andreas Hindborg
  1 sibling, 1 reply; 88+ messages in thread
From: Yuan Tan @ 2026-06-09 21:45 UTC (permalink / raw)
  To: Andreas Hindborg
  Cc: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, linux-block,
	linux-kernel, linux-mm, rust-for-linux, Priya Bala Govindasamy,
	Dylan Zueck, Yuan Tan

On Tue, Jun 9, 2026 at 12:13 PM Andreas Hindborg <a.hindborg@kernel.org> wrote:
>
> The `Send` implementation for `GenDisk<T>` was conditioned on `T: Send`.
> This constrains the wrong type. `T` is the `Operations` implementation,
> which is typically a zero-sized marker type that carries no data, so `T:
> Send` says nothing about whether the data a `GenDisk` actually owns can be
> moved to another thread.
>
> A `GenDisk<T>` owns the queue data `T::QueueData` (stored as the
> `gendisk`'s `queuedata` and dropped when the `GenDisk` is dropped) and an
> `Arc<TagSet<T>>`. These are the values transferred when a `GenDisk` is sent
> across a thread boundary, so the `Send` bound must constrain exactly them.
> Bound `T::QueueData: Send` and `Arc<TagSet<T>>: Send` instead.
>
> Fixes: 3253aba3408a ("rust: block: introduce `kernel::block::mq` module")
> Suggested-by: Yuan Tan <ytan089@ucr.edu>
> Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
> ---
>
> Please take patch from Yuan instead of this one, if they send a fixed
> version [1].
>
> [1] https://lore.kernel.org/r/8839ddc5ff54bf454d508cde91d27d00fc3e2dd8.1780633578.git.ytan089@ucr.edu

My last email mistakenly enabled html. So I am here to resend it. Hope
it doesn't disturb anyone.

Sorry, I've been busy with other things and haven't had the chance to
send the fixed version.

Thank you very much for reviewing the patch and for preparing the v2 version.

Could you please add the following when applying this patch?
Reported-by: Priya Bala Govindasamy <pgovind2@uci.edu>
Reported-by: Dylan Zueck <dzueck@uci.edu>

I didn't discover this issue myself. I just helped write the patch and
I don't want them to lose their credit for it.

Please let me know if you would prefer that I send a v3 instead.


> ---
>  rust/kernel/block/mq/gen_disk.rs | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/rust/kernel/block/mq/gen_disk.rs b/rust/kernel/block/mq/gen_disk.rs
> index 912cb805caf5..b36d24382cc3 100644
> --- a/rust/kernel/block/mq/gen_disk.rs
> +++ b/rust/kernel/block/mq/gen_disk.rs
> @@ -199,8 +199,14 @@ pub struct GenDisk<T: Operations> {
>  }
>
>  // SAFETY: `GenDisk` is an owned pointer to a `struct gendisk` and an `Arc` to a
> -// `TagSet` It is safe to send this to other threads as long as T is Send.
> -unsafe impl<T: Operations + Send> Send for GenDisk<T> {}
> +// `TagSet`. It is safe to send this to other threads as long as these two are `Send`.
> +unsafe impl<T> Send for GenDisk<T>
> +where
> +    T: Operations,
> +    T::QueueData: Send,
> +    Arc<TagSet<T>>: Send,
> +{
> +}
>
>  impl<T: Operations> Drop for GenDisk<T> {
>      fn drop(&mut self) {
>
> --
> 2.51.2
>
>


^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 01/83] block: rust: fix `Send` bound for `GenDisk`
  2026-06-09 21:45   ` Yuan Tan
@ 2026-06-10  9:00     ` Andreas Hindborg
  0 siblings, 0 replies; 88+ messages in thread
From: Andreas Hindborg @ 2026-06-10  9:00 UTC (permalink / raw)
  To: Yuan Tan
  Cc: Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen, Benno Lossin,
	Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross, linux-block,
	linux-kernel, linux-mm, rust-for-linux, Priya Bala Govindasamy,
	Dylan Zueck, Yuan Tan

Yuan Tan <ytan089@ucr.edu> writes:

> On Tue, Jun 9, 2026 at 12:13 PM Andreas Hindborg <a.hindborg@kernel.org> wrote:
>>
>> The `Send` implementation for `GenDisk<T>` was conditioned on `T: Send`.
>> This constrains the wrong type. `T` is the `Operations` implementation,
>> which is typically a zero-sized marker type that carries no data, so `T:
>> Send` says nothing about whether the data a `GenDisk` actually owns can be
>> moved to another thread.
>>
>> A `GenDisk<T>` owns the queue data `T::QueueData` (stored as the
>> `gendisk`'s `queuedata` and dropped when the `GenDisk` is dropped) and an
>> `Arc<TagSet<T>>`. These are the values transferred when a `GenDisk` is sent
>> across a thread boundary, so the `Send` bound must constrain exactly them.
>> Bound `T::QueueData: Send` and `Arc<TagSet<T>>: Send` instead.
>>
>> Fixes: 3253aba3408a ("rust: block: introduce `kernel::block::mq` module")
>> Suggested-by: Yuan Tan <ytan089@ucr.edu>
>> Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
>> ---
>>
>> Please take patch from Yuan instead of this one, if they send a fixed
>> version [1].
>>
>> [1] https://lore.kernel.org/r/8839ddc5ff54bf454d508cde91d27d00fc3e2dd8.1780633578.git.ytan089@ucr.edu
>
> My last email mistakenly enabled html. So I am here to resend it. Hope
> it doesn't disturb anyone.
>
> Sorry, I've been busy with other things and haven't had the chance to
> send the fixed version.
>
> Thank you very much for reviewing the patch and for preparing the v2 version.
>
> Could you please add the following when applying this patch?
> Reported-by: Priya Bala Govindasamy <pgovind2@uci.edu>
> Reported-by: Dylan Zueck <dzueck@uci.edu>
>
> I didn't discover this issue myself. I just helped write the patch and
> I don't want them to lose their credit for it.
>
> Please let me know if you would prefer that I send a v3 instead.

I would absolutely encourage you to send a v3 so we can pick that one.

Your emails are not arriving in my inbox though. I only see this one on
the list. Please check your setup for outgoing email.

Best regards,
Andreas Hindborg




^ permalink raw reply	[flat|nested] 88+ messages in thread

* Re: [PATCH v2 23/83] block: rnull: add discard support
  2026-06-09 19:08 ` [PATCH v2 23/83] block: rnull: add discard support Andreas Hindborg
@ 2026-06-10 13:55   ` Malte Wechter
  0 siblings, 0 replies; 88+ messages in thread
From: Malte Wechter @ 2026-06-10 13:55 UTC (permalink / raw)
  To: Andreas Hindborg, Liam R. Howlett, Alice Ryhl, Anna-Maria Behnsen,
	Benno Lossin, Björn Roy Baron, Boqun Feng, Danilo Krummrich,
	FUJITA Tomonori, Frederic Weisbecker, Gary Guo, Jens Axboe,
	John Stultz, Lorenzo Stoakes, Lyude Paul, Miguel Ojeda,
	Stephen Boyd, Thomas Gleixner, Trevor Gross
  Cc: linux-block, linux-kernel, linux-mm, rust-for-linux

On Tue Jun 9, 2026 at 9:08 PM CEST, Andreas Hindborg wrote:
> Add support for discard operations to the rnull block driver:
> - Add discard module parameter and configfs attribute.
> - Set max_hw_discard_sectors when discard is enabled.
> - Add sector occupancy tracking.
> - Add discard handling that frees sectors and removes empty pages.
> - Discard operations require memory backing to function.
>
> The discard feature uses a bitmap to track which sectors in each page are
> occupied, allowing cleanup of pages when they are empty.
>
> Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
> ---
>  drivers/block/rnull/configfs.rs |  15 +++++
>  drivers/block/rnull/rnull.rs    | 120 +++++++++++++++++++++++++++++++++++-----
>  2 files changed, 121 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/block/rnull/configfs.rs b/drivers/block/rnull/configfs.rs
> index 2f3fa81ea121..e47399cd45a4 100644
> --- a/drivers/block/rnull/configfs.rs
> +++ b/drivers/block/rnull/configfs.rs
...
>          }
>      })
>  );
> +
> +configfs_attribute!(DeviceConfig, 10,
> +    show: |this, page| show_field(this.data.lock().discard, page),
> +    store: |this, page| store_with_power_check(this, page, |data, page| {
> +        if !data.memory_backed {
> +            return Err(EINVAL);
> +        }
> +        data.discard = kstrtobool_bytes(page)?;
> +        Ok(())
> +    })
> +);
Should it be ok to set 'discard' to 0 if 'emory_backed' is not set?
In the C null_blk driver, 'discard' defaults to 0 if 'memory_backed' is not set,
it is also ignored (and defaulted to 0) if 'zoned' is enabled.

Best regards,
Malte


^ permalink raw reply	[flat|nested] 88+ messages in thread

end of thread, other threads:[~2026-06-10 13:55 UTC | newest]

Thread overview: 88+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-09 19:07 [PATCH v2 00/83] block: rnull: complete the rust null block driver Andreas Hindborg
2026-06-09 19:07 ` [PATCH v2 01/83] block: rust: fix `Send` bound for `GenDisk` Andreas Hindborg
2026-06-09 20:44   ` Yuan Tan
2026-06-09 21:45   ` Yuan Tan
2026-06-10  9:00     ` Andreas Hindborg
2026-06-09 19:07 ` [PATCH v2 02/83] rust: block: rename `SECTOR_MASK` to `PAGE_SECTOR_MASK` Andreas Hindborg
2026-06-09 19:07 ` [PATCH v2 03/83] block: rnull: adopt new formatting guidelines Andreas Hindborg
2026-06-09 19:07 ` [PATCH v2 04/83] block: rnull: add module parameters Andreas Hindborg
2026-06-09 19:07 ` [PATCH v2 05/83] block: rnull: add macros to define configfs attributes Andreas Hindborg
2026-06-09 19:07 ` [PATCH v2 06/83] block: rust: fix generation of bindings to `BLK_STS_.*` Andreas Hindborg
2026-06-09 19:07 ` [PATCH v2 07/83] block: rust: change `queue_rq` request type to `Owned` Andreas Hindborg
2026-06-09 19:07 ` [PATCH v2 08/83] block: rust: add `Request` private data support Andreas Hindborg
2026-06-09 19:07 ` [PATCH v2 09/83] block: rust: document the lifetime of `Request` Andreas Hindborg
2026-06-09 19:07 ` [PATCH v2 10/83] block: rust: allow `hrtimer::Timer` in `RequestData` Andreas Hindborg
2026-06-09 19:07 ` [PATCH v2 11/83] block: rnull: add timer completion mode Andreas Hindborg
2026-06-09 19:07 ` [PATCH v2 12/83] block: rust: introduce `kernel::block::bio` module Andreas Hindborg
2026-06-09 19:07 ` [PATCH v2 13/83] block: rust: add `command` getter to `Request` Andreas Hindborg
2026-06-09 19:07 ` [PATCH v2 14/83] block: rust: mq: use GFP_KERNEL from prelude Andreas Hindborg
2026-06-09 19:07 ` [PATCH v2 15/83] block: rust: add `TagSet` flags Andreas Hindborg
2026-06-09 19:07 ` [PATCH v2 16/83] block: rnull: add memory backing Andreas Hindborg
2026-06-09 19:07 ` [PATCH v2 17/83] block: rnull: add submit queue count config option Andreas Hindborg
2026-06-09 19:07 ` [PATCH v2 18/83] block: rnull: add `use_per_node_hctx` " Andreas Hindborg
2026-06-09 19:07 ` [PATCH v2 19/83] block: rust: allow specifying home node when constructing `TagSet` Andreas Hindborg
2026-06-09 19:07 ` [PATCH v2 20/83] block: rnull: allow specifying the home numa node Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 21/83] block: rust: add Request::sectors() method Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 22/83] block: rust: mq: add max_hw_discard_sectors support to GenDiskBuilder Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 23/83] block: rnull: add discard support Andreas Hindborg
2026-06-10 13:55   ` Malte Wechter
2026-06-09 19:08 ` [PATCH v2 24/83] block: rust: add `NoDefaultScheduler` flag for `TagSet` Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 25/83] block: rnull: add no_sched module parameter and configfs attribute Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 26/83] block: rust: change sector type from usize to u64 Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 27/83] block: rust: add `BadBlocks` for bad block tracking Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 28/83] block: rust: mq: add Request::end() method for custom status codes Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 29/83] block: rnull: add badblocks support Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 30/83] block: rnull: add badblocks_once support Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 31/83] block: rust: add `Segment::truncate` Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 32/83] block: rnull: add partial I/O support for bad blocks Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 33/83] block: rust: add `TagSet` private data support Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 34/83] block: rust: add `hctx` " Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 35/83] block: rnull: add volatile cache emulation Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 36/83] block: rust: implement `Sync` for `GenDisk` Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 37/83] block: rust: add a back reference feature to `GenDisk` Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 38/83] block: rust: introduce an idle type state for `Request` Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 39/83] block: rust: add a request queue abstraction Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 40/83] block: rust: add a method to get the request queue for a request Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 41/83] block: rust: introduce `kernel::block::error` Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 42/83] block: rust: require `queue_rq` to return a `BlkResult` Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 43/83] block: rust: add `GenDisk::queue_data` Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 44/83] block: rnull: add bandwidth limiting Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 45/83] block: rnull: add blocking queue mode Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 46/83] block: rnull: add shared tags Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 47/83] block: rnull: add queue depth config option Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 48/83] block: rust: add an abstraction for `bindings::req_op` Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 49/83] block: rust: add a method to set the target sector of a request Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 50/83] block: rust: move gendisk vtable construction to separate function Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 51/83] block: rust: add zoned block device support Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 52/83] block: rust: add `TagSet::flags` Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 53/83] block: rnull: add zoned storage support Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 54/83] block: rust: add `map_queues` support Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 55/83] block: rust: add an abstraction for `struct blk_mq_queue_map` Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 56/83] block: rust: add polled completion support Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 57/83] block: rust: add accessors to `TagSet` Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 58/83] block: rnull: add polled completion support Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 59/83] block: rnull: add REQ_OP_FLUSH support Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 60/83] block: rust: add request flags abstraction Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 61/83] block: rust: add abstraction for block queue feature flags Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 62/83] block: rust: allow setting write cache and FUA flags for `GenDisk` Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 63/83] block: rust: add `Segment::copy_to_page_limit` Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 64/83] block: rnull: add fua support Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 65/83] block: rust: add `GenDisk::tag_set` Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 66/83] block: rust: add `TagSet::update_hw_queue_count` Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 67/83] block: rnull: add an option to change the number of hardware queues Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 68/83] block: rust: add an abstraction for `struct rq_list` Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 69/83] block: rust: add `queue_rqs` vtable hook Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 70/83] block: rnull: support queue_rqs Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 71/83] block: rust: remove the `is_poll` parameter from `queue_rq` Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 72/83] block: rust: add a debug assert for refcounts Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 73/83] block: rust: add `TagSet::tag_to_rq` Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 74/83] block: rust: add `Request::queue_index` Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 75/83] block: rust: add `Request::requeue` Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 76/83] block: rust: add `request_timeout` hook Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 77/83] block: rnull: add fault injection support Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 78/83] block: rust: add max_sectors option to `GenDiskBuilder` Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 79/83] block: rnull: allow configuration of the maximum IO size Andreas Hindborg
2026-06-09 19:08 ` [PATCH v2 80/83] block: rust: add `virt_boundary_mask` option to `GenDiskBuilder` Andreas Hindborg
2026-06-09 19:09 ` [PATCH v2 81/83] block: rnull: add `virt_boundary` option Andreas Hindborg
2026-06-09 19:09 ` [PATCH v2 82/83] block: rnull: add `shared_tag_bitmap` config option Andreas Hindborg
2026-06-09 19:09 ` [PATCH v2 83/83] block: rnull: add zone offline and readonly configfs files Andreas Hindborg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox