Linux block layer
 help / color / mirror / Atom feed
* Re: [PATCH RFC v2 00/18] fs: support freeze/thaw/mark_dead/sync with shared devices
From: Jan Kara @ 2026-06-22 15:40 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Alexander Viro,
	linux-block, linux-kernel, linux-fsdevel, Carlos Maiolino,
	linux-xfs, Chris Mason, David Sterba, linux-btrfs,
	Theodore Ts'o, linux-ext4, Gao Xiang, linux-erofs, syzbot
In-Reply-To: <20260616-work-super-bdev_holder_global-v2-0-7df6b864028e@kernel.org>

Hi!

On Tue 16-06-26 16:08:16, Christian Brauner wrote:
> This is a generalization of the device number to superblock so it works
> for actual block device and anonymous (or even mtd) devices.
> 
> fs_holder_ops recovers the affected superblock from bdev->bd_holder. That
> forces the holder of a block device to be exactly one superblock and makes
> it impossible for several superblocks to share a single device.
> 
> erofs does exactly that. It can mount read-only "blob" devices that are
> shared between many superblocks: a metadata-only erofs that indexes a set
> of per-layer blobs (one filesystem instead of one per OCI layer), or an
> incremental image whose base device is shared by several updates. Because
> the block layer only tracks a single holder, a freeze, thaw, removal or
> sync on such a device is never propagated to all the superblocks using it,
> and the current infrastructure has no way to find them.
> 
> This series replaces the bd_holder-based lookup with a global, dev_t-keyed
> table mapping each block device to the superblock(s) using it. The holder
> argument becomes purely the block layer's exclusivity token -- a superblock,
> or the file_system_type for a device shared within one filesystem type --
> and the fs_holder_ops callbacks look the device up in the table and act on
> every superblock registered for it: 1:1 for most filesystems, 1:many for
> erofs.

So I was thinking about this also in the light of Christoph's complaints. I
agree with you, Chritian, that this translation table maintains the
abstraction of the holder - holder ops define how to transition from bdev
to its holder(s) and how to translate the .sync, .freeze and other
operations for the holders - and that is kept since your changes are
specific to fs_holder_ops.

What I'm wondering about a bit is whether we want this complexity for the
only user which is erofs (i.e., whether this wouldn't be better implemented
in erofs specific holder ops which could arguably be simpler than this
generic solution). On the other hand that will likely have to replicate
the locking dances we do in bdev_super_lock() and I'm not sure whether
spread of this locking complexity into filesystems is better than this
more complex VFS mapping code.

One more thing I was considering is that the need to transition from one
bdev to multiple holders isn't actually unique to erofs. For example device
mapper will need the same thing, arguably partition bdevs could be also
made holders of the complete bdev so events are propagated from the whole
bdev into partition bdevs properly (which currently happens in kind of ad
hoc manner and only in some cases). Currently your translation mechanism is
tied to mapping to superblock but actually rather weakly - we only need the
guarantee that the holder stays alive while the mapping entry exists, the
rest is protected by the mapping entry refcount AFAICS. So with a bit of
effort we could make this a generic bdev -> holders mapping mechanism
usable from whichever holder ops decide to employ it, which would then be
quite attractive IMO.

But I guess let's leave lifting the mapping code from super.c and
converting it into generic mapping mechanism for the moment when we really
get into implementing another user.

All this is a long way of saying that I'm OK with the mapping mechanism
like this :).

								Honza

> Filesystems claim and release their devices through new
> fs_bdev_file_open_by_{dev,path}() and fs_bdev_file_release() helpers; the
> per-fs patches convert xfs, btrfs, ext4, f2fs and erofs over to them and
> fix cramfs and romfs, which released the registered main device with a
> raw bdev_fput().
> 
> Since every superblock is registered under its s_dev the table also
> replaces the last s_dev-keyed walk of the super_blocks list:
> user_get_super() resolves device numbers through it, so ustat() and
> quotactl() now work on any device a filesystem claims and no longer
> take sb_lock.
> 
> The longer-term motivation is to let userspace decide which devices may be
> onlined from one central place, without having to teach every filesystem
> about it individually.
> 
> Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
> ---
> Changes in v2:
> - super: rework the device-to-superblock table reference counting: each
>   (device, superblock) entry carries a single claim count and holds one
>   passive reference on its superblock for the entry's lifetime. New prep
>   patches convert s_count to refcount_t s_passive and make put_super()
>   self-locking.
> - super: preallocate the entry in alloc_super() and register it from the
>   set callbacks through set_anon_super()/set_bdev_super(); an insert
>   failure unwinds exactly like a set callback failure. The superblock
>   stashes the entry in sb->s_super_dev and kill_super_notify() drops the
>   claim through it.
> - super: initialize the table from mnt_init(); the rootfs and shm mounts
>   are created long before any initcall runs.
> - super: fold the v1 "refuse to claim a frozen block device" patch into
>   the registration helper and restore the EBUSY check for the primary
>   device in setup_bdev_super(): additional devices (the xfs log, the ext4
>   journal, erofs blobs) are now refused while frozen as well, answering
>   Jan's question on v1 3/8.
> - Split the core patch into table/helpers/switch-over and move the
>   xfs/btrfs/ext4 conversions before the fs_holder_ops switch so no
>   freeze/mark_dead events are lost mid-series; erofs follows the switch.
> - New prep patches: the ext4 KUnit tests allocate anonymous devices and
>   ocfs2 stops resetting s_dev on dismount.
> - New: convert user_get_super() to the device table, plus a ustat()
>   selftest.
> - New: fix a pre-existing double release of the realtime device file and
>   dangling buftarg pointers in xfs_open_devices()'s error unwind.
> - New: convert f2fs's additional devices to the helpers; fix cramfs and
>   romfs releasing the registered main device with a raw bdev_fput().
> - erofs: drop the .shutdown() and .remove_bdev() implementations and the
>   per-device "dead" flag. Immutable filesystems don't need them: the block
>   layer sets GD_DEAD before fs_bdev_mark_dead() so in-flight bios fail
>   anyway, erofs has no write path or journal to stop, and the read-only
>   loop_change_fd() case must not be forced to -EIO. Patch from Gao Xiang,
>   applied verbatim - thanks!
> - btrfs: fix a general protection fault in close_fs_devices() on a failed
>   mount (reported by syzbot). The release path took the superblock from
>   device->fs_info, which is still NULL if open_ctree() fails before
>   btrfs_init_devices_late(); it now uses bdev_file->private_data.
> - erofs: the v1 conversion was sent with a generic boilerplate changelog;
>   superseded by Gao's patch above.
> - Collect Reviewed-by from Jan Kara and Tested-by from syzbot.
> - Rebase onto v7.1-rc1.
> - Link to v1: https://patch.msgid.link/20260602-work-super-bdev_holder_global-v1-0-bb0fd82f3861@kernel.org
> 
> ---
> Christian Brauner (18):
>       xfs: fix the error unwind in xfs_open_devices()
>       super: convert s_count to refcount_t s_passive
>       super: take lock after last reference count
>       fs, block: move blk_mode_t and fop_flags_t into <linux/types.h>
>       ext4: use anonymous devices for KUnit test superblocks
>       ocfs2: don't reset s_dev on dismount
>       fs: maintain a global device-to-superblock table
>       fs: add dedicated block device open helpers for filesystems
>       xfs: port to fs_bdev_file_open_by_path()
>       btrfs: open via dedicated fs bdev helpers
>       ext4: open via dedicated fs bdev helpers
>       fs: look up superblocks via the device table in fs_holder_ops
>       fs: tolerate per-superblock freeze errors on shared devices
>       erofs: open via dedicated fs bdev helpers
>       f2fs: open via dedicated fs bdev helpers
>       super: make fs_holder_ops private
>       fs: look up the superblock via the device table in user_get_super()
>       selftests/filesystems: add ustat() coverage
> 
>  fs/btrfs/volumes.c                               |  31 +-
>  fs/cramfs/inode.c                                |   2 +-
>  fs/erofs/super.c                                 |  35 +-
>  fs/ext4/extents-test.c                           |   9 +-
>  fs/ext4/mballoc-test.c                           |   9 +-
>  fs/ext4/super.c                                  |  12 +-
>  fs/f2fs/super.c                                  |   6 +-
>  fs/internal.h                                    |   1 +
>  fs/namespace.c                                   |   2 +
>  fs/ocfs2/super.c                                 |   1 -
>  fs/romfs/super.c                                 |   2 +-
>  fs/super.c                                       | 620 ++++++++++++++++-------
>  fs/xfs/xfs_buf.c                                 |   2 +-
>  fs/xfs/xfs_super.c                               |  13 +-
>  include/linux/blkdev.h                           |   9 -
>  include/linux/fs.h                               |   2 -
>  include/linux/fs/super.h                         |   8 +
>  include/linux/fs/super_types.h                   |   4 +-
>  include/linux/types.h                            |   2 +
>  tools/testing/selftests/filesystems/.gitignore   |   1 +
>  tools/testing/selftests/filesystems/Makefile     |   2 +-
>  tools/testing/selftests/filesystems/ustat_test.c | 135 +++++
>  22 files changed, 647 insertions(+), 261 deletions(-)
> ---
> base-commit: 0c0d974f62e6603d4514e1a8035658edb353c68f
> change-id: 20260602-work-super-bdev_holder_global-8cba5e52bed5
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply

* Re: [PATCH RFC v2 03/18] super: take lock after last reference count
From: Jan Kara @ 2026-06-22 13:50 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Alexander Viro,
	linux-block, linux-kernel, linux-fsdevel, Carlos Maiolino,
	linux-xfs, Chris Mason, David Sterba, linux-btrfs,
	Theodore Ts'o, linux-ext4, Gao Xiang, linux-erofs
In-Reply-To: <20260616-work-super-bdev_holder_global-v2-3-7df6b864028e@kernel.org>

On Tue 16-06-26 16:08:19, Christian Brauner wrote:
> __put_super() required the caller to hold sb_lock, so put_super()
> wrapped it. The per-device superblock table introduced later drops its
> passive references from contexts that do not hold sb_lock, so make
> put_super() self-locking: drop the count first and take sb_lock only for
> the final list_del.
> 
> With the count now dropped outside sb_lock a superblock can briefly sit
> on @super_blocks with s_passive == 0 before it is unlinked, so the list
> walkers (__iterate_supers(), iterate_supers_type(), user_get_super())
> switch to refcount_inc_not_zero() and skip it.
> 
> Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>

Looks good, just one style nit below. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

> -static void __put_super(struct super_block *s)
> +void put_super(struct super_block *s)
>  {
>  	if (refcount_dec_and_test(&s->s_passive)) {
> +

I'd delete this empty line.

> +		spin_lock(&sb_lock);
>  		list_del_init(&s->s_list);
> +		spin_unlock(&sb_lock);
> +


								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply

* Re: [PATCH RFC v2 02/18] super: convert s_count to refcount_t s_passive
From: Jan Kara @ 2026-06-22 13:48 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Alexander Viro,
	linux-block, linux-kernel, linux-fsdevel, Carlos Maiolino,
	linux-xfs, Chris Mason, David Sterba, linux-btrfs,
	Theodore Ts'o, linux-ext4, Gao Xiang, linux-erofs
In-Reply-To: <20260616-work-super-bdev_holder_global-v2-2-7df6b864028e@kernel.org>

On Tue 16-06-26 16:08:18, Christian Brauner wrote:
> The superblock carries two counters: s_active, the active reference
> count that keeps the filesystem usable, and s_count, the passive
> reference count that merely keeps the structure itself alive. Turn the
> passive count into a refcount_t and rename it to s_passive to make the
> pairing with s_active obvious.
> 
> Everything is still serialized by sb_lock, so there is no functional
> change; the conversion buys the usual refcount_t saturation and
> underflow checking. The following patches start dropping passive
> references without holding sb_lock and make the device-to-superblock
> table hold one passive reference per registered entry, which a plain
> integer cannot support.
> 
> Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>

Yeah, looks like a reasonable cleanup. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/super.c                     | 18 +++++++++---------
>  include/linux/fs/super_types.h |  2 +-
>  2 files changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/super.c b/fs/super.c
> index a8fd61136aaf..25dd72b550e0 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -102,7 +102,7 @@ static bool super_flags(const struct super_block *sb, unsigned int flags)
>   * creation will succeed and SB_BORN is set by vfs_get_tree() or we're
>   * woken and we'll see SB_DYING.
>   *
> - * The caller must have acquired a temporary reference on @sb->s_count.
> + * The caller must have acquired a temporary reference on @sb->s_passive.
>   *
>   * Return: The function returns true if SB_BORN was set and with
>   *         s_umount held. The function returns false if SB_DYING was
> @@ -367,7 +367,7 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags,
>  	spin_lock_init(&s->s_inode_wblist_lock);
>  	fserror_mount(s);
>  
> -	s->s_count = 1;
> +	refcount_set(&s->s_passive, 1);
>  	atomic_set(&s->s_active, 1);
>  	mutex_init(&s->s_vfs_rename_mutex);
>  	lockdep_set_class(&s->s_vfs_rename_mutex, &type->s_vfs_rename_key);
> @@ -407,7 +407,7 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags,
>   */
>  static void __put_super(struct super_block *s)
>  {
> -	if (!--s->s_count) {
> +	if (refcount_dec_and_test(&s->s_passive)) {
>  		list_del_init(&s->s_list);
>  		WARN_ON(s->s_dentry_lru.node);
>  		WARN_ON(s->s_inode_lru.node);
> @@ -529,7 +529,7 @@ static bool grab_super(struct super_block *sb)
>  {
>  	bool locked;
>  
> -	sb->s_count++;
> +	refcount_inc(&sb->s_passive);
>  	spin_unlock(&sb_lock);
>  	locked = super_lock_excl(sb);
>  	if (locked) {
> @@ -556,7 +556,7 @@ static bool grab_super(struct super_block *sb)
>   *	lock held in read mode in case of success. On successful return,
>   *	the caller must drop the s_umount lock when done.
>   *
> - *	Note that unlike get_super() et.al. this one does *not* bump ->s_count.
> + *	Note that unlike get_super() et.al. this one does *not* bump ->s_passive.
>   *	The reason why it's safe is that we are OK with doing trylock instead
>   *	of down_read().  There's a couple of places that are OK with that, but
>   *	it's very much not a general-purpose interface.
> @@ -858,7 +858,7 @@ static void __iterate_supers(void (*f)(struct super_block *, void *), void *arg,
>  	     sb = next_super(sb, flags)) {
>  		if (super_flags(sb, SB_DYING))
>  			continue;
> -		sb->s_count++;
> +		refcount_inc(&sb->s_passive);
>  		spin_unlock(&sb_lock);
>  
>  		if (flags & SUPER_ITER_UNLOCKED) {
> @@ -903,7 +903,7 @@ void iterate_supers_type(struct file_system_type *type,
>  		if (super_flags(sb, SB_DYING))
>  			continue;
>  
> -		sb->s_count++;
> +		refcount_inc(&sb->s_passive);
>  		spin_unlock(&sb_lock);
>  
>  		locked = super_lock_shared(sb);
> @@ -935,7 +935,7 @@ struct super_block *user_get_super(dev_t dev, bool excl)
>  		if (sb->s_dev != dev)
>  			continue;
>  
> -		sb->s_count++;
> +		refcount_inc(&sb->s_passive);
>  		spin_unlock(&sb_lock);
>  
>  		locked = super_lock(sb, excl);
> @@ -1369,7 +1369,7 @@ static struct super_block *bdev_super_lock(struct block_device *bdev, bool excl)
>  
>  	/* Make sure sb doesn't go away from under us */
>  	spin_lock(&sb_lock);
> -	sb->s_count++;
> +	refcount_inc(&sb->s_passive);
>  	spin_unlock(&sb_lock);
>  
>  	mutex_unlock(&bdev->bd_holder_lock);
> diff --git a/include/linux/fs/super_types.h b/include/linux/fs/super_types.h
> index ef7941e9dc79..68747182abf9 100644
> --- a/include/linux/fs/super_types.h
> +++ b/include/linux/fs/super_types.h
> @@ -145,7 +145,7 @@ struct super_block {
>  	unsigned long				s_magic;
>  	struct dentry				*s_root;
>  	struct rw_semaphore			s_umount;
> -	int					s_count;
> +	refcount_t				s_passive;
>  	atomic_t				s_active;
>  #ifdef CONFIG_SECURITY
>  	void					*s_security;
> 
> -- 
> 2.47.3
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply

* Re: [PATCH RFC v2 05/18] ext4: use anonymous devices for KUnit test superblocks
From: Jan Kara @ 2026-06-22 13:48 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Alexander Viro,
	linux-block, linux-kernel, linux-fsdevel, Carlos Maiolino,
	linux-xfs, Chris Mason, David Sterba, linux-btrfs,
	Theodore Ts'o, linux-ext4, Gao Xiang, linux-erofs
In-Reply-To: <20260616-work-super-bdev_holder_global-v2-5-7df6b864028e@kernel.org>

On Tue 16-06-26 16:08:21, Christian Brauner wrote:
> The mballoc and extents KUnit tests create superblocks through
> sget_fc() with a set callback that never assigns s_dev and a kill_sb
> that only calls generic_shutdown_super().
> 
> The upcoming global device-to-superblock table registers every
> superblock under its s_dev, so each superblock needs a unique device
> number. Allocate a proper anonymous device via set_anon_super_fc() and
> release it through kill_anon_super().
> 
> Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>

Ok. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/ext4/extents-test.c | 9 ++-------
>  fs/ext4/mballoc-test.c | 9 ++-------
>  2 files changed, 4 insertions(+), 14 deletions(-)
> 
> diff --git a/fs/ext4/extents-test.c b/fs/ext4/extents-test.c
> index bd7795a82607..c3836ecb89f9 100644
> --- a/fs/ext4/extents-test.c
> +++ b/fs/ext4/extents-test.c
> @@ -126,11 +126,6 @@ struct kunit_ext_test_param {
>  	struct kunit_ext_data_state exp_data_state[3];
>  };
>  
> -static void ext_kill_sb(struct super_block *sb)
> -{
> -	generic_shutdown_super(sb);
> -}
> -
>  static int ext_init_fs_context(struct fs_context *fc)
>  {
>  	return 0;
> @@ -138,13 +133,13 @@ static int ext_init_fs_context(struct fs_context *fc)
>  
>  static int ext_set(struct super_block *sb, struct fs_context *fc)
>  {
> -	return 0;
> +	return set_anon_super_fc(sb, fc);
>  }
>  
>  static struct file_system_type ext_fs_type = {
>  	.name		 = "extents test",
>  	.init_fs_context = ext_init_fs_context,
> -	.kill_sb	 = ext_kill_sb,
> +	.kill_sb	 = kill_anon_super,
>  };
>  
>  static void extents_kunit_exit(struct kunit *test)
> diff --git a/fs/ext4/mballoc-test.c b/fs/ext4/mballoc-test.c
> index d90da44aadbd..a3b33ed2c172 100644
> --- a/fs/ext4/mballoc-test.c
> +++ b/fs/ext4/mballoc-test.c
> @@ -59,11 +59,6 @@ static const struct super_operations mbt_sops = {
>  	.free_inode	= mbt_free_inode,
>  };
>  
> -static void mbt_kill_sb(struct super_block *sb)
> -{
> -	generic_shutdown_super(sb);
> -}
> -
>  static int mbt_init_fs_context(struct fs_context *fc)
>  {
>  	return 0;
> @@ -72,7 +67,7 @@ static int mbt_init_fs_context(struct fs_context *fc)
>  static struct file_system_type mbt_fs_type = {
>  	.name			= "mballoc test",
>  	.init_fs_context	= mbt_init_fs_context,
> -	.kill_sb		= mbt_kill_sb,
> +	.kill_sb		= kill_anon_super,
>  };
>  
>  static int mbt_mb_init(struct super_block *sb)
> @@ -136,7 +131,7 @@ static void mbt_mb_release(struct super_block *sb)
>  
>  static int mbt_set(struct super_block *sb, struct fs_context *fc)
>  {
> -	return 0;
> +	return set_anon_super_fc(sb, fc);
>  }
>  
>  static struct super_block *mbt_ext4_alloc_super_block(void)
> 
> -- 
> 2.47.3
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply

* Re: [PATCH RFC v2 01/18] xfs: fix the error unwind in xfs_open_devices()
From: Jan Kara @ 2026-06-22 13:35 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Alexander Viro,
	linux-block, linux-kernel, linux-fsdevel, Carlos Maiolino,
	linux-xfs, Chris Mason, David Sterba, linux-btrfs,
	Theodore Ts'o, linux-ext4, Gao Xiang, linux-erofs
In-Reply-To: <20260616-work-super-bdev_holder_global-v2-1-7df6b864028e@kernel.org>

On Tue 16-06-26 16:08:17, Christian Brauner wrote:
> Since the rt and log block devices are closed in xfs_free_buftarg() the
> buftarg owns the device file. The error unwind does not respect that:
> when the log buftarg allocation fails, out_free_rtdev_targ frees the rt
> buftarg - releasing rtdev_file - and then falls through to
> out_close_rtdev and releases it a second time.
> 
> The unwind also leaves mp->m_rtdev_targp and mp->m_ddev_targp pointing
> to the freed buftargs. The failed mount continues into
> deactivate_locked_super() -> xfs_kill_sb() -> xfs_mount_free(), which
> frees them again.
> 
> Clear the buftarg pointers once the unwind freed them and clear
> rtdev_file once the rt buftarg owns it, so nothing is released twice.
> 
> Reachable when a buftarg allocation fails after the data buftarg was
> set up: an I/O error in sync_blockdev() or an allocation failure in
> xfs_init_buftarg() while mounting with external rt and log devices.
> 
> Fixes: 41233576e9a4 ("xfs: close the RT and log block devices in xfs_free_buftarg")
> Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>

Looks good to me. As a small nit I'd probably do rtdev_file = NULL just
after we've successfully allocated m_rtdev_targp but that's really minor.
Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/xfs/xfs_super.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index eac7f9503805..8531d526fc44 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -534,8 +534,11 @@ xfs_open_devices(
>   out_free_rtdev_targ:
>  	if (mp->m_rtdev_targp)
>  		xfs_free_buftarg(mp->m_rtdev_targp);
> +	mp->m_rtdev_targp = NULL;
> +	rtdev_file = NULL;	/* released by xfs_free_buftarg() */
>   out_free_ddev_targ:
>  	xfs_free_buftarg(mp->m_ddev_targp);
> +	mp->m_ddev_targp = NULL;
>   out_close_rtdev:
>  	 if (rtdev_file)
>  		bdev_fput(rtdev_file);
> 
> -- 
> 2.47.3
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply

* Re: [PATCH 1/2 blktests] src/miniublk: switch to ioctl-encoded ublk commands
From: Sebastian Chlad @ 2026-06-22 13:34 UTC (permalink / raw)
  To: Shin'ichiro Kawasaki; +Cc: Sebastian Chlad, linux-block
In-Reply-To: <ajSud4Y4PmCu2X_5@shinmob>

Hi Shin'ichiro,

On Fri, Jun 19, 2026 at 5:26 AM Shin'ichiro Kawasaki
<shinichiro.kawasaki@wdc.com> wrote:
>
> Hi Sebastian,
>
> Thanks for the patches. I agree that this direction is good: it's the better
> shift away from the legacy interface.
>
> One point I noticed is that src/miniublk.c can no longer be built with the
> kernel headers of the LTS kernel version v6.1.y, probably (v5.15.y does not have
> ublk and v6.6.y supports the new interface). This is a rather small window, and
> may be acceptable but I wonder what you think about it
>
> If we drop the miniublk build with v6.1.y kernel headers, it might be the better
> to check before building miniublk. I quickly created a Makefile change [1] for
> that purpose.

You're right, sorry for the omission. I'll incorporate your Makefile
fix into v2 with a Suggested-by tag.

>
> Also, please find a comment in line below.
>
> On Jun 17, 2026 / 09:25, Sebastian Chlad wrote:
> > Kernels built without CONFIG_BLKDEV_UBLK_LEGACY_OPCODES reject the
> > legacy raw UBLK_CMD_* and UBLK_IO_* opcodes. Switch miniublk to use
> > the ioctl-encoded UBLK_U_CMD_* and UBLK_U_IO_* variants defined in
> > linux/ublk_cmd.h instead.
> >
> > For IO commands, the ioctl-encoded opcode is used for submission while
> > _IOC_NR() extracts the raw NR bits for build_user_data(), keeping the
> > user_data tag encoding intact.
> >
> > Signed-off-by: Sebastian Chlad <sebastian.chlad@suse.com>
> > ---
> >  src/miniublk.c | 30 +++++++++++++++---------------
> >  1 file changed, 15 insertions(+), 15 deletions(-)
> >
> > diff --git a/src/miniublk.c b/src/miniublk.c
> > index f98f850..5a35ca7 100644
> > --- a/src/miniublk.c
> > +++ b/src/miniublk.c
> [...]
> > @@ -624,9 +624,9 @@ static int ublk_queue_io_cmd(struct ublk_queue *q,
> >               return 0;
> >
> >       if (io->flags & UBLKSRV_NEED_COMMIT_RQ_COMP)
> > -             cmd_op = UBLK_IO_COMMIT_AND_FETCH_REQ;
> > -     else if (io->flags & UBLKSRV_NEED_FETCH_RQ)
> > -             cmd_op = UBLK_IO_FETCH_REQ;
> > +             cmd_op = UBLK_U_IO_COMMIT_AND_FETCH_REQ;
> > +     else
> > +             cmd_op = UBLK_U_IO_FETCH_REQ;
>
> The hunk above changes the "else if" part, is this intentional?
>

Yes, this is intentional because we already check things in
    if (!(io->flags &
        (UBLKSRV_NEED_FETCH_RQ | UBLKSRV_NEED_COMMIT_RQ_COMP)))
which returns early if neither flag is set, so checking the first
condition makes another check redundant as by that
time we know we need UBLK_U_IO_FETCH_REQ.

However if you think it's safer to still check if io->flags &
UBLKSRV_NEED_FETCH_RQ, I can implement it this way in the v2.
Let me know what you prefer.

I will wait with the v2 for your reply and address either the makefile
change exclusively, or both changes depending on your input.

>
> [1]
>
> diff --git a/src/Makefile b/src/Makefile
> index d8833bf..adfe3ef 100644
> --- a/src/Makefile
> +++ b/src/Makefile
> @@ -8,6 +8,10 @@ HAVE_C_MACRO = $(shell if echo "$(H)include <$(1)>" |  \
>                 $(CC) $(CFLAGS) -E - 2>&1 /dev/null | grep $(2) > /dev/null 2>&1; \
>                 then echo 1;else echo 0; fi)
>
> +HAVE_C_DEF = $(shell if echo -e "$(H)include <$(1)>\n#ifdef $(2)\nHAVE_$(2)\n#endif" | \
> +               $(CC) $(CFLAGS) -E - 2>&1 /dev/null | grep HAVE_$(2) > /dev/null 2>&1; \
> +               then echo 1;else echo 0; fi)
> +
>  C_TARGETS := \
>         dio-offsets \
>         loblksize \
> @@ -27,6 +31,7 @@ C_UBLK_TARGETS := miniublk
>
>  HAVE_LIBURING := $(call HAVE_C_MACRO,liburing.h,IORING_OP_URING_CMD)
>  HAVE_UBLK_HEADER := $(call HAVE_C_HEADER,linux/ublk_cmd.h,1)
> +HAVE_NEW_UBLK_INTF := $(call HAVE_C_DEF,linux/ublk_cmd.h,UBLK_U_CMD_START_DEV)
>
>  CXX_TARGETS := \
>         discontiguous-io
> @@ -37,8 +42,12 @@ SYZKALLER_TARGETS := \
>  TARGETS := $(C_TARGETS) $(CXX_TARGETS) $(SYZKALLER_TARGETS)
>
>  ifeq ($(HAVE_UBLK_HEADER), 1)
> +ifeq ($(HAVE_NEW_UBLK_INTF), 1)
>  C_URING_TARGETS += $(C_UBLK_TARGETS)
>  else
> +$(info Skip $(C_UBLK_TARGETS) build due to missing new ublk interface(v6.4+))
> +endif
> +else
>  $(info Skip $(C_UBLK_TARGETS) build due to missing kernel header(v6.0+))
>  endif
>
>

^ permalink raw reply

* Re: [PATCH v3 1/6] rust: module: add `THIS_MODULE` const to `ModuleMetadata` trait
From: Gary Guo @ 2026-06-22 13:06 UTC (permalink / raw)
  To: Alvin Sun, Gary Guo, Miguel Ojeda, Boqun Feng,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, Danilo Krummrich, Luis Chamberlain, Petr Pavlu,
	Daniel Gomez, Sami Tolvanen, Aaron Tomlin, Greg Kroah-Hartman,
	Rafael J. Wysocki, David Airlie, Simona Vetter, Daniel Almeida,
	Arnd Bergmann, Brendan Higgins, David Gow, Rae Moar, Breno Leitao,
	Jens Axboe
  Cc: rust-for-linux, linux-modules, driver-core, dri-devel, nova-gpu,
	linux-kselftest, kunit-dev, linux-block, linux-kernel
In-Reply-To: <2d54f3e0-3f35-4f97-b6af-b3ceb1aca246@linux.dev>

On Mon Jun 22, 2026 at 1:52 PM BST, Alvin Sun wrote:
>
> On 6/22/26 18:50, Gary Guo wrote:
>> On Mon Jun 22, 2026 at 3:44 AM BST, Alvin Sun wrote:
>>> Since `const_refs_to_static` has been stable as of the MSRV bump, a
>>> `ThisModule` pointer can now be used in const contexts.
>>>
>>> Add a `THIS_MODULE` const to the `ModuleMetadata` trait so that modules
>>> can provide their `ThisModule` pointer in const contexts such as static
>>> `file_operations`.
>>>
>>> Move the `THIS_MODULE` static from the `module!` macro into the
>>> `ModuleMetadata` impl, add a `this_module()` helper, and update `__init`
>>> to use it.
>>>
>>> Signed-off-by: Alvin Sun <alvin.sun@linux.dev>
>>> ---
>>>   rust/kernel/lib.rs    |  8 ++++++++
>>>   rust/macros/module.rs | 34 +++++++++++++++++-----------------
>>>   2 files changed, 25 insertions(+), 17 deletions(-)
>>>
>>> diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
>>> index b72b2fbe046d6..50f5a7b5f028e 100644
>>> --- a/rust/kernel/lib.rs
>>> +++ b/rust/kernel/lib.rs
>>> @@ -184,6 +184,14 @@ fn init(module: &'static ThisModule) -> impl pin_init::PinInit<Self, error::Erro
>>>   pub trait ModuleMetadata {
>>>       /// The name of the module as specified in the `module!` macro.
>>>       const NAME: &'static crate::str::CStr;
>>> +
>>> +    /// The module's `THIS_MODULE` pointer.
>>> +    const THIS_MODULE: ThisModule;
>>> +}
>>> +
>>> +/// Returns a reference to the `THIS_MODULE` of the given module type.
>>> +pub const fn this_module<M: ModuleMetadata>() -> &'static ThisModule {
>>> +    &M::THIS_MODULE
>>>   }
>> Also, FWIW I think this should not put this in the crate root. Perhaps create a
>> modules.rs?
>
> Makes sense. I'll create a new `module.rs` and move the module-related items
> (`ModuleMetadata`, `ThisModule`, `this_module()`) there, then re-export from
> `lib.rs`.

Please do not re-export `this_module`. For the other two, I think it's fine to
re-export to avoid tree-wide changes, but please do update users on code that
would route via the Rust tree.

Best,
Gary

^ permalink raw reply

* Re: [PATCH v3 1/6] rust: module: add `THIS_MODULE` const to `ModuleMetadata` trait
From: Alvin Sun @ 2026-06-22 12:52 UTC (permalink / raw)
  To: Gary Guo, Miguel Ojeda, Boqun Feng, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Alice Ryhl, Trevor Gross,
	Danilo Krummrich, Luis Chamberlain, Petr Pavlu, Daniel Gomez,
	Sami Tolvanen, Aaron Tomlin, Greg Kroah-Hartman,
	Rafael J. Wysocki, David Airlie, Simona Vetter, Daniel Almeida,
	Arnd Bergmann, Brendan Higgins, David Gow, Rae Moar, Breno Leitao,
	Jens Axboe
  Cc: rust-for-linux, linux-modules, driver-core, dri-devel, nova-gpu,
	linux-kselftest, kunit-dev, linux-block, linux-kernel
In-Reply-To: <DJFIQPLOVO4T.1K8T0VZM30LDA@garyguo.net>


On 6/22/26 18:50, Gary Guo wrote:
> On Mon Jun 22, 2026 at 3:44 AM BST, Alvin Sun wrote:
>> Since `const_refs_to_static` has been stable as of the MSRV bump, a
>> `ThisModule` pointer can now be used in const contexts.
>>
>> Add a `THIS_MODULE` const to the `ModuleMetadata` trait so that modules
>> can provide their `ThisModule` pointer in const contexts such as static
>> `file_operations`.
>>
>> Move the `THIS_MODULE` static from the `module!` macro into the
>> `ModuleMetadata` impl, add a `this_module()` helper, and update `__init`
>> to use it.
>>
>> Signed-off-by: Alvin Sun <alvin.sun@linux.dev>
>> ---
>>   rust/kernel/lib.rs    |  8 ++++++++
>>   rust/macros/module.rs | 34 +++++++++++++++++-----------------
>>   2 files changed, 25 insertions(+), 17 deletions(-)
>>
>> diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
>> index b72b2fbe046d6..50f5a7b5f028e 100644
>> --- a/rust/kernel/lib.rs
>> +++ b/rust/kernel/lib.rs
>> @@ -184,6 +184,14 @@ fn init(module: &'static ThisModule) -> impl pin_init::PinInit<Self, error::Erro
>>   pub trait ModuleMetadata {
>>       /// The name of the module as specified in the `module!` macro.
>>       const NAME: &'static crate::str::CStr;
>> +
>> +    /// The module's `THIS_MODULE` pointer.
>> +    const THIS_MODULE: ThisModule;
>> +}
>> +
>> +/// Returns a reference to the `THIS_MODULE` of the given module type.
>> +pub const fn this_module<M: ModuleMetadata>() -> &'static ThisModule {
>> +    &M::THIS_MODULE
>>   }
> Also, FWIW I think this should not put this in the crate root. Perhaps create a
> modules.rs?

Makes sense. I'll create a new `module.rs` and move the module-related items
(`ModuleMetadata`, `ThisModule`, `this_module()`) there, then re-export from
`lib.rs`.

Best regards,
Alvin Sun

>
> Best,
> Gary
>
>>   
>>   /// Equivalent to `THIS_MODULE` in the C API.
>> diff --git a/rust/macros/module.rs b/rust/macros/module.rs
>> index 06c18e2075083..b9fdee2f2af47 100644
>> --- a/rust/macros/module.rs
>> +++ b/rust/macros/module.rs
>> @@ -497,28 +497,28 @@ pub(crate) fn module(info: ModuleInfo) -> Result<TokenStream> {
>>           /// Used by the printing macros, e.g. [`info!`].
>>           const __LOG_PREFIX: &[u8] = #name_cstr.to_bytes_with_nul();
>>   
>> -        // SAFETY: `__this_module` is constructed by the kernel at load time and will not be
>> -        // freed until the module is unloaded.
>> -        #[cfg(MODULE)]
>> -        static THIS_MODULE: ::kernel::ThisModule = unsafe {
>> -            extern "C" {
>> -                static __this_module: ::kernel::types::Opaque<::kernel::bindings::module>;
>> -            };
>> -
>> -            ::kernel::ThisModule::from_ptr(__this_module.get())
>> -        };
>> -
>> -        #[cfg(not(MODULE))]
>> -        static THIS_MODULE: ::kernel::ThisModule = unsafe {
>> -            ::kernel::ThisModule::from_ptr(::core::ptr::null_mut())
>> -        };
>> -
>>           /// The `LocalModule` type is the type of the module created by `module!`,
>>           /// `module_pci_driver!`, `module_platform_driver!`, etc.
>>           type LocalModule = #type_;
>>   
>>           impl ::kernel::ModuleMetadata for #type_ {
>>               const NAME: &'static ::kernel::str::CStr = #name_cstr;
>> +
>> +            #[cfg(MODULE)]
>> +            const THIS_MODULE: ::kernel::ThisModule = {
>> +                extern "C" {
>> +                    static __this_module: ::kernel::types::Opaque<::kernel::bindings::module>;
>> +                }
>> +
>> +                // SAFETY: `__this_module` is constructed by the kernel at load time
>> +                // and lives until the module is unloaded.
>> +                unsafe { ::kernel::ThisModule::from_ptr(__this_module.get()) }
>> +            };
>> +
>> +            #[cfg(not(MODULE))]
>> +            const THIS_MODULE: ::kernel::ThisModule = unsafe {
>> +                ::kernel::ThisModule::from_ptr(::core::ptr::null_mut())
>> +            };
>>           }
>>   
>>           // Double nested modules, since then nobody can access the public items inside.
>> @@ -616,7 +616,7 @@ pub extern "C" fn #ident_exit() {
>>                   /// This function must only be called once.
>>                   unsafe fn __init() -> ::kernel::ffi::c_int {
>>                       let initer = <super::super::LocalModule as ::kernel::InPlaceModule>::init(
>> -                        &super::super::THIS_MODULE
>> +                        ::kernel::this_module::<super::super::LocalModule>()
>>                       );
>>                       // SAFETY: No data race, since `__MOD` can only be accessed by this module
>>                       // and there only `__init` and `__exit` access it. These functions are only
>

^ permalink raw reply

* Re: [PATCH v3 1/6] rust: module: add `THIS_MODULE` const to `ModuleMetadata` trait
From: Alvin Sun @ 2026-06-22 12:42 UTC (permalink / raw)
  To: Gary Guo, Miguel Ojeda, Boqun Feng, Björn Roy Baron,
	Benno Lossin, Andreas Hindborg, Alice Ryhl, Trevor Gross,
	Danilo Krummrich, Luis Chamberlain, Petr Pavlu, Daniel Gomez,
	Sami Tolvanen, Aaron Tomlin, Greg Kroah-Hartman,
	Rafael J. Wysocki, David Airlie, Simona Vetter, Daniel Almeida,
	Arnd Bergmann, Brendan Higgins, David Gow, Rae Moar, Breno Leitao,
	Jens Axboe
  Cc: rust-for-linux, linux-modules, driver-core, dri-devel, nova-gpu,
	linux-kselftest, kunit-dev, linux-block, linux-kernel
In-Reply-To: <DJFIKIXLML05.3KYOXUGZYJRDZ@garyguo.net>


On 6/22/26 18:42, Gary Guo wrote:
> On Mon Jun 22, 2026 at 3:44 AM BST, Alvin Sun wrote:
>> Since `const_refs_to_static` has been stable as of the MSRV bump, a
>> `ThisModule` pointer can now be used in const contexts.
>>
>> Add a `THIS_MODULE` const to the `ModuleMetadata` trait so that modules
>> can provide their `ThisModule` pointer in const contexts such as static
>> `file_operations`.
>>
>> Move the `THIS_MODULE` static from the `module!` macro into the
>> `ModuleMetadata` impl, add a `this_module()` helper, and update `__init`
>> to use it.
> Doesn't this break existing users of THIS_MODULE?

You are right, I missed binder. I will add a patch to update binder to use
`this_module::<LocalModule>()` in the next version.

While looking into this, I also noticed `gen_disk.rs` has a `// TODO: Set to
THIS_MODULE` with `owner` still set to `null_mut()`.

Best regards,
Alvin Sun

>
> Binder, rnull and configfs macros are using it.
>
> Best,
> Gary
>
>> Signed-off-by: Alvin Sun <alvin.sun@linux.dev>
>> ---
>>   rust/kernel/lib.rs    |  8 ++++++++
>>   rust/macros/module.rs | 34 +++++++++++++++++-----------------
>>   2 files changed, 25 insertions(+), 17 deletions(-)
>>
>> diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
>> index b72b2fbe046d6..50f5a7b5f028e 100644
>> --- a/rust/kernel/lib.rs
>> +++ b/rust/kernel/lib.rs
>> @@ -184,6 +184,14 @@ fn init(module: &'static ThisModule) -> impl pin_init::PinInit<Self, error::Erro
>>   pub trait ModuleMetadata {
>>       /// The name of the module as specified in the `module!` macro.
>>       const NAME: &'static crate::str::CStr;
>> +
>> +    /// The module's `THIS_MODULE` pointer.
>> +    const THIS_MODULE: ThisModule;
>> +}
>> +
>> +/// Returns a reference to the `THIS_MODULE` of the given module type.
>> +pub const fn this_module<M: ModuleMetadata>() -> &'static ThisModule {
>> +    &M::THIS_MODULE
>>   }
>>   
>>   /// Equivalent to `THIS_MODULE` in the C API.
>> diff --git a/rust/macros/module.rs b/rust/macros/module.rs
>> index 06c18e2075083..b9fdee2f2af47 100644
>> --- a/rust/macros/module.rs
>> +++ b/rust/macros/module.rs
>> @@ -497,28 +497,28 @@ pub(crate) fn module(info: ModuleInfo) -> Result<TokenStream> {
>>           /// Used by the printing macros, e.g. [`info!`].
>>           const __LOG_PREFIX: &[u8] = #name_cstr.to_bytes_with_nul();
>>   
>> -        // SAFETY: `__this_module` is constructed by the kernel at load time and will not be
>> -        // freed until the module is unloaded.
>> -        #[cfg(MODULE)]
>> -        static THIS_MODULE: ::kernel::ThisModule = unsafe {
>> -            extern "C" {
>> -                static __this_module: ::kernel::types::Opaque<::kernel::bindings::module>;
>> -            };
>> -
>> -            ::kernel::ThisModule::from_ptr(__this_module.get())
>> -        };
>> -
>> -        #[cfg(not(MODULE))]
>> -        static THIS_MODULE: ::kernel::ThisModule = unsafe {
>> -            ::kernel::ThisModule::from_ptr(::core::ptr::null_mut())
>> -        };
>> -
>>           /// The `LocalModule` type is the type of the module created by `module!`,
>>           /// `module_pci_driver!`, `module_platform_driver!`, etc.
>>           type LocalModule = #type_;
>>   
>>           impl ::kernel::ModuleMetadata for #type_ {
>>               const NAME: &'static ::kernel::str::CStr = #name_cstr;
>> +
>> +            #[cfg(MODULE)]
>> +            const THIS_MODULE: ::kernel::ThisModule = {
>> +                extern "C" {
>> +                    static __this_module: ::kernel::types::Opaque<::kernel::bindings::module>;
>> +                }
>> +
>> +                // SAFETY: `__this_module` is constructed by the kernel at load time
>> +                // and lives until the module is unloaded.
>> +                unsafe { ::kernel::ThisModule::from_ptr(__this_module.get()) }
>> +            };
>> +
>> +            #[cfg(not(MODULE))]
>> +            const THIS_MODULE: ::kernel::ThisModule = unsafe {
>> +                ::kernel::ThisModule::from_ptr(::core::ptr::null_mut())
>> +            };
>>           }
>>   
>>           // Double nested modules, since then nobody can access the public items inside.
>> @@ -616,7 +616,7 @@ pub extern "C" fn #ident_exit() {
>>                   /// This function must only be called once.
>>                   unsafe fn __init() -> ::kernel::ffi::c_int {
>>                       let initer = <super::super::LocalModule as ::kernel::InPlaceModule>::init(
>> -                        &super::super::THIS_MODULE
>> +                        ::kernel::this_module::<super::super::LocalModule>()
>>                       );
>>                       // SAFETY: No data race, since `__MOD` can only be accessed by this module
>>                       // and there only `__init` and `__exit` access it. These functions are only
>

^ permalink raw reply

* Re: [PATCH blktests] Fix _get_page_size()
From: Shin'ichiro Kawasaki @ 2026-06-22 11:38 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Jeff Moyer, linux-block, osandov, kch
In-Reply-To: <089e0281-4df8-4358-91ce-1f5cc0f0ec4b@acm.org>

On Jun 20, 2026 / 09:11, Bart Van Assche wrote:
> On 6/20/26 6:51 AM, Shin'ichiro Kawasaki wrote:
> > On Jun 20, 2026 / 05:55, Bart Van Assche wrote:
> > > On 6/20/26 3:26 AM, Shin'ichiro Kawasaki wrote:
> > > > This is a rather fundamental change, so I would like to ask opinions from
> > > > other blktests users, especially Omar and Chaitanya. What do you think about
> > > > the idea to add getconf to the requirement list?
> > > 
> > > CONFIG_PAGE_SHIFT was introduced in the Linux kernel in February 2024
> > > (commit ba89f9c8ccba ("arch: consolidate existing CONFIG_PAGE_SIZE_*KB
> > > definitions")). Older kernels had CONFIG_PAGE_SIZE_4KB,
> > > CONFIG_PAGE_SIZE_16KB, etc. This means that it is possible to derive the
> > > kernel page size from the kernel configuration file for all upstream and
> > > distro kernels, isn't it?
> > 
> > I checked the commit is in the tag v6.9. My Debian bookworm system has kernel
> > v6.1, then the config file at /boot does not have CONFIG_PAGE_SHIFT as expected.
> > But it does not have CONFIG_PAGE_SIZE_* either... I'm still afraid that kernel
> > config file approach is not reliable.
> 
> Right, for older kernels CONFIG_PAGE_SIZE_*KB is only available for some
> but not for all supported architectures.
> 
> It is not clear to me where the desire to avoid the dependency on
> getconf comes from? As far as I know it is available on all Linux
> distro's. Since it is typically included in the C library package it
> should not introduce a new dependency.

I think less dependent is the better in general, and wanted to confirm that
it is fine for everybody. If there is no voice to object, I will create a
patch to add getconf to the requirement list.

^ permalink raw reply

* Re: [PATCH v3 0/7] Prepare mutable list iterators to cache cursor state
From: David Hildenbrand (Arm) @ 2026-06-22 11:27 UTC (permalink / raw)
  To: Alexei Starovoitov, Kaitao Cheng
  Cc: Andrew Morton, Jens Axboe, Tejun Heo, Alexander Viro,
	Christian Brauner, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Johannes Weiner, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Namhyung Kim, Thomas Gleixner,
	Juri Lelli, Vincent Guittot, Paul Moore, Andy Shevchenko,
	Paul E. McKenney, Shakeel Butt, Christian König,
	David Howells, Simona Vetter, Randy Dunlap, Luca Ceresoli,
	Philipp Stanner, linux-block, LKML,
	open list:CONTROL GROUP (CGROUP), linux-ntfs-dev, Linux-Fsdevel,
	io-uring, audit, bpf, Network Development, dri-devel,
	linux-perf-use., linux-trace-kernel, kexec, live-patching,
	linux-modules, Linux Crypto Mailing List, Linux Power Management,
	rcu, sched-ext, linux-mm, virtualization, damon,
	clang-built-linux, chengkaitao
In-Reply-To: <CAADnVQJmPWFT01b7DuLdtafv=8FyB84GYHNZ8zSTck+9Aw0JpA@mail.gmail.com>

On 6/22/26 07:28, Alexei Starovoitov wrote:
> On Sun, Jun 21, 2026 at 9:06 PM Kaitao Cheng <kaitao.cheng@linux.dev> wrote:
>>
>> From: chengkaitao <chengkaitao@kylinos.cn>
>>
>> The list_for_each*_safe() helpers are used when the loop body may remove
>> the current entry.  Their current interface, however, forces every caller
>> to define a temporary cursor outside the macro and pass it in, even when
>> the caller never uses that cursor directly.  For most call sites this
>> extra cursor is just boilerplate required by the macro implementation.
>>
>> This is awkward because the saved next pointer is an internal detail of
>> the iteration.  Callers that only remove or move the current entry do not
>> need to spell it out.
>>
>> The _safe() suffix has also caused confusion.  Christian Koenig pointed
>> out that the name is easy to read as a thread-safe variant, especially
>> for beginners, even though it only means that the iterator keeps enough
>> state to tolerate removal of the current entry.  He suggested _mutable()
>> as a clearer description of what the loop permits.
>>
>> Add *_mutable() iterator variants for list, hlist and llist.  The new
>> helpers are variadic and support both forms.  In the common case, the
>> caller omits the temporary cursor and the macro creates a unique internal
>> cursor with typeof(pos) and __UNIQUE_ID().  If a loop really needs an
>> explicit temporary cursor, the caller can still pass it and the helper
>> keeps the existing *_safe() behaviour.
>>
>> For example, a call site may use the shorter form:
>>
>>   list_for_each_entry_mutable(pos, head, member)
>>
>> or keep the explicit temporary cursor form:
>>
>>   list_for_each_entry_mutable(pos, tmp, head, member)
>>
>> The existing *_safe() helpers remain available for compatibility.  This
>> series only converts users in mm, block, kernel, init and io_uring.  If
>> this approach looks acceptable, the remaining users can be converted in
>> follow-up series.
>>
>> Changes in v3 (Christian König, Andy Shevchenko):
>> - Convert safe list walks to mutable iterators
>>
>> Changes in v2 (Muchun Song, Andy Shevchenko):
>> - Drop the list_for_each_entry_mutable*() helpers from v1 and make the
>>   cursor change directly in the existing list_for_each_entry*() helpers.
>> - Open-code special list walks that rely on updating the loop cursor in
>>   the body, preserving their existing traversal semantics.
>>
>> Link to v2:
>> https://lore.kernel.org/all/20260609061347.93688-1-kaitao.cheng@linux.dev/
>>
>> Link to v1:
>> https://lore.kernel.org/all/20260529082149.76764-1-kaitao.cheng@linux.dev/
>>
>> Kaitao Cheng (7):
>>   list: Add mutable iterator variants
>>   llist: Add mutable iterator variants
>>   mm: Use mutable list iterators
>>   block: Use mutable list iterators
>>   kernel: Use mutable list iterators
>>   initramfs: Use mutable list iterator
>>   io_uring: Use mutable list iterators
>>
>>  block/bfq-iosched.c                 |  17 +-
>>  block/blk-cgroup.c                  |  12 +-
>>  block/blk-flush.c                   |   4 +-
>>  block/blk-iocost.c                  |  18 +-
>>  block/blk-mq.c                      |   8 +-
>>  block/blk-throttle.c                |   4 +-
>>  block/kyber-iosched.c               |   4 +-
>>  block/partitions/ldm.c              |   8 +-
>>  block/sed-opal.c                    |   4 +-
>>  include/linux/list.h                | 269 ++++++++++++++++++++++++----
>>  include/linux/llist.h               |  81 +++++++--
>>  init/initramfs.c                    |   5 +-
>>  io_uring/cancel.c                   |   6 +-
>>  io_uring/poll.c                     |   3 +-
>>  io_uring/rw.c                       |   4 +-
>>  io_uring/timeout.c                  |   8 +-
>>  io_uring/uring_cmd.c                |   3 +-
>>  kernel/audit_tree.c                 |   4 +-
>>  kernel/audit_watch.c                |  16 +-
>>  kernel/auditfilter.c                |   4 +-
>>  kernel/auditsc.c                    |   4 +-
>>  kernel/bpf/arena.c                  |  10 +-
>>  kernel/bpf/arraymap.c               |   8 +-
>>  kernel/bpf/bpf_local_storage.c      |   3 +-
>>  kernel/bpf/bpf_lru_list.c           |  25 ++-
>>  kernel/bpf/btf.c                    |  18 +-
>>  kernel/bpf/cgroup.c                 |   7 +-
>>  kernel/bpf/cpumap.c                 |   4 +-
>>  kernel/bpf/devmap.c                 |  10 +-
>>  kernel/bpf/helpers.c                |   8 +-
>>  kernel/bpf/local_storage.c          |   4 +-
>>  kernel/bpf/memalloc.c               |  16 +-
>>  kernel/bpf/offload.c                |   8 +-
>>  kernel/bpf/states.c                 |   4 +-
>>  kernel/bpf/stream.c                 |   4 +-
>>  kernel/bpf/verifier.c               |   6 +-
>>  kernel/cgroup/cgroup-v1.c           |   4 +-
>>  kernel/cgroup/cgroup.c              |  54 +++---
>>  kernel/cgroup/dmem.c                |  12 +-
>>  kernel/cgroup/rdma.c                |   8 +-
>>  kernel/events/core.c                |  44 +++--
>>  kernel/events/uprobes.c             |  12 +-
>>  kernel/exit.c                       |   8 +-
>>  kernel/fail_function.c              |   4 +-
>>  kernel/gcov/clang.c                 |   4 +-
>>  kernel/irq_work.c                   |   4 +-
>>  kernel/kexec_core.c                 |   4 +-
>>  kernel/kprobes.c                    |  16 +-
>>  kernel/livepatch/core.c             |   4 +-
>>  kernel/livepatch/core.h             |   4 +-
>>  kernel/liveupdate/kho_block.c       |   4 +-
>>  kernel/liveupdate/luo_flb.c         |   4 +-
>>  kernel/locking/rwsem.c              |   2 +-
>>  kernel/locking/test-ww_mutex.c      |   2 +-
>>  kernel/module/main.c                |  11 +-
>>  kernel/padata.c                     |   4 +-
>>  kernel/power/snapshot.c             |   8 +-
>>  kernel/power/wakelock.c             |   4 +-
>>  kernel/printk/printk.c              |  11 +-
>>  kernel/ptrace.c                     |   4 +-
>>  kernel/rcu/rcutorture.c             |   3 +-
>>  kernel/rcu/tasks.h                  |   9 +-
>>  kernel/rcu/tree.c                   |   6 +-
>>  kernel/resource.c                   |   4 +-
>>  kernel/sched/core.c                 |   4 +-
>>  kernel/sched/ext.c                  |  22 +--
>>  kernel/sched/fair.c                 |  28 +--
>>  kernel/sched/topology.c             |   4 +-
>>  kernel/sched/wait.c                 |   4 +-
>>  kernel/seccomp.c                    |   4 +-
>>  kernel/signal.c                     |  11 +-
>>  kernel/smp.c                        |   4 +-
>>  kernel/taskstats.c                  |   8 +-
>>  kernel/time/clockevents.c           |   6 +-
>>  kernel/time/clocksource.c           |   4 +-
>>  kernel/time/posix-cpu-timers.c      |   4 +-
>>  kernel/time/posix-timers.c          |   3 +-
>>  kernel/torture.c                    |   3 +-
>>  kernel/trace/bpf_trace.c            |   4 +-
>>  kernel/trace/ftrace.c               |  49 +++--
>>  kernel/trace/ring_buffer.c          |  25 ++-
>>  kernel/trace/trace.c                |  12 +-
>>  kernel/trace/trace_dynevent.c       |   6 +-
>>  kernel/trace/trace_dynevent.h       |   5 +-
>>  kernel/trace/trace_events.c         |  35 ++--
>>  kernel/trace/trace_events_filter.c  |   4 +-
>>  kernel/trace/trace_events_hist.c    |   8 +-
>>  kernel/trace/trace_events_trigger.c |  17 +-
>>  kernel/trace/trace_events_user.c    |  16 +-
>>  kernel/trace/trace_stat.c           |   4 +-
>>  kernel/user-return-notifier.c       |   3 +-
>>  kernel/workqueue.c                  |  16 +-
>>  mm/backing-dev.c                    |   8 +-
>>  mm/balloon.c                        |   8 +-
>>  mm/cma.c                            |   4 +-
>>  mm/compaction.c                     |   4 +-
>>  mm/damon/core.c                     |   4 +-
>>  mm/damon/sysfs-schemes.c            |   4 +-
>>  mm/dmapool.c                        |   4 +-
>>  mm/huge_memory.c                    |   8 +-
>>  mm/hugetlb.c                        |  56 +++---
>>  mm/hugetlb_vmemmap.c                |  16 +-
>>  mm/khugepaged.c                     |  14 +-
>>  mm/kmemleak.c                       |   7 +-
>>  mm/ksm.c                            |  25 +--
>>  mm/list_lru.c                       |   4 +-
>>  mm/memcontrol-v1.c                  |   8 +-
>>  mm/memory-failure.c                 |  12 +-
>>  mm/memory-tiers.c                   |   4 +-
>>  mm/migrate.c                        |  23 ++-
>>  mm/mmu_notifier.c                   |   9 +-
>>  mm/page_alloc.c                     |   8 +-
>>  mm/page_reporting.c                 |   2 +-
>>  mm/percpu.c                         |  11 +-
>>  mm/pgtable-generic.c                |   4 +-
>>  mm/rmap.c                           |  10 +-
>>  mm/shmem.c                          |   9 +-
>>  mm/slab_common.c                    |  14 +-
>>  mm/slub.c                           |  33 ++--
>>  mm/swapfile.c                       |   4 +-
>>  mm/userfaultfd.c                    |  12 +-
>>  mm/vmalloc.c                        |  24 +--
>>  mm/vmscan.c                         |   7 +-
>>  mm/zsmalloc.c                       |   4 +-
>>  124 files changed, 875 insertions(+), 681 deletions(-)
> 
> Not sure what you were thinking, but this diff stat
> is not landable.

Agreed. If we decide we want this, I guess we should target per-subsystem
conversions.

If this goes through the MM tree, I would even appreciate doing this on a per-MM
component granularity.

(unless we have some magic "Linus converts all of them" script, which I doubt we
will have)

Is there a way forward to replace list_for_each_*_safe entirely, possibly just
reusing the old name but simply the parameter?

-- 
Cheers,

David

^ permalink raw reply

* Re: [PATCH v3 1/6] rust: module: add `THIS_MODULE` const to `ModuleMetadata` trait
From: Gary Guo @ 2026-06-22 10:50 UTC (permalink / raw)
  To: Alvin Sun, Miguel Ojeda, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, Danilo Krummrich, Luis Chamberlain, Petr Pavlu,
	Daniel Gomez, Sami Tolvanen, Aaron Tomlin, Greg Kroah-Hartman,
	Rafael J. Wysocki, David Airlie, Simona Vetter, Daniel Almeida,
	Arnd Bergmann, Brendan Higgins, David Gow, Rae Moar, Breno Leitao,
	Jens Axboe
  Cc: rust-for-linux, linux-modules, driver-core, dri-devel, nova-gpu,
	linux-kselftest, kunit-dev, linux-block, linux-kernel
In-Reply-To: <20260622-fix-fops-owner-v3-1-49d45cb37032@linux.dev>

On Mon Jun 22, 2026 at 3:44 AM BST, Alvin Sun wrote:
> Since `const_refs_to_static` has been stable as of the MSRV bump, a
> `ThisModule` pointer can now be used in const contexts.
>
> Add a `THIS_MODULE` const to the `ModuleMetadata` trait so that modules
> can provide their `ThisModule` pointer in const contexts such as static
> `file_operations`.
>
> Move the `THIS_MODULE` static from the `module!` macro into the
> `ModuleMetadata` impl, add a `this_module()` helper, and update `__init`
> to use it.
>
> Signed-off-by: Alvin Sun <alvin.sun@linux.dev>
> ---
>  rust/kernel/lib.rs    |  8 ++++++++
>  rust/macros/module.rs | 34 +++++++++++++++++-----------------
>  2 files changed, 25 insertions(+), 17 deletions(-)
>
> diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
> index b72b2fbe046d6..50f5a7b5f028e 100644
> --- a/rust/kernel/lib.rs
> +++ b/rust/kernel/lib.rs
> @@ -184,6 +184,14 @@ fn init(module: &'static ThisModule) -> impl pin_init::PinInit<Self, error::Erro
>  pub trait ModuleMetadata {
>      /// The name of the module as specified in the `module!` macro.
>      const NAME: &'static crate::str::CStr;
> +
> +    /// The module's `THIS_MODULE` pointer.
> +    const THIS_MODULE: ThisModule;
> +}
> +
> +/// Returns a reference to the `THIS_MODULE` of the given module type.
> +pub const fn this_module<M: ModuleMetadata>() -> &'static ThisModule {
> +    &M::THIS_MODULE
>  }

Also, FWIW I think this should not put this in the crate root. Perhaps create a
modules.rs?

Best,
Gary

>  
>  /// Equivalent to `THIS_MODULE` in the C API.
> diff --git a/rust/macros/module.rs b/rust/macros/module.rs
> index 06c18e2075083..b9fdee2f2af47 100644
> --- a/rust/macros/module.rs
> +++ b/rust/macros/module.rs
> @@ -497,28 +497,28 @@ pub(crate) fn module(info: ModuleInfo) -> Result<TokenStream> {
>          /// Used by the printing macros, e.g. [`info!`].
>          const __LOG_PREFIX: &[u8] = #name_cstr.to_bytes_with_nul();
>  
> -        // SAFETY: `__this_module` is constructed by the kernel at load time and will not be
> -        // freed until the module is unloaded.
> -        #[cfg(MODULE)]
> -        static THIS_MODULE: ::kernel::ThisModule = unsafe {
> -            extern "C" {
> -                static __this_module: ::kernel::types::Opaque<::kernel::bindings::module>;
> -            };
> -
> -            ::kernel::ThisModule::from_ptr(__this_module.get())
> -        };
> -
> -        #[cfg(not(MODULE))]
> -        static THIS_MODULE: ::kernel::ThisModule = unsafe {
> -            ::kernel::ThisModule::from_ptr(::core::ptr::null_mut())
> -        };
> -
>          /// The `LocalModule` type is the type of the module created by `module!`,
>          /// `module_pci_driver!`, `module_platform_driver!`, etc.
>          type LocalModule = #type_;
>  
>          impl ::kernel::ModuleMetadata for #type_ {
>              const NAME: &'static ::kernel::str::CStr = #name_cstr;
> +
> +            #[cfg(MODULE)]
> +            const THIS_MODULE: ::kernel::ThisModule = {
> +                extern "C" {
> +                    static __this_module: ::kernel::types::Opaque<::kernel::bindings::module>;
> +                }
> +
> +                // SAFETY: `__this_module` is constructed by the kernel at load time
> +                // and lives until the module is unloaded.
> +                unsafe { ::kernel::ThisModule::from_ptr(__this_module.get()) }
> +            };
> +
> +            #[cfg(not(MODULE))]
> +            const THIS_MODULE: ::kernel::ThisModule = unsafe {
> +                ::kernel::ThisModule::from_ptr(::core::ptr::null_mut())
> +            };
>          }
>  
>          // Double nested modules, since then nobody can access the public items inside.
> @@ -616,7 +616,7 @@ pub extern "C" fn #ident_exit() {
>                  /// This function must only be called once.
>                  unsafe fn __init() -> ::kernel::ffi::c_int {
>                      let initer = <super::super::LocalModule as ::kernel::InPlaceModule>::init(
> -                        &super::super::THIS_MODULE
> +                        ::kernel::this_module::<super::super::LocalModule>()
>                      );
>                      // SAFETY: No data race, since `__MOD` can only be accessed by this module
>                      // and there only `__init` and `__exit` access it. These functions are only



^ permalink raw reply

* Re: [PATCH v3 4/6] rust: drm: set fops.owner from driver module pointer
From: Gary Guo @ 2026-06-22 10:48 UTC (permalink / raw)
  To: Alvin Sun, Miguel Ojeda, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, Danilo Krummrich, Luis Chamberlain, Petr Pavlu,
	Daniel Gomez, Sami Tolvanen, Aaron Tomlin, Greg Kroah-Hartman,
	Rafael J. Wysocki, David Airlie, Simona Vetter, Daniel Almeida,
	Arnd Bergmann, Brendan Higgins, David Gow, Rae Moar, Breno Leitao,
	Jens Axboe
  Cc: rust-for-linux, linux-modules, driver-core, dri-devel, nova-gpu,
	linux-kselftest, kunit-dev, linux-block, linux-kernel
In-Reply-To: <20260622-fix-fops-owner-v3-4-49d45cb37032@linux.dev>

On Mon Jun 22, 2026 at 3:44 AM BST, Alvin Sun wrote:
> Change `create_fops()` to accept an owner module pointer instead of
> hardcoding `null_mut()`, ensuring the kernel correctly tracks the
> module owning the DRM device's file operations.
> 
> Signed-off-by: Alvin Sun <alvin.sun@linux.dev>

Reviewed-by: Gary Guo <gary@garyguo.net>

How is the patch logistics going to be handled? This series probably should be
routed via the rust tree? Perhaps as fixes?

Best,
Gary

> ---
>  rust/kernel/drm/device.rs  | 3 ++-
>  rust/kernel/drm/gem/mod.rs | 4 ++--
>  2 files changed, 4 insertions(+), 3 deletions(-)


^ permalink raw reply

* Re: [PATCH v3 0/7] Prepare mutable list iterators to cache cursor state
From: Andy Shevchenko @ 2026-06-22 10:46 UTC (permalink / raw)
  To: Kaitao Cheng
  Cc: Alexei Starovoitov, Andrew Morton, David Hildenbrand, Jens Axboe,
	Tejun Heo, Alexander Viro, Christian Brauner, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Johannes Weiner, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim,
	Thomas Gleixner, Juri Lelli, Vincent Guittot, Paul Moore,
	Paul E. McKenney, Shakeel Butt, Christian König,
	David Howells, Simona Vetter, Randy Dunlap, Luca Ceresoli,
	Philipp Stanner, linux-block, LKML,
	open list:CONTROL GROUP (CGROUP), linux-ntfs-dev, Linux-Fsdevel,
	io-uring, audit, bpf, Network Development, dri-devel,
	linux-perf-use., linux-trace-kernel, kexec, live-patching,
	linux-modules, Linux Crypto Mailing List, Linux Power Management,
	rcu, sched-ext, linux-mm, virtualization, damon,
	clang-built-linux, chengkaitao, Muchun Song
In-Reply-To: <8c8f1849-86d3-4c69-be27-30bbdffdf616@linux.dev>

On Mon, Jun 22, 2026 at 02:15:01PM +0800, Kaitao Cheng wrote:
> 在 2026/6/22 13:28, Alexei Starovoitov 写道:
> > On Sun, Jun 21, 2026 at 9:06 PM Kaitao Cheng <kaitao.cheng@linux.dev> wrote:

...

> >>  block/bfq-iosched.c                 |  17 +-
> >>  block/blk-cgroup.c                  |  12 +-
> >>  block/blk-flush.c                   |   4 +-
> >>  block/blk-iocost.c                  |  18 +-
> >>  block/blk-mq.c                      |   8 +-
> >>  block/blk-throttle.c                |   4 +-
> >>  block/kyber-iosched.c               |   4 +-
> >>  block/partitions/ldm.c              |   8 +-
> >>  block/sed-opal.c                    |   4 +-
> >>  include/linux/list.h                | 269 ++++++++++++++++++++++++----
> >>  include/linux/llist.h               |  81 +++++++--
> >>  init/initramfs.c                    |   5 +-
> >>  io_uring/cancel.c                   |   6 +-
> >>  io_uring/poll.c                     |   3 +-
> >>  io_uring/rw.c                       |   4 +-
> >>  io_uring/timeout.c                  |   8 +-
> >>  io_uring/uring_cmd.c                |   3 +-
> >>  kernel/audit_tree.c                 |   4 +-
> >>  kernel/audit_watch.c                |  16 +-
> >>  kernel/auditfilter.c                |   4 +-
> >>  kernel/auditsc.c                    |   4 +-
> >>  kernel/bpf/arena.c                  |  10 +-
> >>  kernel/bpf/arraymap.c               |   8 +-
> >>  kernel/bpf/bpf_local_storage.c      |   3 +-
> >>  kernel/bpf/bpf_lru_list.c           |  25 ++-
> >>  kernel/bpf/btf.c                    |  18 +-
> >>  kernel/bpf/cgroup.c                 |   7 +-
> >>  kernel/bpf/cpumap.c                 |   4 +-
> >>  kernel/bpf/devmap.c                 |  10 +-
> >>  kernel/bpf/helpers.c                |   8 +-
> >>  kernel/bpf/local_storage.c          |   4 +-
> >>  kernel/bpf/memalloc.c               |  16 +-
> >>  kernel/bpf/offload.c                |   8 +-
> >>  kernel/bpf/states.c                 |   4 +-
> >>  kernel/bpf/stream.c                 |   4 +-
> >>  kernel/bpf/verifier.c               |   6 +-
> >>  kernel/cgroup/cgroup-v1.c           |   4 +-
> >>  kernel/cgroup/cgroup.c              |  54 +++---
> >>  kernel/cgroup/dmem.c                |  12 +-
> >>  kernel/cgroup/rdma.c                |   8 +-
> >>  kernel/events/core.c                |  44 +++--
> >>  kernel/events/uprobes.c             |  12 +-
> >>  kernel/exit.c                       |   8 +-
> >>  kernel/fail_function.c              |   4 +-
> >>  kernel/gcov/clang.c                 |   4 +-
> >>  kernel/irq_work.c                   |   4 +-
> >>  kernel/kexec_core.c                 |   4 +-
> >>  kernel/kprobes.c                    |  16 +-
> >>  kernel/livepatch/core.c             |   4 +-
> >>  kernel/livepatch/core.h             |   4 +-
> >>  kernel/liveupdate/kho_block.c       |   4 +-
> >>  kernel/liveupdate/luo_flb.c         |   4 +-
> >>  kernel/locking/rwsem.c              |   2 +-
> >>  kernel/locking/test-ww_mutex.c      |   2 +-
> >>  kernel/module/main.c                |  11 +-
> >>  kernel/padata.c                     |   4 +-
> >>  kernel/power/snapshot.c             |   8 +-
> >>  kernel/power/wakelock.c             |   4 +-
> >>  kernel/printk/printk.c              |  11 +-
> >>  kernel/ptrace.c                     |   4 +-
> >>  kernel/rcu/rcutorture.c             |   3 +-
> >>  kernel/rcu/tasks.h                  |   9 +-
> >>  kernel/rcu/tree.c                   |   6 +-
> >>  kernel/resource.c                   |   4 +-
> >>  kernel/sched/core.c                 |   4 +-
> >>  kernel/sched/ext.c                  |  22 +--
> >>  kernel/sched/fair.c                 |  28 +--
> >>  kernel/sched/topology.c             |   4 +-
> >>  kernel/sched/wait.c                 |   4 +-
> >>  kernel/seccomp.c                    |   4 +-
> >>  kernel/signal.c                     |  11 +-
> >>  kernel/smp.c                        |   4 +-
> >>  kernel/taskstats.c                  |   8 +-
> >>  kernel/time/clockevents.c           |   6 +-
> >>  kernel/time/clocksource.c           |   4 +-
> >>  kernel/time/posix-cpu-timers.c      |   4 +-
> >>  kernel/time/posix-timers.c          |   3 +-
> >>  kernel/torture.c                    |   3 +-
> >>  kernel/trace/bpf_trace.c            |   4 +-
> >>  kernel/trace/ftrace.c               |  49 +++--
> >>  kernel/trace/ring_buffer.c          |  25 ++-
> >>  kernel/trace/trace.c                |  12 +-
> >>  kernel/trace/trace_dynevent.c       |   6 +-
> >>  kernel/trace/trace_dynevent.h       |   5 +-
> >>  kernel/trace/trace_events.c         |  35 ++--
> >>  kernel/trace/trace_events_filter.c  |   4 +-
> >>  kernel/trace/trace_events_hist.c    |   8 +-
> >>  kernel/trace/trace_events_trigger.c |  17 +-
> >>  kernel/trace/trace_events_user.c    |  16 +-
> >>  kernel/trace/trace_stat.c           |   4 +-
> >>  kernel/user-return-notifier.c       |   3 +-
> >>  kernel/workqueue.c                  |  16 +-
> >>  mm/backing-dev.c                    |   8 +-
> >>  mm/balloon.c                        |   8 +-
> >>  mm/cma.c                            |   4 +-
> >>  mm/compaction.c                     |   4 +-
> >>  mm/damon/core.c                     |   4 +-
> >>  mm/damon/sysfs-schemes.c            |   4 +-
> >>  mm/dmapool.c                        |   4 +-
> >>  mm/huge_memory.c                    |   8 +-
> >>  mm/hugetlb.c                        |  56 +++---
> >>  mm/hugetlb_vmemmap.c                |  16 +-
> >>  mm/khugepaged.c                     |  14 +-
> >>  mm/kmemleak.c                       |   7 +-
> >>  mm/ksm.c                            |  25 +--
> >>  mm/list_lru.c                       |   4 +-
> >>  mm/memcontrol-v1.c                  |   8 +-
> >>  mm/memory-failure.c                 |  12 +-
> >>  mm/memory-tiers.c                   |   4 +-
> >>  mm/migrate.c                        |  23 ++-
> >>  mm/mmu_notifier.c                   |   9 +-
> >>  mm/page_alloc.c                     |   8 +-
> >>  mm/page_reporting.c                 |   2 +-
> >>  mm/percpu.c                         |  11 +-
> >>  mm/pgtable-generic.c                |   4 +-
> >>  mm/rmap.c                           |  10 +-
> >>  mm/shmem.c                          |   9 +-
> >>  mm/slab_common.c                    |  14 +-
> >>  mm/slub.c                           |  33 ++--
> >>  mm/swapfile.c                       |   4 +-
> >>  mm/userfaultfd.c                    |  12 +-
> >>  mm/vmalloc.c                        |  24 +--
> >>  mm/vmscan.c                         |   7 +-
> >>  mm/zsmalloc.c                       |   4 +-
> >>  124 files changed, 875 insertions(+), 681 deletions(-)
> > 
> > Not sure what you were thinking, but this diff stat
> > is not landable.
> 
> [PATCH v3 1/7] and [PATCH v3 2/7] contain the main logic and can
> be merged directly. They are also compatible with the old API.
> [PATCH v3 3/7] through [PATCH v3 7/7] are just simple interface
> replacements and do not change any functional logic. They can be
> left unmerged for now; individual modules can pick them up later
> if needed.
> 
> In v2, Andy Shevchenko mentioned: "If it's done by Linus himself
> during the day when he prepares -rc1, it's fine."

Yes, but you need to get his blessing first to go with this.
Have you communicated with him on this?

> Even so, the
> changes in this patch series are indeed quite large and touch
> almost every subsystem. I have only converted part of them for
> now, so I wanted to send this out first and see what people think.

That's why it's better to provide a script to convert (e.g., coccinelle)
instead of tons of patches.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply

* Re: [PATCH v3 3/6] rust: macros: auto-insert OwnerModule in #[vtable]
From: Gary Guo @ 2026-06-22 10:44 UTC (permalink / raw)
  To: Alvin Sun, Miguel Ojeda, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, Danilo Krummrich, Luis Chamberlain, Petr Pavlu,
	Daniel Gomez, Sami Tolvanen, Aaron Tomlin, Greg Kroah-Hartman,
	Rafael J. Wysocki, David Airlie, Simona Vetter, Daniel Almeida,
	Arnd Bergmann, Brendan Higgins, David Gow, Rae Moar, Breno Leitao,
	Jens Axboe
  Cc: rust-for-linux, linux-modules, driver-core, dri-devel, nova-gpu,
	linux-kselftest, kunit-dev, linux-block, linux-kernel
In-Reply-To: <20260622-fix-fops-owner-v3-3-49d45cb37032@linux.dev>

On Mon Jun 22, 2026 at 3:44 AM BST, Alvin Sun wrote:
> Auto-add `type OwnerModule: ::kernel::ModuleMetadata;` as a required
> associated type on the trait side if not already defined, and
> auto-insert `type OwnerModule = crate::LocalModule;` on the impl side
> if not explicitly provided, eliminating the need to manually declare
> and implement `OwnerModule` in every vtable trait and impl.
> 
> Reviewed-by: Andreas Hindborg <a.hindborg@kernel.org>
> Suggested-by: Gary Guo <gary@garyguo.net>
> Link: https://lore.kernel.org/all/DIMMWHUOLPSH.13JFRHDKDQJGO@garyguo.net
> Signed-off-by: Alvin Sun <alvin.sun@linux.dev>

Reviewed-by: Gary Guo <gary@garyguo.net>

> ---
>  rust/macros/lib.rs    |  6 ++++++
>  rust/macros/vtable.rs | 41 ++++++++++++++++++++++++++++++++++++-----
>  2 files changed, 42 insertions(+), 5 deletions(-)


^ permalink raw reply

* Re: [PATCH v3 1/6] rust: module: add `THIS_MODULE` const to `ModuleMetadata` trait
From: Gary Guo @ 2026-06-22 10:42 UTC (permalink / raw)
  To: Alvin Sun, Miguel Ojeda, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, Danilo Krummrich, Luis Chamberlain, Petr Pavlu,
	Daniel Gomez, Sami Tolvanen, Aaron Tomlin, Greg Kroah-Hartman,
	Rafael J. Wysocki, David Airlie, Simona Vetter, Daniel Almeida,
	Arnd Bergmann, Brendan Higgins, David Gow, Rae Moar, Breno Leitao,
	Jens Axboe
  Cc: rust-for-linux, linux-modules, driver-core, dri-devel, nova-gpu,
	linux-kselftest, kunit-dev, linux-block, linux-kernel
In-Reply-To: <20260622-fix-fops-owner-v3-1-49d45cb37032@linux.dev>

On Mon Jun 22, 2026 at 3:44 AM BST, Alvin Sun wrote:
> Since `const_refs_to_static` has been stable as of the MSRV bump, a
> `ThisModule` pointer can now be used in const contexts.
>
> Add a `THIS_MODULE` const to the `ModuleMetadata` trait so that modules
> can provide their `ThisModule` pointer in const contexts such as static
> `file_operations`.
>
> Move the `THIS_MODULE` static from the `module!` macro into the
> `ModuleMetadata` impl, add a `this_module()` helper, and update `__init`
> to use it.

Doesn't this break existing users of THIS_MODULE?

Binder, rnull and configfs macros are using it.

Best,
Gary

>
> Signed-off-by: Alvin Sun <alvin.sun@linux.dev>
> ---
>  rust/kernel/lib.rs    |  8 ++++++++
>  rust/macros/module.rs | 34 +++++++++++++++++-----------------
>  2 files changed, 25 insertions(+), 17 deletions(-)
>
> diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
> index b72b2fbe046d6..50f5a7b5f028e 100644
> --- a/rust/kernel/lib.rs
> +++ b/rust/kernel/lib.rs
> @@ -184,6 +184,14 @@ fn init(module: &'static ThisModule) -> impl pin_init::PinInit<Self, error::Erro
>  pub trait ModuleMetadata {
>      /// The name of the module as specified in the `module!` macro.
>      const NAME: &'static crate::str::CStr;
> +
> +    /// The module's `THIS_MODULE` pointer.
> +    const THIS_MODULE: ThisModule;
> +}
> +
> +/// Returns a reference to the `THIS_MODULE` of the given module type.
> +pub const fn this_module<M: ModuleMetadata>() -> &'static ThisModule {
> +    &M::THIS_MODULE
>  }
>  
>  /// Equivalent to `THIS_MODULE` in the C API.
> diff --git a/rust/macros/module.rs b/rust/macros/module.rs
> index 06c18e2075083..b9fdee2f2af47 100644
> --- a/rust/macros/module.rs
> +++ b/rust/macros/module.rs
> @@ -497,28 +497,28 @@ pub(crate) fn module(info: ModuleInfo) -> Result<TokenStream> {
>          /// Used by the printing macros, e.g. [`info!`].
>          const __LOG_PREFIX: &[u8] = #name_cstr.to_bytes_with_nul();
>  
> -        // SAFETY: `__this_module` is constructed by the kernel at load time and will not be
> -        // freed until the module is unloaded.
> -        #[cfg(MODULE)]
> -        static THIS_MODULE: ::kernel::ThisModule = unsafe {
> -            extern "C" {
> -                static __this_module: ::kernel::types::Opaque<::kernel::bindings::module>;
> -            };
> -
> -            ::kernel::ThisModule::from_ptr(__this_module.get())
> -        };
> -
> -        #[cfg(not(MODULE))]
> -        static THIS_MODULE: ::kernel::ThisModule = unsafe {
> -            ::kernel::ThisModule::from_ptr(::core::ptr::null_mut())
> -        };
> -
>          /// The `LocalModule` type is the type of the module created by `module!`,
>          /// `module_pci_driver!`, `module_platform_driver!`, etc.
>          type LocalModule = #type_;
>  
>          impl ::kernel::ModuleMetadata for #type_ {
>              const NAME: &'static ::kernel::str::CStr = #name_cstr;
> +
> +            #[cfg(MODULE)]
> +            const THIS_MODULE: ::kernel::ThisModule = {
> +                extern "C" {
> +                    static __this_module: ::kernel::types::Opaque<::kernel::bindings::module>;
> +                }
> +
> +                // SAFETY: `__this_module` is constructed by the kernel at load time
> +                // and lives until the module is unloaded.
> +                unsafe { ::kernel::ThisModule::from_ptr(__this_module.get()) }
> +            };
> +
> +            #[cfg(not(MODULE))]
> +            const THIS_MODULE: ::kernel::ThisModule = unsafe {
> +                ::kernel::ThisModule::from_ptr(::core::ptr::null_mut())
> +            };
>          }
>  
>          // Double nested modules, since then nobody can access the public items inside.
> @@ -616,7 +616,7 @@ pub extern "C" fn #ident_exit() {
>                  /// This function must only be called once.
>                  unsafe fn __init() -> ::kernel::ffi::c_int {
>                      let initer = <super::super::LocalModule as ::kernel::InPlaceModule>::init(
> -                        &super::super::THIS_MODULE
> +                        ::kernel::this_module::<super::super::LocalModule>()
>                      );
>                      // SAFETY: No data race, since `__MOD` can only be accessed by this module
>                      // and there only `__init` and `__exit` access it. These functions are only



^ permalink raw reply

* Re: [PATCH 1/1] block: validate user space vectors during extraction
From: kernel test robot @ 2026-06-22 10:05 UTC (permalink / raw)
  To: Keith Busch, linux-block, linux-fsdevel
  Cc: oe-kbuild-all, dm-devel, hch, axboe, brauner, djwong, viro,
	Keith Busch, stable
In-Reply-To: <20260617233235.1016063-2-kbusch@meta.com>

Hi Keith,

kernel test robot noticed the following build warnings:

[auto build test WARNING on axboe/for-next]
[also build test WARNING on brauner-vfs/vfs.all akpm-mm/mm-nonmm-unstable linus/master v7.1 next-20260619]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Keith-Busch/block-validate-user-space-vectors-during-extraction/20260618-073522
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux.git for-next
patch link:    https://lore.kernel.org/r/20260617233235.1016063-2-kbusch%40meta.com
patch subject: [PATCH 1/1] block: validate user space vectors during extraction
config: openrisc-allnoconfig (https://download.01.org/0day-ci/archive/20260622/202606221846.H7g3giF8-lkp@intel.com/config)
compiler: or1k-linux-gcc (GCC) 16.1.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260622/202606221846.H7g3giF8-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202606221846.H7g3giF8-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> Warning: block/bio.c:1245 function parameter 'vec_align_mask' not described in 'bio_iov_iter_get_pages'
>> Warning: block/bio.c:1245 function parameter 'vec_align_mask' not described in 'bio_iov_iter_get_pages'

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply

* Re: [PATCH v3 6/6] rust: configfs: use `LocalModule` for `THIS_MODULE`
From: Andreas Hindborg @ 2026-06-22  9:38 UTC (permalink / raw)
  To: Alvin Sun, Miguel Ojeda, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Alice Ryhl, Trevor Gross,
	Danilo Krummrich, Luis Chamberlain, Petr Pavlu, Daniel Gomez,
	Sami Tolvanen, Aaron Tomlin, Greg Kroah-Hartman,
	Rafael J. Wysocki, David Airlie, Simona Vetter, Daniel Almeida,
	Arnd Bergmann, Brendan Higgins, David Gow, Rae Moar, Breno Leitao,
	Jens Axboe
  Cc: rust-for-linux, linux-modules, driver-core, dri-devel, nova-gpu,
	linux-kselftest, kunit-dev, linux-block, linux-kernel, Alvin Sun
In-Reply-To: <20260622-fix-fops-owner-v3-6-49d45cb37032@linux.dev>

Alvin Sun <alvin.sun@linux.dev> writes:

> Replace the `THIS_MODULE` static reference in the `configfs_attrs!`
> macro with `this_module::<LocalModule>()`, and update
> rnull to import `LocalModule` instead of `THIS_MODULE`, consistent
> with the move of `THIS_MODULE` into the `ModuleMetadata` trait.
>
> Signed-off-by: Alvin Sun <alvin.sun@linux.dev>

Reviewed-by: Andreas Hindborg <a.hindborg@kernel.org>


Best regards,
Andreas Hindborg




^ permalink raw reply

* Re: [PATCH v3 4/6] rust: drm: set fops.owner from driver module pointer
From: Andreas Hindborg @ 2026-06-22  9:39 UTC (permalink / raw)
  To: Alvin Sun, Miguel Ojeda, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Alice Ryhl, Trevor Gross,
	Danilo Krummrich, Luis Chamberlain, Petr Pavlu, Daniel Gomez,
	Sami Tolvanen, Aaron Tomlin, Greg Kroah-Hartman,
	Rafael J. Wysocki, David Airlie, Simona Vetter, Daniel Almeida,
	Arnd Bergmann, Brendan Higgins, David Gow, Rae Moar, Breno Leitao,
	Jens Axboe
  Cc: rust-for-linux, linux-modules, driver-core, dri-devel, nova-gpu,
	linux-kselftest, kunit-dev, linux-block, linux-kernel, Alvin Sun
In-Reply-To: <20260622-fix-fops-owner-v3-4-49d45cb37032@linux.dev>

Alvin Sun <alvin.sun@linux.dev> writes:

> Change `create_fops()` to accept an owner module pointer instead of
> hardcoding `null_mut()`, ensuring the kernel correctly tracks the
> module owning the DRM device's file operations.
>
> Signed-off-by: Alvin Sun <alvin.sun@linux.dev>

Reviewed-by: Andreas Hindborg <a.hindborg@kernel.org>


Best regards,
Andreas Hindborg




^ permalink raw reply

* Re: [PATCH v3 1/6] rust: module: add `THIS_MODULE` const to `ModuleMetadata` trait
From: Andreas Hindborg @ 2026-06-22  9:44 UTC (permalink / raw)
  To: Alvin Sun, Miguel Ojeda, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Alice Ryhl, Trevor Gross,
	Danilo Krummrich, Luis Chamberlain, Petr Pavlu, Daniel Gomez,
	Sami Tolvanen, Aaron Tomlin, Greg Kroah-Hartman,
	Rafael J. Wysocki, David Airlie, Simona Vetter, Daniel Almeida,
	Arnd Bergmann, Brendan Higgins, David Gow, Rae Moar, Breno Leitao,
	Jens Axboe
  Cc: rust-for-linux, linux-modules, driver-core, dri-devel, nova-gpu,
	linux-kselftest, kunit-dev, linux-block, linux-kernel, Alvin Sun
In-Reply-To: <20260622-fix-fops-owner-v3-1-49d45cb37032@linux.dev>

Alvin Sun <alvin.sun@linux.dev> writes:

> Since `const_refs_to_static` has been stable as of the MSRV bump, a
> `ThisModule` pointer can now be used in const contexts.
>
> Add a `THIS_MODULE` const to the `ModuleMetadata` trait so that modules
> can provide their `ThisModule` pointer in const contexts such as static
> `file_operations`.
>
> Move the `THIS_MODULE` static from the `module!` macro into the
> `ModuleMetadata` impl, add a `this_module()` helper, and update `__init`
> to use it.
>
> Signed-off-by: Alvin Sun <alvin.sun@linux.dev>

Reviewed-by: Andreas Hindborg <a.hindborg@kernel.org>


Best regards,
Andreas Hindborg




^ permalink raw reply

* Re: [PATCH v3 5/6] rust: miscdevice: set fops.owner from driver module pointer
From: Andreas Hindborg @ 2026-06-22  9:38 UTC (permalink / raw)
  To: Alvin Sun, Miguel Ojeda, Boqun Feng, Gary Guo,
	Björn Roy Baron, Benno Lossin, Alice Ryhl, Trevor Gross,
	Danilo Krummrich, Luis Chamberlain, Petr Pavlu, Daniel Gomez,
	Sami Tolvanen, Aaron Tomlin, Greg Kroah-Hartman,
	Rafael J. Wysocki, David Airlie, Simona Vetter, Daniel Almeida,
	Arnd Bergmann, Brendan Higgins, David Gow, Rae Moar, Breno Leitao,
	Jens Axboe
  Cc: rust-for-linux, linux-modules, driver-core, dri-devel, nova-gpu,
	linux-kselftest, kunit-dev, linux-block, linux-kernel, Alvin Sun
In-Reply-To: <20260622-fix-fops-owner-v3-5-49d45cb37032@linux.dev>

Alvin Sun <alvin.sun@linux.dev> writes:

> Set the miscdevice fops owner field from the driver module pointer
> via the `this_module::<T::OwnerModule>()` helper, instead of
> defaulting to null.
>
> Signed-off-by: Alvin Sun <alvin.sun@linux.dev>

Reviewed-by: Andreas Hindborg <a.hindborg@kernel.org>


Best regards,
Andreas Hindborg




^ permalink raw reply

* [PATCH] block/cgroup: Drop stale -EBUSY retry from blkg_conf_prep()
From: Yang Xiuwei @ 2026-06-22  8:56 UTC (permalink / raw)
  To: Tejun Heo, Josef Bacik, Jens Axboe; +Cc: cgroups, linux-block, Yang Xiuwei

Since commit 8f4236d9008b ("block: remove QUEUE_FLAG_BYPASS and
->bypass") nothing in the blkcg blkg lookup/creation path
returns -EBUSY anymore. blkg_conf_prep() nevertheless still
retries at fail_exit with msleep(10) and restart_syscall()
— logic added in 2012 when blk_queue_bypass() could
cause blkg lookup/creation to fail with -EBUSY while the queue was
temporarily bypassed during elevator changes.

Signed-off-by: Yang Xiuwei <yangxiuwei@kylinos.cn>
---
 block/blk-cgroup.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 3093c1c03902..259f2240e7df 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -919,16 +919,6 @@ int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol,
 	spin_unlock_irq(&q->queue_lock);
 fail_exit:
 	mutex_unlock(&q->blkcg_mutex);
-	/*
-	 * If queue was bypassing, we should retry.  Do so after a
-	 * short msleep().  It isn't strictly necessary but queue
-	 * can be bypassing for some time and it's always nice to
-	 * avoid busy looping.
-	 */
-	if (ret == -EBUSY) {
-		msleep(10);
-		ret = restart_syscall();
-	}
 	return ret;
 }
 EXPORT_SYMBOL_GPL(blkg_conf_prep);
-- 
2.25.1


^ permalink raw reply related

* Re: [PATCH v3 1/7] list: Add mutable iterator variants
From: Christian König @ 2026-06-22  8:51 UTC (permalink / raw)
  To: Kaitao Cheng, Andrew Morton, David Hildenbrand, Jens Axboe,
	Tejun Heo, Alexander Viro, Christian Brauner, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Johannes Weiner, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim,
	Thomas Gleixner, Juri Lelli, Vincent Guittot, Paul Moore,
	Andy Shevchenko, Paul E. McKenney, Shakeel Butt
  Cc: David Howells, Simona Vetter, Randy Dunlap, Luca Ceresoli,
	Philipp Stanner, linux-block, linux-kernel, cgroups,
	linux-ntfs-dev, linux-fsdevel, io-uring, audit, bpf, netdev,
	dri-devel, linux-perf-users, linux-trace-kernel, kexec,
	live-patching, linux-modules, linux-crypto, linux-pm, rcu,
	sched-ext, linux-mm, virtualization, damon, llvm, Kaitao Cheng
In-Reply-To: <20260622040533.29824-2-kaitao.cheng@linux.dev>

On 6/22/26 06:05, Kaitao Cheng wrote:
> From: Kaitao Cheng <chengkaitao@kylinos.cn>
> 
> The list_for_each*_safe() helpers are used when the loop body may
> remove the current entry.  Their API exposes the temporary cursor at
> every call site, even though most users only need it for the iterator
> implementation and never reference it in the loop body.
> 
> Add *_mutable() variants for list and hlist iteration.  The new helpers
> support both forms: callers may keep passing an explicit temporary cursor
> when they need to inspect or reset it, or omit it and let the helper use
> a unique internal cursor.

That sounds like a bad idea to me. The macro should really be doing one job and that as best as it can.

> This makes call sites that only mutate the list through the current entry
> less noisy, while keeping the existing *_safe() helpers available for
> compatibility.

This can be perfectly used for code that which really needs the separate variable for the next entry.

Regards,
Christian.


> 
> Signed-off-by: Kaitao Cheng <chengkaitao@kylinos.cn>
> ---
>  include/linux/list.h | 269 +++++++++++++++++++++++++++++++++++++------
>  1 file changed, 231 insertions(+), 38 deletions(-)
> 
> diff --git a/include/linux/list.h b/include/linux/list.h
> index 09d979976b3b..1081def7cea9 100644
> --- a/include/linux/list.h
> +++ b/include/linux/list.h
> @@ -7,6 +7,7 @@
>  #include <linux/stddef.h>
>  #include <linux/poison.h>
>  #include <linux/const.h>
> +#include <linux/args.h>
>  
>  #include <asm/barrier.h>
>  
> @@ -763,28 +764,72 @@ static inline void list_splice_tail_init(struct list_head *list,
>  #define list_for_each_prev(pos, head) \
>  	for (pos = (head)->prev; !list_is_head(pos, (head)); pos = pos->prev)
>  
> -/**
> - * list_for_each_safe - iterate over a list safe against removal of list entry
> - * @pos:	the &struct list_head to use as a loop cursor.
> - * @n:		another &struct list_head to use as temporary storage
> - * @head:	the head for your list.
> +/*
> + * list_for_each_safe is an old interface, use list_for_each_mutable instead.
>   */
>  #define list_for_each_safe(pos, n, head) \
>  	for (pos = (head)->next, n = pos->next; \
>  	     !list_is_head(pos, (head)); \
>  	     pos = n, n = pos->next)
>  
> +#define __list_for_each_mutable_internal(pos, tmp, head)		\
> +	for (typeof(pos) tmp = (pos = (head)->next)->next;		\
> +	     !list_is_head(pos, (head));				\
> +	     pos = tmp, tmp = pos->next)
> +
> +#define __list_for_each_mutable1(pos, head)				\
> +	__list_for_each_mutable_internal(pos, __UNIQUE_ID(next), head)
> +
> +#define __list_for_each_mutable2(pos, next, head)			\
> +	list_for_each_safe(pos, next, head)
> +
>  /**
> - * list_for_each_prev_safe - iterate over a list backwards safe against removal of list entry
> + * list_for_each_mutable - iterate over a list safe against entry removal
>   * @pos:	the &struct list_head to use as a loop cursor.
> - * @n:		another &struct list_head to use as temporary storage
> - * @head:	the head for your list.
> + * @...:	either (head) or (next, head)
> + *
> + * next:	another &struct list_head to use as optional temporary storage.
> + *		The temporary cursor is internal unless explicitly supplied by
> + *		the caller.
> + * head:	the head for your list.
> + */
> +#define list_for_each_mutable(pos, ...)					\
> +	CONCATENATE(__list_for_each_mutable, COUNT_ARGS(__VA_ARGS__))	\
> +		(pos, __VA_ARGS__)
> +
> +/*
> + * list_for_each_prev_safe is an old interface, use list_for_each_prev_mutable instead.
>   */
>  #define list_for_each_prev_safe(pos, n, head) \
>  	for (pos = (head)->prev, n = pos->prev; \
>  	     !list_is_head(pos, (head)); \
>  	     pos = n, n = pos->prev)
>  
> +#define __list_for_each_prev_mutable_internal(pos, tmp, head)		\
> +	for (typeof(pos) tmp = (pos = (head)->prev)->prev;		\
> +	     !list_is_head(pos, (head));				\
> +	     pos = tmp, tmp = pos->prev)
> +
> +#define __list_for_each_prev_mutable1(pos, head)			\
> +	__list_for_each_prev_mutable_internal(pos, __UNIQUE_ID(prev), head)
> +
> +#define __list_for_each_prev_mutable2(pos, prev, head)			\
> +	list_for_each_prev_safe(pos, prev, head)
> +
> +/**
> + * list_for_each_prev_mutable - iterate over a list backwards safe against entry removal
> + * @pos:	the &struct list_head to use as a loop cursor.
> + * @...:	either (head) or (prev, head)
> + *
> + * prev:	another &struct list_head to use as optional temporary storage.
> + *		The temporary cursor is internal unless explicitly supplied by
> + *		the caller.
> + * head:	the head for your list.
> + */
> +#define list_for_each_prev_mutable(pos, ...)				\
> +	CONCATENATE(__list_for_each_prev_mutable, COUNT_ARGS(__VA_ARGS__)) \
> +		(pos, __VA_ARGS__)
> +
>  /**
>   * list_count_nodes - count nodes in the list
>   * @head:	the head for your list.
> @@ -895,12 +940,8 @@ static inline size_t list_count_nodes(struct list_head *head)
>  	for (; !list_entry_is_head(pos, head, member);			\
>  	     pos = list_prev_entry(pos, member))
>  
> -/**
> - * list_for_each_entry_safe - iterate over list of given type safe against removal of list entry
> - * @pos:	the type * to use as a loop cursor.
> - * @n:		another type * to use as temporary storage
> - * @head:	the head for your list.
> - * @member:	the name of the list_head within the struct.
> +/*
> + * list_for_each_entry_safe is an old interface, use list_for_each_entry_mutable instead.
>   */
>  #define list_for_each_entry_safe(pos, n, head, member)			\
>  	for (pos = list_first_entry(head, typeof(*pos), member),	\
> @@ -908,15 +949,36 @@ static inline size_t list_count_nodes(struct list_head *head)
>  	     !list_entry_is_head(pos, head, member); 			\
>  	     pos = n, n = list_next_entry(n, member))
>  
> +#define __list_for_each_entry_mutable_internal(pos, tmp, head, member)	\
> +	for (typeof(pos) tmp = list_next_entry(pos =			\
> +		list_first_entry(head, typeof(*pos), member), member);	\
> +	     !list_entry_is_head(pos, head, member);			\
> +	     pos = tmp, tmp = list_next_entry(tmp, member))
> +
> +#define __list_for_each_entry_mutable2(pos, head, member)		\
> +	__list_for_each_entry_mutable_internal(pos, __UNIQUE_ID(next), head, member)
> +
> +#define __list_for_each_entry_mutable3(pos, next, head, member)		\
> +	list_for_each_entry_safe(pos, next, head, member)
> +
>  /**
> - * list_for_each_entry_safe_continue - continue list iteration safe against removal
> + * list_for_each_entry_mutable - iterate over a list safe against entry removal
>   * @pos:	the type * to use as a loop cursor.
> - * @n:		another type * to use as temporary storage
> - * @head:	the head for your list.
> - * @member:	the name of the list_head within the struct.
> + * @...:	either (head, member) or (next, head, member)
>   *
> - * Iterate over list of given type, continuing after current point,
> - * safe against removal of list entry.
> + * next:	another type * to use as optional temporary storage. The
> + *		temporary cursor is internal unless explicitly supplied by the
> + *		caller.
> + * head:	the head for your list.
> + * member:	the name of the list_head within the struct.
> + */
> +#define list_for_each_entry_mutable(pos, ...)				\
> +	CONCATENATE(__list_for_each_entry_mutable, COUNT_ARGS(__VA_ARGS__)) \
> +		(pos, __VA_ARGS__)
> +
> +/*
> + * list_for_each_entry_safe_continue is an old interface,
> + * use list_for_each_entry_mutable_continue instead.
>   */
>  #define list_for_each_entry_safe_continue(pos, n, head, member) 		\
>  	for (pos = list_next_entry(pos, member), 				\
> @@ -924,30 +986,79 @@ static inline size_t list_count_nodes(struct list_head *head)
>  	     !list_entry_is_head(pos, head, member);				\
>  	     pos = n, n = list_next_entry(n, member))
>  
> +#define __list_for_each_entry_mutable_continue_internal(pos, tmp, head, member) \
> +	for (typeof(pos) tmp = list_next_entry(pos =			\
> +		list_next_entry(pos, member), member);			\
> +	     !list_entry_is_head(pos, head, member);			\
> +	     pos = tmp, tmp = list_next_entry(tmp, member))
> +
> +#define __list_for_each_entry_mutable_continue2(pos, head, member)	\
> +	__list_for_each_entry_mutable_continue_internal(pos,		\
> +		__UNIQUE_ID(next), head, member)
> +
> +#define __list_for_each_entry_mutable_continue3(pos, next, head, member) \
> +	list_for_each_entry_safe_continue(pos, next, head, member)
> +
>  /**
> - * list_for_each_entry_safe_from - iterate over list from current point safe against removal
> + * list_for_each_entry_mutable_continue - continue list iteration safe against removal
>   * @pos:	the type * to use as a loop cursor.
> - * @n:		another type * to use as temporary storage
> - * @head:	the head for your list.
> - * @member:	the name of the list_head within the struct.
> + * @...:	either (head, member) or (next, head, member)
>   *
> - * Iterate over list of given type from current point, safe against
> - * removal of list entry.
> + * next:	another type * to use as optional temporary storage. The
> + *		temporary cursor is internal unless explicitly supplied by the
> + *		caller.
> + * head:	the head for your list.
> + * member:	the name of the list_head within the struct.
> + *
> + * Iterate over list of given type, continuing after current point,
> + * safe against removal of list entry.
> + */
> +#define list_for_each_entry_mutable_continue(pos, ...)			\
> +	CONCATENATE(__list_for_each_entry_mutable_continue,		\
> +		COUNT_ARGS(__VA_ARGS__))(pos, __VA_ARGS__)
> +
> +/*
> + * list_for_each_entry_safe_from is an old interface,
> + * use list_for_each_entry_mutable_from instead.
>   */
>  #define list_for_each_entry_safe_from(pos, n, head, member) 			\
>  	for (n = list_next_entry(pos, member);					\
>  	     !list_entry_is_head(pos, head, member);				\
>  	     pos = n, n = list_next_entry(n, member))
>  
> +#define __list_for_each_entry_mutable_from_internal(pos, tmp, head, member) \
> +	for (typeof(pos) tmp = list_next_entry(pos, member);		\
> +	     !list_entry_is_head(pos, head, member);			\
> +	     pos = tmp, tmp = list_next_entry(tmp, member))
> +
> +#define __list_for_each_entry_mutable_from2(pos, head, member)		\
> +	__list_for_each_entry_mutable_from_internal(pos,		\
> +		__UNIQUE_ID(next), head, member)
> +
> +#define __list_for_each_entry_mutable_from3(pos, next, head, member)	\
> +	list_for_each_entry_safe_from(pos, next, head, member)
> +
>  /**
> - * list_for_each_entry_safe_reverse - iterate backwards over list safe against removal
> + * list_for_each_entry_mutable_from - iterate over list from current point safe against removal
>   * @pos:	the type * to use as a loop cursor.
> - * @n:		another type * to use as temporary storage
> - * @head:	the head for your list.
> - * @member:	the name of the list_head within the struct.
> + * @...:	either (head, member) or (next, head, member)
>   *
> - * Iterate backwards over list of given type, safe against removal
> - * of list entry.
> + * next:	another type * to use as optional temporary storage. The
> + *		temporary cursor is internal unless explicitly supplied by the
> + *		caller.
> + * head:	the head for your list.
> + * member:	the name of the list_head within the struct.
> + *
> + * Iterate over list of given type from current point, safe against
> + * removal of list entry.
> + */
> +#define list_for_each_entry_mutable_from(pos, ...)			\
> +	CONCATENATE(__list_for_each_entry_mutable_from,			\
> +		COUNT_ARGS(__VA_ARGS__))(pos, __VA_ARGS__)
> +
> +/*
> + * list_for_each_entry_safe_reverse is an old interface,
> + * use list_for_each_entry_mutable_reverse instead.
>   */
>  #define list_for_each_entry_safe_reverse(pos, n, head, member)		\
>  	for (pos = list_last_entry(head, typeof(*pos), member),		\
> @@ -955,6 +1066,37 @@ static inline size_t list_count_nodes(struct list_head *head)
>  	     !list_entry_is_head(pos, head, member); 			\
>  	     pos = n, n = list_prev_entry(n, member))
>  
> +#define __list_for_each_entry_mutable_reverse_internal(pos, tmp, head, member) \
> +	for (typeof(pos) tmp = list_prev_entry(pos =			\
> +		list_last_entry(head, typeof(*pos), member), member);	\
> +	     !list_entry_is_head(pos, head, member);			\
> +	     pos = tmp, tmp = list_prev_entry(tmp, member))
> +
> +#define __list_for_each_entry_mutable_reverse2(pos, head, member)	\
> +	__list_for_each_entry_mutable_reverse_internal(pos,		\
> +		__UNIQUE_ID(prev), head, member)
> +
> +#define __list_for_each_entry_mutable_reverse3(pos, prev, head, member)	\
> +	list_for_each_entry_safe_reverse(pos, prev, head, member)
> +
> +/**
> + * list_for_each_entry_mutable_reverse - iterate backwards over list safe against removal
> + * @pos:	the type * to use as a loop cursor.
> + * @...:	either (head, member) or (prev, head, member)
> + *
> + * prev:	another type * to use as optional temporary storage. The
> + *		temporary cursor is internal unless explicitly supplied by the
> + *		caller.
> + * head:	the head for your list.
> + * member:	the name of the list_head within the struct.
> + *
> + * Iterate backwards over list of given type, safe against removal
> + * of list entry.
> + */
> +#define list_for_each_entry_mutable_reverse(pos, ...)			\
> +	CONCATENATE(__list_for_each_entry_mutable_reverse,		\
> +		COUNT_ARGS(__VA_ARGS__))(pos, __VA_ARGS__)
> +
>  /**
>   * list_safe_reset_next - reset a stale list_for_each_entry_safe loop
>   * @pos:	the loop cursor used in the list_for_each_entry_safe loop
> @@ -1189,6 +1331,31 @@ static inline void hlist_splice_init(struct hlist_head *from,
>  	for (pos = (head)->first; pos && ({ n = pos->next; 1; }); \
>  	     pos = n)
>  
> +#define __hlist_for_each_mutable_internal(pos, tmp, head)		\
> +	for (typeof(pos) tmp = (pos = (head)->first) ? pos->next : NULL; \
> +	     pos;							\
> +	     pos = tmp, tmp = pos ? pos->next : NULL)
> +
> +#define __hlist_for_each_mutable1(pos, head)				\
> +	__hlist_for_each_mutable_internal(pos, __UNIQUE_ID(next), head)
> +
> +#define __hlist_for_each_mutable2(pos, next, head)			\
> +	hlist_for_each_safe(pos, next, head)
> +
> +/**
> + * hlist_for_each_mutable - iterate over a hlist safe against entry removal
> + * @pos:	the &struct hlist_node to use as a loop cursor.
> + * @...:	either (head) or (next, head)
> + *
> + * next:	another &struct hlist_node to use as optional temporary storage.
> + *		The temporary cursor is internal unless explicitly supplied by
> + *		the caller.
> + * head:	the head for your hlist.
> + */
> +#define hlist_for_each_mutable(pos, ...)				\
> +	CONCATENATE(__hlist_for_each_mutable, COUNT_ARGS(__VA_ARGS__))	\
> +		(pos, __VA_ARGS__)
> +
>  #define hlist_entry_safe(ptr, type, member) \
>  	({ typeof(ptr) ____ptr = (ptr); \
>  	   ____ptr ? hlist_entry(____ptr, type, member) : NULL; \
> @@ -1224,18 +1391,44 @@ static inline void hlist_splice_init(struct hlist_head *from,
>  	for (; pos;							\
>  	     pos = hlist_entry_safe((pos)->member.next, typeof(*(pos)), member))
>  
> -/**
> - * hlist_for_each_entry_safe - iterate over list of given type safe against removal of list entry
> - * @pos:	the type * to use as a loop cursor.
> - * @n:		a &struct hlist_node to use as temporary storage
> - * @head:	the head for your list.
> - * @member:	the name of the hlist_node within the struct.
> +/*
> + * hlist_for_each_entry_safe is an old interface, use hlist_for_each_entry_mutable instead.
>   */
>  #define hlist_for_each_entry_safe(pos, n, head, member) 		\
>  	for (pos = hlist_entry_safe((head)->first, typeof(*pos), member);\
>  	     pos && ({ n = pos->member.next; 1; });			\
>  	     pos = hlist_entry_safe(n, typeof(*pos), member))
>  
> +#define __hlist_for_each_entry_mutable_internal(pos, tmp, head, member)	\
> +	for (struct hlist_node *tmp = (pos =				\
> +		hlist_entry_safe((head)->first, typeof(*pos), member)) ? \
> +		pos->member.next : NULL;				\
> +	     pos;							\
> +	     pos = hlist_entry_safe((tmp), typeof(*pos), member),	\
> +		tmp = pos ? pos->member.next : NULL)
> +
> +#define __hlist_for_each_entry_mutable2(pos, head, member)		\
> +	__hlist_for_each_entry_mutable_internal(pos,			\
> +		__UNIQUE_ID(next), head, member)
> +
> +#define __hlist_for_each_entry_mutable3(pos, next, head, member)	\
> +	hlist_for_each_entry_safe(pos, next, head, member)
> +
> +/**
> + * hlist_for_each_entry_mutable - iterate over hlist safe against entry removal
> + * @pos:	the type * to use as a loop cursor.
> + * @...:	either (head, member) or (next, head, member)
> + *
> + * next:	a &struct hlist_node to use as optional temporary storage. The
> + *		temporary cursor is internal unless explicitly supplied by the
> + *		caller.
> + * head:	the head for your hlist.
> + * member:	the name of the hlist_node within the struct.
> + */
> +#define hlist_for_each_entry_mutable(pos, ...)				\
> +	CONCATENATE(__hlist_for_each_entry_mutable,			\
> +		COUNT_ARGS(__VA_ARGS__))(pos, __VA_ARGS__)
> +
>  /**
>   * hlist_count_nodes - count nodes in the hlist
>   * @head:	the head for your hlist.


^ permalink raw reply

* Re: [PATCH v3 1/7] list: Add mutable iterator variants
From: David Laight @ 2026-06-22  8:42 UTC (permalink / raw)
  To: Kaitao Cheng
  Cc: Andrew Morton, David Hildenbrand, Jens Axboe, Tejun Heo,
	Alexander Viro, Christian Brauner, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Johannes Weiner, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim,
	Thomas Gleixner, Juri Lelli, Vincent Guittot, Paul Moore,
	Andy Shevchenko, Paul E. McKenney, Shakeel Butt,
	Christian König, David Howells, Simona Vetter, Randy Dunlap,
	Luca Ceresoli, Philipp Stanner, linux-block, linux-kernel,
	cgroups, linux-ntfs-dev, linux-fsdevel, io-uring, audit, bpf,
	netdev, dri-devel, linux-perf-users, linux-trace-kernel, kexec,
	live-patching, linux-modules, linux-crypto, linux-pm, rcu,
	sched-ext, linux-mm, virtualization, damon, llvm, Kaitao Cheng
In-Reply-To: <20260622040533.29824-2-kaitao.cheng@linux.dev>

On Mon, 22 Jun 2026 12:05:31 +0800
Kaitao Cheng <kaitao.cheng@linux.dev> wrote:

> From: Kaitao Cheng <chengkaitao@kylinos.cn>
> 
> The list_for_each*_safe() helpers are used when the loop body may
> remove the current entry.  Their API exposes the temporary cursor at
> every call site, even though most users only need it for the iterator
> implementation and never reference it in the loop body.
> 
> Add *_mutable() variants for list and hlist iteration.  The new helpers
> support both forms: callers may keep passing an explicit temporary cursor
> when they need to inspect or reset it, or omit it and let the helper use
> a unique internal cursor.

I'm not really sure 'mutable' means anything either.
It is possible to make it valid for the loop body (or even other threads)
to delete arbitrary list items - but that needs significant extra overheads.

It might be worth doing something that doesn't need the extra variable,
but there is little point doing all the churn just to rename things.

> 
> This makes call sites that only mutate the list through the current entry
> less noisy, while keeping the existing *_safe() helpers available for
> compatibility.
> 
> Signed-off-by: Kaitao Cheng <chengkaitao@kylinos.cn>
> ---
>  include/linux/list.h | 269 +++++++++++++++++++++++++++++++++++++------
>  1 file changed, 231 insertions(+), 38 deletions(-)
> 
> diff --git a/include/linux/list.h b/include/linux/list.h
> index 09d979976b3b..1081def7cea9 100644
> --- a/include/linux/list.h
> +++ b/include/linux/list.h
> @@ -7,6 +7,7 @@
>  #include <linux/stddef.h>
>  #include <linux/poison.h>
>  #include <linux/const.h>
> +#include <linux/args.h>
>  
>  #include <asm/barrier.h>
>  
> @@ -763,28 +764,72 @@ static inline void list_splice_tail_init(struct list_head *list,
>  #define list_for_each_prev(pos, head) \
>  	for (pos = (head)->prev; !list_is_head(pos, (head)); pos = pos->prev)
>  
> -/**
> - * list_for_each_safe - iterate over a list safe against removal of list entry
> - * @pos:	the &struct list_head to use as a loop cursor.
> - * @n:		another &struct list_head to use as temporary storage
> - * @head:	the head for your list.
> +/*
> + * list_for_each_safe is an old interface, use list_for_each_mutable instead.
>   */
>  #define list_for_each_safe(pos, n, head) \
>  	for (pos = (head)->next, n = pos->next; \
>  	     !list_is_head(pos, (head)); \
>  	     pos = n, n = pos->next)
>  
> +#define __list_for_each_mutable_internal(pos, tmp, head)		\
> +	for (typeof(pos) tmp = (pos = (head)->next)->next;		\

Use auto

> +	     !list_is_head(pos, (head));				\
> +	     pos = tmp, tmp = pos->next)
> +
> +#define __list_for_each_mutable1(pos, head)				\
> +	__list_for_each_mutable_internal(pos, __UNIQUE_ID(next), head)
> +
> +#define __list_for_each_mutable2(pos, next, head)			\
> +	list_for_each_safe(pos, next, head)
> +
>  /**
> - * list_for_each_prev_safe - iterate over a list backwards safe against removal of list entry
> + * list_for_each_mutable - iterate over a list safe against entry removal
>   * @pos:	the &struct list_head to use as a loop cursor.
> - * @n:		another &struct list_head to use as temporary storage
> - * @head:	the head for your list.
> + * @...:	either (head) or (next, head)
> + *
> + * next:	another &struct list_head to use as optional temporary storage.
> + *		The temporary cursor is internal unless explicitly supplied by
> + *		the caller.
> + * head:	the head for your list.
> + */
> +#define list_for_each_mutable(pos, ...)					\
> +	CONCATENATE(__list_for_each_mutable, COUNT_ARGS(__VA_ARGS__))	\
> +		(pos, __VA_ARGS__)

The variable argument count logic really just slows down compilation.
Maybe there aren't enough copies of this code to make that significant.
But just because you can do it doesn't mean it is a gooD idea.
I'm also not sure it really adds anything to the readability.

And, it you are going to make the middle argument optional there is
no need to change the macro name.

	David



^ permalink raw reply

* Re: [PATCH v3 0/7] Prepare mutable list iterators to cache cursor state
From: Jani Nikula @ 2026-06-22  8:37 UTC (permalink / raw)
  To: Kaitao Cheng, Andrew Morton, David Hildenbrand, Jens Axboe,
	Tejun Heo, Alexander Viro, Christian Brauner, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Johannes Weiner, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim,
	Thomas Gleixner, Juri Lelli, Vincent Guittot, Paul Moore,
	Andy Shevchenko, Paul E. McKenney, Shakeel Butt,
	Christian König
  Cc: David Howells, Simona Vetter, Randy Dunlap, Luca Ceresoli,
	Philipp Stanner, linux-block, linux-kernel, cgroups,
	linux-ntfs-dev, linux-fsdevel, io-uring, audit, bpf, netdev,
	dri-devel, linux-perf-users, linux-trace-kernel, kexec,
	live-patching, linux-modules, linux-crypto, linux-pm, rcu,
	sched-ext, linux-mm, virtualization, damon, llvm, chengkaitao
In-Reply-To: <20260622040533.29824-1-kaitao.cheng@linux.dev>

On Mon, 22 Jun 2026, Kaitao Cheng <kaitao.cheng@linux.dev> wrote:
> Add *_mutable() iterator variants for list, hlist and llist.  The new
> helpers are variadic and support both forms.  In the common case, the
> caller omits the temporary cursor and the macro creates a unique internal
> cursor with typeof(pos) and __UNIQUE_ID().  If a loop really needs an
> explicit temporary cursor, the caller can still pass it and the helper
> keeps the existing *_safe() behaviour.
>
> For example, a call site may use the shorter form:
>
>   list_for_each_entry_mutable(pos, head, member)
>
> or keep the explicit temporary cursor form:
>
>   list_for_each_entry_mutable(pos, tmp, head, member)

I'm unconvinced it's a good idea to allow two forms with macro trickery,
*especially* when it's not the last argument you can omit. I think it's
a footgun.

IMO stick with the first form only, and there'll always be the _safe
variant that can be used when the temp pointer is needed.


BR,
Jani.


-- 
Jani Nikula, Intel

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox