Linux block layer
 help / color / mirror / Atom feed
From: Christian Brauner <brauner@kernel.org>
To: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>, Jens Axboe <axboe@kernel.dk>,
	 Alexander Viro <viro@zeniv.linux.org.uk>,
	linux-block@vger.kernel.org,  linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org,  Carlos Maiolino <cem@kernel.org>,
	linux-xfs@vger.kernel.org,  Chris Mason <clm@fb.com>,
	David Sterba <dsterba@suse.com>,
	 linux-btrfs@vger.kernel.org, Theodore Ts'o <tytso@mit.edu>,
	 linux-ext4@vger.kernel.org, Gao Xiang <xiang@kernel.org>,
	 linux-erofs@lists.ozlabs.org,
	 "Christian Brauner (Amutable)" <brauner@kernel.org>,
	 syzbot@syzkaller.appspotmail.com, Gao Xiang <xiang@kernel.org>
Subject: [PATCH RFC v2 00/18] fs: support freeze/thaw/mark_dead/sync with shared devices
Date: Tue, 16 Jun 2026 16:08:16 +0200	[thread overview]
Message-ID: <20260616-work-super-bdev_holder_global-v2-0-7df6b864028e@kernel.org> (raw)

This is a generalization of the device number to superblock so it works
for actual block device and anonymous (or even mtd) devices.

fs_holder_ops recovers the affected superblock from bdev->bd_holder. That
forces the holder of a block device to be exactly one superblock and makes
it impossible for several superblocks to share a single device.

erofs does exactly that. It can mount read-only "blob" devices that are
shared between many superblocks: a metadata-only erofs that indexes a set
of per-layer blobs (one filesystem instead of one per OCI layer), or an
incremental image whose base device is shared by several updates. Because
the block layer only tracks a single holder, a freeze, thaw, removal or
sync on such a device is never propagated to all the superblocks using it,
and the current infrastructure has no way to find them.

This series replaces the bd_holder-based lookup with a global, dev_t-keyed
table mapping each block device to the superblock(s) using it. The holder
argument becomes purely the block layer's exclusivity token -- a superblock,
or the file_system_type for a device shared within one filesystem type --
and the fs_holder_ops callbacks look the device up in the table and act on
every superblock registered for it: 1:1 for most filesystems, 1:many for
erofs.

Filesystems claim and release their devices through new
fs_bdev_file_open_by_{dev,path}() and fs_bdev_file_release() helpers; the
per-fs patches convert xfs, btrfs, ext4, f2fs and erofs over to them and
fix cramfs and romfs, which released the registered main device with a
raw bdev_fput().

Since every superblock is registered under its s_dev the table also
replaces the last s_dev-keyed walk of the super_blocks list:
user_get_super() resolves device numbers through it, so ustat() and
quotactl() now work on any device a filesystem claims and no longer
take sb_lock.

The longer-term motivation is to let userspace decide which devices may be
onlined from one central place, without having to teach every filesystem
about it individually.

Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
---
Changes in v2:
- super: rework the device-to-superblock table reference counting: each
  (device, superblock) entry carries a single claim count and holds one
  passive reference on its superblock for the entry's lifetime. New prep
  patches convert s_count to refcount_t s_passive and make put_super()
  self-locking.
- super: preallocate the entry in alloc_super() and register it from the
  set callbacks through set_anon_super()/set_bdev_super(); an insert
  failure unwinds exactly like a set callback failure. The superblock
  stashes the entry in sb->s_super_dev and kill_super_notify() drops the
  claim through it.
- super: initialize the table from mnt_init(); the rootfs and shm mounts
  are created long before any initcall runs.
- super: fold the v1 "refuse to claim a frozen block device" patch into
  the registration helper and restore the EBUSY check for the primary
  device in setup_bdev_super(): additional devices (the xfs log, the ext4
  journal, erofs blobs) are now refused while frozen as well, answering
  Jan's question on v1 3/8.
- Split the core patch into table/helpers/switch-over and move the
  xfs/btrfs/ext4 conversions before the fs_holder_ops switch so no
  freeze/mark_dead events are lost mid-series; erofs follows the switch.
- New prep patches: the ext4 KUnit tests allocate anonymous devices and
  ocfs2 stops resetting s_dev on dismount.
- New: convert user_get_super() to the device table, plus a ustat()
  selftest.
- New: fix a pre-existing double release of the realtime device file and
  dangling buftarg pointers in xfs_open_devices()'s error unwind.
- New: convert f2fs's additional devices to the helpers; fix cramfs and
  romfs releasing the registered main device with a raw bdev_fput().
- erofs: drop the .shutdown() and .remove_bdev() implementations and the
  per-device "dead" flag. Immutable filesystems don't need them: the block
  layer sets GD_DEAD before fs_bdev_mark_dead() so in-flight bios fail
  anyway, erofs has no write path or journal to stop, and the read-only
  loop_change_fd() case must not be forced to -EIO. Patch from Gao Xiang,
  applied verbatim - thanks!
- btrfs: fix a general protection fault in close_fs_devices() on a failed
  mount (reported by syzbot). The release path took the superblock from
  device->fs_info, which is still NULL if open_ctree() fails before
  btrfs_init_devices_late(); it now uses bdev_file->private_data.
- erofs: the v1 conversion was sent with a generic boilerplate changelog;
  superseded by Gao's patch above.
- Collect Reviewed-by from Jan Kara and Tested-by from syzbot.
- Rebase onto v7.1-rc1.
- Link to v1: https://patch.msgid.link/20260602-work-super-bdev_holder_global-v1-0-bb0fd82f3861@kernel.org

---
Christian Brauner (18):
      xfs: fix the error unwind in xfs_open_devices()
      super: convert s_count to refcount_t s_passive
      super: take lock after last reference count
      fs, block: move blk_mode_t and fop_flags_t into <linux/types.h>
      ext4: use anonymous devices for KUnit test superblocks
      ocfs2: don't reset s_dev on dismount
      fs: maintain a global device-to-superblock table
      fs: add dedicated block device open helpers for filesystems
      xfs: port to fs_bdev_file_open_by_path()
      btrfs: open via dedicated fs bdev helpers
      ext4: open via dedicated fs bdev helpers
      fs: look up superblocks via the device table in fs_holder_ops
      fs: tolerate per-superblock freeze errors on shared devices
      erofs: open via dedicated fs bdev helpers
      f2fs: open via dedicated fs bdev helpers
      super: make fs_holder_ops private
      fs: look up the superblock via the device table in user_get_super()
      selftests/filesystems: add ustat() coverage

 fs/btrfs/volumes.c                               |  31 +-
 fs/cramfs/inode.c                                |   2 +-
 fs/erofs/super.c                                 |  35 +-
 fs/ext4/extents-test.c                           |   9 +-
 fs/ext4/mballoc-test.c                           |   9 +-
 fs/ext4/super.c                                  |  12 +-
 fs/f2fs/super.c                                  |   6 +-
 fs/internal.h                                    |   1 +
 fs/namespace.c                                   |   2 +
 fs/ocfs2/super.c                                 |   1 -
 fs/romfs/super.c                                 |   2 +-
 fs/super.c                                       | 620 ++++++++++++++++-------
 fs/xfs/xfs_buf.c                                 |   2 +-
 fs/xfs/xfs_super.c                               |  13 +-
 include/linux/blkdev.h                           |   9 -
 include/linux/fs.h                               |   2 -
 include/linux/fs/super.h                         |   8 +
 include/linux/fs/super_types.h                   |   4 +-
 include/linux/types.h                            |   2 +
 tools/testing/selftests/filesystems/.gitignore   |   1 +
 tools/testing/selftests/filesystems/Makefile     |   2 +-
 tools/testing/selftests/filesystems/ustat_test.c | 135 +++++
 22 files changed, 647 insertions(+), 261 deletions(-)
---
base-commit: 0c0d974f62e6603d4514e1a8035658edb353c68f
change-id: 20260602-work-super-bdev_holder_global-8cba5e52bed5


             reply	other threads:[~2026-06-16 14:08 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-16 14:08 Christian Brauner [this message]
2026-06-16 14:08 ` [PATCH RFC v2 01/18] xfs: fix the error unwind in xfs_open_devices() Christian Brauner
2026-06-16 14:08 ` [PATCH RFC v2 02/18] super: convert s_count to refcount_t s_passive Christian Brauner
2026-06-16 14:08 ` [PATCH RFC v2 03/18] super: take lock after last reference count Christian Brauner
2026-06-16 14:08 ` [PATCH RFC v2 04/18] fs, block: move blk_mode_t and fop_flags_t into <linux/types.h> Christian Brauner
2026-06-16 14:08 ` [PATCH RFC v2 05/18] ext4: use anonymous devices for KUnit test superblocks Christian Brauner
2026-06-16 14:08 ` [PATCH RFC v2 06/18] ocfs2: don't reset s_dev on dismount Christian Brauner
2026-06-16 14:08 ` [PATCH RFC v2 07/18] fs: maintain a global device-to-superblock table Christian Brauner
2026-06-16 14:08 ` [PATCH RFC v2 08/18] fs: add dedicated block device open helpers for filesystems Christian Brauner
2026-06-16 14:08 ` [PATCH RFC v2 09/18] xfs: port to fs_bdev_file_open_by_path() Christian Brauner
2026-06-16 14:08 ` [PATCH RFC v2 10/18] btrfs: open via dedicated fs bdev helpers Christian Brauner
2026-06-16 14:08 ` [PATCH RFC v2 11/18] ext4: " Christian Brauner
2026-06-16 14:08 ` [PATCH RFC v2 12/18] fs: look up superblocks via the device table in fs_holder_ops Christian Brauner
2026-06-16 14:08 ` [PATCH RFC v2 13/18] fs: tolerate per-superblock freeze errors on shared devices Christian Brauner
2026-06-16 14:08 ` [PATCH RFC v2 14/18] erofs: open via dedicated fs bdev helpers Christian Brauner
2026-06-16 14:08 ` [PATCH RFC v2 15/18] f2fs: " Christian Brauner
2026-06-17  3:17   ` Chao Yu
2026-06-16 14:08 ` [PATCH RFC v2 16/18] super: make fs_holder_ops private Christian Brauner
2026-06-16 14:08 ` [PATCH RFC v2 17/18] fs: look up the superblock via the device table in user_get_super() Christian Brauner
2026-06-16 14:08 ` [PATCH RFC v2 18/18] selftests/filesystems: add ustat() coverage Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260616-work-super-bdev_holder_global-v2-0-7df6b864028e@kernel.org \
    --to=brauner@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=cem@kernel.org \
    --cc=clm@fb.com \
    --cc=dsterba@suse.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-erofs@lists.ozlabs.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=syzbot@syzkaller.appspotmail.com \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    --cc=xiang@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox