linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/34] Open block devices as files
@ 2024-01-23 13:26 Christian Brauner
  2024-01-23 13:26 ` [PATCH v2 01/34] bdev: open block device " Christian Brauner
                   ` (37 more replies)
  0 siblings, 38 replies; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

Hey Christoph,
Hey Jan,
Hey Jens,

This opens block devices as files. Instead of introducing a separate
indirection into bdev_open_by_*() vis struct bdev_handle we can just
make bdev_file_open_by_*() return a struct file. Opening and closing a
block device from setup_bdev_super() and in all other places just
becomes equivalent to opening and closing a file.

This has held up in xfstests and in blktests so far and it seems stable
and clean. The equivalence of opening and closing block devices to
regular files is a win in and of itself imho. Added to that is the
ability to do away with struct bdev_handle completely and make various
low-level helpers private to the block layer.

All places were we currently stash a struct bdev_handle we just stash a
file and use an accessor such as file_bdev() akin to I_BDEV() to get to
the block device.

It's now also possible to use file->f_mapping as a replacement for
bdev->bd_inode->i_mapping and file->f_inode or file->f_mapping->host as
an alternative to bdev->bd_inode allowing us to significantly reduce or
even fully remove bdev->bd_inode in follow-up patches.

In addition, we could get rid of sb->s_bdev and various other places
that stash the block device directly and instead stash the block device
file. Again, this is follow-up work.

Thanks!
Christian

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
Changes in v2:
- This is not an RFC anymore.
- The patches to convert all of fs/buffer.c and associated helpers to
  struct file have been split out of the core infrastructure.
- Various renaming of helpers in response to v1.
- Link to v1: https://lore.kernel.org/r/20240103-vfs-bdev-file-v1-0-6c8ee55fb6ef@kernel.org

---
Christian Brauner (34):
      bdev: open block device as files
      block/ioctl: port blkdev_bszset() to file
      block/genhd: port disk_scan_partitions() to file
      md: port block device access to file
      swap: port block device usage to file
      power: port block device access to file
      xfs: port block device access to files
      drbd: port block device access to file
      pktcdvd: port block device access to file
      rnbd: port block device access to file
      xen: port block device access to file
      zram: port block device access to file
      bcache: port block device access to files
      block2mtd: port device access to files
      nvme: port block device access to file
      s390: port block device access to file
      target: port block device access to file
      bcachefs: port block device access to file
      btrfs: port device access to file
      erofs: port device access to file
      ext4: port block device access to file
      f2fs: port block device access to files
      jfs: port block device access to file
      nfs: port block device access to files
      ocfs2: port block device access to file
      reiserfs: port block device access to file
      bdev: remove bdev_open_by_path()
      bdev: make bdev_release() private to block layer
      bdev: make struct bdev_handle private to the block layer
      bdev: remove bdev pointer from struct bdev_handle
      block: use file->f_op to indicate restricted writes
      block: remove bdev_handle completely
      block: expose bdev_file_inode()
      ext4: rely on sb->f_bdev only

 block/bdev.c                        | 249 ++++++++++++++++++++++--------------
 block/blk.h                         |   6 +
 block/fops.c                        |  48 ++++---
 block/genhd.c                       |  12 +-
 block/ioctl.c                       |   9 +-
 drivers/block/drbd/drbd_int.h       |   4 +-
 drivers/block/drbd/drbd_nl.c        |  58 ++++-----
 drivers/block/pktcdvd.c             |  68 +++++-----
 drivers/block/rnbd/rnbd-srv.c       |  28 ++--
 drivers/block/rnbd/rnbd-srv.h       |   2 +-
 drivers/block/xen-blkback/blkback.c |   4 +-
 drivers/block/xen-blkback/common.h  |   4 +-
 drivers/block/xen-blkback/xenbus.c  |  37 +++---
 drivers/block/zram/zram_drv.c       |  26 ++--
 drivers/block/zram/zram_drv.h       |   2 +-
 drivers/md/bcache/bcache.h          |   4 +-
 drivers/md/bcache/super.c           |  74 +++++------
 drivers/md/dm.c                     |  23 ++--
 drivers/md/md.c                     |  12 +-
 drivers/md/md.h                     |   2 +-
 drivers/mtd/devices/block2mtd.c     |  46 +++----
 drivers/nvme/target/io-cmd-bdev.c   |  16 +--
 drivers/nvme/target/nvmet.h         |   2 +-
 drivers/s390/block/dasd.c           |  10 +-
 drivers/s390/block/dasd_genhd.c     |  36 +++---
 drivers/s390/block/dasd_int.h       |   2 +-
 drivers/s390/block/dasd_ioctl.c     |   2 +-
 drivers/target/target_core_iblock.c |  18 +--
 drivers/target/target_core_iblock.h |   2 +-
 drivers/target/target_core_pscsi.c  |  22 ++--
 drivers/target/target_core_pscsi.h  |   2 +-
 fs/bcachefs/super-io.c              |  20 +--
 fs/bcachefs/super_types.h           |   2 +-
 fs/btrfs/dev-replace.c              |  14 +-
 fs/btrfs/ioctl.c                    |  16 +--
 fs/btrfs/volumes.c                  |  92 ++++++-------
 fs/btrfs/volumes.h                  |   4 +-
 fs/cramfs/inode.c                   |   2 +-
 fs/erofs/data.c                     |   6 +-
 fs/erofs/internal.h                 |   2 +-
 fs/erofs/super.c                    |  16 +--
 fs/ext4/dir.c                       |   2 +-
 fs/ext4/ext4.h                      |   2 +-
 fs/ext4/ext4_jbd2.c                 |   2 +-
 fs/ext4/fast_commit.c               |   2 +-
 fs/ext4/fsmap.c                     |   8 +-
 fs/ext4/super.c                     |  87 ++++++-------
 fs/f2fs/f2fs.h                      |   2 +-
 fs/f2fs/super.c                     |  12 +-
 fs/file_table.c                     |  36 +++++-
 fs/jfs/jfs_logmgr.c                 |  26 ++--
 fs/jfs/jfs_logmgr.h                 |   2 +-
 fs/jfs/jfs_mount.c                  |   2 +-
 fs/nfs/blocklayout/blocklayout.h    |   2 +-
 fs/nfs/blocklayout/dev.c            |  68 +++++-----
 fs/ocfs2/cluster/heartbeat.c        |  32 ++---
 fs/reiserfs/journal.c               |  38 +++---
 fs/reiserfs/procfs.c                |   2 +-
 fs/reiserfs/reiserfs.h              |   8 +-
 fs/romfs/super.c                    |   2 +-
 fs/super.c                          |  18 +--
 fs/xfs/xfs_buf.c                    |  10 +-
 fs/xfs/xfs_buf.h                    |   4 +-
 fs/xfs/xfs_super.c                  |  43 +++----
 include/linux/blkdev.h              |  18 +--
 include/linux/device-mapper.h       |   2 +-
 include/linux/file.h                |   2 +
 include/linux/fs.h                  |   4 +-
 include/linux/pktcdvd.h             |   4 +-
 include/linux/swap.h                |   2 +-
 kernel/power/swap.c                 |  28 ++--
 mm/swapfile.c                       |  22 ++--
 72 files changed, 791 insertions(+), 705 deletions(-)
---
base-commit: 6613476e225e090cc9aad49be7fa504e290dd33d
change-id: 20240103-vfs-bdev-file-1208da73d7ea


^ permalink raw reply	[flat|nested] 146+ messages in thread

* [PATCH v2 01/34] bdev: open block device as files
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-01-29 16:02   ` Christoph Hellwig
  2024-03-13  2:32   ` Christoph Hellwig
  2024-01-23 13:26 ` [PATCH v2 02/34] block/ioctl: port blkdev_bszset() to file Christian Brauner
                   ` (36 subsequent siblings)
  37 siblings, 2 replies; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

Add two new helpers to allow opening block devices as files.
This is not the final infrastructure. This still opens the block device
before opening a struct a file. Until we have removed all references to
struct bdev_handle we can't switch the order:

* Introduce blk_to_file_flags() to translate from block specific to
  flags usable to pen a new file.
* Introduce bdev_file_open_by_{dev,path}().
* Introduce temporary sb_bdev_handle() helper to retrieve a struct
  bdev_handle from a block device file and update places that directly
  reference struct bdev_handle to rely on it.
* Don't count block device openes against the number of open files. A
  bdev_file_open_by_{dev,path}() file is never installed into any
  file descriptor table.

One idea that came to mind was to use kernel_tmpfile_open() which
would require us to pass a path and it would then call do_dentry_open()
going through the regular fops->open::blkdev_open() path. But that has
drawbacks that I consider fatal. We're back to the problem of
routing block specific flags such as BLK_OPEN_RESTRICT_WRITES through
the open path and would have to waste FMODE_* flags every time we add a
new one. Second, it would prohibit us from later using custom fops to
indicate that this is a restricted write operation as well. Overall, we
can avoid using an fmode flag and we have way more leeway in how we open
block devices from bdev_open_by_{dev,path}().

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 block/bdev.c           | 104 +++++++++++++++++++++++++++++++++++++++++++++++--
 fs/cramfs/inode.c      |   2 +-
 fs/f2fs/super.c        |   2 +-
 fs/file_table.c        |  36 +++++++++++++----
 fs/jfs/jfs_logmgr.c    |   2 +-
 fs/romfs/super.c       |   2 +-
 fs/super.c             |  18 ++++-----
 fs/xfs/xfs_super.c     |   2 +-
 include/linux/blkdev.h |   7 ++++
 include/linux/file.h   |   2 +
 include/linux/fs.h     |  10 ++++-
 11 files changed, 160 insertions(+), 27 deletions(-)

diff --git a/block/bdev.c b/block/bdev.c
index e9f1b12bd75c..4246a57a7c5a 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -49,6 +49,13 @@ struct block_device *I_BDEV(struct inode *inode)
 }
 EXPORT_SYMBOL(I_BDEV);
 
+struct block_device *file_bdev(struct file *bdev_file)
+{
+	struct bdev_handle *handle = bdev_file->private_data;
+	return handle->bdev;
+}
+EXPORT_SYMBOL(file_bdev);
+
 static void bdev_write_inode(struct block_device *bdev)
 {
 	struct inode *inode = bdev->bd_inode;
@@ -368,12 +375,12 @@ static struct file_system_type bd_type = {
 };
 
 struct super_block *blockdev_superblock __ro_after_init;
+struct vfsmount *blockdev_mnt __ro_after_init;
 EXPORT_SYMBOL_GPL(blockdev_superblock);
 
 void __init bdev_cache_init(void)
 {
 	int err;
-	static struct vfsmount *bd_mnt __ro_after_init;
 
 	bdev_cachep = kmem_cache_create("bdev_cache", sizeof(struct bdev_inode),
 			0, (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
@@ -382,10 +389,10 @@ void __init bdev_cache_init(void)
 	err = register_filesystem(&bd_type);
 	if (err)
 		panic("Cannot register bdev pseudo-fs");
-	bd_mnt = kern_mount(&bd_type);
-	if (IS_ERR(bd_mnt))
+	blockdev_mnt = kern_mount(&bd_type);
+	if (IS_ERR(blockdev_mnt))
 		panic("Cannot create bdev pseudo-fs");
-	blockdev_superblock = bd_mnt->mnt_sb;   /* For writeback */
+	blockdev_superblock = blockdev_mnt->mnt_sb;   /* For writeback */
 }
 
 struct block_device *bdev_alloc(struct gendisk *disk, u8 partno)
@@ -911,6 +918,95 @@ struct bdev_handle *bdev_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
 }
 EXPORT_SYMBOL(bdev_open_by_dev);
 
+static unsigned blk_to_file_flags(blk_mode_t mode)
+{
+	unsigned int flags = 0;
+
+	if ((mode & (BLK_OPEN_READ | BLK_OPEN_WRITE)) ==
+	    (BLK_OPEN_READ | BLK_OPEN_WRITE))
+		flags |= O_RDWR;
+	else if (mode & BLK_OPEN_WRITE)
+		flags |= O_WRONLY;
+	else if (mode & BLK_OPEN_READ)
+		flags |= O_RDONLY;
+	else /* Neither read nor write for a block device requested? */
+		WARN_ON_ONCE(true);
+
+	/*
+	 * O_EXCL is one of those flags that the VFS clears once it's done with
+	 * the operation. So don't raise it here either.
+	 */
+	if (mode & BLK_OPEN_NDELAY)
+		flags |= O_NDELAY;
+
+	/*
+	 * If BLK_OPEN_WRITE_IOCTL is set then this is a historical quirk
+	 * associated with the floppy driver where it has allowed ioctls if the
+	 * file was opened for writing, but does not allow reads or writes.
+	 * Make sure that this quirk is reflected in @f_flags.
+	 */
+	if (mode & BLK_OPEN_WRITE_IOCTL)
+		flags |= O_RDWR | O_WRONLY;
+
+	return flags;
+}
+
+struct file *bdev_file_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
+				   const struct blk_holder_ops *hops)
+{
+	struct file *bdev_file;
+	struct bdev_handle *handle;
+	unsigned int flags;
+
+	handle = bdev_open_by_dev(dev, mode, holder, hops);
+	if (IS_ERR(handle))
+		return ERR_CAST(handle);
+
+	flags = blk_to_file_flags(mode);
+	bdev_file = alloc_file_pseudo_noaccount(handle->bdev->bd_inode,
+			blockdev_mnt, "", flags | O_LARGEFILE, &def_blk_fops);
+	if (IS_ERR(bdev_file)) {
+		bdev_release(handle);
+		return bdev_file;
+	}
+	ihold(handle->bdev->bd_inode);
+
+	bdev_file->f_mode |= FMODE_BUF_RASYNC | FMODE_CAN_ODIRECT;
+	if (bdev_nowait(handle->bdev))
+		bdev_file->f_mode |= FMODE_NOWAIT;
+
+	bdev_file->f_mapping = handle->bdev->bd_inode->i_mapping;
+	bdev_file->f_wb_err = filemap_sample_wb_err(bdev_file->f_mapping);
+	bdev_file->private_data = handle;
+	return bdev_file;
+}
+EXPORT_SYMBOL(bdev_file_open_by_dev);
+
+struct file *bdev_file_open_by_path(const char *path, blk_mode_t mode,
+				    void *holder,
+				    const struct blk_holder_ops *hops)
+{
+	struct file *bdev_file;
+	dev_t dev;
+	int error;
+
+	error = lookup_bdev(path, &dev);
+	if (error)
+		return ERR_PTR(error);
+
+	bdev_file = bdev_file_open_by_dev(dev, mode, holder, hops);
+	if (!IS_ERR(bdev_file) && (mode & BLK_OPEN_WRITE)) {
+		struct bdev_handle *handle = bdev_file->private_data;
+		if (bdev_read_only(handle->bdev)) {
+			fput(bdev_file);
+			bdev_file = ERR_PTR(-EACCES);
+		}
+	}
+
+	return bdev_file;
+}
+EXPORT_SYMBOL(bdev_file_open_by_path);
+
 /**
  * bdev_open_by_path - open a block device by name
  * @path: path to the block device to open
diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
index 60dbfa0f8805..39e75131fd5a 100644
--- a/fs/cramfs/inode.c
+++ b/fs/cramfs/inode.c
@@ -495,7 +495,7 @@ static void cramfs_kill_sb(struct super_block *sb)
 		sb->s_mtd = NULL;
 	} else if (IS_ENABLED(CONFIG_CRAMFS_BLOCKDEV) && sb->s_bdev) {
 		sync_blockdev(sb->s_bdev);
-		bdev_release(sb->s_bdev_handle);
+		fput(sb->s_bdev_file);
 	}
 	kfree(sbi);
 }
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index d45ab0992ae5..ea94c148fee5 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -4247,7 +4247,7 @@ static int f2fs_scan_devices(struct f2fs_sb_info *sbi)
 
 	for (i = 0; i < max_devices; i++) {
 		if (i == 0)
-			FDEV(0).bdev_handle = sbi->sb->s_bdev_handle;
+			FDEV(0).bdev_handle = sb_bdev_handle(sbi->sb);
 		else if (!RDEV(i).path[0])
 			break;
 
diff --git a/fs/file_table.c b/fs/file_table.c
index b991f90571b4..f61e212b99f4 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -281,13 +281,17 @@ struct file *alloc_empty_backing_file(int flags, const struct cred *cred)
  * @path: the (dentry, vfsmount) pair for the new file
  * @flags: O_... flags with which the new file will be opened
  * @fop: the 'struct file_operations' for the new file
+ * @noaccount: whether this is an internal open that shouldn't be counted
  */
 static struct file *alloc_file(const struct path *path, int flags,
-		const struct file_operations *fop)
+		const struct file_operations *fop, bool noaccount)
 {
 	struct file *file;
 
-	file = alloc_empty_file(flags, current_cred());
+	if (noaccount)
+		file = alloc_empty_file_noaccount(flags, current_cred());
+	else
+		file = alloc_empty_file(flags, current_cred());
 	if (IS_ERR(file))
 		return file;
 
@@ -312,9 +316,11 @@ static struct file *alloc_file(const struct path *path, int flags,
 	return file;
 }
 
-struct file *alloc_file_pseudo(struct inode *inode, struct vfsmount *mnt,
-				const char *name, int flags,
-				const struct file_operations *fops)
+static struct file *__alloc_file_pseudo(struct inode *inode,
+					struct vfsmount *mnt, const char *name,
+					int flags,
+					const struct file_operations *fops,
+					bool noaccount)
 {
 	struct qstr this = QSTR_INIT(name, strlen(name));
 	struct path path;
@@ -325,19 +331,35 @@ struct file *alloc_file_pseudo(struct inode *inode, struct vfsmount *mnt,
 		return ERR_PTR(-ENOMEM);
 	path.mnt = mntget(mnt);
 	d_instantiate(path.dentry, inode);
-	file = alloc_file(&path, flags, fops);
+	file = alloc_file(&path, flags, fops, noaccount);
 	if (IS_ERR(file)) {
 		ihold(inode);
 		path_put(&path);
 	}
 	return file;
 }
+
+struct file *alloc_file_pseudo(struct inode *inode, struct vfsmount *mnt,
+				const char *name, int flags,
+				const struct file_operations *fops)
+{
+	return __alloc_file_pseudo(inode, mnt, name, flags, fops, false);
+}
 EXPORT_SYMBOL(alloc_file_pseudo);
 
+struct file *alloc_file_pseudo_noaccount(struct inode *inode,
+					 struct vfsmount *mnt, const char *name,
+					 int flags,
+					 const struct file_operations *fops)
+{
+	return __alloc_file_pseudo(inode, mnt, name, flags, fops, true);
+}
+EXPORT_SYMBOL_GPL(alloc_file_pseudo_noaccount);
+
 struct file *alloc_file_clone(struct file *base, int flags,
 				const struct file_operations *fops)
 {
-	struct file *f = alloc_file(&base->f_path, flags, fops);
+	struct file *f = alloc_file(&base->f_path, flags, fops, false);
 	if (!IS_ERR(f)) {
 		path_get(&f->f_path);
 		f->f_mapping = base->f_mapping;
diff --git a/fs/jfs/jfs_logmgr.c b/fs/jfs/jfs_logmgr.c
index cb6d1fda66a7..8691463956d1 100644
--- a/fs/jfs/jfs_logmgr.c
+++ b/fs/jfs/jfs_logmgr.c
@@ -1162,7 +1162,7 @@ static int open_inline_log(struct super_block *sb)
 	init_waitqueue_head(&log->syncwait);
 
 	set_bit(log_INLINELOG, &log->flag);
-	log->bdev_handle = sb->s_bdev_handle;
+	log->bdev_handle = sb_bdev_handle(sb);
 	log->base = addressPXD(&JFS_SBI(sb)->logpxd);
 	log->size = lengthPXD(&JFS_SBI(sb)->logpxd) >>
 	    (L2LOGPSIZE - sb->s_blocksize_bits);
diff --git a/fs/romfs/super.c b/fs/romfs/super.c
index 545ad44f96b8..1ed468c03557 100644
--- a/fs/romfs/super.c
+++ b/fs/romfs/super.c
@@ -594,7 +594,7 @@ static void romfs_kill_sb(struct super_block *sb)
 #ifdef CONFIG_ROMFS_ON_BLOCK
 	if (sb->s_bdev) {
 		sync_blockdev(sb->s_bdev);
-		bdev_release(sb->s_bdev_handle);
+		fput(sb->s_bdev_file);
 	}
 #endif
 }
diff --git a/fs/super.c b/fs/super.c
index d35e85295489..08dcc3371aa0 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1532,16 +1532,16 @@ int setup_bdev_super(struct super_block *sb, int sb_flags,
 		struct fs_context *fc)
 {
 	blk_mode_t mode = sb_open_mode(sb_flags);
-	struct bdev_handle *bdev_handle;
+	struct file *bdev_file;
 	struct block_device *bdev;
 
-	bdev_handle = bdev_open_by_dev(sb->s_dev, mode, sb, &fs_holder_ops);
-	if (IS_ERR(bdev_handle)) {
+	bdev_file = bdev_file_open_by_dev(sb->s_dev, mode, sb, &fs_holder_ops);
+	if (IS_ERR(bdev_file)) {
 		if (fc)
 			errorf(fc, "%s: Can't open blockdev", fc->source);
-		return PTR_ERR(bdev_handle);
+		return PTR_ERR(bdev_file);
 	}
-	bdev = bdev_handle->bdev;
+	bdev = file_bdev(bdev_file);
 
 	/*
 	 * This really should be in blkdev_get_by_dev, but right now can't due
@@ -1549,7 +1549,7 @@ int setup_bdev_super(struct super_block *sb, int sb_flags,
 	 * writable from userspace even for a read-only block device.
 	 */
 	if ((mode & BLK_OPEN_WRITE) && bdev_read_only(bdev)) {
-		bdev_release(bdev_handle);
+		fput(bdev_file);
 		return -EACCES;
 	}
 
@@ -1560,11 +1560,11 @@ int setup_bdev_super(struct super_block *sb, int sb_flags,
 	if (atomic_read(&bdev->bd_fsfreeze_count) > 0) {
 		if (fc)
 			warnf(fc, "%pg: Can't mount, blockdev is frozen", bdev);
-		bdev_release(bdev_handle);
+		fput(bdev_file);
 		return -EBUSY;
 	}
 	spin_lock(&sb_lock);
-	sb->s_bdev_handle = bdev_handle;
+	sb->s_bdev_file = bdev_file;
 	sb->s_bdev = bdev;
 	sb->s_bdi = bdi_get(bdev->bd_disk->bdi);
 	if (bdev_stable_writes(bdev))
@@ -1680,7 +1680,7 @@ void kill_block_super(struct super_block *sb)
 	generic_shutdown_super(sb);
 	if (bdev) {
 		sync_blockdev(bdev);
-		bdev_release(sb->s_bdev_handle);
+		fput(sb->s_bdev_file);
 	}
 }
 
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index aff20ddd4a9f..e5ac0e59ede9 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -467,7 +467,7 @@ xfs_open_devices(
 	 * Setup xfs_mount buffer target pointers
 	 */
 	error = -ENOMEM;
-	mp->m_ddev_targp = xfs_alloc_buftarg(mp, sb->s_bdev_handle);
+	mp->m_ddev_targp = xfs_alloc_buftarg(mp, sb_bdev_handle(sb));
 	if (!mp->m_ddev_targp)
 		goto out_close_rtdev;
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 99e4f5e72213..76706aa47316 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -24,6 +24,7 @@
 #include <linux/sbitmap.h>
 #include <linux/uuid.h>
 #include <linux/xarray.h>
+#include <linux/file.h>
 
 struct module;
 struct request_queue;
@@ -1474,6 +1475,7 @@ extern const struct blk_holder_ops fs_holder_ops;
 	(BLK_OPEN_READ | BLK_OPEN_RESTRICT_WRITES | \
 	 (((flags) & SB_RDONLY) ? 0 : BLK_OPEN_WRITE))
 
+/* @bdev_handle will be removed soon. */
 struct bdev_handle {
 	struct block_device *bdev;
 	void *holder;
@@ -1484,6 +1486,10 @@ struct bdev_handle *bdev_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
 		const struct blk_holder_ops *hops);
 struct bdev_handle *bdev_open_by_path(const char *path, blk_mode_t mode,
 		void *holder, const struct blk_holder_ops *hops);
+struct file *bdev_file_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
+		const struct blk_holder_ops *hops);
+struct file *bdev_file_open_by_path(const char *path, blk_mode_t mode,
+		void *holder, const struct blk_holder_ops *hops);
 int bd_prepare_to_claim(struct block_device *bdev, void *holder,
 		const struct blk_holder_ops *hops);
 void bd_abort_claiming(struct block_device *bdev, void *holder);
@@ -1494,6 +1500,7 @@ struct block_device *blkdev_get_no_open(dev_t dev);
 void blkdev_put_no_open(struct block_device *bdev);
 
 struct block_device *I_BDEV(struct inode *inode);
+struct block_device *file_bdev(struct file *bdev_file);
 
 #ifdef CONFIG_BLOCK
 void invalidate_bdev(struct block_device *bdev);
diff --git a/include/linux/file.h b/include/linux/file.h
index 6834a29338c4..169692cb1906 100644
--- a/include/linux/file.h
+++ b/include/linux/file.h
@@ -24,6 +24,8 @@ struct inode;
 struct path;
 extern struct file *alloc_file_pseudo(struct inode *, struct vfsmount *,
 	const char *, int flags, const struct file_operations *);
+extern struct file *alloc_file_pseudo_noaccount(struct inode *, struct vfsmount *,
+	const char *, int flags, const struct file_operations *);
 extern struct file *alloc_file_clone(struct file *, int flags,
 	const struct file_operations *);
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index ed5966a70495..e9291e27cc47 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1228,8 +1228,8 @@ struct super_block {
 #endif
 	struct hlist_bl_head	s_roots;	/* alternate root dentries for NFS */
 	struct list_head	s_mounts;	/* list of mounts; _not_ for fs use */
-	struct block_device	*s_bdev;
-	struct bdev_handle	*s_bdev_handle;
+	struct block_device	*s_bdev;	/* can go away once we use an accessor for @s_bdev_file */
+	struct file		*s_bdev_file;
 	struct backing_dev_info *s_bdi;
 	struct mtd_info		*s_mtd;
 	struct hlist_node	s_instances;
@@ -1327,6 +1327,12 @@ struct super_block {
 	struct list_head	s_inodes_wb;	/* writeback inodes */
 } __randomize_layout;
 
+/* Temporary helper that will go away. */
+static inline struct bdev_handle *sb_bdev_handle(struct super_block *sb)
+{
+	return sb->s_bdev_file->private_data;
+}
+
 static inline struct user_namespace *i_user_ns(const struct inode *inode)
 {
 	return inode->i_sb->s_user_ns;

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 02/34] block/ioctl: port blkdev_bszset() to file
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
  2024-01-23 13:26 ` [PATCH v2 01/34] bdev: open block device " Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-01-29 16:14   ` Christoph Hellwig
  2024-01-31 18:10   ` Jan Kara
  2024-01-23 13:26 ` [PATCH v2 03/34] block/genhd: port disk_scan_partitions() " Christian Brauner
                   ` (35 subsequent siblings)
  37 siblings, 2 replies; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 block/ioctl.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/block/ioctl.c b/block/ioctl.c
index 9c73a763ef88..5d0619e02e4c 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -471,7 +471,7 @@ static int blkdev_bszset(struct block_device *bdev, blk_mode_t mode,
 		int __user *argp)
 {
 	int ret, n;
-	struct bdev_handle *handle;
+	struct file *file;
 
 	if (!capable(CAP_SYS_ADMIN))
 		return -EACCES;
@@ -483,12 +483,11 @@ static int blkdev_bszset(struct block_device *bdev, blk_mode_t mode,
 	if (mode & BLK_OPEN_EXCL)
 		return set_blocksize(bdev, n);
 
-	handle = bdev_open_by_dev(bdev->bd_dev, mode, &bdev, NULL);
-	if (IS_ERR(handle))
+	file = bdev_file_open_by_dev(bdev->bd_dev, mode, &bdev, NULL);
+	if (IS_ERR(file))
 		return -EBUSY;
 	ret = set_blocksize(bdev, n);
-	bdev_release(handle);
-
+	fput(file);
 	return ret;
 }
 

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 03/34] block/genhd: port disk_scan_partitions() to file
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
  2024-01-23 13:26 ` [PATCH v2 01/34] bdev: open block device " Christian Brauner
  2024-01-23 13:26 ` [PATCH v2 02/34] block/ioctl: port blkdev_bszset() to file Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-01-29 16:14   ` Christoph Hellwig
  2024-01-31 18:13   ` Jan Kara
  2024-01-23 13:26 ` [PATCH v2 04/34] md: port block device access " Christian Brauner
                   ` (34 subsequent siblings)
  37 siblings, 2 replies; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

This may run from a kernel thread via device_add_disk(). So this could
also use __fput_sync() if we were worried about EBUSY. But when it is
called from a kernel thread it's always BLK_OPEN_READ so EBUSY can't
really happen even if we do BLK_OPEN_RESTRICT_WRITES or BLK_OPEN_EXCL.

Otherwise it's called from an ioctl on the block device which is only
called from userspace and can rely on task work.

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 block/genhd.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index d74fb5b4ae68..a911d2969c07 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -342,7 +342,7 @@ EXPORT_SYMBOL_GPL(disk_uevent);
 
 int disk_scan_partitions(struct gendisk *disk, blk_mode_t mode)
 {
-	struct bdev_handle *handle;
+	struct file *file;
 	int ret = 0;
 
 	if (disk->flags & (GENHD_FL_NO_PART | GENHD_FL_HIDDEN))
@@ -366,12 +366,12 @@ int disk_scan_partitions(struct gendisk *disk, blk_mode_t mode)
 	}
 
 	set_bit(GD_NEED_PART_SCAN, &disk->state);
-	handle = bdev_open_by_dev(disk_devt(disk), mode & ~BLK_OPEN_EXCL, NULL,
-				  NULL);
-	if (IS_ERR(handle))
-		ret = PTR_ERR(handle);
+	file = bdev_file_open_by_dev(disk_devt(disk), mode & ~BLK_OPEN_EXCL,
+				     NULL, NULL);
+	if (IS_ERR(file))
+		ret = PTR_ERR(file);
 	else
-		bdev_release(handle);
+		fput(file);
 
 	/*
 	 * If blkdev_get_by_dev() failed early, GD_NEED_PART_SCAN is still set,

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 04/34] md: port block device access to file
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (2 preceding siblings ...)
  2024-01-23 13:26 ` [PATCH v2 03/34] block/genhd: port disk_scan_partitions() " Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-01-29 16:14   ` Christoph Hellwig
                     ` (2 more replies)
  2024-01-23 13:26 ` [PATCH v2 05/34] swap: port block device usage " Christian Brauner
                   ` (33 subsequent siblings)
  37 siblings, 3 replies; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 drivers/md/dm.c               | 23 +++++++++++++----------
 drivers/md/md.c               | 12 ++++++------
 drivers/md/md.h               |  2 +-
 include/linux/device-mapper.h |  2 +-
 4 files changed, 21 insertions(+), 18 deletions(-)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 8dcabf84d866..87de5b5682ad 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -726,7 +726,8 @@ static struct table_device *open_table_device(struct mapped_device *md,
 		dev_t dev, blk_mode_t mode)
 {
 	struct table_device *td;
-	struct bdev_handle *bdev_handle;
+	struct file *bdev_file;
+	struct block_device *bdev;
 	u64 part_off;
 	int r;
 
@@ -735,34 +736,36 @@ static struct table_device *open_table_device(struct mapped_device *md,
 		return ERR_PTR(-ENOMEM);
 	refcount_set(&td->count, 1);
 
-	bdev_handle = bdev_open_by_dev(dev, mode, _dm_claim_ptr, NULL);
-	if (IS_ERR(bdev_handle)) {
-		r = PTR_ERR(bdev_handle);
+	bdev_file = bdev_file_open_by_dev(dev, mode, _dm_claim_ptr, NULL);
+	if (IS_ERR(bdev_file)) {
+		r = PTR_ERR(bdev_file);
 		goto out_free_td;
 	}
 
+	bdev = file_bdev(bdev_file);
+
 	/*
 	 * We can be called before the dm disk is added.  In that case we can't
 	 * register the holder relation here.  It will be done once add_disk was
 	 * called.
 	 */
 	if (md->disk->slave_dir) {
-		r = bd_link_disk_holder(bdev_handle->bdev, md->disk);
+		r = bd_link_disk_holder(bdev, md->disk);
 		if (r)
 			goto out_blkdev_put;
 	}
 
 	td->dm_dev.mode = mode;
-	td->dm_dev.bdev = bdev_handle->bdev;
-	td->dm_dev.bdev_handle = bdev_handle;
-	td->dm_dev.dax_dev = fs_dax_get_by_bdev(bdev_handle->bdev, &part_off,
+	td->dm_dev.bdev = bdev;
+	td->dm_dev.bdev_file = bdev_file;
+	td->dm_dev.dax_dev = fs_dax_get_by_bdev(bdev, &part_off,
 						NULL, NULL);
 	format_dev_t(td->dm_dev.name, dev);
 	list_add(&td->list, &md->table_devices);
 	return td;
 
 out_blkdev_put:
-	bdev_release(bdev_handle);
+	fput(bdev_file);
 out_free_td:
 	kfree(td);
 	return ERR_PTR(r);
@@ -775,7 +778,7 @@ static void close_table_device(struct table_device *td, struct mapped_device *md
 {
 	if (md->disk->slave_dir)
 		bd_unlink_disk_holder(td->dm_dev.bdev, md->disk);
-	bdev_release(td->dm_dev.bdev_handle);
+	fput(td->dm_dev.bdev_file);
 	put_dax(td->dm_dev.dax_dev);
 	list_del(&td->list);
 	kfree(td);
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 2266358d8074..0653584db63b 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -2578,7 +2578,7 @@ static void export_rdev(struct md_rdev *rdev, struct mddev *mddev)
 	if (test_bit(AutoDetected, &rdev->flags))
 		md_autodetect_dev(rdev->bdev->bd_dev);
 #endif
-	bdev_release(rdev->bdev_handle);
+	fput(rdev->bdev_file);
 	rdev->bdev = NULL;
 	kobject_put(&rdev->kobj);
 }
@@ -3773,16 +3773,16 @@ static struct md_rdev *md_import_device(dev_t newdev, int super_format, int supe
 	if (err)
 		goto out_clear_rdev;
 
-	rdev->bdev_handle = bdev_open_by_dev(newdev,
+	rdev->bdev_file = bdev_file_open_by_dev(newdev,
 			BLK_OPEN_READ | BLK_OPEN_WRITE,
 			super_format == -2 ? &claim_rdev : rdev, NULL);
-	if (IS_ERR(rdev->bdev_handle)) {
+	if (IS_ERR(rdev->bdev_file)) {
 		pr_warn("md: could not open device unknown-block(%u,%u).\n",
 			MAJOR(newdev), MINOR(newdev));
-		err = PTR_ERR(rdev->bdev_handle);
+		err = PTR_ERR(rdev->bdev_file);
 		goto out_clear_rdev;
 	}
-	rdev->bdev = rdev->bdev_handle->bdev;
+	rdev->bdev = file_bdev(rdev->bdev_file);
 
 	kobject_init(&rdev->kobj, &rdev_ktype);
 
@@ -3813,7 +3813,7 @@ static struct md_rdev *md_import_device(dev_t newdev, int super_format, int supe
 	return rdev;
 
 out_blkdev_put:
-	bdev_release(rdev->bdev_handle);
+	fput(rdev->bdev_file);
 out_clear_rdev:
 	md_rdev_clear(rdev);
 out_free_rdev:
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 8d881cc59799..a079ee9b6190 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -59,7 +59,7 @@ struct md_rdev {
 	 */
 	struct block_device *meta_bdev;
 	struct block_device *bdev;	/* block device handle */
-	struct bdev_handle *bdev_handle;	/* Handle from open for bdev */
+	struct file *bdev_file;		/* Handle from open for bdev */
 
 	struct page	*sb_page, *bb_page;
 	int		sb_loaded;
diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h
index 772ab4d74d94..82b2195efaca 100644
--- a/include/linux/device-mapper.h
+++ b/include/linux/device-mapper.h
@@ -165,7 +165,7 @@ void dm_error(const char *message);
 
 struct dm_dev {
 	struct block_device *bdev;
-	struct bdev_handle *bdev_handle;
+	struct file *bdev_file;
 	struct dax_device *dax_dev;
 	blk_mode_t mode;
 	char name[16];

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 05/34] swap: port block device usage to file
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (3 preceding siblings ...)
  2024-01-23 13:26 ` [PATCH v2 04/34] md: port block device access " Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-01-29 16:15   ` Christoph Hellwig
  2024-01-31 18:16   ` Jan Kara
  2024-01-23 13:26 ` [PATCH v2 06/34] power: port block device access " Christian Brauner
                   ` (32 subsequent siblings)
  37 siblings, 2 replies; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 include/linux/swap.h |  2 +-
 mm/swapfile.c        | 22 +++++++++++-----------
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 4db00ddad261..e5b82bc05e60 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -298,7 +298,7 @@ struct swap_info_struct {
 	unsigned int __percpu *cluster_next_cpu; /*percpu index for next allocation */
 	struct percpu_cluster __percpu *percpu_cluster; /* per cpu's swap location */
 	struct rb_root swap_extent_root;/* root of the swap extent rbtree */
-	struct bdev_handle *bdev_handle;/* open handle of the bdev */
+	struct file *bdev_file;		/* open handle of the bdev */
 	struct block_device *bdev;	/* swap device or bdev of swap file */
 	struct file *swap_file;		/* seldom referenced */
 	unsigned int old_block_size;	/* seldom referenced */
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 556ff7347d5f..73edd6fed6a2 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -2532,10 +2532,10 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile)
 	exit_swap_address_space(p->type);
 
 	inode = mapping->host;
-	if (p->bdev_handle) {
+	if (p->bdev_file) {
 		set_blocksize(p->bdev, old_block_size);
-		bdev_release(p->bdev_handle);
-		p->bdev_handle = NULL;
+		fput(p->bdev_file);
+		p->bdev_file = NULL;
 	}
 
 	inode_lock(inode);
@@ -2765,14 +2765,14 @@ static int claim_swapfile(struct swap_info_struct *p, struct inode *inode)
 	int error;
 
 	if (S_ISBLK(inode->i_mode)) {
-		p->bdev_handle = bdev_open_by_dev(inode->i_rdev,
+		p->bdev_file = bdev_file_open_by_dev(inode->i_rdev,
 				BLK_OPEN_READ | BLK_OPEN_WRITE, p, NULL);
-		if (IS_ERR(p->bdev_handle)) {
-			error = PTR_ERR(p->bdev_handle);
-			p->bdev_handle = NULL;
+		if (IS_ERR(p->bdev_file)) {
+			error = PTR_ERR(p->bdev_file);
+			p->bdev_file = NULL;
 			return error;
 		}
-		p->bdev = p->bdev_handle->bdev;
+		p->bdev = file_bdev(p->bdev_file);
 		p->old_block_size = block_size(p->bdev);
 		error = set_blocksize(p->bdev, PAGE_SIZE);
 		if (error < 0)
@@ -3208,10 +3208,10 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
 	p->percpu_cluster = NULL;
 	free_percpu(p->cluster_next_cpu);
 	p->cluster_next_cpu = NULL;
-	if (p->bdev_handle) {
+	if (p->bdev_file) {
 		set_blocksize(p->bdev, p->old_block_size);
-		bdev_release(p->bdev_handle);
-		p->bdev_handle = NULL;
+		fput(p->bdev_file);
+		p->bdev_file = NULL;
 	}
 	inode = NULL;
 	destroy_swap_extents(p);

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 06/34] power: port block device access to file
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (4 preceding siblings ...)
  2024-01-23 13:26 ` [PATCH v2 05/34] swap: port block device usage " Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-01-29 16:15   ` Christoph Hellwig
  2024-01-31 18:17   ` Jan Kara
  2024-01-23 13:26 ` [PATCH v2 07/34] xfs: port block device access to files Christian Brauner
                   ` (31 subsequent siblings)
  37 siblings, 2 replies; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 kernel/power/swap.c | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/kernel/power/swap.c b/kernel/power/swap.c
index 6053ddddaf65..692f12fe60c1 100644
--- a/kernel/power/swap.c
+++ b/kernel/power/swap.c
@@ -222,7 +222,7 @@ int swsusp_swap_in_use(void)
  */
 
 static unsigned short root_swap = 0xffff;
-static struct bdev_handle *hib_resume_bdev_handle;
+static struct file *hib_resume_bdev_file;
 
 struct hib_bio_batch {
 	atomic_t		count;
@@ -276,7 +276,7 @@ static int hib_submit_io(blk_opf_t opf, pgoff_t page_off, void *addr,
 	struct bio *bio;
 	int error = 0;
 
-	bio = bio_alloc(hib_resume_bdev_handle->bdev, 1, opf,
+	bio = bio_alloc(file_bdev(hib_resume_bdev_file), 1, opf,
 			GFP_NOIO | __GFP_HIGH);
 	bio->bi_iter.bi_sector = page_off * (PAGE_SIZE >> 9);
 
@@ -357,14 +357,14 @@ static int swsusp_swap_check(void)
 		return res;
 	root_swap = res;
 
-	hib_resume_bdev_handle = bdev_open_by_dev(swsusp_resume_device,
+	hib_resume_bdev_file = bdev_file_open_by_dev(swsusp_resume_device,
 			BLK_OPEN_WRITE, NULL, NULL);
-	if (IS_ERR(hib_resume_bdev_handle))
-		return PTR_ERR(hib_resume_bdev_handle);
+	if (IS_ERR(hib_resume_bdev_file))
+		return PTR_ERR(hib_resume_bdev_file);
 
-	res = set_blocksize(hib_resume_bdev_handle->bdev, PAGE_SIZE);
+	res = set_blocksize(file_bdev(hib_resume_bdev_file), PAGE_SIZE);
 	if (res < 0)
-		bdev_release(hib_resume_bdev_handle);
+		fput(hib_resume_bdev_file);
 
 	return res;
 }
@@ -1523,10 +1523,10 @@ int swsusp_check(bool exclusive)
 	void *holder = exclusive ? &swsusp_holder : NULL;
 	int error;
 
-	hib_resume_bdev_handle = bdev_open_by_dev(swsusp_resume_device,
+	hib_resume_bdev_file = bdev_file_open_by_dev(swsusp_resume_device,
 				BLK_OPEN_READ, holder, NULL);
-	if (!IS_ERR(hib_resume_bdev_handle)) {
-		set_blocksize(hib_resume_bdev_handle->bdev, PAGE_SIZE);
+	if (!IS_ERR(hib_resume_bdev_file)) {
+		set_blocksize(file_bdev(hib_resume_bdev_file), PAGE_SIZE);
 		clear_page(swsusp_header);
 		error = hib_submit_io(REQ_OP_READ, swsusp_resume_block,
 					swsusp_header, NULL);
@@ -1551,11 +1551,11 @@ int swsusp_check(bool exclusive)
 
 put:
 		if (error)
-			bdev_release(hib_resume_bdev_handle);
+			fput(hib_resume_bdev_file);
 		else
 			pr_debug("Image signature found, resuming\n");
 	} else {
-		error = PTR_ERR(hib_resume_bdev_handle);
+		error = PTR_ERR(hib_resume_bdev_file);
 	}
 
 	if (error)
@@ -1570,12 +1570,12 @@ int swsusp_check(bool exclusive)
 
 void swsusp_close(void)
 {
-	if (IS_ERR(hib_resume_bdev_handle)) {
+	if (IS_ERR(hib_resume_bdev_file)) {
 		pr_debug("Image device not initialised\n");
 		return;
 	}
 
-	bdev_release(hib_resume_bdev_handle);
+	fput(hib_resume_bdev_file);
 }
 
 /**

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 07/34] xfs: port block device access to files
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (5 preceding siblings ...)
  2024-01-23 13:26 ` [PATCH v2 06/34] power: port block device access " Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-01-29 16:17   ` Christoph Hellwig
  2024-01-31 18:19   ` Jan Kara
  2024-01-23 13:26 ` [PATCH v2 08/34] drbd: port block device access to file Christian Brauner
                   ` (30 subsequent siblings)
  37 siblings, 2 replies; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 fs/xfs/xfs_buf.c   | 10 +++++-----
 fs/xfs/xfs_buf.h   |  4 ++--
 fs/xfs/xfs_super.c | 43 +++++++++++++++++++++----------------------
 3 files changed, 28 insertions(+), 29 deletions(-)

diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index 8e5bd50d29fe..01b41fabbe3c 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -1951,7 +1951,7 @@ xfs_free_buftarg(
 	fs_put_dax(btp->bt_daxdev, btp->bt_mount);
 	/* the main block device is closed by kill_block_super */
 	if (btp->bt_bdev != btp->bt_mount->m_super->s_bdev)
-		bdev_release(btp->bt_bdev_handle);
+		fput(btp->bt_bdev_file);
 
 	kmem_free(btp);
 }
@@ -1994,7 +1994,7 @@ xfs_setsize_buftarg_early(
 struct xfs_buftarg *
 xfs_alloc_buftarg(
 	struct xfs_mount	*mp,
-	struct bdev_handle	*bdev_handle)
+	struct file		*bdev_file)
 {
 	xfs_buftarg_t		*btp;
 	const struct dax_holder_operations *ops = NULL;
@@ -2005,9 +2005,9 @@ xfs_alloc_buftarg(
 	btp = kmem_zalloc(sizeof(*btp), KM_NOFS);
 
 	btp->bt_mount = mp;
-	btp->bt_bdev_handle = bdev_handle;
-	btp->bt_dev = bdev_handle->bdev->bd_dev;
-	btp->bt_bdev = bdev_handle->bdev;
+	btp->bt_bdev_file = bdev_file;
+	btp->bt_bdev = file_bdev(bdev_file);
+	btp->bt_dev = btp->bt_bdev->bd_dev;
 	btp->bt_daxdev = fs_dax_get_by_bdev(btp->bt_bdev, &btp->bt_dax_part_off,
 					    mp, ops);
 
diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h
index b470de08a46c..304e858d04fb 100644
--- a/fs/xfs/xfs_buf.h
+++ b/fs/xfs/xfs_buf.h
@@ -98,7 +98,7 @@ typedef unsigned int xfs_buf_flags_t;
  */
 typedef struct xfs_buftarg {
 	dev_t			bt_dev;
-	struct bdev_handle	*bt_bdev_handle;
+	struct file		*bt_bdev_file;
 	struct block_device	*bt_bdev;
 	struct dax_device	*bt_daxdev;
 	u64			bt_dax_part_off;
@@ -366,7 +366,7 @@ xfs_buf_update_cksum(struct xfs_buf *bp, unsigned long cksum_offset)
  *	Handling of buftargs.
  */
 struct xfs_buftarg *xfs_alloc_buftarg(struct xfs_mount *mp,
-		struct bdev_handle *bdev_handle);
+		struct file *bdev_file);
 extern void xfs_free_buftarg(struct xfs_buftarg *);
 extern void xfs_buftarg_wait(struct xfs_buftarg *);
 extern void xfs_buftarg_drain(struct xfs_buftarg *);
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index e5ac0e59ede9..4a076c464177 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -362,16 +362,16 @@ STATIC int
 xfs_blkdev_get(
 	xfs_mount_t		*mp,
 	const char		*name,
-	struct bdev_handle	**handlep)
+	struct file		**bdev_filep)
 {
 	int			error = 0;
 
-	*handlep = bdev_open_by_path(name,
+	*bdev_filep = bdev_file_open_by_path(name,
 		BLK_OPEN_READ | BLK_OPEN_WRITE | BLK_OPEN_RESTRICT_WRITES,
 		mp->m_super, &fs_holder_ops);
-	if (IS_ERR(*handlep)) {
-		error = PTR_ERR(*handlep);
-		*handlep = NULL;
+	if (IS_ERR(*bdev_filep)) {
+		error = PTR_ERR(*bdev_filep);
+		*bdev_filep = NULL;
 		xfs_warn(mp, "Invalid device [%s], error=%d", name, error);
 	}
 
@@ -436,26 +436,25 @@ xfs_open_devices(
 {
 	struct super_block	*sb = mp->m_super;
 	struct block_device	*ddev = sb->s_bdev;
-	struct bdev_handle	*logdev_handle = NULL, *rtdev_handle = NULL;
+	struct file		*logdev_file = NULL, *rtdev_file = NULL;
 	int			error;
 
 	/*
 	 * Open real time and log devices - order is important.
 	 */
 	if (mp->m_logname) {
-		error = xfs_blkdev_get(mp, mp->m_logname, &logdev_handle);
+		error = xfs_blkdev_get(mp, mp->m_logname, &logdev_file);
 		if (error)
 			return error;
 	}
 
 	if (mp->m_rtname) {
-		error = xfs_blkdev_get(mp, mp->m_rtname, &rtdev_handle);
+		error = xfs_blkdev_get(mp, mp->m_rtname, &rtdev_file);
 		if (error)
 			goto out_close_logdev;
 
-		if (rtdev_handle->bdev == ddev ||
-		    (logdev_handle &&
-		     rtdev_handle->bdev == logdev_handle->bdev)) {
+		if (file_bdev(rtdev_file) == ddev ||
+		    (logdev_file && file_bdev(rtdev_file) == file_bdev(logdev_file))) {
 			xfs_warn(mp,
 	"Cannot mount filesystem with identical rtdev and ddev/logdev.");
 			error = -EINVAL;
@@ -467,25 +466,25 @@ xfs_open_devices(
 	 * Setup xfs_mount buffer target pointers
 	 */
 	error = -ENOMEM;
-	mp->m_ddev_targp = xfs_alloc_buftarg(mp, sb_bdev_handle(sb));
+	mp->m_ddev_targp = xfs_alloc_buftarg(mp, sb->s_bdev_file);
 	if (!mp->m_ddev_targp)
 		goto out_close_rtdev;
 
-	if (rtdev_handle) {
-		mp->m_rtdev_targp = xfs_alloc_buftarg(mp, rtdev_handle);
+	if (rtdev_file) {
+		mp->m_rtdev_targp = xfs_alloc_buftarg(mp, rtdev_file);
 		if (!mp->m_rtdev_targp)
 			goto out_free_ddev_targ;
 	}
 
-	if (logdev_handle && logdev_handle->bdev != ddev) {
-		mp->m_logdev_targp = xfs_alloc_buftarg(mp, logdev_handle);
+	if (logdev_file && file_bdev(logdev_file) != ddev) {
+		mp->m_logdev_targp = xfs_alloc_buftarg(mp, logdev_file);
 		if (!mp->m_logdev_targp)
 			goto out_free_rtdev_targ;
 	} else {
 		mp->m_logdev_targp = mp->m_ddev_targp;
 		/* Handle won't be used, drop it */
-		if (logdev_handle)
-			bdev_release(logdev_handle);
+		if (logdev_file)
+			fput(logdev_file);
 	}
 
 	return 0;
@@ -496,11 +495,11 @@ xfs_open_devices(
  out_free_ddev_targ:
 	xfs_free_buftarg(mp->m_ddev_targp);
  out_close_rtdev:
-	 if (rtdev_handle)
-		bdev_release(rtdev_handle);
+	 if (rtdev_file)
+		fput(rtdev_file);
  out_close_logdev:
-	if (logdev_handle)
-		bdev_release(logdev_handle);
+	if (logdev_file)
+		fput(logdev_file);
 	return error;
 }
 

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 08/34] drbd: port block device access to file
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (6 preceding siblings ...)
  2024-01-23 13:26 ` [PATCH v2 07/34] xfs: port block device access to files Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-01-31 18:22   ` Jan Kara
  2024-01-23 13:26 ` [PATCH v2 09/34] pktcdvd: " Christian Brauner
                   ` (29 subsequent siblings)
  37 siblings, 1 reply; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 drivers/block/drbd/drbd_int.h |  4 +--
 drivers/block/drbd/drbd_nl.c  | 58 +++++++++++++++++++++----------------------
 2 files changed, 31 insertions(+), 31 deletions(-)

diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index c21e3732759e..94dc0a235919 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -524,9 +524,9 @@ struct drbd_md {
 
 struct drbd_backing_dev {
 	struct block_device *backing_bdev;
-	struct bdev_handle *backing_bdev_handle;
+	struct file *backing_bdev_file;
 	struct block_device *md_bdev;
-	struct bdev_handle *md_bdev_handle;
+	struct file *f_md_bdev;
 	struct drbd_md md;
 	struct disk_conf *disk_conf; /* RCU, for updates: resource->conf_update */
 	sector_t known_size; /* last known size of that backing device */
diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index 43747a1aae43..6aed67278e8b 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -1635,45 +1635,45 @@ int drbd_adm_disk_opts(struct sk_buff *skb, struct genl_info *info)
 	return 0;
 }
 
-static struct bdev_handle *open_backing_dev(struct drbd_device *device,
+static struct file *open_backing_dev(struct drbd_device *device,
 		const char *bdev_path, void *claim_ptr, bool do_bd_link)
 {
-	struct bdev_handle *handle;
+	struct file *file;
 	int err = 0;
 
-	handle = bdev_open_by_path(bdev_path, BLK_OPEN_READ | BLK_OPEN_WRITE,
-				   claim_ptr, NULL);
-	if (IS_ERR(handle)) {
+	file = bdev_file_open_by_path(bdev_path, BLK_OPEN_READ | BLK_OPEN_WRITE,
+				      claim_ptr, NULL);
+	if (IS_ERR(file)) {
 		drbd_err(device, "open(\"%s\") failed with %ld\n",
-				bdev_path, PTR_ERR(handle));
-		return handle;
+				bdev_path, PTR_ERR(file));
+		return file;
 	}
 
 	if (!do_bd_link)
-		return handle;
+		return file;
 
-	err = bd_link_disk_holder(handle->bdev, device->vdisk);
+	err = bd_link_disk_holder(file_bdev(file), device->vdisk);
 	if (err) {
-		bdev_release(handle);
+		fput(file);
 		drbd_err(device, "bd_link_disk_holder(\"%s\", ...) failed with %d\n",
 				bdev_path, err);
-		handle = ERR_PTR(err);
+		file = ERR_PTR(err);
 	}
-	return handle;
+	return file;
 }
 
 static int open_backing_devices(struct drbd_device *device,
 		struct disk_conf *new_disk_conf,
 		struct drbd_backing_dev *nbc)
 {
-	struct bdev_handle *handle;
+	struct file *file;
 
-	handle = open_backing_dev(device, new_disk_conf->backing_dev, device,
+	file = open_backing_dev(device, new_disk_conf->backing_dev, device,
 				  true);
-	if (IS_ERR(handle))
+	if (IS_ERR(file))
 		return ERR_OPEN_DISK;
-	nbc->backing_bdev = handle->bdev;
-	nbc->backing_bdev_handle = handle;
+	nbc->backing_bdev = file_bdev(file);
+	nbc->backing_bdev_file = file;
 
 	/*
 	 * meta_dev_idx >= 0: external fixed size, possibly multiple
@@ -1683,7 +1683,7 @@ static int open_backing_devices(struct drbd_device *device,
 	 * should check it for you already; but if you don't, or
 	 * someone fooled it, we need to double check here)
 	 */
-	handle = open_backing_dev(device, new_disk_conf->meta_dev,
+	file = open_backing_dev(device, new_disk_conf->meta_dev,
 		/* claim ptr: device, if claimed exclusively; shared drbd_m_holder,
 		 * if potentially shared with other drbd minors */
 			(new_disk_conf->meta_dev_idx < 0) ? (void*)device : (void*)drbd_m_holder,
@@ -1691,21 +1691,21 @@ static int open_backing_devices(struct drbd_device *device,
 		 * as would happen with internal metadata. */
 			(new_disk_conf->meta_dev_idx != DRBD_MD_INDEX_FLEX_INT &&
 			 new_disk_conf->meta_dev_idx != DRBD_MD_INDEX_INTERNAL));
-	if (IS_ERR(handle))
+	if (IS_ERR(file))
 		return ERR_OPEN_MD_DISK;
-	nbc->md_bdev = handle->bdev;
-	nbc->md_bdev_handle = handle;
+	nbc->md_bdev = file_bdev(file);
+	nbc->f_md_bdev = file;
 	return NO_ERROR;
 }
 
 static void close_backing_dev(struct drbd_device *device,
-		struct bdev_handle *handle, bool do_bd_unlink)
+		struct file *bdev_file, bool do_bd_unlink)
 {
-	if (!handle)
+	if (!bdev_file)
 		return;
 	if (do_bd_unlink)
-		bd_unlink_disk_holder(handle->bdev, device->vdisk);
-	bdev_release(handle);
+		bd_unlink_disk_holder(file_bdev(bdev_file), device->vdisk);
+	fput(bdev_file);
 }
 
 void drbd_backing_dev_free(struct drbd_device *device, struct drbd_backing_dev *ldev)
@@ -1713,9 +1713,9 @@ void drbd_backing_dev_free(struct drbd_device *device, struct drbd_backing_dev *
 	if (ldev == NULL)
 		return;
 
-	close_backing_dev(device, ldev->md_bdev_handle,
+	close_backing_dev(device, ldev->f_md_bdev,
 			  ldev->md_bdev != ldev->backing_bdev);
-	close_backing_dev(device, ldev->backing_bdev_handle, true);
+	close_backing_dev(device, ldev->backing_bdev_file, true);
 
 	kfree(ldev->disk_conf);
 	kfree(ldev);
@@ -2131,9 +2131,9 @@ int drbd_adm_attach(struct sk_buff *skb, struct genl_info *info)
  fail:
 	conn_reconfig_done(connection);
 	if (nbc) {
-		close_backing_dev(device, nbc->md_bdev_handle,
+		close_backing_dev(device, nbc->f_md_bdev,
 			  nbc->md_bdev != nbc->backing_bdev);
-		close_backing_dev(device, nbc->backing_bdev_handle, true);
+		close_backing_dev(device, nbc->backing_bdev_file, true);
 		kfree(nbc);
 	}
 	kfree(new_disk_conf);

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 09/34] pktcdvd: port block device access to file
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (7 preceding siblings ...)
  2024-01-23 13:26 ` [PATCH v2 08/34] drbd: port block device access to file Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-01-31 18:26   ` Jan Kara
  2024-01-23 13:26 ` [PATCH v2 10/34] rnbd: " Christian Brauner
                   ` (28 subsequent siblings)
  37 siblings, 1 reply; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 drivers/block/pktcdvd.c | 68 ++++++++++++++++++++++++-------------------------
 include/linux/pktcdvd.h |  4 +--
 2 files changed, 36 insertions(+), 36 deletions(-)

diff --git a/drivers/block/pktcdvd.c b/drivers/block/pktcdvd.c
index d56d972aadb3..c21444716e43 100644
--- a/drivers/block/pktcdvd.c
+++ b/drivers/block/pktcdvd.c
@@ -340,8 +340,8 @@ static ssize_t device_map_show(const struct class *c, const struct class_attribu
 		n += sysfs_emit_at(data, n, "%s %u:%u %u:%u\n",
 			pd->disk->disk_name,
 			MAJOR(pd->pkt_dev), MINOR(pd->pkt_dev),
-			MAJOR(pd->bdev_handle->bdev->bd_dev),
-			MINOR(pd->bdev_handle->bdev->bd_dev));
+			MAJOR(file_bdev(pd->bdev_file)->bd_dev),
+			MINOR(file_bdev(pd->bdev_file)->bd_dev));
 	}
 	mutex_unlock(&ctl_mutex);
 	return n;
@@ -438,7 +438,7 @@ static int pkt_seq_show(struct seq_file *m, void *p)
 	int states[PACKET_NUM_STATES];
 
 	seq_printf(m, "Writer %s mapped to %pg:\n", pd->disk->disk_name,
-		   pd->bdev_handle->bdev);
+		   file_bdev(pd->bdev_file));
 
 	seq_printf(m, "\nSettings:\n");
 	seq_printf(m, "\tpacket size:\t\t%dkB\n", pd->settings.size / 2);
@@ -715,7 +715,7 @@ static void pkt_rbtree_insert(struct pktcdvd_device *pd, struct pkt_rb_node *nod
  */
 static int pkt_generic_packet(struct pktcdvd_device *pd, struct packet_command *cgc)
 {
-	struct request_queue *q = bdev_get_queue(pd->bdev_handle->bdev);
+	struct request_queue *q = bdev_get_queue(file_bdev(pd->bdev_file));
 	struct scsi_cmnd *scmd;
 	struct request *rq;
 	int ret = 0;
@@ -1048,7 +1048,7 @@ static void pkt_gather_data(struct pktcdvd_device *pd, struct packet_data *pkt)
 			continue;
 
 		bio = pkt->r_bios[f];
-		bio_init(bio, pd->bdev_handle->bdev, bio->bi_inline_vecs, 1,
+		bio_init(bio, file_bdev(pd->bdev_file), bio->bi_inline_vecs, 1,
 			 REQ_OP_READ);
 		bio->bi_iter.bi_sector = pkt->sector + f * (CD_FRAMESIZE >> 9);
 		bio->bi_end_io = pkt_end_io_read;
@@ -1264,7 +1264,7 @@ static void pkt_start_write(struct pktcdvd_device *pd, struct packet_data *pkt)
 	struct device *ddev = disk_to_dev(pd->disk);
 	int f;
 
-	bio_init(pkt->w_bio, pd->bdev_handle->bdev, pkt->w_bio->bi_inline_vecs,
+	bio_init(pkt->w_bio, file_bdev(pd->bdev_file), pkt->w_bio->bi_inline_vecs,
 		 pkt->frames, REQ_OP_WRITE);
 	pkt->w_bio->bi_iter.bi_sector = pkt->sector;
 	pkt->w_bio->bi_end_io = pkt_end_io_packet_write;
@@ -2162,20 +2162,20 @@ static int pkt_open_dev(struct pktcdvd_device *pd, bool write)
 	int ret;
 	long lba;
 	struct request_queue *q;
-	struct bdev_handle *bdev_handle;
+	struct file *bdev_file;
 
 	/*
 	 * We need to re-open the cdrom device without O_NONBLOCK to be able
 	 * to read/write from/to it. It is already opened in O_NONBLOCK mode
 	 * so open should not fail.
 	 */
-	bdev_handle = bdev_open_by_dev(pd->bdev_handle->bdev->bd_dev,
+	bdev_file = bdev_file_open_by_dev(file_bdev(pd->bdev_file)->bd_dev,
 				       BLK_OPEN_READ, pd, NULL);
-	if (IS_ERR(bdev_handle)) {
-		ret = PTR_ERR(bdev_handle);
+	if (IS_ERR(bdev_file)) {
+		ret = PTR_ERR(bdev_file);
 		goto out;
 	}
-	pd->open_bdev_handle = bdev_handle;
+	pd->f_open_bdev = bdev_file;
 
 	ret = pkt_get_last_written(pd, &lba);
 	if (ret) {
@@ -2184,9 +2184,9 @@ static int pkt_open_dev(struct pktcdvd_device *pd, bool write)
 	}
 
 	set_capacity(pd->disk, lba << 2);
-	set_capacity_and_notify(pd->bdev_handle->bdev->bd_disk, lba << 2);
+	set_capacity_and_notify(file_bdev(pd->bdev_file)->bd_disk, lba << 2);
 
-	q = bdev_get_queue(pd->bdev_handle->bdev);
+	q = bdev_get_queue(file_bdev(pd->bdev_file));
 	if (write) {
 		ret = pkt_open_write(pd);
 		if (ret)
@@ -2218,7 +2218,7 @@ static int pkt_open_dev(struct pktcdvd_device *pd, bool write)
 	return 0;
 
 out_putdev:
-	bdev_release(bdev_handle);
+	fput(bdev_file);
 out:
 	return ret;
 }
@@ -2237,8 +2237,8 @@ static void pkt_release_dev(struct pktcdvd_device *pd, int flush)
 	pkt_lock_door(pd, 0);
 
 	pkt_set_speed(pd, MAX_SPEED, MAX_SPEED);
-	bdev_release(pd->open_bdev_handle);
-	pd->open_bdev_handle = NULL;
+	fput(pd->f_open_bdev);
+	pd->f_open_bdev = NULL;
 
 	pkt_shrink_pktlist(pd);
 }
@@ -2326,7 +2326,7 @@ static void pkt_end_io_read_cloned(struct bio *bio)
 
 static void pkt_make_request_read(struct pktcdvd_device *pd, struct bio *bio)
 {
-	struct bio *cloned_bio = bio_alloc_clone(pd->bdev_handle->bdev, bio,
+	struct bio *cloned_bio = bio_alloc_clone(file_bdev(pd->bdev_file), bio,
 		GFP_NOIO, &pkt_bio_set);
 	struct packet_stacked_data *psd = mempool_alloc(&psd_pool, GFP_NOIO);
 
@@ -2497,7 +2497,7 @@ static int pkt_new_dev(struct pktcdvd_device *pd, dev_t dev)
 {
 	struct device *ddev = disk_to_dev(pd->disk);
 	int i;
-	struct bdev_handle *bdev_handle;
+	struct file *bdev_file;
 	struct scsi_device *sdev;
 
 	if (pd->pkt_dev == dev) {
@@ -2508,9 +2508,9 @@ static int pkt_new_dev(struct pktcdvd_device *pd, dev_t dev)
 		struct pktcdvd_device *pd2 = pkt_devs[i];
 		if (!pd2)
 			continue;
-		if (pd2->bdev_handle->bdev->bd_dev == dev) {
+		if (file_bdev(pd2->bdev_file)->bd_dev == dev) {
 			dev_err(ddev, "%pg already setup\n",
-				pd2->bdev_handle->bdev);
+				file_bdev(pd2->bdev_file));
 			return -EBUSY;
 		}
 		if (pd2->pkt_dev == dev) {
@@ -2519,13 +2519,13 @@ static int pkt_new_dev(struct pktcdvd_device *pd, dev_t dev)
 		}
 	}
 
-	bdev_handle = bdev_open_by_dev(dev, BLK_OPEN_READ | BLK_OPEN_NDELAY,
+	bdev_file = bdev_file_open_by_dev(dev, BLK_OPEN_READ | BLK_OPEN_NDELAY,
 				       NULL, NULL);
-	if (IS_ERR(bdev_handle))
-		return PTR_ERR(bdev_handle);
-	sdev = scsi_device_from_queue(bdev_handle->bdev->bd_disk->queue);
+	if (IS_ERR(bdev_file))
+		return PTR_ERR(bdev_file);
+	sdev = scsi_device_from_queue(file_bdev(bdev_file)->bd_disk->queue);
 	if (!sdev) {
-		bdev_release(bdev_handle);
+		fput(bdev_file);
 		return -EINVAL;
 	}
 	put_device(&sdev->sdev_gendev);
@@ -2533,8 +2533,8 @@ static int pkt_new_dev(struct pktcdvd_device *pd, dev_t dev)
 	/* This is safe, since we have a reference from open(). */
 	__module_get(THIS_MODULE);
 
-	pd->bdev_handle = bdev_handle;
-	set_blocksize(bdev_handle->bdev, CD_FRAMESIZE);
+	pd->bdev_file = bdev_file;
+	set_blocksize(file_bdev(bdev_file), CD_FRAMESIZE);
 
 	pkt_init_queue(pd);
 
@@ -2546,11 +2546,11 @@ static int pkt_new_dev(struct pktcdvd_device *pd, dev_t dev)
 	}
 
 	proc_create_single_data(pd->disk->disk_name, 0, pkt_proc, pkt_seq_show, pd);
-	dev_notice(ddev, "writer mapped to %pg\n", bdev_handle->bdev);
+	dev_notice(ddev, "writer mapped to %pg\n", file_bdev(bdev_file));
 	return 0;
 
 out_mem:
-	bdev_release(bdev_handle);
+	fput(bdev_file);
 	/* This is safe: open() is still holding a reference. */
 	module_put(THIS_MODULE);
 	return -ENOMEM;
@@ -2605,9 +2605,9 @@ static unsigned int pkt_check_events(struct gendisk *disk,
 
 	if (!pd)
 		return 0;
-	if (!pd->bdev_handle)
+	if (!pd->bdev_file)
 		return 0;
-	attached_disk = pd->bdev_handle->bdev->bd_disk;
+	attached_disk = file_bdev(pd->bdev_file)->bd_disk;
 	if (!attached_disk || !attached_disk->fops->check_events)
 		return 0;
 	return attached_disk->fops->check_events(attached_disk, clearing);
@@ -2692,7 +2692,7 @@ static int pkt_setup_dev(dev_t dev, dev_t* pkt_dev)
 		goto out_mem2;
 
 	/* inherit events of the host device */
-	disk->events = pd->bdev_handle->bdev->bd_disk->events;
+	disk->events = file_bdev(pd->bdev_file)->bd_disk->events;
 
 	ret = add_disk(disk);
 	if (ret)
@@ -2757,7 +2757,7 @@ static int pkt_remove_dev(dev_t pkt_dev)
 	pkt_debugfs_dev_remove(pd);
 	pkt_sysfs_dev_remove(pd);
 
-	bdev_release(pd->bdev_handle);
+	fput(pd->bdev_file);
 
 	remove_proc_entry(pd->disk->disk_name, pkt_proc);
 	dev_notice(ddev, "writer unmapped\n");
@@ -2784,7 +2784,7 @@ static void pkt_get_status(struct pkt_ctrl_command *ctrl_cmd)
 
 	pd = pkt_find_dev_from_minor(ctrl_cmd->dev_index);
 	if (pd) {
-		ctrl_cmd->dev = new_encode_dev(pd->bdev_handle->bdev->bd_dev);
+		ctrl_cmd->dev = new_encode_dev(file_bdev(pd->bdev_file)->bd_dev);
 		ctrl_cmd->pkt_dev = new_encode_dev(pd->pkt_dev);
 	} else {
 		ctrl_cmd->dev = 0;
diff --git a/include/linux/pktcdvd.h b/include/linux/pktcdvd.h
index 79594aeb160d..2f1b952d596a 100644
--- a/include/linux/pktcdvd.h
+++ b/include/linux/pktcdvd.h
@@ -154,9 +154,9 @@ struct packet_stacked_data
 
 struct pktcdvd_device
 {
-	struct bdev_handle	*bdev_handle;	/* dev attached */
+	struct file		*bdev_file;	/* dev attached */
 	/* handle acquired for bdev during pkt_open_dev() */
-	struct bdev_handle	*open_bdev_handle;
+	struct file		*f_open_bdev;
 	dev_t			pkt_dev;	/* our dev */
 	struct packet_settings	settings;
 	struct packet_stats	stats;

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 10/34] rnbd: port block device access to file
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (8 preceding siblings ...)
  2024-01-23 13:26 ` [PATCH v2 09/34] pktcdvd: " Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-01-31 18:28   ` Jan Kara
  2024-01-23 13:26 ` [PATCH v2 11/34] xen: " Christian Brauner
                   ` (27 subsequent siblings)
  37 siblings, 1 reply; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 drivers/block/rnbd/rnbd-srv.c | 28 ++++++++++++++--------------
 drivers/block/rnbd/rnbd-srv.h |  2 +-
 2 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/drivers/block/rnbd/rnbd-srv.c b/drivers/block/rnbd/rnbd-srv.c
index 3a0d5dcec6f2..f6e3a3c4b76c 100644
--- a/drivers/block/rnbd/rnbd-srv.c
+++ b/drivers/block/rnbd/rnbd-srv.c
@@ -145,7 +145,7 @@ static int process_rdma(struct rnbd_srv_session *srv_sess,
 	priv->sess_dev = sess_dev;
 	priv->id = id;
 
-	bio = bio_alloc(sess_dev->bdev_handle->bdev, 1,
+	bio = bio_alloc(file_bdev(sess_dev->bdev_file), 1,
 			rnbd_to_bio_flags(le32_to_cpu(msg->rw)), GFP_KERNEL);
 	if (bio_add_page(bio, virt_to_page(data), datalen,
 			offset_in_page(data)) != datalen) {
@@ -219,7 +219,7 @@ void rnbd_destroy_sess_dev(struct rnbd_srv_sess_dev *sess_dev, bool keep_id)
 	rnbd_put_sess_dev(sess_dev);
 	wait_for_completion(&dc); /* wait for inflights to drop to zero */
 
-	bdev_release(sess_dev->bdev_handle);
+	fput(sess_dev->bdev_file);
 	mutex_lock(&sess_dev->dev->lock);
 	list_del(&sess_dev->dev_list);
 	if (!sess_dev->readonly)
@@ -534,7 +534,7 @@ rnbd_srv_get_or_create_srv_dev(struct block_device *bdev,
 static void rnbd_srv_fill_msg_open_rsp(struct rnbd_msg_open_rsp *rsp,
 					struct rnbd_srv_sess_dev *sess_dev)
 {
-	struct block_device *bdev = sess_dev->bdev_handle->bdev;
+	struct block_device *bdev = file_bdev(sess_dev->bdev_file);
 
 	rsp->hdr.type = cpu_to_le16(RNBD_MSG_OPEN_RSP);
 	rsp->device_id = cpu_to_le32(sess_dev->device_id);
@@ -560,7 +560,7 @@ static void rnbd_srv_fill_msg_open_rsp(struct rnbd_msg_open_rsp *rsp,
 static struct rnbd_srv_sess_dev *
 rnbd_srv_create_set_sess_dev(struct rnbd_srv_session *srv_sess,
 			      const struct rnbd_msg_open *open_msg,
-			      struct bdev_handle *handle, bool readonly,
+			      struct file *bdev_file, bool readonly,
 			      struct rnbd_srv_dev *srv_dev)
 {
 	struct rnbd_srv_sess_dev *sdev = rnbd_sess_dev_alloc(srv_sess);
@@ -572,7 +572,7 @@ rnbd_srv_create_set_sess_dev(struct rnbd_srv_session *srv_sess,
 
 	strscpy(sdev->pathname, open_msg->dev_name, sizeof(sdev->pathname));
 
-	sdev->bdev_handle	= handle;
+	sdev->bdev_file		= bdev_file;
 	sdev->sess		= srv_sess;
 	sdev->dev		= srv_dev;
 	sdev->readonly		= readonly;
@@ -678,7 +678,7 @@ static int process_msg_open(struct rnbd_srv_session *srv_sess,
 	struct rnbd_srv_dev *srv_dev;
 	struct rnbd_srv_sess_dev *srv_sess_dev;
 	const struct rnbd_msg_open *open_msg = msg;
-	struct bdev_handle *bdev_handle;
+	struct file *bdev_file;
 	blk_mode_t open_flags = BLK_OPEN_READ;
 	char *full_path;
 	struct rnbd_msg_open_rsp *rsp = data;
@@ -716,15 +716,15 @@ static int process_msg_open(struct rnbd_srv_session *srv_sess,
 		goto reject;
 	}
 
-	bdev_handle = bdev_open_by_path(full_path, open_flags, NULL, NULL);
-	if (IS_ERR(bdev_handle)) {
-		ret = PTR_ERR(bdev_handle);
+	bdev_file = bdev_file_open_by_path(full_path, open_flags, NULL, NULL);
+	if (IS_ERR(bdev_file)) {
+		ret = PTR_ERR(bdev_file);
 		pr_err("Opening device '%s' on session %s failed, failed to open the block device, err: %pe\n",
-		       full_path, srv_sess->sessname, bdev_handle);
+		       full_path, srv_sess->sessname, bdev_file);
 		goto free_path;
 	}
 
-	srv_dev = rnbd_srv_get_or_create_srv_dev(bdev_handle->bdev, srv_sess,
+	srv_dev = rnbd_srv_get_or_create_srv_dev(file_bdev(bdev_file), srv_sess,
 						  open_msg->access_mode);
 	if (IS_ERR(srv_dev)) {
 		pr_err("Opening device '%s' on session %s failed, creating srv_dev failed, err: %pe\n",
@@ -734,7 +734,7 @@ static int process_msg_open(struct rnbd_srv_session *srv_sess,
 	}
 
 	srv_sess_dev = rnbd_srv_create_set_sess_dev(srv_sess, open_msg,
-				bdev_handle,
+				bdev_file,
 				open_msg->access_mode == RNBD_ACCESS_RO,
 				srv_dev);
 	if (IS_ERR(srv_sess_dev)) {
@@ -750,7 +750,7 @@ static int process_msg_open(struct rnbd_srv_session *srv_sess,
 	 */
 	mutex_lock(&srv_dev->lock);
 	if (!srv_dev->dev_kobj.state_in_sysfs) {
-		ret = rnbd_srv_create_dev_sysfs(srv_dev, bdev_handle->bdev);
+		ret = rnbd_srv_create_dev_sysfs(srv_dev, file_bdev(bdev_file));
 		if (ret) {
 			mutex_unlock(&srv_dev->lock);
 			rnbd_srv_err(srv_sess_dev,
@@ -793,7 +793,7 @@ static int process_msg_open(struct rnbd_srv_session *srv_sess,
 	}
 	rnbd_put_srv_dev(srv_dev);
 blkdev_put:
-	bdev_release(bdev_handle);
+	fput(bdev_file);
 free_path:
 	kfree(full_path);
 reject:
diff --git a/drivers/block/rnbd/rnbd-srv.h b/drivers/block/rnbd/rnbd-srv.h
index 343cc682b617..18d873808b8d 100644
--- a/drivers/block/rnbd/rnbd-srv.h
+++ b/drivers/block/rnbd/rnbd-srv.h
@@ -46,7 +46,7 @@ struct rnbd_srv_dev {
 struct rnbd_srv_sess_dev {
 	/* Entry inside rnbd_srv_dev struct */
 	struct list_head		dev_list;
-	struct bdev_handle		*bdev_handle;
+	struct file			*bdev_file;
 	struct rnbd_srv_session		*sess;
 	struct rnbd_srv_dev		*dev;
 	struct kobject                  kobj;

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 11/34] xen: port block device access to file
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (9 preceding siblings ...)
  2024-01-23 13:26 ` [PATCH v2 10/34] rnbd: " Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-01-31 18:31   ` Jan Kara
  2024-01-23 13:26 ` [PATCH v2 12/34] zram: " Christian Brauner
                   ` (26 subsequent siblings)
  37 siblings, 1 reply; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 drivers/block/xen-blkback/blkback.c |  4 ++--
 drivers/block/xen-blkback/common.h  |  4 ++--
 drivers/block/xen-blkback/xenbus.c  | 37 ++++++++++++++++++-------------------
 3 files changed, 22 insertions(+), 23 deletions(-)

diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c
index 4defd7f387c7..944576d582fb 100644
--- a/drivers/block/xen-blkback/blkback.c
+++ b/drivers/block/xen-blkback/blkback.c
@@ -465,7 +465,7 @@ static int xen_vbd_translate(struct phys_req *req, struct xen_blkif *blkif,
 	}
 
 	req->dev  = vbd->pdevice;
-	req->bdev = vbd->bdev_handle->bdev;
+	req->bdev = file_bdev(vbd->bdev_file);
 	rc = 0;
 
  out:
@@ -969,7 +969,7 @@ static int dispatch_discard_io(struct xen_blkif_ring *ring,
 	int err = 0;
 	int status = BLKIF_RSP_OKAY;
 	struct xen_blkif *blkif = ring->blkif;
-	struct block_device *bdev = blkif->vbd.bdev_handle->bdev;
+	struct block_device *bdev = file_bdev(blkif->vbd.bdev_file);
 	struct phys_req preq;
 
 	xen_blkif_get(blkif);
diff --git a/drivers/block/xen-blkback/common.h b/drivers/block/xen-blkback/common.h
index 1432c83183d0..b427d54bc120 100644
--- a/drivers/block/xen-blkback/common.h
+++ b/drivers/block/xen-blkback/common.h
@@ -221,7 +221,7 @@ struct xen_vbd {
 	unsigned char		type;
 	/* phys device that this vbd maps to. */
 	u32			pdevice;
-	struct bdev_handle	*bdev_handle;
+	struct file		*bdev_file;
 	/* Cached size parameter. */
 	sector_t		size;
 	unsigned int		flush_support:1;
@@ -360,7 +360,7 @@ struct pending_req {
 };
 
 
-#define vbd_sz(_v)	bdev_nr_sectors((_v)->bdev_handle->bdev)
+#define vbd_sz(_v)	bdev_nr_sectors(file_bdev((_v)->bdev_file))
 
 #define xen_blkif_get(_b) (atomic_inc(&(_b)->refcnt))
 #define xen_blkif_put(_b)				\
diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
index e34219ea2b05..0621878940ae 100644
--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -81,7 +81,7 @@ static void xen_update_blkif_status(struct xen_blkif *blkif)
 	int i;
 
 	/* Not ready to connect? */
-	if (!blkif->rings || !blkif->rings[0].irq || !blkif->vbd.bdev_handle)
+	if (!blkif->rings || !blkif->rings[0].irq || !blkif->vbd.bdev_file)
 		return;
 
 	/* Already connected? */
@@ -99,13 +99,12 @@ static void xen_update_blkif_status(struct xen_blkif *blkif)
 		return;
 	}
 
-	err = sync_blockdev(blkif->vbd.bdev_handle->bdev);
+	err = sync_blockdev(file_bdev(blkif->vbd.bdev_file));
 	if (err) {
 		xenbus_dev_error(blkif->be->dev, err, "block flush");
 		return;
 	}
-	invalidate_inode_pages2(
-			blkif->vbd.bdev_handle->bdev->bd_inode->i_mapping);
+	invalidate_inode_pages2(blkif->vbd.bdev_file->f_mapping);
 
 	for (i = 0; i < blkif->nr_rings; i++) {
 		ring = &blkif->rings[i];
@@ -473,9 +472,9 @@ static void xenvbd_sysfs_delif(struct xenbus_device *dev)
 
 static void xen_vbd_free(struct xen_vbd *vbd)
 {
-	if (vbd->bdev_handle)
-		bdev_release(vbd->bdev_handle);
-	vbd->bdev_handle = NULL;
+	if (vbd->bdev_file)
+		fput(vbd->bdev_file);
+	vbd->bdev_file = NULL;
 }
 
 static int xen_vbd_create(struct xen_blkif *blkif, blkif_vdev_t handle,
@@ -483,7 +482,7 @@ static int xen_vbd_create(struct xen_blkif *blkif, blkif_vdev_t handle,
 			  int cdrom)
 {
 	struct xen_vbd *vbd;
-	struct bdev_handle *bdev_handle;
+	struct file *bdev_file;
 
 	vbd = &blkif->vbd;
 	vbd->handle   = handle;
@@ -492,17 +491,17 @@ static int xen_vbd_create(struct xen_blkif *blkif, blkif_vdev_t handle,
 
 	vbd->pdevice  = MKDEV(major, minor);
 
-	bdev_handle = bdev_open_by_dev(vbd->pdevice, vbd->readonly ?
+	bdev_file = bdev_file_open_by_dev(vbd->pdevice, vbd->readonly ?
 				 BLK_OPEN_READ : BLK_OPEN_WRITE, NULL, NULL);
 
-	if (IS_ERR(bdev_handle)) {
+	if (IS_ERR(bdev_file)) {
 		pr_warn("xen_vbd_create: device %08x could not be opened\n",
 			vbd->pdevice);
 		return -ENOENT;
 	}
 
-	vbd->bdev_handle = bdev_handle;
-	if (vbd->bdev_handle->bdev->bd_disk == NULL) {
+	vbd->bdev_file = bdev_file;
+	if (file_bdev(vbd->bdev_file)->bd_disk == NULL) {
 		pr_warn("xen_vbd_create: device %08x doesn't exist\n",
 			vbd->pdevice);
 		xen_vbd_free(vbd);
@@ -510,14 +509,14 @@ static int xen_vbd_create(struct xen_blkif *blkif, blkif_vdev_t handle,
 	}
 	vbd->size = vbd_sz(vbd);
 
-	if (cdrom || disk_to_cdi(vbd->bdev_handle->bdev->bd_disk))
+	if (cdrom || disk_to_cdi(file_bdev(vbd->bdev_file)->bd_disk))
 		vbd->type |= VDISK_CDROM;
-	if (vbd->bdev_handle->bdev->bd_disk->flags & GENHD_FL_REMOVABLE)
+	if (file_bdev(vbd->bdev_file)->bd_disk->flags & GENHD_FL_REMOVABLE)
 		vbd->type |= VDISK_REMOVABLE;
 
-	if (bdev_write_cache(bdev_handle->bdev))
+	if (bdev_write_cache(file_bdev(bdev_file)))
 		vbd->flush_support = true;
-	if (bdev_max_secure_erase_sectors(bdev_handle->bdev))
+	if (bdev_max_secure_erase_sectors(file_bdev(bdev_file)))
 		vbd->discard_secure = true;
 
 	pr_debug("Successful creation of handle=%04x (dom=%u)\n",
@@ -570,7 +569,7 @@ static void xen_blkbk_discard(struct xenbus_transaction xbt, struct backend_info
 	struct xen_blkif *blkif = be->blkif;
 	int err;
 	int state = 0;
-	struct block_device *bdev = be->blkif->vbd.bdev_handle->bdev;
+	struct block_device *bdev = file_bdev(be->blkif->vbd.bdev_file);
 
 	if (!xenbus_read_unsigned(dev->nodename, "discard-enable", 1))
 		return;
@@ -932,7 +931,7 @@ static void connect(struct backend_info *be)
 	}
 	err = xenbus_printf(xbt, dev->nodename, "sector-size", "%lu",
 			    (unsigned long)bdev_logical_block_size(
-					be->blkif->vbd.bdev_handle->bdev));
+					file_bdev(be->blkif->vbd.bdev_file)));
 	if (err) {
 		xenbus_dev_fatal(dev, err, "writing %s/sector-size",
 				 dev->nodename);
@@ -940,7 +939,7 @@ static void connect(struct backend_info *be)
 	}
 	err = xenbus_printf(xbt, dev->nodename, "physical-sector-size", "%u",
 			    bdev_physical_block_size(
-					be->blkif->vbd.bdev_handle->bdev));
+					file_bdev(be->blkif->vbd.bdev_file)));
 	if (err)
 		xenbus_dev_error(dev, err, "writing %s/physical-sector-size",
 				 dev->nodename);

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 12/34] zram: port block device access to file
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (10 preceding siblings ...)
  2024-01-23 13:26 ` [PATCH v2 11/34] xen: " Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-01-31 18:32   ` Jan Kara
  2024-01-23 13:26 ` [PATCH v2 13/34] bcache: port block device access to files Christian Brauner
                   ` (25 subsequent siblings)
  37 siblings, 1 reply; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 drivers/block/zram/zram_drv.c | 26 +++++++++++++-------------
 drivers/block/zram/zram_drv.h |  2 +-
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 6772e0c654fa..d96b3851b5d3 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -426,11 +426,11 @@ static void reset_bdev(struct zram *zram)
 	if (!zram->backing_dev)
 		return;
 
-	bdev_release(zram->bdev_handle);
+	fput(zram->bdev_file);
 	/* hope filp_close flush all of IO */
 	filp_close(zram->backing_dev, NULL);
 	zram->backing_dev = NULL;
-	zram->bdev_handle = NULL;
+	zram->bdev_file = NULL;
 	zram->disk->fops = &zram_devops;
 	kvfree(zram->bitmap);
 	zram->bitmap = NULL;
@@ -476,7 +476,7 @@ static ssize_t backing_dev_store(struct device *dev,
 	struct address_space *mapping;
 	unsigned int bitmap_sz;
 	unsigned long nr_pages, *bitmap = NULL;
-	struct bdev_handle *bdev_handle = NULL;
+	struct file *bdev_file = NULL;
 	int err;
 	struct zram *zram = dev_to_zram(dev);
 
@@ -513,11 +513,11 @@ static ssize_t backing_dev_store(struct device *dev,
 		goto out;
 	}
 
-	bdev_handle = bdev_open_by_dev(inode->i_rdev,
+	bdev_file = bdev_file_open_by_dev(inode->i_rdev,
 				BLK_OPEN_READ | BLK_OPEN_WRITE, zram, NULL);
-	if (IS_ERR(bdev_handle)) {
-		err = PTR_ERR(bdev_handle);
-		bdev_handle = NULL;
+	if (IS_ERR(bdev_file)) {
+		err = PTR_ERR(bdev_file);
+		bdev_file = NULL;
 		goto out;
 	}
 
@@ -531,7 +531,7 @@ static ssize_t backing_dev_store(struct device *dev,
 
 	reset_bdev(zram);
 
-	zram->bdev_handle = bdev_handle;
+	zram->bdev_file = bdev_file;
 	zram->backing_dev = backing_dev;
 	zram->bitmap = bitmap;
 	zram->nr_pages = nr_pages;
@@ -544,8 +544,8 @@ static ssize_t backing_dev_store(struct device *dev,
 out:
 	kvfree(bitmap);
 
-	if (bdev_handle)
-		bdev_release(bdev_handle);
+	if (bdev_file)
+		fput(bdev_file);
 
 	if (backing_dev)
 		filp_close(backing_dev, NULL);
@@ -587,7 +587,7 @@ static void read_from_bdev_async(struct zram *zram, struct page *page,
 {
 	struct bio *bio;
 
-	bio = bio_alloc(zram->bdev_handle->bdev, 1, parent->bi_opf, GFP_NOIO);
+	bio = bio_alloc(file_bdev(zram->bdev_file), 1, parent->bi_opf, GFP_NOIO);
 	bio->bi_iter.bi_sector = entry * (PAGE_SIZE >> 9);
 	__bio_add_page(bio, page, PAGE_SIZE, 0);
 	bio_chain(bio, parent);
@@ -703,7 +703,7 @@ static ssize_t writeback_store(struct device *dev,
 			continue;
 		}
 
-		bio_init(&bio, zram->bdev_handle->bdev, &bio_vec, 1,
+		bio_init(&bio, file_bdev(zram->bdev_file), &bio_vec, 1,
 			 REQ_OP_WRITE | REQ_SYNC);
 		bio.bi_iter.bi_sector = blk_idx * (PAGE_SIZE >> 9);
 		__bio_add_page(&bio, page, PAGE_SIZE, 0);
@@ -785,7 +785,7 @@ static void zram_sync_read(struct work_struct *work)
 	struct bio_vec bv;
 	struct bio bio;
 
-	bio_init(&bio, zw->zram->bdev_handle->bdev, &bv, 1, REQ_OP_READ);
+	bio_init(&bio, file_bdev(zw->zram->bdev_file), &bv, 1, REQ_OP_READ);
 	bio.bi_iter.bi_sector = zw->entry * (PAGE_SIZE >> 9);
 	__bio_add_page(&bio, zw->page, PAGE_SIZE, 0);
 	zw->error = submit_bio_wait(&bio);
diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
index 3b94d12f41b4..37bf29f34d26 100644
--- a/drivers/block/zram/zram_drv.h
+++ b/drivers/block/zram/zram_drv.h
@@ -132,7 +132,7 @@ struct zram {
 	spinlock_t wb_limit_lock;
 	bool wb_limit_enable;
 	u64 bd_wb_limit;
-	struct bdev_handle *bdev_handle;
+	struct file *bdev_file;
 	unsigned long *bitmap;
 	unsigned long nr_pages;
 #endif

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 13/34] bcache: port block device access to files
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (11 preceding siblings ...)
  2024-01-23 13:26 ` [PATCH v2 12/34] zram: " Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-02-01  9:45   ` Jan Kara
  2024-01-23 13:26 ` [PATCH v2 14/34] block2mtd: port " Christian Brauner
                   ` (24 subsequent siblings)
  37 siblings, 1 reply; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 drivers/md/bcache/bcache.h |  4 +--
 drivers/md/bcache/super.c  | 74 +++++++++++++++++++++++-----------------------
 2 files changed, 39 insertions(+), 39 deletions(-)

diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
index 6ae2329052c9..4e6afa89921f 100644
--- a/drivers/md/bcache/bcache.h
+++ b/drivers/md/bcache/bcache.h
@@ -300,7 +300,7 @@ struct cached_dev {
 	struct list_head	list;
 	struct bcache_device	disk;
 	struct block_device	*bdev;
-	struct bdev_handle	*bdev_handle;
+	struct file		*bdev_file;
 
 	struct cache_sb		sb;
 	struct cache_sb_disk	*sb_disk;
@@ -423,7 +423,7 @@ struct cache {
 
 	struct kobject		kobj;
 	struct block_device	*bdev;
-	struct bdev_handle	*bdev_handle;
+	struct file		*bdev_file;
 
 	struct task_struct	*alloc_thread;
 
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index dc3f50f69714..d00b3abab133 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1369,8 +1369,8 @@ static CLOSURE_CALLBACK(cached_dev_free)
 	if (dc->sb_disk)
 		put_page(virt_to_page(dc->sb_disk));
 
-	if (dc->bdev_handle)
-		bdev_release(dc->bdev_handle);
+	if (dc->bdev_file)
+		fput(dc->bdev_file);
 
 	wake_up(&unregister_wait);
 
@@ -1445,7 +1445,7 @@ static int cached_dev_init(struct cached_dev *dc, unsigned int block_size)
 /* Cached device - bcache superblock */
 
 static int register_bdev(struct cache_sb *sb, struct cache_sb_disk *sb_disk,
-				 struct bdev_handle *bdev_handle,
+				 struct file *bdev_file,
 				 struct cached_dev *dc)
 {
 	const char *err = "cannot allocate memory";
@@ -1453,8 +1453,8 @@ static int register_bdev(struct cache_sb *sb, struct cache_sb_disk *sb_disk,
 	int ret = -ENOMEM;
 
 	memcpy(&dc->sb, sb, sizeof(struct cache_sb));
-	dc->bdev_handle = bdev_handle;
-	dc->bdev = bdev_handle->bdev;
+	dc->bdev_file = bdev_file;
+	dc->bdev = file_bdev(bdev_file);
 	dc->sb_disk = sb_disk;
 
 	if (cached_dev_init(dc, sb->block_size << 9))
@@ -2218,8 +2218,8 @@ void bch_cache_release(struct kobject *kobj)
 	if (ca->sb_disk)
 		put_page(virt_to_page(ca->sb_disk));
 
-	if (ca->bdev_handle)
-		bdev_release(ca->bdev_handle);
+	if (ca->bdev_file)
+		fput(ca->bdev_file);
 
 	kfree(ca);
 	module_put(THIS_MODULE);
@@ -2339,18 +2339,18 @@ static int cache_alloc(struct cache *ca)
 }
 
 static int register_cache(struct cache_sb *sb, struct cache_sb_disk *sb_disk,
-				struct bdev_handle *bdev_handle,
+				struct file *bdev_file,
 				struct cache *ca)
 {
 	const char *err = NULL; /* must be set for any error case */
 	int ret = 0;
 
 	memcpy(&ca->sb, sb, sizeof(struct cache_sb));
-	ca->bdev_handle = bdev_handle;
-	ca->bdev = bdev_handle->bdev;
+	ca->bdev_file = bdev_file;
+	ca->bdev = file_bdev(bdev_file);
 	ca->sb_disk = sb_disk;
 
-	if (bdev_max_discard_sectors((bdev_handle->bdev)))
+	if (bdev_max_discard_sectors(file_bdev(bdev_file)))
 		ca->discard = CACHE_DISCARD(&ca->sb);
 
 	ret = cache_alloc(ca);
@@ -2361,20 +2361,20 @@ static int register_cache(struct cache_sb *sb, struct cache_sb_disk *sb_disk,
 			err = "cache_alloc(): cache device is too small";
 		else
 			err = "cache_alloc(): unknown error";
-		pr_notice("error %pg: %s\n", bdev_handle->bdev, err);
+		pr_notice("error %pg: %s\n", file_bdev(bdev_file), err);
 		/*
 		 * If we failed here, it means ca->kobj is not initialized yet,
 		 * kobject_put() won't be called and there is no chance to
-		 * call bdev_release() to bdev in bch_cache_release(). So
-		 * we explicitly call bdev_release() here.
+		 * call fput() to bdev in bch_cache_release(). So
+		 * we explicitly call fput() on the block device here.
 		 */
-		bdev_release(bdev_handle);
+		fput(bdev_file);
 		return ret;
 	}
 
-	if (kobject_add(&ca->kobj, bdev_kobj(bdev_handle->bdev), "bcache")) {
+	if (kobject_add(&ca->kobj, bdev_kobj(file_bdev(bdev_file)), "bcache")) {
 		pr_notice("error %pg: error calling kobject_add\n",
-			  bdev_handle->bdev);
+			  file_bdev(bdev_file));
 		ret = -ENOMEM;
 		goto out;
 	}
@@ -2388,7 +2388,7 @@ static int register_cache(struct cache_sb *sb, struct cache_sb_disk *sb_disk,
 		goto out;
 	}
 
-	pr_info("registered cache device %pg\n", ca->bdev_handle->bdev);
+	pr_info("registered cache device %pg\n", file_bdev(ca->bdev_file));
 
 out:
 	kobject_put(&ca->kobj);
@@ -2446,7 +2446,7 @@ struct async_reg_args {
 	char *path;
 	struct cache_sb *sb;
 	struct cache_sb_disk *sb_disk;
-	struct bdev_handle *bdev_handle;
+	struct file *bdev_file;
 	void *holder;
 };
 
@@ -2457,7 +2457,7 @@ static void register_bdev_worker(struct work_struct *work)
 		container_of(work, struct async_reg_args, reg_work.work);
 
 	mutex_lock(&bch_register_lock);
-	if (register_bdev(args->sb, args->sb_disk, args->bdev_handle,
+	if (register_bdev(args->sb, args->sb_disk, args->bdev_file,
 			  args->holder) < 0)
 		fail = true;
 	mutex_unlock(&bch_register_lock);
@@ -2478,7 +2478,7 @@ static void register_cache_worker(struct work_struct *work)
 		container_of(work, struct async_reg_args, reg_work.work);
 
 	/* blkdev_put() will be called in bch_cache_release() */
-	if (register_cache(args->sb, args->sb_disk, args->bdev_handle,
+	if (register_cache(args->sb, args->sb_disk, args->bdev_file,
 			   args->holder))
 		fail = true;
 
@@ -2516,7 +2516,7 @@ static ssize_t register_bcache(struct kobject *k, struct kobj_attribute *attr,
 	char *path = NULL;
 	struct cache_sb *sb;
 	struct cache_sb_disk *sb_disk;
-	struct bdev_handle *bdev_handle, *bdev_handle2;
+	struct file *bdev_file, *bdev_file2;
 	void *holder = NULL;
 	ssize_t ret;
 	bool async_registration = false;
@@ -2549,15 +2549,15 @@ static ssize_t register_bcache(struct kobject *k, struct kobj_attribute *attr,
 
 	ret = -EINVAL;
 	err = "failed to open device";
-	bdev_handle = bdev_open_by_path(strim(path), BLK_OPEN_READ, NULL, NULL);
-	if (IS_ERR(bdev_handle))
+	bdev_file = bdev_file_open_by_path(strim(path), BLK_OPEN_READ, NULL, NULL);
+	if (IS_ERR(bdev_file))
 		goto out_free_sb;
 
 	err = "failed to set blocksize";
-	if (set_blocksize(bdev_handle->bdev, 4096))
+	if (set_blocksize(file_bdev(bdev_file), 4096))
 		goto out_blkdev_put;
 
-	err = read_super(sb, bdev_handle->bdev, &sb_disk);
+	err = read_super(sb, file_bdev(bdev_file), &sb_disk);
 	if (err)
 		goto out_blkdev_put;
 
@@ -2569,13 +2569,13 @@ static ssize_t register_bcache(struct kobject *k, struct kobj_attribute *attr,
 	}
 
 	/* Now reopen in exclusive mode with proper holder */
-	bdev_handle2 = bdev_open_by_dev(bdev_handle->bdev->bd_dev,
+	bdev_file2 = bdev_file_open_by_dev(file_bdev(bdev_file)->bd_dev,
 			BLK_OPEN_READ | BLK_OPEN_WRITE, holder, NULL);
-	bdev_release(bdev_handle);
-	bdev_handle = bdev_handle2;
-	if (IS_ERR(bdev_handle)) {
-		ret = PTR_ERR(bdev_handle);
-		bdev_handle = NULL;
+	fput(bdev_file);
+	bdev_file = bdev_file2;
+	if (IS_ERR(bdev_file)) {
+		ret = PTR_ERR(bdev_file);
+		bdev_file = NULL;
 		if (ret == -EBUSY) {
 			dev_t dev;
 
@@ -2610,7 +2610,7 @@ static ssize_t register_bcache(struct kobject *k, struct kobj_attribute *attr,
 		args->path	= path;
 		args->sb	= sb;
 		args->sb_disk	= sb_disk;
-		args->bdev_handle	= bdev_handle;
+		args->bdev_file	= bdev_file;
 		args->holder	= holder;
 		register_device_async(args);
 		/* No wait and returns to user space */
@@ -2619,14 +2619,14 @@ static ssize_t register_bcache(struct kobject *k, struct kobj_attribute *attr,
 
 	if (SB_IS_BDEV(sb)) {
 		mutex_lock(&bch_register_lock);
-		ret = register_bdev(sb, sb_disk, bdev_handle, holder);
+		ret = register_bdev(sb, sb_disk, bdev_file, holder);
 		mutex_unlock(&bch_register_lock);
 		/* blkdev_put() will be called in cached_dev_free() */
 		if (ret < 0)
 			goto out_free_sb;
 	} else {
 		/* blkdev_put() will be called in bch_cache_release() */
-		ret = register_cache(sb, sb_disk, bdev_handle, holder);
+		ret = register_cache(sb, sb_disk, bdev_file, holder);
 		if (ret)
 			goto out_free_sb;
 	}
@@ -2642,8 +2642,8 @@ static ssize_t register_bcache(struct kobject *k, struct kobj_attribute *attr,
 out_put_sb_page:
 	put_page(virt_to_page(sb_disk));
 out_blkdev_put:
-	if (bdev_handle)
-		bdev_release(bdev_handle);
+	if (bdev_file)
+		fput(bdev_file);
 out_free_sb:
 	kfree(sb);
 out_free_path:

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 14/34] block2mtd: port device access to files
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (12 preceding siblings ...)
  2024-01-23 13:26 ` [PATCH v2 13/34] bcache: port block device access to files Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-02-01  9:47   ` Jan Kara
  2024-01-23 13:26 ` [PATCH v2 15/34] nvme: port block device access to file Christian Brauner
                   ` (23 subsequent siblings)
  37 siblings, 1 reply; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 drivers/mtd/devices/block2mtd.c | 46 +++++++++++++++++++----------------------
 1 file changed, 21 insertions(+), 25 deletions(-)

diff --git a/drivers/mtd/devices/block2mtd.c b/drivers/mtd/devices/block2mtd.c
index aa44a23ec045..97a00ec9a4d4 100644
--- a/drivers/mtd/devices/block2mtd.c
+++ b/drivers/mtd/devices/block2mtd.c
@@ -37,7 +37,7 @@
 /* Info for the block device */
 struct block2mtd_dev {
 	struct list_head list;
-	struct bdev_handle *bdev_handle;
+	struct file *bdev_file;
 	struct mtd_info mtd;
 	struct mutex write_mutex;
 };
@@ -55,8 +55,7 @@ static struct page *page_read(struct address_space *mapping, pgoff_t index)
 /* erase a specified part of the device */
 static int _block2mtd_erase(struct block2mtd_dev *dev, loff_t to, size_t len)
 {
-	struct address_space *mapping =
-				dev->bdev_handle->bdev->bd_inode->i_mapping;
+	struct address_space *mapping = dev->bdev_file->f_mapping;
 	struct page *page;
 	pgoff_t index = to >> PAGE_SHIFT;	// page index
 	int pages = len >> PAGE_SHIFT;
@@ -106,8 +105,7 @@ static int block2mtd_read(struct mtd_info *mtd, loff_t from, size_t len,
 		size_t *retlen, u_char *buf)
 {
 	struct block2mtd_dev *dev = mtd->priv;
-	struct address_space *mapping =
-				dev->bdev_handle->bdev->bd_inode->i_mapping;
+	struct address_space *mapping = dev->bdev_file->f_mapping;
 	struct page *page;
 	pgoff_t index = from >> PAGE_SHIFT;
 	int offset = from & (PAGE_SIZE-1);
@@ -142,8 +140,7 @@ static int _block2mtd_write(struct block2mtd_dev *dev, const u_char *buf,
 		loff_t to, size_t len, size_t *retlen)
 {
 	struct page *page;
-	struct address_space *mapping =
-				dev->bdev_handle->bdev->bd_inode->i_mapping;
+	struct address_space *mapping = dev->bdev_file->f_mapping;
 	pgoff_t index = to >> PAGE_SHIFT;	// page index
 	int offset = to & ~PAGE_MASK;	// page offset
 	int cpylen;
@@ -198,7 +195,7 @@ static int block2mtd_write(struct mtd_info *mtd, loff_t to, size_t len,
 static void block2mtd_sync(struct mtd_info *mtd)
 {
 	struct block2mtd_dev *dev = mtd->priv;
-	sync_blockdev(dev->bdev_handle->bdev);
+	sync_blockdev(file_bdev(dev->bdev_file));
 	return;
 }
 
@@ -210,10 +207,9 @@ static void block2mtd_free_device(struct block2mtd_dev *dev)
 
 	kfree(dev->mtd.name);
 
-	if (dev->bdev_handle) {
-		invalidate_mapping_pages(
-			dev->bdev_handle->bdev->bd_inode->i_mapping, 0, -1);
-		bdev_release(dev->bdev_handle);
+	if (dev->bdev_file) {
+		invalidate_mapping_pages(dev->bdev_file->f_mapping, 0, -1);
+		fput(dev->bdev_file);
 	}
 
 	kfree(dev);
@@ -223,10 +219,10 @@ static void block2mtd_free_device(struct block2mtd_dev *dev)
  * This function is marked __ref because it calls the __init marked
  * early_lookup_bdev when called from the early boot code.
  */
-static struct bdev_handle __ref *mdtblock_early_get_bdev(const char *devname,
+static struct file __ref *mdtblock_early_get_bdev(const char *devname,
 		blk_mode_t mode, int timeout, struct block2mtd_dev *dev)
 {
-	struct bdev_handle *bdev_handle = ERR_PTR(-ENODEV);
+	struct file *bdev_file = ERR_PTR(-ENODEV);
 #ifndef MODULE
 	int i;
 
@@ -234,7 +230,7 @@ static struct bdev_handle __ref *mdtblock_early_get_bdev(const char *devname,
 	 * We can't use early_lookup_bdev from a running system.
 	 */
 	if (system_state >= SYSTEM_RUNNING)
-		return bdev_handle;
+		return bdev_file;
 
 	/*
 	 * We might not have the root device mounted at this point.
@@ -253,20 +249,20 @@ static struct bdev_handle __ref *mdtblock_early_get_bdev(const char *devname,
 		wait_for_device_probe();
 
 		if (!early_lookup_bdev(devname, &devt)) {
-			bdev_handle = bdev_open_by_dev(devt, mode, dev, NULL);
-			if (!IS_ERR(bdev_handle))
+			bdev_file = bdev_file_open_by_dev(devt, mode, dev, NULL);
+			if (!IS_ERR(bdev_file))
 				break;
 		}
 	}
 #endif
-	return bdev_handle;
+	return bdev_file;
 }
 
 static struct block2mtd_dev *add_device(char *devname, int erase_size,
 		char *label, int timeout)
 {
 	const blk_mode_t mode = BLK_OPEN_READ | BLK_OPEN_WRITE;
-	struct bdev_handle *bdev_handle;
+	struct file *bdev_file;
 	struct block_device *bdev;
 	struct block2mtd_dev *dev;
 	char *name;
@@ -279,16 +275,16 @@ static struct block2mtd_dev *add_device(char *devname, int erase_size,
 		return NULL;
 
 	/* Get a handle on the device */
-	bdev_handle = bdev_open_by_path(devname, mode, dev, NULL);
-	if (IS_ERR(bdev_handle))
-		bdev_handle = mdtblock_early_get_bdev(devname, mode, timeout,
+	bdev_file = bdev_file_open_by_path(devname, mode, dev, NULL);
+	if (IS_ERR(bdev_file))
+		bdev_file = mdtblock_early_get_bdev(devname, mode, timeout,
 						      dev);
-	if (IS_ERR(bdev_handle)) {
+	if (IS_ERR(bdev_file)) {
 		pr_err("error: cannot open device %s\n", devname);
 		goto err_free_block2mtd;
 	}
-	dev->bdev_handle = bdev_handle;
-	bdev = bdev_handle->bdev;
+	dev->bdev_file = bdev_file;
+	bdev = file_bdev(bdev_file);
 
 	if (MAJOR(bdev->bd_dev) == MTD_BLOCK_MAJOR) {
 		pr_err("attempting to use an MTD device as a block device\n");

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 15/34] nvme: port block device access to file
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (13 preceding siblings ...)
  2024-01-23 13:26 ` [PATCH v2 14/34] block2mtd: port " Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-02-01  9:48   ` Jan Kara
  2024-01-23 13:26 ` [PATCH v2 16/34] s390: " Christian Brauner
                   ` (22 subsequent siblings)
  37 siblings, 1 reply; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 drivers/nvme/target/io-cmd-bdev.c | 16 ++++++++--------
 drivers/nvme/target/nvmet.h       |  2 +-
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/nvme/target/io-cmd-bdev.c b/drivers/nvme/target/io-cmd-bdev.c
index f11400a908f2..6426aac2634a 100644
--- a/drivers/nvme/target/io-cmd-bdev.c
+++ b/drivers/nvme/target/io-cmd-bdev.c
@@ -50,10 +50,10 @@ void nvmet_bdev_set_limits(struct block_device *bdev, struct nvme_id_ns *id)
 
 void nvmet_bdev_ns_disable(struct nvmet_ns *ns)
 {
-	if (ns->bdev_handle) {
-		bdev_release(ns->bdev_handle);
+	if (ns->bdev_file) {
+		fput(ns->bdev_file);
 		ns->bdev = NULL;
-		ns->bdev_handle = NULL;
+		ns->bdev_file = NULL;
 	}
 }
 
@@ -85,18 +85,18 @@ int nvmet_bdev_ns_enable(struct nvmet_ns *ns)
 	if (ns->buffered_io)
 		return -ENOTBLK;
 
-	ns->bdev_handle = bdev_open_by_path(ns->device_path,
+	ns->bdev_file = bdev_file_open_by_path(ns->device_path,
 				BLK_OPEN_READ | BLK_OPEN_WRITE, NULL, NULL);
-	if (IS_ERR(ns->bdev_handle)) {
-		ret = PTR_ERR(ns->bdev_handle);
+	if (IS_ERR(ns->bdev_file)) {
+		ret = PTR_ERR(ns->bdev_file);
 		if (ret != -ENOTBLK) {
 			pr_err("failed to open block device %s: (%d)\n",
 					ns->device_path, ret);
 		}
-		ns->bdev_handle = NULL;
+		ns->bdev_file = NULL;
 		return ret;
 	}
-	ns->bdev = ns->bdev_handle->bdev;
+	ns->bdev = file_bdev(ns->bdev_file);
 	ns->size = bdev_nr_bytes(ns->bdev);
 	ns->blksize_shift = blksize_bits(bdev_logical_block_size(ns->bdev));
 
diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h
index 6c8acebe1a1a..33e61b4f478b 100644
--- a/drivers/nvme/target/nvmet.h
+++ b/drivers/nvme/target/nvmet.h
@@ -58,7 +58,7 @@
 
 struct nvmet_ns {
 	struct percpu_ref	ref;
-	struct bdev_handle	*bdev_handle;
+	struct file		*bdev_file;
 	struct block_device	*bdev;
 	struct file		*file;
 	bool			readonly;

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 16/34] s390: port block device access to file
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (14 preceding siblings ...)
  2024-01-23 13:26 ` [PATCH v2 15/34] nvme: port block device access to file Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-02-01 10:11   ` Jan Kara
  2024-01-23 13:26 ` [PATCH v2 17/34] target: " Christian Brauner
                   ` (21 subsequent siblings)
  37 siblings, 1 reply; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 drivers/s390/block/dasd.c       | 10 +++++-----
 drivers/s390/block/dasd_genhd.c | 36 ++++++++++++++++++------------------
 drivers/s390/block/dasd_int.h   |  2 +-
 drivers/s390/block/dasd_ioctl.c |  2 +-
 4 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/drivers/s390/block/dasd.c b/drivers/s390/block/dasd.c
index 7327e81352e9..c833a7c7d7b2 100644
--- a/drivers/s390/block/dasd.c
+++ b/drivers/s390/block/dasd.c
@@ -412,7 +412,7 @@ dasd_state_ready_to_online(struct dasd_device * device)
 					KOBJ_CHANGE);
 			return 0;
 		}
-		disk_uevent(device->block->bdev_handle->bdev->bd_disk,
+		disk_uevent(file_bdev(device->block->bdev_file)->bd_disk,
 			    KOBJ_CHANGE);
 	}
 	return 0;
@@ -433,7 +433,7 @@ static int dasd_state_online_to_ready(struct dasd_device *device)
 
 	device->state = DASD_STATE_READY;
 	if (device->block && !(device->features & DASD_FEATURE_USERAW))
-		disk_uevent(device->block->bdev_handle->bdev->bd_disk,
+		disk_uevent(file_bdev(device->block->bdev_file)->bd_disk,
 			    KOBJ_CHANGE);
 	return 0;
 }
@@ -3588,7 +3588,7 @@ int dasd_generic_set_offline(struct ccw_device *cdev)
 	 * in the other openers.
 	 */
 	if (device->block) {
-		max_count = device->block->bdev_handle ? 0 : -1;
+		max_count = device->block->bdev_file ? 0 : -1;
 		open_count = atomic_read(&device->block->open_count);
 		if (open_count > max_count) {
 			if (open_count > 0)
@@ -3634,8 +3634,8 @@ int dasd_generic_set_offline(struct ccw_device *cdev)
 		 * so sync bdev first and then wait for our queues to become
 		 * empty
 		 */
-		if (device->block && device->block->bdev_handle)
-			bdev_mark_dead(device->block->bdev_handle->bdev, false);
+		if (device->block && device->block->bdev_file)
+			bdev_mark_dead(file_bdev(device->block->bdev_file), false);
 		dasd_schedule_device_bh(device);
 		rc = wait_event_interruptible(shutdown_waitq,
 					      _wait_for_empty_queues(device));
diff --git a/drivers/s390/block/dasd_genhd.c b/drivers/s390/block/dasd_genhd.c
index 55e3abe94cde..8bf2cf0ccc15 100644
--- a/drivers/s390/block/dasd_genhd.c
+++ b/drivers/s390/block/dasd_genhd.c
@@ -127,15 +127,15 @@ void dasd_gendisk_free(struct dasd_block *block)
  */
 int dasd_scan_partitions(struct dasd_block *block)
 {
-	struct bdev_handle *bdev_handle;
+	struct file *bdev_file;
 	int rc;
 
-	bdev_handle = bdev_open_by_dev(disk_devt(block->gdp), BLK_OPEN_READ,
+	bdev_file = bdev_file_open_by_dev(disk_devt(block->gdp), BLK_OPEN_READ,
 				       NULL, NULL);
-	if (IS_ERR(bdev_handle)) {
+	if (IS_ERR(bdev_file)) {
 		DBF_DEV_EVENT(DBF_ERR, block->base,
 			      "scan partitions error, blkdev_get returned %ld",
-			      PTR_ERR(bdev_handle));
+			      PTR_ERR(bdev_file));
 		return -ENODEV;
 	}
 
@@ -147,15 +147,15 @@ int dasd_scan_partitions(struct dasd_block *block)
 				"scan partitions error, rc %d", rc);
 
 	/*
-	 * Since the matching bdev_release() call to the
-	 * bdev_open_by_path() in this function is not called before
+	 * Since the matching fput() call to the
+	 * bdev_file_open_by_path() in this function is not called before
 	 * dasd_destroy_partitions the offline open_count limit needs to be
-	 * increased from 0 to 1. This is done by setting device->bdev_handle
+	 * increased from 0 to 1. This is done by setting device->bdev_file
 	 * (see dasd_generic_set_offline). As long as the partition detection
 	 * is running no offline should be allowed. That is why the assignment
-	 * to block->bdev_handle is done AFTER the BLKRRPART ioctl.
+	 * to block->bdev_file is done AFTER the BLKRRPART ioctl.
 	 */
-	block->bdev_handle = bdev_handle;
+	block->bdev_file = bdev_file;
 	return 0;
 }
 
@@ -165,21 +165,21 @@ int dasd_scan_partitions(struct dasd_block *block)
  */
 void dasd_destroy_partitions(struct dasd_block *block)
 {
-	struct bdev_handle *bdev_handle;
+	struct file *bdev_file;
 
 	/*
-	 * Get the bdev_handle pointer from the device structure and clear
-	 * device->bdev_handle to lower the offline open_count limit again.
+	 * Get the bdev_file pointer from the device structure and clear
+	 * device->bdev_file to lower the offline open_count limit again.
 	 */
-	bdev_handle = block->bdev_handle;
-	block->bdev_handle = NULL;
+	bdev_file = block->bdev_file;
+	block->bdev_file = NULL;
 
-	mutex_lock(&bdev_handle->bdev->bd_disk->open_mutex);
-	bdev_disk_changed(bdev_handle->bdev->bd_disk, true);
-	mutex_unlock(&bdev_handle->bdev->bd_disk->open_mutex);
+	mutex_lock(&file_bdev(bdev_file)->bd_disk->open_mutex);
+	bdev_disk_changed(file_bdev(bdev_file)->bd_disk, true);
+	mutex_unlock(&file_bdev(bdev_file)->bd_disk->open_mutex);
 
 	/* Matching blkdev_put to the blkdev_get in dasd_scan_partitions. */
-	bdev_release(bdev_handle);
+	fput(bdev_file);
 }
 
 int dasd_gendisk_init(void)
diff --git a/drivers/s390/block/dasd_int.h b/drivers/s390/block/dasd_int.h
index 1b1b8a41c4d4..aecd502aec51 100644
--- a/drivers/s390/block/dasd_int.h
+++ b/drivers/s390/block/dasd_int.h
@@ -650,7 +650,7 @@ struct dasd_block {
 	struct gendisk *gdp;
 	spinlock_t request_queue_lock;
 	struct blk_mq_tag_set tag_set;
-	struct bdev_handle *bdev_handle;
+	struct file *bdev_file;
 	atomic_t open_count;
 
 	unsigned long blocks;	   /* size of volume in blocks */
diff --git a/drivers/s390/block/dasd_ioctl.c b/drivers/s390/block/dasd_ioctl.c
index 61b9675e2a67..de85a5e4e21b 100644
--- a/drivers/s390/block/dasd_ioctl.c
+++ b/drivers/s390/block/dasd_ioctl.c
@@ -537,7 +537,7 @@ static int __dasd_ioctl_information(struct dasd_block *block,
 	 * This must be hidden from user-space.
 	 */
 	dasd_info->open_count = atomic_read(&block->open_count);
-	if (!block->bdev_handle)
+	if (!block->bdev_file)
 		dasd_info->open_count++;
 
 	/*

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 17/34] target: port block device access to file
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (15 preceding siblings ...)
  2024-01-23 13:26 ` [PATCH v2 16/34] s390: " Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-02-01 10:12   ` Jan Kara
  2024-01-23 13:26 ` [PATCH v2 18/34] bcachefs: " Christian Brauner
                   ` (20 subsequent siblings)
  37 siblings, 1 reply; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 drivers/target/target_core_iblock.c | 18 +++++++++---------
 drivers/target/target_core_iblock.h |  2 +-
 drivers/target/target_core_pscsi.c  | 22 +++++++++++-----------
 drivers/target/target_core_pscsi.h  |  2 +-
 4 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/drivers/target/target_core_iblock.c b/drivers/target/target_core_iblock.c
index 8eb9eb7ce5df..7f6ca8177845 100644
--- a/drivers/target/target_core_iblock.c
+++ b/drivers/target/target_core_iblock.c
@@ -91,7 +91,7 @@ static int iblock_configure_device(struct se_device *dev)
 {
 	struct iblock_dev *ib_dev = IBLOCK_DEV(dev);
 	struct request_queue *q;
-	struct bdev_handle *bdev_handle;
+	struct file *bdev_file;
 	struct block_device *bd;
 	struct blk_integrity *bi;
 	blk_mode_t mode = BLK_OPEN_READ;
@@ -117,14 +117,14 @@ static int iblock_configure_device(struct se_device *dev)
 	else
 		dev->dev_flags |= DF_READ_ONLY;
 
-	bdev_handle = bdev_open_by_path(ib_dev->ibd_udev_path, mode, ib_dev,
+	bdev_file = bdev_file_open_by_path(ib_dev->ibd_udev_path, mode, ib_dev,
 					NULL);
-	if (IS_ERR(bdev_handle)) {
-		ret = PTR_ERR(bdev_handle);
+	if (IS_ERR(bdev_file)) {
+		ret = PTR_ERR(bdev_file);
 		goto out_free_bioset;
 	}
-	ib_dev->ibd_bdev_handle = bdev_handle;
-	ib_dev->ibd_bd = bd = bdev_handle->bdev;
+	ib_dev->ibd_bdev_file = bdev_file;
+	ib_dev->ibd_bd = bd = file_bdev(bdev_file);
 
 	q = bdev_get_queue(bd);
 
@@ -180,7 +180,7 @@ static int iblock_configure_device(struct se_device *dev)
 	return 0;
 
 out_blkdev_put:
-	bdev_release(ib_dev->ibd_bdev_handle);
+	fput(ib_dev->ibd_bdev_file);
 out_free_bioset:
 	bioset_exit(&ib_dev->ibd_bio_set);
 out:
@@ -205,8 +205,8 @@ static void iblock_destroy_device(struct se_device *dev)
 {
 	struct iblock_dev *ib_dev = IBLOCK_DEV(dev);
 
-	if (ib_dev->ibd_bdev_handle)
-		bdev_release(ib_dev->ibd_bdev_handle);
+	if (ib_dev->ibd_bdev_file)
+		fput(ib_dev->ibd_bdev_file);
 	bioset_exit(&ib_dev->ibd_bio_set);
 }
 
diff --git a/drivers/target/target_core_iblock.h b/drivers/target/target_core_iblock.h
index 683f9a55945b..91f6f4280666 100644
--- a/drivers/target/target_core_iblock.h
+++ b/drivers/target/target_core_iblock.h
@@ -32,7 +32,7 @@ struct iblock_dev {
 	u32	ibd_flags;
 	struct bio_set	ibd_bio_set;
 	struct block_device *ibd_bd;
-	struct bdev_handle *ibd_bdev_handle;
+	struct file *ibd_bdev_file;
 	bool ibd_readonly;
 	struct iblock_dev_plug *ibd_plug;
 } ____cacheline_aligned;
diff --git a/drivers/target/target_core_pscsi.c b/drivers/target/target_core_pscsi.c
index 41b7489d37ce..9aedd682d10c 100644
--- a/drivers/target/target_core_pscsi.c
+++ b/drivers/target/target_core_pscsi.c
@@ -352,7 +352,7 @@ static int pscsi_create_type_disk(struct se_device *dev, struct scsi_device *sd)
 	struct pscsi_hba_virt *phv = dev->se_hba->hba_ptr;
 	struct pscsi_dev_virt *pdv = PSCSI_DEV(dev);
 	struct Scsi_Host *sh = sd->host;
-	struct bdev_handle *bdev_handle;
+	struct file *bdev_file;
 	int ret;
 
 	if (scsi_device_get(sd)) {
@@ -366,18 +366,18 @@ static int pscsi_create_type_disk(struct se_device *dev, struct scsi_device *sd)
 	 * Claim exclusive struct block_device access to struct scsi_device
 	 * for TYPE_DISK and TYPE_ZBC using supplied udev_path
 	 */
-	bdev_handle = bdev_open_by_path(dev->udev_path,
+	bdev_file = bdev_file_open_by_path(dev->udev_path,
 				BLK_OPEN_WRITE | BLK_OPEN_READ, pdv, NULL);
-	if (IS_ERR(bdev_handle)) {
+	if (IS_ERR(bdev_file)) {
 		pr_err("pSCSI: bdev_open_by_path() failed\n");
 		scsi_device_put(sd);
-		return PTR_ERR(bdev_handle);
+		return PTR_ERR(bdev_file);
 	}
-	pdv->pdv_bdev_handle = bdev_handle;
+	pdv->pdv_bdev_file = bdev_file;
 
 	ret = pscsi_add_device_to_list(dev, sd);
 	if (ret) {
-		bdev_release(bdev_handle);
+		fput(bdev_file);
 		scsi_device_put(sd);
 		return ret;
 	}
@@ -564,9 +564,9 @@ static void pscsi_destroy_device(struct se_device *dev)
 		 * from pscsi_create_type_disk()
 		 */
 		if ((sd->type == TYPE_DISK || sd->type == TYPE_ZBC) &&
-		    pdv->pdv_bdev_handle) {
-			bdev_release(pdv->pdv_bdev_handle);
-			pdv->pdv_bdev_handle = NULL;
+		    pdv->pdv_bdev_file) {
+			fput(pdv->pdv_bdev_file);
+			pdv->pdv_bdev_file = NULL;
 		}
 		/*
 		 * For HBA mode PHV_LLD_SCSI_HOST_NO, release the reference
@@ -994,8 +994,8 @@ static sector_t pscsi_get_blocks(struct se_device *dev)
 {
 	struct pscsi_dev_virt *pdv = PSCSI_DEV(dev);
 
-	if (pdv->pdv_bdev_handle)
-		return bdev_nr_sectors(pdv->pdv_bdev_handle->bdev);
+	if (pdv->pdv_bdev_file)
+		return bdev_nr_sectors(file_bdev(pdv->pdv_bdev_file));
 	return 0;
 }
 
diff --git a/drivers/target/target_core_pscsi.h b/drivers/target/target_core_pscsi.h
index b0a3ef136592..9acaa21e4c78 100644
--- a/drivers/target/target_core_pscsi.h
+++ b/drivers/target/target_core_pscsi.h
@@ -37,7 +37,7 @@ struct pscsi_dev_virt {
 	int	pdv_channel_id;
 	int	pdv_target_id;
 	int	pdv_lun_id;
-	struct bdev_handle *pdv_bdev_handle;
+	struct file *pdv_bdev_file;
 	struct scsi_device *pdv_sd;
 	struct Scsi_Host *pdv_lld_host;
 } ____cacheline_aligned;

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 18/34] bcachefs: port block device access to file
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (16 preceding siblings ...)
  2024-01-23 13:26 ` [PATCH v2 17/34] target: " Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-02-01 10:13   ` Jan Kara
  2024-01-23 13:26 ` [PATCH v2 19/34] btrfs: port " Christian Brauner
                   ` (19 subsequent siblings)
  37 siblings, 1 reply; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 fs/bcachefs/super-io.c    | 20 ++++++++++----------
 fs/bcachefs/super_types.h |  2 +-
 2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/fs/bcachefs/super-io.c b/fs/bcachefs/super-io.c
index d60c7d27a047..ce8cf2d91f84 100644
--- a/fs/bcachefs/super-io.c
+++ b/fs/bcachefs/super-io.c
@@ -142,8 +142,8 @@ void bch2_sb_field_delete(struct bch_sb_handle *sb,
 void bch2_free_super(struct bch_sb_handle *sb)
 {
 	kfree(sb->bio);
-	if (!IS_ERR_OR_NULL(sb->bdev_handle))
-		bdev_release(sb->bdev_handle);
+	if (!IS_ERR_OR_NULL(sb->s_bdev_file))
+		fput(sb->s_bdev_file);
 	kfree(sb->holder);
 	kfree(sb->sb_name);
 
@@ -704,22 +704,22 @@ static int __bch2_read_super(const char *path, struct bch_opts *opts,
 	if (!opt_get(*opts, nochanges))
 		sb->mode |= BLK_OPEN_WRITE;
 
-	sb->bdev_handle = bdev_open_by_path(path, sb->mode, sb->holder, &bch2_sb_handle_bdev_ops);
-	if (IS_ERR(sb->bdev_handle) &&
-	    PTR_ERR(sb->bdev_handle) == -EACCES &&
+	sb->s_bdev_file = bdev_file_open_by_path(path, sb->mode, sb->holder, &bch2_sb_handle_bdev_ops);
+	if (IS_ERR(sb->s_bdev_file) &&
+	    PTR_ERR(sb->s_bdev_file) == -EACCES &&
 	    opt_get(*opts, read_only)) {
 		sb->mode &= ~BLK_OPEN_WRITE;
 
-		sb->bdev_handle = bdev_open_by_path(path, sb->mode, sb->holder, &bch2_sb_handle_bdev_ops);
-		if (!IS_ERR(sb->bdev_handle))
+		sb->s_bdev_file = bdev_file_open_by_path(path, sb->mode, sb->holder, &bch2_sb_handle_bdev_ops);
+		if (!IS_ERR(sb->s_bdev_file))
 			opt_set(*opts, nochanges, true);
 	}
 
-	if (IS_ERR(sb->bdev_handle)) {
-		ret = PTR_ERR(sb->bdev_handle);
+	if (IS_ERR(sb->s_bdev_file)) {
+		ret = PTR_ERR(sb->s_bdev_file);
 		goto out;
 	}
-	sb->bdev = sb->bdev_handle->bdev;
+	sb->bdev = file_bdev(sb->s_bdev_file);
 
 	ret = bch2_sb_realloc(sb, 0);
 	if (ret) {
diff --git a/fs/bcachefs/super_types.h b/fs/bcachefs/super_types.h
index 0e5a14fc8e7f..ec784d975f66 100644
--- a/fs/bcachefs/super_types.h
+++ b/fs/bcachefs/super_types.h
@@ -4,7 +4,7 @@
 
 struct bch_sb_handle {
 	struct bch_sb		*sb;
-	struct bdev_handle	*bdev_handle;
+	struct file		*s_bdev_file;
 	struct block_device	*bdev;
 	char			*sb_name;
 	struct bio		*bio;

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 19/34] btrfs: port device access to file
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (17 preceding siblings ...)
  2024-01-23 13:26 ` [PATCH v2 18/34] bcachefs: " Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-02-01 10:16   ` Jan Kara
  2024-01-23 13:26 ` [PATCH v2 20/34] erofs: " Christian Brauner
                   ` (18 subsequent siblings)
  37 siblings, 1 reply; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 fs/btrfs/dev-replace.c | 14 ++++----
 fs/btrfs/ioctl.c       | 16 ++++-----
 fs/btrfs/volumes.c     | 92 +++++++++++++++++++++++++-------------------------
 fs/btrfs/volumes.h     |  4 +--
 4 files changed, 63 insertions(+), 63 deletions(-)

diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
index 1502d664c892..2eb11fe4bd05 100644
--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -246,7 +246,7 @@ static int btrfs_init_dev_replace_tgtdev(struct btrfs_fs_info *fs_info,
 {
 	struct btrfs_fs_devices *fs_devices = fs_info->fs_devices;
 	struct btrfs_device *device;
-	struct bdev_handle *bdev_handle;
+	struct file *bdev_file;
 	struct block_device *bdev;
 	u64 devid = BTRFS_DEV_REPLACE_DEVID;
 	int ret = 0;
@@ -257,13 +257,13 @@ static int btrfs_init_dev_replace_tgtdev(struct btrfs_fs_info *fs_info,
 		return -EINVAL;
 	}
 
-	bdev_handle = bdev_open_by_path(device_path, BLK_OPEN_WRITE,
+	bdev_file = bdev_file_open_by_path(device_path, BLK_OPEN_WRITE,
 					fs_info->bdev_holder, NULL);
-	if (IS_ERR(bdev_handle)) {
+	if (IS_ERR(bdev_file)) {
 		btrfs_err(fs_info, "target device %s is invalid!", device_path);
-		return PTR_ERR(bdev_handle);
+		return PTR_ERR(bdev_file);
 	}
-	bdev = bdev_handle->bdev;
+	bdev = file_bdev(bdev_file);
 
 	if (!btrfs_check_device_zone_type(fs_info, bdev)) {
 		btrfs_err(fs_info,
@@ -314,7 +314,7 @@ static int btrfs_init_dev_replace_tgtdev(struct btrfs_fs_info *fs_info,
 	device->commit_bytes_used = device->bytes_used;
 	device->fs_info = fs_info;
 	device->bdev = bdev;
-	device->bdev_handle = bdev_handle;
+	device->bdev_file = bdev_file;
 	set_bit(BTRFS_DEV_STATE_IN_FS_METADATA, &device->dev_state);
 	set_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state);
 	device->dev_stats_valid = 1;
@@ -335,7 +335,7 @@ static int btrfs_init_dev_replace_tgtdev(struct btrfs_fs_info *fs_info,
 	return 0;
 
 error:
-	bdev_release(bdev_handle);
+	fput(bdev_file);
 	return ret;
 }
 
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 41b479861b3c..9e0b3932d90c 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -2691,7 +2691,7 @@ static long btrfs_ioctl_rm_dev_v2(struct file *file, void __user *arg)
 	struct inode *inode = file_inode(file);
 	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
 	struct btrfs_ioctl_vol_args_v2 *vol_args;
-	struct bdev_handle *bdev_handle = NULL;
+	struct file *bdev_file = NULL;
 	int ret;
 	bool cancel = false;
 
@@ -2728,7 +2728,7 @@ static long btrfs_ioctl_rm_dev_v2(struct file *file, void __user *arg)
 		goto err_drop;
 
 	/* Exclusive operation is now claimed */
-	ret = btrfs_rm_device(fs_info, &args, &bdev_handle);
+	ret = btrfs_rm_device(fs_info, &args, &bdev_file);
 
 	btrfs_exclop_finish(fs_info);
 
@@ -2742,8 +2742,8 @@ static long btrfs_ioctl_rm_dev_v2(struct file *file, void __user *arg)
 	}
 err_drop:
 	mnt_drop_write_file(file);
-	if (bdev_handle)
-		bdev_release(bdev_handle);
+	if (bdev_file)
+		fput(bdev_file);
 out:
 	btrfs_put_dev_args_from_path(&args);
 	kfree(vol_args);
@@ -2756,7 +2756,7 @@ static long btrfs_ioctl_rm_dev(struct file *file, void __user *arg)
 	struct inode *inode = file_inode(file);
 	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
 	struct btrfs_ioctl_vol_args *vol_args;
-	struct bdev_handle *bdev_handle = NULL;
+	struct file *bdev_file = NULL;
 	int ret;
 	bool cancel = false;
 
@@ -2783,15 +2783,15 @@ static long btrfs_ioctl_rm_dev(struct file *file, void __user *arg)
 	ret = exclop_start_or_cancel_reloc(fs_info, BTRFS_EXCLOP_DEV_REMOVE,
 					   cancel);
 	if (ret == 0) {
-		ret = btrfs_rm_device(fs_info, &args, &bdev_handle);
+		ret = btrfs_rm_device(fs_info, &args, &bdev_file);
 		if (!ret)
 			btrfs_info(fs_info, "disk deleted %s", vol_args->name);
 		btrfs_exclop_finish(fs_info);
 	}
 
 	mnt_drop_write_file(file);
-	if (bdev_handle)
-		bdev_release(bdev_handle);
+	if (bdev_file)
+		fput(bdev_file);
 out:
 	btrfs_put_dev_args_from_path(&args);
 	kfree(vol_args);
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 4c32497311d2..769a1dc4b756 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -468,39 +468,39 @@ static noinline struct btrfs_fs_devices *find_fsid(
 
 static int
 btrfs_get_bdev_and_sb(const char *device_path, blk_mode_t flags, void *holder,
-		      int flush, struct bdev_handle **bdev_handle,
+		      int flush, struct file **bdev_file,
 		      struct btrfs_super_block **disk_super)
 {
 	struct block_device *bdev;
 	int ret;
 
-	*bdev_handle = bdev_open_by_path(device_path, flags, holder, NULL);
+	*bdev_file = bdev_file_open_by_path(device_path, flags, holder, NULL);
 
-	if (IS_ERR(*bdev_handle)) {
-		ret = PTR_ERR(*bdev_handle);
+	if (IS_ERR(*bdev_file)) {
+		ret = PTR_ERR(*bdev_file);
 		goto error;
 	}
-	bdev = (*bdev_handle)->bdev;
+	bdev = file_bdev(*bdev_file);
 
 	if (flush)
 		sync_blockdev(bdev);
 	ret = set_blocksize(bdev, BTRFS_BDEV_BLOCKSIZE);
 	if (ret) {
-		bdev_release(*bdev_handle);
+		fput(*bdev_file);
 		goto error;
 	}
 	invalidate_bdev(bdev);
 	*disk_super = btrfs_read_dev_super(bdev);
 	if (IS_ERR(*disk_super)) {
 		ret = PTR_ERR(*disk_super);
-		bdev_release(*bdev_handle);
+		fput(*bdev_file);
 		goto error;
 	}
 
 	return 0;
 
 error:
-	*bdev_handle = NULL;
+	*bdev_file = NULL;
 	return ret;
 }
 
@@ -643,7 +643,7 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices,
 			struct btrfs_device *device, blk_mode_t flags,
 			void *holder)
 {
-	struct bdev_handle *bdev_handle;
+	struct file *bdev_file;
 	struct btrfs_super_block *disk_super;
 	u64 devid;
 	int ret;
@@ -654,7 +654,7 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices,
 		return -EINVAL;
 
 	ret = btrfs_get_bdev_and_sb(device->name->str, flags, holder, 1,
-				    &bdev_handle, &disk_super);
+				    &bdev_file, &disk_super);
 	if (ret)
 		return ret;
 
@@ -678,20 +678,20 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices,
 		clear_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state);
 		fs_devices->seeding = true;
 	} else {
-		if (bdev_read_only(bdev_handle->bdev))
+		if (bdev_read_only(file_bdev(bdev_file)))
 			clear_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state);
 		else
 			set_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state);
 	}
 
-	if (!bdev_nonrot(bdev_handle->bdev))
+	if (!bdev_nonrot(file_bdev(bdev_file)))
 		fs_devices->rotating = true;
 
-	if (bdev_max_discard_sectors(bdev_handle->bdev))
+	if (bdev_max_discard_sectors(file_bdev(bdev_file)))
 		fs_devices->discardable = true;
 
-	device->bdev_handle = bdev_handle;
-	device->bdev = bdev_handle->bdev;
+	device->bdev_file = bdev_file;
+	device->bdev = file_bdev(bdev_file);
 	clear_bit(BTRFS_DEV_STATE_IN_FS_METADATA, &device->dev_state);
 
 	fs_devices->open_devices++;
@@ -706,7 +706,7 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices,
 
 error_free_page:
 	btrfs_release_disk_super(disk_super);
-	bdev_release(bdev_handle);
+	fput(bdev_file);
 
 	return -EINVAL;
 }
@@ -1015,10 +1015,10 @@ static void __btrfs_free_extra_devids(struct btrfs_fs_devices *fs_devices,
 		if (device->devid == BTRFS_DEV_REPLACE_DEVID)
 			continue;
 
-		if (device->bdev_handle) {
-			bdev_release(device->bdev_handle);
+		if (device->bdev_file) {
+			fput(device->bdev_file);
 			device->bdev = NULL;
-			device->bdev_handle = NULL;
+			device->bdev_file = NULL;
 			fs_devices->open_devices--;
 		}
 		if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) {
@@ -1063,7 +1063,7 @@ static void btrfs_close_bdev(struct btrfs_device *device)
 		invalidate_bdev(device->bdev);
 	}
 
-	bdev_release(device->bdev_handle);
+	fput(device->bdev_file);
 }
 
 static void btrfs_close_one_device(struct btrfs_device *device)
@@ -1316,7 +1316,7 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags,
 	struct btrfs_super_block *disk_super;
 	bool new_device_added = false;
 	struct btrfs_device *device = NULL;
-	struct bdev_handle *bdev_handle;
+	struct file *bdev_file;
 	u64 bytenr, bytenr_orig;
 	int ret;
 
@@ -1339,18 +1339,18 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags,
 	 * values temporarily, as the device paths of the fsid are the only
 	 * required information for assembling the volume.
 	 */
-	bdev_handle = bdev_open_by_path(path, flags, NULL, NULL);
-	if (IS_ERR(bdev_handle))
-		return ERR_CAST(bdev_handle);
+	bdev_file = bdev_file_open_by_path(path, flags, NULL, NULL);
+	if (IS_ERR(bdev_file))
+		return ERR_CAST(bdev_file);
 
 	bytenr_orig = btrfs_sb_offset(0);
-	ret = btrfs_sb_log_location_bdev(bdev_handle->bdev, 0, READ, &bytenr);
+	ret = btrfs_sb_log_location_bdev(file_bdev(bdev_file), 0, READ, &bytenr);
 	if (ret) {
 		device = ERR_PTR(ret);
 		goto error_bdev_put;
 	}
 
-	disk_super = btrfs_read_disk_super(bdev_handle->bdev, bytenr,
+	disk_super = btrfs_read_disk_super(file_bdev(bdev_file), bytenr,
 					   bytenr_orig);
 	if (IS_ERR(disk_super)) {
 		device = ERR_CAST(disk_super);
@@ -1381,7 +1381,7 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags,
 	btrfs_release_disk_super(disk_super);
 
 error_bdev_put:
-	bdev_release(bdev_handle);
+	fput(bdev_file);
 
 	return device;
 }
@@ -2057,7 +2057,7 @@ void btrfs_scratch_superblocks(struct btrfs_fs_info *fs_info,
 
 int btrfs_rm_device(struct btrfs_fs_info *fs_info,
 		    struct btrfs_dev_lookup_args *args,
-		    struct bdev_handle **bdev_handle)
+		    struct file **bdev_file)
 {
 	struct btrfs_trans_handle *trans;
 	struct btrfs_device *device;
@@ -2166,7 +2166,7 @@ int btrfs_rm_device(struct btrfs_fs_info *fs_info,
 
 	btrfs_assign_next_active_device(device, NULL);
 
-	if (device->bdev_handle) {
+	if (device->bdev_file) {
 		cur_devices->open_devices--;
 		/* remove sysfs entry */
 		btrfs_sysfs_remove_device(device);
@@ -2182,9 +2182,9 @@ int btrfs_rm_device(struct btrfs_fs_info *fs_info,
 	 * free the device.
 	 *
 	 * We cannot call btrfs_close_bdev() here because we're holding the sb
-	 * write lock, and bdev_release() will pull in the ->open_mutex on
-	 * the block device and it's dependencies.  Instead just flush the
-	 * device and let the caller do the final bdev_release.
+	 * write lock, and fput() on the block device will pull in the
+	 * ->open_mutex on the block device and it's dependencies.  Instead
+	 *  just flush the device and let the caller do the final bdev_release.
 	 */
 	if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) {
 		btrfs_scratch_superblocks(fs_info, device->bdev,
@@ -2195,7 +2195,7 @@ int btrfs_rm_device(struct btrfs_fs_info *fs_info,
 		}
 	}
 
-	*bdev_handle = device->bdev_handle;
+	*bdev_file = device->bdev_file;
 	synchronize_rcu();
 	btrfs_free_device(device);
 
@@ -2332,7 +2332,7 @@ int btrfs_get_dev_args_from_path(struct btrfs_fs_info *fs_info,
 				 const char *path)
 {
 	struct btrfs_super_block *disk_super;
-	struct bdev_handle *bdev_handle;
+	struct file *bdev_file;
 	int ret;
 
 	if (!path || !path[0])
@@ -2350,7 +2350,7 @@ int btrfs_get_dev_args_from_path(struct btrfs_fs_info *fs_info,
 	}
 
 	ret = btrfs_get_bdev_and_sb(path, BLK_OPEN_READ, NULL, 0,
-				    &bdev_handle, &disk_super);
+				    &bdev_file, &disk_super);
 	if (ret) {
 		btrfs_put_dev_args_from_path(args);
 		return ret;
@@ -2363,7 +2363,7 @@ int btrfs_get_dev_args_from_path(struct btrfs_fs_info *fs_info,
 	else
 		memcpy(args->fsid, disk_super->fsid, BTRFS_FSID_SIZE);
 	btrfs_release_disk_super(disk_super);
-	bdev_release(bdev_handle);
+	fput(bdev_file);
 	return 0;
 }
 
@@ -2583,7 +2583,7 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
 	struct btrfs_root *root = fs_info->dev_root;
 	struct btrfs_trans_handle *trans;
 	struct btrfs_device *device;
-	struct bdev_handle *bdev_handle;
+	struct file *bdev_file;
 	struct super_block *sb = fs_info->sb;
 	struct btrfs_fs_devices *fs_devices = fs_info->fs_devices;
 	struct btrfs_fs_devices *seed_devices = NULL;
@@ -2596,12 +2596,12 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
 	if (sb_rdonly(sb) && !fs_devices->seeding)
 		return -EROFS;
 
-	bdev_handle = bdev_open_by_path(device_path, BLK_OPEN_WRITE,
+	bdev_file = bdev_file_open_by_path(device_path, BLK_OPEN_WRITE,
 					fs_info->bdev_holder, NULL);
-	if (IS_ERR(bdev_handle))
-		return PTR_ERR(bdev_handle);
+	if (IS_ERR(bdev_file))
+		return PTR_ERR(bdev_file);
 
-	if (!btrfs_check_device_zone_type(fs_info, bdev_handle->bdev)) {
+	if (!btrfs_check_device_zone_type(fs_info, file_bdev(bdev_file))) {
 		ret = -EINVAL;
 		goto error;
 	}
@@ -2613,11 +2613,11 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
 		locked = true;
 	}
 
-	sync_blockdev(bdev_handle->bdev);
+	sync_blockdev(file_bdev(bdev_file));
 
 	rcu_read_lock();
 	list_for_each_entry_rcu(device, &fs_devices->devices, dev_list) {
-		if (device->bdev == bdev_handle->bdev) {
+		if (device->bdev == file_bdev(bdev_file)) {
 			ret = -EEXIST;
 			rcu_read_unlock();
 			goto error;
@@ -2633,8 +2633,8 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
 	}
 
 	device->fs_info = fs_info;
-	device->bdev_handle = bdev_handle;
-	device->bdev = bdev_handle->bdev;
+	device->bdev_file = bdev_file;
+	device->bdev = file_bdev(bdev_file);
 	ret = lookup_bdev(device_path, &device->devt);
 	if (ret)
 		goto error_free_device;
@@ -2817,7 +2817,7 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
 error_free_device:
 	btrfs_free_device(device);
 error:
-	bdev_release(bdev_handle);
+	fput(bdev_file);
 	if (locked) {
 		mutex_unlock(&uuid_mutex);
 		up_write(&sb->s_umount);
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 53f87f398da7..a11854912d53 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -90,7 +90,7 @@ struct btrfs_device {
 
 	u64 generation;
 
-	struct bdev_handle *bdev_handle;
+	struct file *bdev_file;
 	struct block_device *bdev;
 
 	struct btrfs_zoned_device_info *zone_info;
@@ -661,7 +661,7 @@ struct btrfs_device *btrfs_alloc_device(struct btrfs_fs_info *fs_info,
 void btrfs_put_dev_args_from_path(struct btrfs_dev_lookup_args *args);
 int btrfs_rm_device(struct btrfs_fs_info *fs_info,
 		    struct btrfs_dev_lookup_args *args,
-		    struct bdev_handle **bdev_handle);
+		    struct file **bdev_file);
 void __exit btrfs_cleanup_fs_uuids(void);
 int btrfs_num_copies(struct btrfs_fs_info *fs_info, u64 logical, u64 len);
 int btrfs_grow_device(struct btrfs_trans_handle *trans,

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 20/34] erofs: port device access to file
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (18 preceding siblings ...)
  2024-01-23 13:26 ` [PATCH v2 19/34] btrfs: port " Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-02-01 10:16   ` Jan Kara
  2024-01-23 13:26 ` [PATCH v2 21/34] ext4: port block " Christian Brauner
                   ` (17 subsequent siblings)
  37 siblings, 1 reply; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 fs/erofs/data.c     |  6 +++---
 fs/erofs/internal.h |  2 +-
 fs/erofs/super.c    | 16 ++++++++--------
 3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/fs/erofs/data.c b/fs/erofs/data.c
index c98aeda8abb2..433fc39ba423 100644
--- a/fs/erofs/data.c
+++ b/fs/erofs/data.c
@@ -220,7 +220,7 @@ int erofs_map_dev(struct super_block *sb, struct erofs_map_dev *map)
 			up_read(&devs->rwsem);
 			return 0;
 		}
-		map->m_bdev = dif->bdev_handle ? dif->bdev_handle->bdev : NULL;
+		map->m_bdev = dif->bdev_file ? file_bdev(dif->bdev_file) : NULL;
 		map->m_daxdev = dif->dax_dev;
 		map->m_dax_part_off = dif->dax_part_off;
 		map->m_fscache = dif->fscache;
@@ -238,8 +238,8 @@ int erofs_map_dev(struct super_block *sb, struct erofs_map_dev *map)
 			if (map->m_pa >= startoff &&
 			    map->m_pa < startoff + length) {
 				map->m_pa -= startoff;
-				map->m_bdev = dif->bdev_handle ?
-					      dif->bdev_handle->bdev : NULL;
+				map->m_bdev = dif->bdev_file ?
+					      file_bdev(dif->bdev_file) : NULL;
 				map->m_daxdev = dif->dax_dev;
 				map->m_dax_part_off = dif->dax_part_off;
 				map->m_fscache = dif->fscache;
diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index b0409badb017..0f0706325b7b 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -49,7 +49,7 @@ typedef u32 erofs_blk_t;
 struct erofs_device_info {
 	char *path;
 	struct erofs_fscache *fscache;
-	struct bdev_handle *bdev_handle;
+	struct file *bdev_file;
 	struct dax_device *dax_dev;
 	u64 dax_part_off;
 
diff --git a/fs/erofs/super.c b/fs/erofs/super.c
index 5f60f163bd56..9b4b66dcdd4f 100644
--- a/fs/erofs/super.c
+++ b/fs/erofs/super.c
@@ -177,7 +177,7 @@ static int erofs_init_device(struct erofs_buf *buf, struct super_block *sb,
 	struct erofs_sb_info *sbi = EROFS_SB(sb);
 	struct erofs_fscache *fscache;
 	struct erofs_deviceslot *dis;
-	struct bdev_handle *bdev_handle;
+	struct file *bdev_file;
 	void *ptr;
 
 	ptr = erofs_read_metabuf(buf, sb, erofs_blknr(sb, *pos), EROFS_KMAP);
@@ -201,12 +201,12 @@ static int erofs_init_device(struct erofs_buf *buf, struct super_block *sb,
 			return PTR_ERR(fscache);
 		dif->fscache = fscache;
 	} else if (!sbi->devs->flatdev) {
-		bdev_handle = bdev_open_by_path(dif->path, BLK_OPEN_READ,
+		bdev_file = bdev_file_open_by_path(dif->path, BLK_OPEN_READ,
 						sb->s_type, NULL);
-		if (IS_ERR(bdev_handle))
-			return PTR_ERR(bdev_handle);
-		dif->bdev_handle = bdev_handle;
-		dif->dax_dev = fs_dax_get_by_bdev(bdev_handle->bdev,
+		if (IS_ERR(bdev_file))
+			return PTR_ERR(bdev_file);
+		dif->bdev_file = bdev_file;
+		dif->dax_dev = fs_dax_get_by_bdev(file_bdev(bdev_file),
 				&dif->dax_part_off, NULL, NULL);
 	}
 
@@ -754,8 +754,8 @@ static int erofs_release_device_info(int id, void *ptr, void *data)
 	struct erofs_device_info *dif = ptr;
 
 	fs_put_dax(dif->dax_dev, NULL);
-	if (dif->bdev_handle)
-		bdev_release(dif->bdev_handle);
+	if (dif->bdev_file)
+		fput(dif->bdev_file);
 	erofs_fscache_unregister_cookie(dif->fscache);
 	dif->fscache = NULL;
 	kfree(dif->path);

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 21/34] ext4: port block device access to file
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (19 preceding siblings ...)
  2024-01-23 13:26 ` [PATCH v2 20/34] erofs: " Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-02-01 10:18   ` Jan Kara
  2024-01-23 13:26 ` [PATCH v2 22/34] f2fs: port block device access to files Christian Brauner
                   ` (16 subsequent siblings)
  37 siblings, 1 reply; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 fs/ext4/ext4.h  |  2 +-
 fs/ext4/fsmap.c |  8 ++++----
 fs/ext4/super.c | 52 ++++++++++++++++++++++++++--------------------------
 3 files changed, 31 insertions(+), 31 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index a5d784872303..dcdad5da419e 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1548,7 +1548,7 @@ struct ext4_sb_info {
 	unsigned long s_commit_interval;
 	u32 s_max_batch_time;
 	u32 s_min_batch_time;
-	struct bdev_handle *s_journal_bdev_handle;
+	struct file *s_journal_bdev_file;
 #ifdef CONFIG_QUOTA
 	/* Names of quota files with journalled quota */
 	char __rcu *s_qf_names[EXT4_MAXQUOTAS];
diff --git a/fs/ext4/fsmap.c b/fs/ext4/fsmap.c
index 11e6f33677a2..df853c4d3a8c 100644
--- a/fs/ext4/fsmap.c
+++ b/fs/ext4/fsmap.c
@@ -576,9 +576,9 @@ static bool ext4_getfsmap_is_valid_device(struct super_block *sb,
 	if (fm->fmr_device == 0 || fm->fmr_device == UINT_MAX ||
 	    fm->fmr_device == new_encode_dev(sb->s_bdev->bd_dev))
 		return true;
-	if (EXT4_SB(sb)->s_journal_bdev_handle &&
+	if (EXT4_SB(sb)->s_journal_bdev_file &&
 	    fm->fmr_device ==
-	    new_encode_dev(EXT4_SB(sb)->s_journal_bdev_handle->bdev->bd_dev))
+	    new_encode_dev(file_bdev(EXT4_SB(sb)->s_journal_bdev_file)->bd_dev))
 		return true;
 	return false;
 }
@@ -648,9 +648,9 @@ int ext4_getfsmap(struct super_block *sb, struct ext4_fsmap_head *head,
 	memset(handlers, 0, sizeof(handlers));
 	handlers[0].gfd_dev = new_encode_dev(sb->s_bdev->bd_dev);
 	handlers[0].gfd_fn = ext4_getfsmap_datadev;
-	if (EXT4_SB(sb)->s_journal_bdev_handle) {
+	if (EXT4_SB(sb)->s_journal_bdev_file) {
 		handlers[1].gfd_dev = new_encode_dev(
-			EXT4_SB(sb)->s_journal_bdev_handle->bdev->bd_dev);
+			file_bdev(EXT4_SB(sb)->s_journal_bdev_file)->bd_dev);
 		handlers[1].gfd_fn = ext4_getfsmap_logdev;
 	}
 
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index dcba0f85dfe2..aa007710cfc3 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1359,14 +1359,14 @@ static void ext4_put_super(struct super_block *sb)
 
 	sync_blockdev(sb->s_bdev);
 	invalidate_bdev(sb->s_bdev);
-	if (sbi->s_journal_bdev_handle) {
+	if (sbi->s_journal_bdev_file) {
 		/*
 		 * Invalidate the journal device's buffers.  We don't want them
 		 * floating about in memory - the physical journal device may
 		 * hotswapped, and it breaks the `ro-after' testing code.
 		 */
-		sync_blockdev(sbi->s_journal_bdev_handle->bdev);
-		invalidate_bdev(sbi->s_journal_bdev_handle->bdev);
+		sync_blockdev(file_bdev(sbi->s_journal_bdev_file));
+		invalidate_bdev(file_bdev(sbi->s_journal_bdev_file));
 	}
 
 	ext4_xattr_destroy_cache(sbi->s_ea_inode_cache);
@@ -4233,7 +4233,7 @@ int ext4_calculate_overhead(struct super_block *sb)
 	 * Add the internal journal blocks whether the journal has been
 	 * loaded or not
 	 */
-	if (sbi->s_journal && !sbi->s_journal_bdev_handle)
+	if (sbi->s_journal && !sbi->s_journal_bdev_file)
 		overhead += EXT4_NUM_B2C(sbi, sbi->s_journal->j_total_len);
 	else if (ext4_has_feature_journal(sb) && !sbi->s_journal && j_inum) {
 		/* j_inum for internal journal is non-zero */
@@ -5670,9 +5670,9 @@ failed_mount9: __maybe_unused
 #endif
 	fscrypt_free_dummy_policy(&sbi->s_dummy_enc_policy);
 	brelse(sbi->s_sbh);
-	if (sbi->s_journal_bdev_handle) {
-		invalidate_bdev(sbi->s_journal_bdev_handle->bdev);
-		bdev_release(sbi->s_journal_bdev_handle);
+	if (sbi->s_journal_bdev_file) {
+		invalidate_bdev(file_bdev(sbi->s_journal_bdev_file));
+		fput(sbi->s_journal_bdev_file);
 	}
 out_fail:
 	invalidate_bdev(sb->s_bdev);
@@ -5842,30 +5842,30 @@ static journal_t *ext4_open_inode_journal(struct super_block *sb,
 	return journal;
 }
 
-static struct bdev_handle *ext4_get_journal_blkdev(struct super_block *sb,
+static struct file *ext4_get_journal_blkdev(struct super_block *sb,
 					dev_t j_dev, ext4_fsblk_t *j_start,
 					ext4_fsblk_t *j_len)
 {
 	struct buffer_head *bh;
 	struct block_device *bdev;
-	struct bdev_handle *bdev_handle;
+	struct file *bdev_file;
 	int hblock, blocksize;
 	ext4_fsblk_t sb_block;
 	unsigned long offset;
 	struct ext4_super_block *es;
 	int errno;
 
-	bdev_handle = bdev_open_by_dev(j_dev,
+	bdev_file = bdev_file_open_by_dev(j_dev,
 		BLK_OPEN_READ | BLK_OPEN_WRITE | BLK_OPEN_RESTRICT_WRITES,
 		sb, &fs_holder_ops);
-	if (IS_ERR(bdev_handle)) {
+	if (IS_ERR(bdev_file)) {
 		ext4_msg(sb, KERN_ERR,
 			 "failed to open journal device unknown-block(%u,%u) %ld",
-			 MAJOR(j_dev), MINOR(j_dev), PTR_ERR(bdev_handle));
-		return bdev_handle;
+			 MAJOR(j_dev), MINOR(j_dev), PTR_ERR(bdev_file));
+		return bdev_file;
 	}
 
-	bdev = bdev_handle->bdev;
+	bdev = file_bdev(bdev_file);
 	blocksize = sb->s_blocksize;
 	hblock = bdev_logical_block_size(bdev);
 	if (blocksize < hblock) {
@@ -5912,12 +5912,12 @@ static struct bdev_handle *ext4_get_journal_blkdev(struct super_block *sb,
 	*j_start = sb_block + 1;
 	*j_len = ext4_blocks_count(es);
 	brelse(bh);
-	return bdev_handle;
+	return bdev_file;
 
 out_bh:
 	brelse(bh);
 out_bdev:
-	bdev_release(bdev_handle);
+	fput(bdev_file);
 	return ERR_PTR(errno);
 }
 
@@ -5927,14 +5927,14 @@ static journal_t *ext4_open_dev_journal(struct super_block *sb,
 	journal_t *journal;
 	ext4_fsblk_t j_start;
 	ext4_fsblk_t j_len;
-	struct bdev_handle *bdev_handle;
+	struct file *bdev_file;
 	int errno = 0;
 
-	bdev_handle = ext4_get_journal_blkdev(sb, j_dev, &j_start, &j_len);
-	if (IS_ERR(bdev_handle))
-		return ERR_CAST(bdev_handle);
+	bdev_file = ext4_get_journal_blkdev(sb, j_dev, &j_start, &j_len);
+	if (IS_ERR(bdev_file))
+		return ERR_CAST(bdev_file);
 
-	journal = jbd2_journal_init_dev(bdev_handle->bdev, sb->s_bdev, j_start,
+	journal = jbd2_journal_init_dev(file_bdev(bdev_file), sb->s_bdev, j_start,
 					j_len, sb->s_blocksize);
 	if (IS_ERR(journal)) {
 		ext4_msg(sb, KERN_ERR, "failed to create device journal");
@@ -5949,14 +5949,14 @@ static journal_t *ext4_open_dev_journal(struct super_block *sb,
 		goto out_journal;
 	}
 	journal->j_private = sb;
-	EXT4_SB(sb)->s_journal_bdev_handle = bdev_handle;
+	EXT4_SB(sb)->s_journal_bdev_file = bdev_file;
 	ext4_init_journal_params(sb, journal);
 	return journal;
 
 out_journal:
 	jbd2_journal_destroy(journal);
 out_bdev:
-	bdev_release(bdev_handle);
+	fput(bdev_file);
 	return ERR_PTR(errno);
 }
 
@@ -7314,12 +7314,12 @@ static inline int ext3_feature_set_ok(struct super_block *sb)
 static void ext4_kill_sb(struct super_block *sb)
 {
 	struct ext4_sb_info *sbi = EXT4_SB(sb);
-	struct bdev_handle *handle = sbi ? sbi->s_journal_bdev_handle : NULL;
+	struct file *bdev_file = sbi ? sbi->s_journal_bdev_file : NULL;
 
 	kill_block_super(sb);
 
-	if (handle)
-		bdev_release(handle);
+	if (bdev_file)
+		fput(bdev_file);
 }
 
 static struct file_system_type ext4_fs_type = {

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 22/34] f2fs: port block device access to files
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (20 preceding siblings ...)
  2024-01-23 13:26 ` [PATCH v2 21/34] ext4: port block " Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-02-01 10:19   ` Jan Kara
  2024-01-23 13:26 ` [PATCH v2 23/34] jfs: port block device access to file Christian Brauner
                   ` (15 subsequent siblings)
  37 siblings, 1 reply; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 fs/f2fs/f2fs.h  |  2 +-
 fs/f2fs/super.c | 12 ++++++------
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 65294e3b0bef..6fc172c99915 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1239,7 +1239,7 @@ struct f2fs_bio_info {
 #define FDEV(i)				(sbi->devs[i])
 #define RDEV(i)				(raw_super->devs[i])
 struct f2fs_dev_info {
-	struct bdev_handle *bdev_handle;
+	struct file *bdev_file;
 	struct block_device *bdev;
 	char path[MAX_PATH_LEN];
 	unsigned int total_segments;
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index ea94c148fee5..557ea5c6c926 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1605,7 +1605,7 @@ static void destroy_device_list(struct f2fs_sb_info *sbi)
 
 	for (i = 0; i < sbi->s_ndevs; i++) {
 		if (i > 0)
-			bdev_release(FDEV(i).bdev_handle);
+			fput(FDEV(i).bdev_file);
 #ifdef CONFIG_BLK_DEV_ZONED
 		kvfree(FDEV(i).blkz_seq);
 #endif
@@ -4247,7 +4247,7 @@ static int f2fs_scan_devices(struct f2fs_sb_info *sbi)
 
 	for (i = 0; i < max_devices; i++) {
 		if (i == 0)
-			FDEV(0).bdev_handle = sb_bdev_handle(sbi->sb);
+			FDEV(0).bdev_file = sbi->sb->s_bdev_file;
 		else if (!RDEV(i).path[0])
 			break;
 
@@ -4267,14 +4267,14 @@ static int f2fs_scan_devices(struct f2fs_sb_info *sbi)
 				FDEV(i).end_blk = FDEV(i).start_blk +
 					(FDEV(i).total_segments <<
 					sbi->log_blocks_per_seg) - 1;
-				FDEV(i).bdev_handle = bdev_open_by_path(
+				FDEV(i).bdev_file = bdev_file_open_by_path(
 					FDEV(i).path, mode, sbi->sb, NULL);
 			}
 		}
-		if (IS_ERR(FDEV(i).bdev_handle))
-			return PTR_ERR(FDEV(i).bdev_handle);
+		if (IS_ERR(FDEV(i).bdev_file))
+			return PTR_ERR(FDEV(i).bdev_file);
 
-		FDEV(i).bdev = FDEV(i).bdev_handle->bdev;
+		FDEV(i).bdev = file_bdev(FDEV(i).bdev_file);
 		/* to release errored devices */
 		sbi->s_ndevs = i + 1;
 

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 23/34] jfs: port block device access to file
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (21 preceding siblings ...)
  2024-01-23 13:26 ` [PATCH v2 22/34] f2fs: port block device access to files Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-02-01 10:19   ` Jan Kara
  2024-01-23 13:26 ` [PATCH v2 24/34] nfs: port block device access to files Christian Brauner
                   ` (14 subsequent siblings)
  37 siblings, 1 reply; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 fs/jfs/jfs_logmgr.c | 26 +++++++++++++-------------
 fs/jfs/jfs_logmgr.h |  2 +-
 fs/jfs/jfs_mount.c  |  2 +-
 3 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/fs/jfs/jfs_logmgr.c b/fs/jfs/jfs_logmgr.c
index 8691463956d1..73389c68e251 100644
--- a/fs/jfs/jfs_logmgr.c
+++ b/fs/jfs/jfs_logmgr.c
@@ -1058,7 +1058,7 @@ void jfs_syncpt(struct jfs_log *log, int hard_sync)
 int lmLogOpen(struct super_block *sb)
 {
 	int rc;
-	struct bdev_handle *bdev_handle;
+	struct file *bdev_file;
 	struct jfs_log *log;
 	struct jfs_sb_info *sbi = JFS_SBI(sb);
 
@@ -1070,7 +1070,7 @@ int lmLogOpen(struct super_block *sb)
 
 	mutex_lock(&jfs_log_mutex);
 	list_for_each_entry(log, &jfs_external_logs, journal_list) {
-		if (log->bdev_handle->bdev->bd_dev == sbi->logdev) {
+		if (file_bdev(log->bdev_file)->bd_dev == sbi->logdev) {
 			if (!uuid_equal(&log->uuid, &sbi->loguuid)) {
 				jfs_warn("wrong uuid on JFS journal");
 				mutex_unlock(&jfs_log_mutex);
@@ -1100,14 +1100,14 @@ int lmLogOpen(struct super_block *sb)
 	 * file systems to log may have n-to-1 relationship;
 	 */
 
-	bdev_handle = bdev_open_by_dev(sbi->logdev,
+	bdev_file = bdev_file_open_by_dev(sbi->logdev,
 			BLK_OPEN_READ | BLK_OPEN_WRITE, log, NULL);
-	if (IS_ERR(bdev_handle)) {
-		rc = PTR_ERR(bdev_handle);
+	if (IS_ERR(bdev_file)) {
+		rc = PTR_ERR(bdev_file);
 		goto free;
 	}
 
-	log->bdev_handle = bdev_handle;
+	log->bdev_file = bdev_file;
 	uuid_copy(&log->uuid, &sbi->loguuid);
 
 	/*
@@ -1141,7 +1141,7 @@ int lmLogOpen(struct super_block *sb)
 	lbmLogShutdown(log);
 
       close:		/* close external log device */
-	bdev_release(bdev_handle);
+	fput(bdev_file);
 
       free:		/* free log descriptor */
 	mutex_unlock(&jfs_log_mutex);
@@ -1162,7 +1162,7 @@ static int open_inline_log(struct super_block *sb)
 	init_waitqueue_head(&log->syncwait);
 
 	set_bit(log_INLINELOG, &log->flag);
-	log->bdev_handle = sb_bdev_handle(sb);
+	log->bdev_file = sb->s_bdev_file;
 	log->base = addressPXD(&JFS_SBI(sb)->logpxd);
 	log->size = lengthPXD(&JFS_SBI(sb)->logpxd) >>
 	    (L2LOGPSIZE - sb->s_blocksize_bits);
@@ -1436,7 +1436,7 @@ int lmLogClose(struct super_block *sb)
 {
 	struct jfs_sb_info *sbi = JFS_SBI(sb);
 	struct jfs_log *log = sbi->log;
-	struct bdev_handle *bdev_handle;
+	struct file *bdev_file;
 	int rc = 0;
 
 	jfs_info("lmLogClose: log:0x%p", log);
@@ -1482,10 +1482,10 @@ int lmLogClose(struct super_block *sb)
 	 *	external log as separate logical volume
 	 */
 	list_del(&log->journal_list);
-	bdev_handle = log->bdev_handle;
+	bdev_file = log->bdev_file;
 	rc = lmLogShutdown(log);
 
-	bdev_release(bdev_handle);
+	fput(bdev_file);
 
 	kfree(log);
 
@@ -1972,7 +1972,7 @@ static int lbmRead(struct jfs_log * log, int pn, struct lbuf ** bpp)
 
 	bp->l_flag |= lbmREAD;
 
-	bio = bio_alloc(log->bdev_handle->bdev, 1, REQ_OP_READ, GFP_NOFS);
+	bio = bio_alloc(file_bdev(log->bdev_file), 1, REQ_OP_READ, GFP_NOFS);
 	bio->bi_iter.bi_sector = bp->l_blkno << (log->l2bsize - 9);
 	__bio_add_page(bio, bp->l_page, LOGPSIZE, bp->l_offset);
 	BUG_ON(bio->bi_iter.bi_size != LOGPSIZE);
@@ -2115,7 +2115,7 @@ static void lbmStartIO(struct lbuf * bp)
 	jfs_info("lbmStartIO");
 
 	if (!log->no_integrity)
-		bdev = log->bdev_handle->bdev;
+		bdev = file_bdev(log->bdev_file);
 
 	bio = bio_alloc(bdev, 1, REQ_OP_WRITE | REQ_SYNC,
 			GFP_NOFS);
diff --git a/fs/jfs/jfs_logmgr.h b/fs/jfs/jfs_logmgr.h
index 84aa2d253907..8b8994e48cd0 100644
--- a/fs/jfs/jfs_logmgr.h
+++ b/fs/jfs/jfs_logmgr.h
@@ -356,7 +356,7 @@ struct jfs_log {
 				 *    before writing syncpt.
 				 */
 	struct list_head journal_list; /* Global list */
-	struct bdev_handle *bdev_handle; /* 4: log lv pointer */
+	struct file *bdev_file;	/* 4: log lv pointer */
 	int serial;		/* 4: log mount serial number */
 
 	s64 base;		/* @8: log extent address (inline log ) */
diff --git a/fs/jfs/jfs_mount.c b/fs/jfs/jfs_mount.c
index 9b5c6a20b30c..98f9a432c336 100644
--- a/fs/jfs/jfs_mount.c
+++ b/fs/jfs/jfs_mount.c
@@ -431,7 +431,7 @@ int updateSuper(struct super_block *sb, uint state)
 	if (state == FM_MOUNT) {
 		/* record log's dev_t and mount serial number */
 		j_sb->s_logdev = cpu_to_le32(
-			new_encode_dev(sbi->log->bdev_handle->bdev->bd_dev));
+			new_encode_dev(file_bdev(sbi->log->bdev_file)->bd_dev));
 		j_sb->s_logserial = cpu_to_le32(sbi->log->serial);
 	} else if (state == FM_CLEAN) {
 		/*

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 24/34] nfs: port block device access to files
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (22 preceding siblings ...)
  2024-01-23 13:26 ` [PATCH v2 23/34] jfs: port block device access to file Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-02-01 10:22   ` Jan Kara
  2024-01-23 13:26 ` [PATCH v2 25/34] ocfs2: port block device access to file Christian Brauner
                   ` (13 subsequent siblings)
  37 siblings, 1 reply; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 fs/nfs/blocklayout/blocklayout.h |  2 +-
 fs/nfs/blocklayout/dev.c         | 68 ++++++++++++++++++++--------------------
 2 files changed, 35 insertions(+), 35 deletions(-)

diff --git a/fs/nfs/blocklayout/blocklayout.h b/fs/nfs/blocklayout/blocklayout.h
index b4294a8aa2d4..f1eeb4914199 100644
--- a/fs/nfs/blocklayout/blocklayout.h
+++ b/fs/nfs/blocklayout/blocklayout.h
@@ -108,7 +108,7 @@ struct pnfs_block_dev {
 	struct pnfs_block_dev		*children;
 	u64				chunk_size;
 
-	struct bdev_handle		*bdev_handle;
+	struct file			*bdev_file;
 	u64				disk_offset;
 
 	u64				pr_key;
diff --git a/fs/nfs/blocklayout/dev.c b/fs/nfs/blocklayout/dev.c
index c97ebc42ec0f..93ef7f864980 100644
--- a/fs/nfs/blocklayout/dev.c
+++ b/fs/nfs/blocklayout/dev.c
@@ -25,17 +25,17 @@ bl_free_device(struct pnfs_block_dev *dev)
 	} else {
 		if (dev->pr_registered) {
 			const struct pr_ops *ops =
-				dev->bdev_handle->bdev->bd_disk->fops->pr_ops;
+				file_bdev(dev->bdev_file)->bd_disk->fops->pr_ops;
 			int error;
 
-			error = ops->pr_register(dev->bdev_handle->bdev,
+			error = ops->pr_register(file_bdev(dev->bdev_file),
 				dev->pr_key, 0, false);
 			if (error)
 				pr_err("failed to unregister PR key.\n");
 		}
 
-		if (dev->bdev_handle)
-			bdev_release(dev->bdev_handle);
+		if (dev->bdev_file)
+			fput(dev->bdev_file);
 	}
 }
 
@@ -169,7 +169,7 @@ static bool bl_map_simple(struct pnfs_block_dev *dev, u64 offset,
 	map->start = dev->start;
 	map->len = dev->len;
 	map->disk_offset = dev->disk_offset;
-	map->bdev = dev->bdev_handle->bdev;
+	map->bdev = file_bdev(dev->bdev_file);
 	return true;
 }
 
@@ -236,26 +236,26 @@ bl_parse_simple(struct nfs_server *server, struct pnfs_block_dev *d,
 		struct pnfs_block_volume *volumes, int idx, gfp_t gfp_mask)
 {
 	struct pnfs_block_volume *v = &volumes[idx];
-	struct bdev_handle *bdev_handle;
+	struct file *bdev_file;
 	dev_t dev;
 
 	dev = bl_resolve_deviceid(server, v, gfp_mask);
 	if (!dev)
 		return -EIO;
 
-	bdev_handle = bdev_open_by_dev(dev, BLK_OPEN_READ | BLK_OPEN_WRITE,
+	bdev_file = bdev_file_open_by_dev(dev, BLK_OPEN_READ | BLK_OPEN_WRITE,
 				       NULL, NULL);
-	if (IS_ERR(bdev_handle)) {
+	if (IS_ERR(bdev_file)) {
 		printk(KERN_WARNING "pNFS: failed to open device %d:%d (%ld)\n",
-			MAJOR(dev), MINOR(dev), PTR_ERR(bdev_handle));
-		return PTR_ERR(bdev_handle);
+			MAJOR(dev), MINOR(dev), PTR_ERR(bdev_file));
+		return PTR_ERR(bdev_file);
 	}
-	d->bdev_handle = bdev_handle;
-	d->len = bdev_nr_bytes(bdev_handle->bdev);
+	d->bdev_file = bdev_file;
+	d->len = bdev_nr_bytes(file_bdev(bdev_file));
 	d->map = bl_map_simple;
 
 	printk(KERN_INFO "pNFS: using block device %s\n",
-		bdev_handle->bdev->bd_disk->disk_name);
+		file_bdev(bdev_file)->bd_disk->disk_name);
 	return 0;
 }
 
@@ -300,10 +300,10 @@ bl_validate_designator(struct pnfs_block_volume *v)
 	}
 }
 
-static struct bdev_handle *
+static struct file *
 bl_open_path(struct pnfs_block_volume *v, const char *prefix)
 {
-	struct bdev_handle *bdev_handle;
+	struct file *bdev_file;
 	const char *devname;
 
 	devname = kasprintf(GFP_KERNEL, "/dev/disk/by-id/%s%*phN",
@@ -311,15 +311,15 @@ bl_open_path(struct pnfs_block_volume *v, const char *prefix)
 	if (!devname)
 		return ERR_PTR(-ENOMEM);
 
-	bdev_handle = bdev_open_by_path(devname, BLK_OPEN_READ | BLK_OPEN_WRITE,
+	bdev_file = bdev_file_open_by_path(devname, BLK_OPEN_READ | BLK_OPEN_WRITE,
 					NULL, NULL);
-	if (IS_ERR(bdev_handle)) {
+	if (IS_ERR(bdev_file)) {
 		pr_warn("pNFS: failed to open device %s (%ld)\n",
-			devname, PTR_ERR(bdev_handle));
+			devname, PTR_ERR(bdev_file));
 	}
 
 	kfree(devname);
-	return bdev_handle;
+	return bdev_file;
 }
 
 static int
@@ -327,7 +327,7 @@ bl_parse_scsi(struct nfs_server *server, struct pnfs_block_dev *d,
 		struct pnfs_block_volume *volumes, int idx, gfp_t gfp_mask)
 {
 	struct pnfs_block_volume *v = &volumes[idx];
-	struct bdev_handle *bdev_handle;
+	struct file *bdev_file;
 	const struct pr_ops *ops;
 	int error;
 
@@ -340,14 +340,14 @@ bl_parse_scsi(struct nfs_server *server, struct pnfs_block_dev *d,
 	 * On other distributions like Debian, the default SCSI by-id path will
 	 * point to the dm-multipath device if one exists.
 	 */
-	bdev_handle = bl_open_path(v, "dm-uuid-mpath-0x");
-	if (IS_ERR(bdev_handle))
-		bdev_handle = bl_open_path(v, "wwn-0x");
-	if (IS_ERR(bdev_handle))
-		return PTR_ERR(bdev_handle);
-	d->bdev_handle = bdev_handle;
-
-	d->len = bdev_nr_bytes(d->bdev_handle->bdev);
+	bdev_file = bl_open_path(v, "dm-uuid-mpath-0x");
+	if (IS_ERR(bdev_file))
+		bdev_file = bl_open_path(v, "wwn-0x");
+	if (IS_ERR(bdev_file))
+		return PTR_ERR(bdev_file);
+	d->bdev_file = bdev_file;
+
+	d->len = bdev_nr_bytes(file_bdev(d->bdev_file));
 	d->map = bl_map_simple;
 	d->pr_key = v->scsi.pr_key;
 
@@ -355,20 +355,20 @@ bl_parse_scsi(struct nfs_server *server, struct pnfs_block_dev *d,
 		return -ENODEV;
 
 	pr_info("pNFS: using block device %s (reservation key 0x%llx)\n",
-		d->bdev_handle->bdev->bd_disk->disk_name, d->pr_key);
+		file_bdev(d->bdev_file)->bd_disk->disk_name, d->pr_key);
 
-	ops = d->bdev_handle->bdev->bd_disk->fops->pr_ops;
+	ops = file_bdev(d->bdev_file)->bd_disk->fops->pr_ops;
 	if (!ops) {
 		pr_err("pNFS: block device %s does not support reservations.",
-				d->bdev_handle->bdev->bd_disk->disk_name);
+				file_bdev(d->bdev_file)->bd_disk->disk_name);
 		error = -EINVAL;
 		goto out_blkdev_put;
 	}
 
-	error = ops->pr_register(d->bdev_handle->bdev, 0, d->pr_key, true);
+	error = ops->pr_register(file_bdev(d->bdev_file), 0, d->pr_key, true);
 	if (error) {
 		pr_err("pNFS: failed to register key for block device %s.",
-				d->bdev_handle->bdev->bd_disk->disk_name);
+				file_bdev(d->bdev_file)->bd_disk->disk_name);
 		goto out_blkdev_put;
 	}
 
@@ -376,7 +376,7 @@ bl_parse_scsi(struct nfs_server *server, struct pnfs_block_dev *d,
 	return 0;
 
 out_blkdev_put:
-	bdev_release(d->bdev_handle);
+	fput(d->bdev_file);
 	return error;
 }
 

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 25/34] ocfs2: port block device access to file
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (23 preceding siblings ...)
  2024-01-23 13:26 ` [PATCH v2 24/34] nfs: port block device access to files Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-02-01 10:22   ` Jan Kara
  2024-01-23 13:26 ` [PATCH v2 26/34] reiserfs: " Christian Brauner
                   ` (12 subsequent siblings)
  37 siblings, 1 reply; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 fs/ocfs2/cluster/heartbeat.c | 32 ++++++++++++++++----------------
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/fs/ocfs2/cluster/heartbeat.c b/fs/ocfs2/cluster/heartbeat.c
index 4d7efefa98c5..1bde1281d514 100644
--- a/fs/ocfs2/cluster/heartbeat.c
+++ b/fs/ocfs2/cluster/heartbeat.c
@@ -213,7 +213,7 @@ struct o2hb_region {
 	unsigned int		hr_num_pages;
 
 	struct page             **hr_slot_data;
-	struct bdev_handle	*hr_bdev_handle;
+	struct file		*hr_bdev_file;
 	struct o2hb_disk_slot	*hr_slots;
 
 	/* live node map of this region */
@@ -263,7 +263,7 @@ struct o2hb_region {
 
 static inline struct block_device *reg_bdev(struct o2hb_region *reg)
 {
-	return reg->hr_bdev_handle ? reg->hr_bdev_handle->bdev : NULL;
+	return reg->hr_bdev_file ? file_bdev(reg->hr_bdev_file) : NULL;
 }
 
 struct o2hb_bio_wait_ctxt {
@@ -1509,8 +1509,8 @@ static void o2hb_region_release(struct config_item *item)
 		kfree(reg->hr_slot_data);
 	}
 
-	if (reg->hr_bdev_handle)
-		bdev_release(reg->hr_bdev_handle);
+	if (reg->hr_bdev_file)
+		fput(reg->hr_bdev_file);
 
 	kfree(reg->hr_slots);
 
@@ -1569,7 +1569,7 @@ static ssize_t o2hb_region_block_bytes_store(struct config_item *item,
 	unsigned long block_bytes;
 	unsigned int block_bits;
 
-	if (reg->hr_bdev_handle)
+	if (reg->hr_bdev_file)
 		return -EINVAL;
 
 	status = o2hb_read_block_input(reg, page, &block_bytes,
@@ -1598,7 +1598,7 @@ static ssize_t o2hb_region_start_block_store(struct config_item *item,
 	char *p = (char *)page;
 	ssize_t ret;
 
-	if (reg->hr_bdev_handle)
+	if (reg->hr_bdev_file)
 		return -EINVAL;
 
 	ret = kstrtoull(p, 0, &tmp);
@@ -1623,7 +1623,7 @@ static ssize_t o2hb_region_blocks_store(struct config_item *item,
 	unsigned long tmp;
 	char *p = (char *)page;
 
-	if (reg->hr_bdev_handle)
+	if (reg->hr_bdev_file)
 		return -EINVAL;
 
 	tmp = simple_strtoul(p, &p, 0);
@@ -1642,7 +1642,7 @@ static ssize_t o2hb_region_dev_show(struct config_item *item, char *page)
 {
 	unsigned int ret = 0;
 
-	if (to_o2hb_region(item)->hr_bdev_handle)
+	if (to_o2hb_region(item)->hr_bdev_file)
 		ret = sprintf(page, "%pg\n", reg_bdev(to_o2hb_region(item)));
 
 	return ret;
@@ -1753,7 +1753,7 @@ static int o2hb_populate_slot_data(struct o2hb_region *reg)
 }
 
 /*
- * this is acting as commit; we set up all of hr_bdev_handle and hr_task or
+ * this is acting as commit; we set up all of hr_bdev_file and hr_task or
  * nothing
  */
 static ssize_t o2hb_region_dev_store(struct config_item *item,
@@ -1769,7 +1769,7 @@ static ssize_t o2hb_region_dev_store(struct config_item *item,
 	ssize_t ret = -EINVAL;
 	int live_threshold;
 
-	if (reg->hr_bdev_handle)
+	if (reg->hr_bdev_file)
 		goto out;
 
 	/* We can't heartbeat without having had our node number
@@ -1795,11 +1795,11 @@ static ssize_t o2hb_region_dev_store(struct config_item *item,
 	if (!S_ISBLK(f.file->f_mapping->host->i_mode))
 		goto out2;
 
-	reg->hr_bdev_handle = bdev_open_by_dev(f.file->f_mapping->host->i_rdev,
+	reg->hr_bdev_file = bdev_file_open_by_dev(f.file->f_mapping->host->i_rdev,
 			BLK_OPEN_WRITE | BLK_OPEN_READ, NULL, NULL);
-	if (IS_ERR(reg->hr_bdev_handle)) {
-		ret = PTR_ERR(reg->hr_bdev_handle);
-		reg->hr_bdev_handle = NULL;
+	if (IS_ERR(reg->hr_bdev_file)) {
+		ret = PTR_ERR(reg->hr_bdev_file);
+		reg->hr_bdev_file = NULL;
 		goto out2;
 	}
 
@@ -1903,8 +1903,8 @@ static ssize_t o2hb_region_dev_store(struct config_item *item,
 
 out3:
 	if (ret < 0) {
-		bdev_release(reg->hr_bdev_handle);
-		reg->hr_bdev_handle = NULL;
+		fput(reg->hr_bdev_file);
+		reg->hr_bdev_file = NULL;
 	}
 out2:
 	fdput(f);

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 26/34] reiserfs: port block device access to file
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (24 preceding siblings ...)
  2024-01-23 13:26 ` [PATCH v2 25/34] ocfs2: port block device access to file Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-02-01 10:24   ` Jan Kara
  2024-01-23 13:26 ` [PATCH v2 27/34] bdev: remove bdev_open_by_path() Christian Brauner
                   ` (11 subsequent siblings)
  37 siblings, 1 reply; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 fs/reiserfs/journal.c  | 38 +++++++++++++++++++-------------------
 fs/reiserfs/procfs.c   |  2 +-
 fs/reiserfs/reiserfs.h |  8 ++++----
 3 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/fs/reiserfs/journal.c b/fs/reiserfs/journal.c
index 171c912af50f..6474529c4253 100644
--- a/fs/reiserfs/journal.c
+++ b/fs/reiserfs/journal.c
@@ -2386,7 +2386,7 @@ static int journal_read(struct super_block *sb)
 
 	cur_dblock = SB_ONDISK_JOURNAL_1st_BLOCK(sb);
 	reiserfs_info(sb, "checking transaction log (%pg)\n",
-		      journal->j_bdev_handle->bdev);
+		      file_bdev(journal->j_bdev_file));
 	start = ktime_get_seconds();
 
 	/*
@@ -2447,7 +2447,7 @@ static int journal_read(struct super_block *sb)
 		 * device and journal device to be the same
 		 */
 		d_bh =
-		    reiserfs_breada(journal->j_bdev_handle->bdev, cur_dblock,
+		    reiserfs_breada(file_bdev(journal->j_bdev_file), cur_dblock,
 				    sb->s_blocksize,
 				    SB_ONDISK_JOURNAL_1st_BLOCK(sb) +
 				    SB_ONDISK_JOURNAL_SIZE(sb));
@@ -2588,9 +2588,9 @@ static void journal_list_init(struct super_block *sb)
 
 static void release_journal_dev(struct reiserfs_journal *journal)
 {
-	if (journal->j_bdev_handle) {
-		bdev_release(journal->j_bdev_handle);
-		journal->j_bdev_handle = NULL;
+	if (journal->j_bdev_file) {
+		fput(journal->j_bdev_file);
+		journal->j_bdev_file = NULL;
 	}
 }
 
@@ -2605,7 +2605,7 @@ static int journal_init_dev(struct super_block *super,
 
 	result = 0;
 
-	journal->j_bdev_handle = NULL;
+	journal->j_bdev_file = NULL;
 	jdev = SB_ONDISK_JOURNAL_DEVICE(super) ?
 	    new_decode_dev(SB_ONDISK_JOURNAL_DEVICE(super)) : super->s_dev;
 
@@ -2616,37 +2616,37 @@ static int journal_init_dev(struct super_block *super,
 	if ((!jdev_name || !jdev_name[0])) {
 		if (jdev == super->s_dev)
 			holder = NULL;
-		journal->j_bdev_handle = bdev_open_by_dev(jdev, blkdev_mode,
+		journal->j_bdev_file = bdev_file_open_by_dev(jdev, blkdev_mode,
 							  holder, NULL);
-		if (IS_ERR(journal->j_bdev_handle)) {
-			result = PTR_ERR(journal->j_bdev_handle);
-			journal->j_bdev_handle = NULL;
+		if (IS_ERR(journal->j_bdev_file)) {
+			result = PTR_ERR(journal->j_bdev_file);
+			journal->j_bdev_file = NULL;
 			reiserfs_warning(super, "sh-458",
 					 "cannot init journal device unknown-block(%u,%u): %i",
 					 MAJOR(jdev), MINOR(jdev), result);
 			return result;
 		} else if (jdev != super->s_dev)
-			set_blocksize(journal->j_bdev_handle->bdev,
+			set_blocksize(file_bdev(journal->j_bdev_file),
 				      super->s_blocksize);
 
 		return 0;
 	}
 
-	journal->j_bdev_handle = bdev_open_by_path(jdev_name, blkdev_mode,
+	journal->j_bdev_file = bdev_file_open_by_path(jdev_name, blkdev_mode,
 						   holder, NULL);
-	if (IS_ERR(journal->j_bdev_handle)) {
-		result = PTR_ERR(journal->j_bdev_handle);
-		journal->j_bdev_handle = NULL;
+	if (IS_ERR(journal->j_bdev_file)) {
+		result = PTR_ERR(journal->j_bdev_file);
+		journal->j_bdev_file = NULL;
 		reiserfs_warning(super, "sh-457",
 				 "journal_init_dev: Cannot open '%s': %i",
 				 jdev_name, result);
 		return result;
 	}
 
-	set_blocksize(journal->j_bdev_handle->bdev, super->s_blocksize);
+	set_blocksize(file_bdev(journal->j_bdev_file), super->s_blocksize);
 	reiserfs_info(super,
 		      "journal_init_dev: journal device: %pg\n",
-		      journal->j_bdev_handle->bdev);
+		      file_bdev(journal->j_bdev_file));
 	return 0;
 }
 
@@ -2804,7 +2804,7 @@ int journal_init(struct super_block *sb, const char *j_dev_name,
 				 "journal header magic %x (device %pg) does "
 				 "not match to magic found in super block %x",
 				 jh->jh_journal.jp_journal_magic,
-				 journal->j_bdev_handle->bdev,
+				 file_bdev(journal->j_bdev_file),
 				 sb_jp_journal_magic(rs));
 		brelse(bhjh);
 		goto free_and_return;
@@ -2828,7 +2828,7 @@ int journal_init(struct super_block *sb, const char *j_dev_name,
 	reiserfs_info(sb, "journal params: device %pg, size %u, "
 		      "journal first block %u, max trans len %u, max batch %u, "
 		      "max commit age %u, max trans age %u\n",
-		      journal->j_bdev_handle->bdev,
+		      file_bdev(journal->j_bdev_file),
 		      SB_ONDISK_JOURNAL_SIZE(sb),
 		      SB_ONDISK_JOURNAL_1st_BLOCK(sb),
 		      journal->j_trans_max,
diff --git a/fs/reiserfs/procfs.c b/fs/reiserfs/procfs.c
index 83cb9402e0f9..5c68a4a52d78 100644
--- a/fs/reiserfs/procfs.c
+++ b/fs/reiserfs/procfs.c
@@ -354,7 +354,7 @@ static int show_journal(struct seq_file *m, void *unused)
 		   "prepare: \t%12lu\n"
 		   "prepare_retry: \t%12lu\n",
 		   DJP(jp_journal_1st_block),
-		   SB_JOURNAL(sb)->j_bdev_handle->bdev,
+		   file_bdev(SB_JOURNAL(sb)->j_bdev_file),
 		   DJP(jp_journal_dev),
 		   DJP(jp_journal_size),
 		   DJP(jp_journal_trans_max),
diff --git a/fs/reiserfs/reiserfs.h b/fs/reiserfs/reiserfs.h
index 725667880e62..0554903f42a9 100644
--- a/fs/reiserfs/reiserfs.h
+++ b/fs/reiserfs/reiserfs.h
@@ -299,7 +299,7 @@ struct reiserfs_journal {
 	/* oldest journal block.  start here for traverse */
 	struct reiserfs_journal_cnode *j_first;
 
-	struct bdev_handle *j_bdev_handle;
+	struct file *j_bdev_file;
 
 	/* first block on s_dev of reserved area journal */
 	int j_1st_reserved_block;
@@ -2810,10 +2810,10 @@ struct reiserfs_journal_header {
 
 /* We need these to make journal.c code more readable */
 #define journal_find_get_block(s, block) __find_get_block(\
-		SB_JOURNAL(s)->j_bdev_handle->bdev, block, s->s_blocksize)
-#define journal_getblk(s, block) __getblk(SB_JOURNAL(s)->j_bdev_handle->bdev,\
+		file_bdev(SB_JOURNAL(s)->j_bdev_file), block, s->s_blocksize)
+#define journal_getblk(s, block) __getblk(file_bdev(SB_JOURNAL(s)->j_bdev_file),\
 		block, s->s_blocksize)
-#define journal_bread(s, block) __bread(SB_JOURNAL(s)->j_bdev_handle->bdev,\
+#define journal_bread(s, block) __bread(file_bdev(SB_JOURNAL(s)->j_bdev_file),\
 		block, s->s_blocksize)
 
 enum reiserfs_bh_state_bits {

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 27/34] bdev: remove bdev_open_by_path()
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (25 preceding siblings ...)
  2024-01-23 13:26 ` [PATCH v2 26/34] reiserfs: " Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-01-29 16:17   ` Christoph Hellwig
  2024-02-01 10:24   ` Jan Kara
  2024-01-23 13:26 ` [PATCH v2 28/34] bdev: make bdev_release() private to block layer Christian Brauner
                   ` (10 subsequent siblings)
  37 siblings, 2 replies; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 block/bdev.c           | 40 ----------------------------------------
 include/linux/blkdev.h |  2 --
 2 files changed, 42 deletions(-)

diff --git a/block/bdev.c b/block/bdev.c
index 4246a57a7c5a..eb5607af6ec5 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -1007,46 +1007,6 @@ struct file *bdev_file_open_by_path(const char *path, blk_mode_t mode,
 }
 EXPORT_SYMBOL(bdev_file_open_by_path);
 
-/**
- * bdev_open_by_path - open a block device by name
- * @path: path to the block device to open
- * @mode: open mode (BLK_OPEN_*)
- * @holder: exclusive holder identifier
- * @hops: holder operations
- *
- * Open the block device described by the device file at @path.  If @holder is
- * not %NULL, the block device is opened with exclusive access.  Exclusive opens
- * may nest for the same @holder.
- *
- * CONTEXT:
- * Might sleep.
- *
- * RETURNS:
- * Handle with a reference to the block_device on success, ERR_PTR(-errno) on
- * failure.
- */
-struct bdev_handle *bdev_open_by_path(const char *path, blk_mode_t mode,
-		void *holder, const struct blk_holder_ops *hops)
-{
-	struct bdev_handle *handle;
-	dev_t dev;
-	int error;
-
-	error = lookup_bdev(path, &dev);
-	if (error)
-		return ERR_PTR(error);
-
-	handle = bdev_open_by_dev(dev, mode, holder, hops);
-	if (!IS_ERR(handle) && (mode & BLK_OPEN_WRITE) &&
-	    bdev_read_only(handle->bdev)) {
-		bdev_release(handle);
-		return ERR_PTR(-EACCES);
-	}
-
-	return handle;
-}
-EXPORT_SYMBOL(bdev_open_by_path);
-
 void bdev_release(struct bdev_handle *handle)
 {
 	struct block_device *bdev = handle->bdev;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 76706aa47316..5880d5abfebe 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1484,8 +1484,6 @@ struct bdev_handle {
 
 struct bdev_handle *bdev_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
 		const struct blk_holder_ops *hops);
-struct bdev_handle *bdev_open_by_path(const char *path, blk_mode_t mode,
-		void *holder, const struct blk_holder_ops *hops);
 struct file *bdev_file_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
 		const struct blk_holder_ops *hops);
 struct file *bdev_file_open_by_path(const char *path, blk_mode_t mode,

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 28/34] bdev: make bdev_release() private to block layer
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (26 preceding siblings ...)
  2024-01-23 13:26 ` [PATCH v2 27/34] bdev: remove bdev_open_by_path() Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-01-29 16:19   ` Christoph Hellwig
  2024-02-01 10:26   ` Jan Kara
  2024-01-23 13:26 ` [PATCH v2 29/34] bdev: make struct bdev_handle private to the " Christian Brauner
                   ` (9 subsequent siblings)
  37 siblings, 2 replies; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

and move both of them to the private block header. There's no caller in
the tree anymore that uses them directly.

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 block/bdev.c           | 2 --
 block/blk.h            | 4 ++++
 include/linux/blkdev.h | 3 ---
 3 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/block/bdev.c b/block/bdev.c
index eb5607af6ec5..1f64f213c5fa 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -916,7 +916,6 @@ struct bdev_handle *bdev_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
 	kfree(handle);
 	return ERR_PTR(ret);
 }
-EXPORT_SYMBOL(bdev_open_by_dev);
 
 static unsigned blk_to_file_flags(blk_mode_t mode)
 {
@@ -1045,7 +1044,6 @@ void bdev_release(struct bdev_handle *handle)
 	blkdev_put_no_open(bdev);
 	kfree(handle);
 }
-EXPORT_SYMBOL(bdev_release);
 
 /**
  * lookup_bdev() - Look up a struct block_device by name.
diff --git a/block/blk.h b/block/blk.h
index 1ef920f72e0f..c9630774767d 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -516,4 +516,8 @@ static inline int req_ref_read(struct request *req)
 	return atomic_read(&req->ref);
 }
 
+void bdev_release(struct bdev_handle *handle);
+struct bdev_handle *bdev_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
+		const struct blk_holder_ops *hops);
+
 #endif /* BLK_INTERNAL_H */
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 5880d5abfebe..495f55587207 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1482,8 +1482,6 @@ struct bdev_handle {
 	blk_mode_t mode;
 };
 
-struct bdev_handle *bdev_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
-		const struct blk_holder_ops *hops);
 struct file *bdev_file_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
 		const struct blk_holder_ops *hops);
 struct file *bdev_file_open_by_path(const char *path, blk_mode_t mode,
@@ -1491,7 +1489,6 @@ struct file *bdev_file_open_by_path(const char *path, blk_mode_t mode,
 int bd_prepare_to_claim(struct block_device *bdev, void *holder,
 		const struct blk_holder_ops *hops);
 void bd_abort_claiming(struct block_device *bdev, void *holder);
-void bdev_release(struct bdev_handle *handle);
 
 /* just for blk-cgroup, don't use elsewhere */
 struct block_device *blkdev_get_no_open(dev_t dev);

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 29/34] bdev: make struct bdev_handle private to the block layer
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (27 preceding siblings ...)
  2024-01-23 13:26 ` [PATCH v2 28/34] bdev: make bdev_release() private to block layer Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-01-29 16:22   ` Christoph Hellwig
                     ` (2 more replies)
  2024-01-23 13:26 ` [PATCH v2 30/34] bdev: remove bdev pointer from struct bdev_handle Christian Brauner
                   ` (8 subsequent siblings)
  37 siblings, 3 replies; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 block/bdev.c           | 125 +++++++++++++++++++++++++++----------------------
 block/blk.h            |  12 +++--
 block/fops.c           |  34 ++++++--------
 include/linux/blkdev.h |   7 ---
 include/linux/fs.h     |   6 ---
 5 files changed, 92 insertions(+), 92 deletions(-)

diff --git a/block/bdev.c b/block/bdev.c
index 1f64f213c5fa..34b9a16edb6e 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -703,6 +703,24 @@ static int blkdev_get_part(struct block_device *part, blk_mode_t mode)
 	return ret;
 }
 
+int bdev_permission(dev_t dev, blk_mode_t mode, void *holder)
+{
+	int ret;
+
+	ret = devcgroup_check_permission(
+		DEVCG_DEV_BLOCK, MAJOR(dev), MINOR(dev),
+		((mode & BLK_OPEN_READ) ? DEVCG_ACC_READ : 0) |
+			((mode & BLK_OPEN_WRITE) ? DEVCG_ACC_WRITE : 0));
+	if (ret)
+		return ret;
+
+	/* Blocking writes requires exclusive opener */
+	if (mode & BLK_OPEN_RESTRICT_WRITES && !holder)
+		return -EINVAL;
+
+	return 0;
+}
+
 static void blkdev_put_part(struct block_device *part)
 {
 	struct block_device *whole = bdev_whole(part);
@@ -795,15 +813,15 @@ static void bdev_yield_write_access(struct block_device *bdev, blk_mode_t mode)
 }
 
 /**
- * bdev_open_by_dev - open a block device by device number
- * @dev: device number of block device to open
+ * bdev_open - open a block device
+ * @bdev: block device to open
  * @mode: open mode (BLK_OPEN_*)
  * @holder: exclusive holder identifier
  * @hops: holder operations
+ * @bdev_file: file for the block device
  *
- * Open the block device described by device number @dev. If @holder is not
- * %NULL, the block device is opened with exclusive access.  Exclusive opens may
- * nest for the same @holder.
+ * Open the block device. If @holder is not %NULL, the block device is opened
+ * with exclusive access.  Exclusive opens may nest for the same @holder.
  *
  * Use this interface ONLY if you really do not have anything better - i.e. when
  * you are behind a truly sucky interface and all you are given is a device
@@ -813,52 +831,29 @@ static void bdev_yield_write_access(struct block_device *bdev, blk_mode_t mode)
  * Might sleep.
  *
  * RETURNS:
- * Handle with a reference to the block_device on success, ERR_PTR(-errno) on
- * failure.
+ * zero on success, -errno on failure.
  */
-struct bdev_handle *bdev_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
-				     const struct blk_holder_ops *hops)
+int bdev_open(struct block_device *bdev, blk_mode_t mode, void *holder,
+	      const struct blk_holder_ops *hops, struct file *bdev_file)
 {
 	struct bdev_handle *handle = kmalloc(sizeof(struct bdev_handle),
 					     GFP_KERNEL);
-	struct block_device *bdev;
 	bool unblock_events = true;
-	struct gendisk *disk;
+	struct gendisk *disk = bdev->bd_disk;
 	int ret;
 
+	handle = kmalloc(sizeof(struct bdev_handle), GFP_KERNEL);
 	if (!handle)
-		return ERR_PTR(-ENOMEM);
-
-	ret = devcgroup_check_permission(DEVCG_DEV_BLOCK,
-			MAJOR(dev), MINOR(dev),
-			((mode & BLK_OPEN_READ) ? DEVCG_ACC_READ : 0) |
-			((mode & BLK_OPEN_WRITE) ? DEVCG_ACC_WRITE : 0));
-	if (ret)
-		goto free_handle;
-
-	/* Blocking writes requires exclusive opener */
-	if (mode & BLK_OPEN_RESTRICT_WRITES && !holder) {
-		ret = -EINVAL;
-		goto free_handle;
-	}
-
-	bdev = blkdev_get_no_open(dev);
-	if (!bdev) {
-		ret = -ENXIO;
-		goto free_handle;
-	}
-	disk = bdev->bd_disk;
+		return -ENOMEM;
 
 	if (holder) {
 		mode |= BLK_OPEN_EXCL;
 		ret = bd_prepare_to_claim(bdev, holder, hops);
 		if (ret)
-			goto put_blkdev;
+			return ret;
 	} else {
-		if (WARN_ON_ONCE(mode & BLK_OPEN_EXCL)) {
-			ret = -EIO;
-			goto put_blkdev;
-		}
+		if (WARN_ON_ONCE(mode & BLK_OPEN_EXCL))
+			return -EIO;
 	}
 
 	disk_block_events(disk);
@@ -902,7 +897,22 @@ struct bdev_handle *bdev_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
 	handle->bdev = bdev;
 	handle->holder = holder;
 	handle->mode = mode;
-	return handle;
+
+	/*
+	 * Preserve backwards compatibility and allow large file access
+	 * even if userspace doesn't ask for it explicitly. Some mkfs
+	 * binary needs it. We might want to drop this workaround
+	 * during an unstable branch.
+	 */
+	bdev_file->f_flags |= O_LARGEFILE;
+	bdev_file->f_mode |= FMODE_BUF_RASYNC | FMODE_CAN_ODIRECT;
+	if (bdev_nowait(bdev))
+		bdev_file->f_mode |= FMODE_NOWAIT;
+	bdev_file->f_mapping = handle->bdev->bd_inode->i_mapping;
+	bdev_file->f_wb_err = filemap_sample_wb_err(bdev_file->f_mapping);
+	bdev_file->private_data = handle;
+
+	return 0;
 put_module:
 	module_put(disk->fops->owner);
 abort_claiming:
@@ -910,11 +920,8 @@ struct bdev_handle *bdev_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
 		bd_abort_claiming(bdev, holder);
 	mutex_unlock(&disk->open_mutex);
 	disk_unblock_events(disk);
-put_blkdev:
-	blkdev_put_no_open(bdev);
-free_handle:
 	kfree(handle);
-	return ERR_PTR(ret);
+	return ret;
 }
 
 static unsigned blk_to_file_flags(blk_mode_t mode)
@@ -954,29 +961,35 @@ struct file *bdev_file_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
 				   const struct blk_holder_ops *hops)
 {
 	struct file *bdev_file;
-	struct bdev_handle *handle;
+	struct block_device *bdev;
 	unsigned int flags;
+	int ret;
 
-	handle = bdev_open_by_dev(dev, mode, holder, hops);
-	if (IS_ERR(handle))
-		return ERR_CAST(handle);
+	ret = bdev_permission(dev, 0, holder);
+	if (ret)
+		return ERR_PTR(ret);
+
+	bdev = blkdev_get_no_open(dev);
+	if (!bdev)
+		return ERR_PTR(-ENXIO);
 
 	flags = blk_to_file_flags(mode);
-	bdev_file = alloc_file_pseudo_noaccount(handle->bdev->bd_inode,
+	bdev_file = alloc_file_pseudo_noaccount(bdev->bd_inode,
 			blockdev_mnt, "", flags | O_LARGEFILE, &def_blk_fops);
 	if (IS_ERR(bdev_file)) {
-		bdev_release(handle);
+		blkdev_put_no_open(bdev);
 		return bdev_file;
 	}
-	ihold(handle->bdev->bd_inode);
+	bdev_file->f_mode &= ~FMODE_OPENED;
 
-	bdev_file->f_mode |= FMODE_BUF_RASYNC | FMODE_CAN_ODIRECT;
-	if (bdev_nowait(handle->bdev))
-		bdev_file->f_mode |= FMODE_NOWAIT;
-
-	bdev_file->f_mapping = handle->bdev->bd_inode->i_mapping;
-	bdev_file->f_wb_err = filemap_sample_wb_err(bdev_file->f_mapping);
-	bdev_file->private_data = handle;
+	ihold(bdev->bd_inode);
+	ret = bdev_open(bdev, mode, holder, hops, bdev_file);
+	if (ret) {
+		fput(bdev_file);
+		return ERR_PTR(ret);
+	}
+	/* Now that thing is opened. */
+	bdev_file->f_mode |= FMODE_OPENED;
 	return bdev_file;
 }
 EXPORT_SYMBOL(bdev_file_open_by_dev);
diff --git a/block/blk.h b/block/blk.h
index c9630774767d..19b15870284f 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -25,6 +25,12 @@ struct blk_flush_queue {
 	struct request		*flush_rq;
 };
 
+struct bdev_handle {
+	struct block_device *bdev;
+	void *holder;
+	blk_mode_t mode;
+};
+
 bool is_flush_rq(struct request *req);
 
 struct blk_flush_queue *blk_alloc_flush_queue(int node, int cmd_size,
@@ -517,7 +523,7 @@ static inline int req_ref_read(struct request *req)
 }
 
 void bdev_release(struct bdev_handle *handle);
-struct bdev_handle *bdev_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
-		const struct blk_holder_ops *hops);
-
+int bdev_open(struct block_device *bdev, blk_mode_t mode, void *holder,
+	      const struct blk_holder_ops *hops, struct file *bdev_file);
+int bdev_permission(dev_t dev, blk_mode_t mode, void *holder);
 #endif /* BLK_INTERNAL_H */
diff --git a/block/fops.c b/block/fops.c
index 0cf8cf72cdfa..81ff8c0ce32f 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -599,31 +599,25 @@ blk_mode_t file_to_blk_mode(struct file *file)
 
 static int blkdev_open(struct inode *inode, struct file *filp)
 {
-	struct bdev_handle *handle;
+	struct block_device *bdev;
 	blk_mode_t mode;
-
-	/*
-	 * Preserve backwards compatibility and allow large file access
-	 * even if userspace doesn't ask for it explicitly. Some mkfs
-	 * binary needs it. We might want to drop this workaround
-	 * during an unstable branch.
-	 */
-	filp->f_flags |= O_LARGEFILE;
-	filp->f_mode |= FMODE_BUF_RASYNC | FMODE_CAN_ODIRECT;
+	void *holder;
+	int ret;
 
 	mode = file_to_blk_mode(filp);
-	handle = bdev_open_by_dev(inode->i_rdev, mode,
-			mode & BLK_OPEN_EXCL ? filp : NULL, NULL);
-	if (IS_ERR(handle))
-		return PTR_ERR(handle);
+	holder = mode & BLK_OPEN_EXCL ? filp : NULL;
+	ret = bdev_permission(inode->i_rdev, mode, holder);
+	if (ret)
+		return ret;
 
-	if (bdev_nowait(handle->bdev))
-		filp->f_mode |= FMODE_NOWAIT;
+	bdev = blkdev_get_no_open(inode->i_rdev);
+	if (!bdev)
+		return -ENXIO;
 
-	filp->f_mapping = handle->bdev->bd_inode->i_mapping;
-	filp->f_wb_err = filemap_sample_wb_err(filp->f_mapping);
-	filp->private_data = handle;
-	return 0;
+	ret = bdev_open(bdev, mode, holder, NULL, filp);
+	if (ret)
+		blkdev_put_no_open(bdev);
+	return ret;
 }
 
 static int blkdev_release(struct inode *inode, struct file *filp)
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 495f55587207..2f5dbde23094 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1475,13 +1475,6 @@ extern const struct blk_holder_ops fs_holder_ops;
 	(BLK_OPEN_READ | BLK_OPEN_RESTRICT_WRITES | \
 	 (((flags) & SB_RDONLY) ? 0 : BLK_OPEN_WRITE))
 
-/* @bdev_handle will be removed soon. */
-struct bdev_handle {
-	struct block_device *bdev;
-	void *holder;
-	blk_mode_t mode;
-};
-
 struct file *bdev_file_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
 		const struct blk_holder_ops *hops);
 struct file *bdev_file_open_by_path(const char *path, blk_mode_t mode,
diff --git a/include/linux/fs.h b/include/linux/fs.h
index e9291e27cc47..6e0714d35d9b 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1327,12 +1327,6 @@ struct super_block {
 	struct list_head	s_inodes_wb;	/* writeback inodes */
 } __randomize_layout;
 
-/* Temporary helper that will go away. */
-static inline struct bdev_handle *sb_bdev_handle(struct super_block *sb)
-{
-	return sb->s_bdev_file->private_data;
-}
-
 static inline struct user_namespace *i_user_ns(const struct inode *inode)
 {
 	return inode->i_sb->s_user_ns;

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 30/34] bdev: remove bdev pointer from struct bdev_handle
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (28 preceding siblings ...)
  2024-01-23 13:26 ` [PATCH v2 29/34] bdev: make struct bdev_handle private to the " Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-01-29 16:22   ` Christoph Hellwig
  2024-02-01 10:57   ` Jan Kara
  2024-01-23 13:26 ` [PATCH v2 31/34] block: use file->f_op to indicate restricted writes Christian Brauner
                   ` (7 subsequent siblings)
  37 siblings, 2 replies; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

We can always go directly via:

* I_BDEV(bdev_file->f_inode)
* I_BDEV(bdev_file->f_mapping->host)

So keeping struct bdev in struct bdev_handle is redundant.

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 block/bdev.c | 26 ++++++++++++--------------
 block/blk.h  |  3 +--
 block/fops.c |  2 +-
 3 files changed, 14 insertions(+), 17 deletions(-)

diff --git a/block/bdev.c b/block/bdev.c
index 34b9a16edb6e..71eaa1b5b7eb 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -51,8 +51,7 @@ EXPORT_SYMBOL(I_BDEV);
 
 struct block_device *file_bdev(struct file *bdev_file)
 {
-	struct bdev_handle *handle = bdev_file->private_data;
-	return handle->bdev;
+	return I_BDEV(bdev_file->f_mapping->host);
 }
 EXPORT_SYMBOL(file_bdev);
 
@@ -894,7 +893,6 @@ int bdev_open(struct block_device *bdev, blk_mode_t mode, void *holder,
 
 	if (unblock_events)
 		disk_unblock_events(disk);
-	handle->bdev = bdev;
 	handle->holder = holder;
 	handle->mode = mode;
 
@@ -908,7 +906,7 @@ int bdev_open(struct block_device *bdev, blk_mode_t mode, void *holder,
 	bdev_file->f_mode |= FMODE_BUF_RASYNC | FMODE_CAN_ODIRECT;
 	if (bdev_nowait(bdev))
 		bdev_file->f_mode |= FMODE_NOWAIT;
-	bdev_file->f_mapping = handle->bdev->bd_inode->i_mapping;
+	bdev_file->f_mapping = bdev->bd_inode->i_mapping;
 	bdev_file->f_wb_err = filemap_sample_wb_err(bdev_file->f_mapping);
 	bdev_file->private_data = handle;
 
@@ -998,7 +996,7 @@ struct file *bdev_file_open_by_path(const char *path, blk_mode_t mode,
 				    void *holder,
 				    const struct blk_holder_ops *hops)
 {
-	struct file *bdev_file;
+	struct file *file;
 	dev_t dev;
 	int error;
 
@@ -1006,22 +1004,22 @@ struct file *bdev_file_open_by_path(const char *path, blk_mode_t mode,
 	if (error)
 		return ERR_PTR(error);
 
-	bdev_file = bdev_file_open_by_dev(dev, mode, holder, hops);
-	if (!IS_ERR(bdev_file) && (mode & BLK_OPEN_WRITE)) {
-		struct bdev_handle *handle = bdev_file->private_data;
-		if (bdev_read_only(handle->bdev)) {
-			fput(bdev_file);
-			bdev_file = ERR_PTR(-EACCES);
+	file = bdev_file_open_by_dev(dev, mode, holder, hops);
+	if (!IS_ERR(file) && (mode & BLK_OPEN_WRITE)) {
+		if (bdev_read_only(file_bdev(file))) {
+			fput(file);
+			file = ERR_PTR(-EACCES);
 		}
 	}
 
-	return bdev_file;
+	return file;
 }
 EXPORT_SYMBOL(bdev_file_open_by_path);
 
-void bdev_release(struct bdev_handle *handle)
+void bdev_release(struct file *bdev_file)
 {
-	struct block_device *bdev = handle->bdev;
+	struct block_device *bdev = file_bdev(bdev_file);
+	struct bdev_handle *handle = bdev_file->private_data;
 	struct gendisk *disk = bdev->bd_disk;
 
 	/*
diff --git a/block/blk.h b/block/blk.h
index 19b15870284f..7ca24814f3a0 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -26,7 +26,6 @@ struct blk_flush_queue {
 };
 
 struct bdev_handle {
-	struct block_device *bdev;
 	void *holder;
 	blk_mode_t mode;
 };
@@ -522,7 +521,7 @@ static inline int req_ref_read(struct request *req)
 	return atomic_read(&req->ref);
 }
 
-void bdev_release(struct bdev_handle *handle);
+void bdev_release(struct file *bdev_file);
 int bdev_open(struct block_device *bdev, blk_mode_t mode, void *holder,
 	      const struct blk_holder_ops *hops, struct file *bdev_file);
 int bdev_permission(dev_t dev, blk_mode_t mode, void *holder);
diff --git a/block/fops.c b/block/fops.c
index 81ff8c0ce32f..5589bf9c3822 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -622,7 +622,7 @@ static int blkdev_open(struct inode *inode, struct file *filp)
 
 static int blkdev_release(struct inode *inode, struct file *filp)
 {
-	bdev_release(filp->private_data);
+	bdev_release(filp);
 	return 0;
 }
 

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 31/34] block: use file->f_op to indicate restricted writes
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (29 preceding siblings ...)
  2024-01-23 13:26 ` [PATCH v2 30/34] bdev: remove bdev pointer from struct bdev_handle Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-01-29 16:49   ` Christoph Hellwig
  2024-02-01 11:08   ` [PATCH v2 31/34] block: use file->f_op to indicate restricted writes Jan Kara
  2024-01-23 13:26 ` [PATCH v2 32/34] block: remove bdev_handle completely Christian Brauner
                   ` (6 subsequent siblings)
  37 siblings, 2 replies; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

Make it possible to detected a block device that was opened with
restricted write access solely based on its file operations that it was
opened with. This avoids wasting an FMODE_* flag.

def_blk_fops isn't needed to check whether something is a block device
checking the inode type is enough for that. And def_blk_fops_restricted
can be kept private to the block layer.

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 block/bdev.c | 16 ++++++++++++----
 block/blk.h  |  2 ++
 block/fops.c |  3 +++
 3 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/block/bdev.c b/block/bdev.c
index 71eaa1b5b7eb..9d96a43f198d 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -799,13 +799,16 @@ static void bdev_claim_write_access(struct block_device *bdev, blk_mode_t mode)
 		bdev->bd_writers++;
 }
 
-static void bdev_yield_write_access(struct block_device *bdev, blk_mode_t mode)
+static void bdev_yield_write_access(struct file *bdev_file, blk_mode_t mode)
 {
+	struct block_device *bdev;
+
 	if (bdev_allow_write_mounted)
 		return;
 
+	bdev = file_bdev(bdev_file);
 	/* Yield exclusive or shared write access. */
-	if (mode & BLK_OPEN_RESTRICT_WRITES)
+	if (bdev_file->f_op == &def_blk_fops_restricted)
 		bdev_unblock_writes(bdev);
 	else if (mode & BLK_OPEN_WRITE)
 		bdev->bd_writers--;
@@ -959,6 +962,7 @@ struct file *bdev_file_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
 				   const struct blk_holder_ops *hops)
 {
 	struct file *bdev_file;
+	const struct file_operations *blk_fops;
 	struct block_device *bdev;
 	unsigned int flags;
 	int ret;
@@ -972,8 +976,12 @@ struct file *bdev_file_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
 		return ERR_PTR(-ENXIO);
 
 	flags = blk_to_file_flags(mode);
+	if (mode & BLK_OPEN_RESTRICT_WRITES)
+		blk_fops = &def_blk_fops_restricted;
+	else
+		blk_fops = &def_blk_fops;
 	bdev_file = alloc_file_pseudo_noaccount(bdev->bd_inode,
-			blockdev_mnt, "", flags | O_LARGEFILE, &def_blk_fops);
+			blockdev_mnt, "", flags | O_LARGEFILE, blk_fops);
 	if (IS_ERR(bdev_file)) {
 		blkdev_put_no_open(bdev);
 		return bdev_file;
@@ -1033,7 +1041,7 @@ void bdev_release(struct file *bdev_file)
 		sync_blockdev(bdev);
 
 	mutex_lock(&disk->open_mutex);
-	bdev_yield_write_access(bdev, handle->mode);
+	bdev_yield_write_access(bdev_file, handle->mode);
 
 	if (handle->holder)
 		bd_end_claim(bdev, handle->holder);
diff --git a/block/blk.h b/block/blk.h
index 7ca24814f3a0..dfa958909c54 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -9,6 +9,8 @@
 
 struct elevator_type;
 
+extern const struct file_operations def_blk_fops_restricted;
+
 /* Max future timer expiry for timeouts */
 #define BLK_MAX_TIMEOUT		(5 * HZ)
 
diff --git a/block/fops.c b/block/fops.c
index 5589bf9c3822..f56bdfe459de 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -862,6 +862,9 @@ const struct file_operations def_blk_fops = {
 	.fallocate	= blkdev_fallocate,
 };
 
+/* Indicator that this block device is opened with restricted write access. */
+const struct file_operations def_blk_fops_restricted = def_blk_fops;
+
 static __init int blkdev_init(void)
 {
 	return bioset_init(&blkdev_dio_pool, 4,

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 32/34] block: remove bdev_handle completely
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (30 preceding siblings ...)
  2024-01-23 13:26 ` [PATCH v2 31/34] block: use file->f_op to indicate restricted writes Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-01-29 16:50   ` Christoph Hellwig
  2024-02-01 11:20   ` Jan Kara
  2024-01-23 13:26 ` [PATCH v2 33/34] block: expose bdev_file_inode() Christian Brauner
                   ` (5 subsequent siblings)
  37 siblings, 2 replies; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

We just need to use the holder to indicate whether a block device open
was exclusive or not. We did use to do that before but had to give that
up once we switched to struct bdev_handle. Before struct bdev_handle we
only stashed stuff in file->private_data if this was an exclusive open
but after struct bdev_handle we always set file->private_data to a
struct bdev_handle and so we had to use bdev_handle->mode or
bdev_handle->holder. Now that we don't use struct bdev_handle anymore we
can revert back to the old behavior.

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 block/bdev.c | 24 +++++++-----------------
 block/blk.h  |  5 -----
 block/fops.c | 18 +++++++++++-------
 3 files changed, 18 insertions(+), 29 deletions(-)

diff --git a/block/bdev.c b/block/bdev.c
index 9d96a43f198d..4b47003d8082 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -799,7 +799,7 @@ static void bdev_claim_write_access(struct block_device *bdev, blk_mode_t mode)
 		bdev->bd_writers++;
 }
 
-static void bdev_yield_write_access(struct file *bdev_file, blk_mode_t mode)
+static void bdev_yield_write_access(struct file *bdev_file)
 {
 	struct block_device *bdev;
 
@@ -810,7 +810,7 @@ static void bdev_yield_write_access(struct file *bdev_file, blk_mode_t mode)
 	/* Yield exclusive or shared write access. */
 	if (bdev_file->f_op == &def_blk_fops_restricted)
 		bdev_unblock_writes(bdev);
-	else if (mode & BLK_OPEN_WRITE)
+	else if (bdev_file->f_mode & FMODE_WRITE)
 		bdev->bd_writers--;
 }
 
@@ -838,16 +838,10 @@ static void bdev_yield_write_access(struct file *bdev_file, blk_mode_t mode)
 int bdev_open(struct block_device *bdev, blk_mode_t mode, void *holder,
 	      const struct blk_holder_ops *hops, struct file *bdev_file)
 {
-	struct bdev_handle *handle = kmalloc(sizeof(struct bdev_handle),
-					     GFP_KERNEL);
 	bool unblock_events = true;
 	struct gendisk *disk = bdev->bd_disk;
 	int ret;
 
-	handle = kmalloc(sizeof(struct bdev_handle), GFP_KERNEL);
-	if (!handle)
-		return -ENOMEM;
-
 	if (holder) {
 		mode |= BLK_OPEN_EXCL;
 		ret = bd_prepare_to_claim(bdev, holder, hops);
@@ -896,8 +890,6 @@ int bdev_open(struct block_device *bdev, blk_mode_t mode, void *holder,
 
 	if (unblock_events)
 		disk_unblock_events(disk);
-	handle->holder = holder;
-	handle->mode = mode;
 
 	/*
 	 * Preserve backwards compatibility and allow large file access
@@ -911,7 +903,7 @@ int bdev_open(struct block_device *bdev, blk_mode_t mode, void *holder,
 		bdev_file->f_mode |= FMODE_NOWAIT;
 	bdev_file->f_mapping = bdev->bd_inode->i_mapping;
 	bdev_file->f_wb_err = filemap_sample_wb_err(bdev_file->f_mapping);
-	bdev_file->private_data = handle;
+	bdev_file->private_data = holder;
 
 	return 0;
 put_module:
@@ -921,7 +913,6 @@ int bdev_open(struct block_device *bdev, blk_mode_t mode, void *holder,
 		bd_abort_claiming(bdev, holder);
 	mutex_unlock(&disk->open_mutex);
 	disk_unblock_events(disk);
-	kfree(handle);
 	return ret;
 }
 
@@ -1027,7 +1018,7 @@ EXPORT_SYMBOL(bdev_file_open_by_path);
 void bdev_release(struct file *bdev_file)
 {
 	struct block_device *bdev = file_bdev(bdev_file);
-	struct bdev_handle *handle = bdev_file->private_data;
+	void *holder = bdev_file->private_data;
 	struct gendisk *disk = bdev->bd_disk;
 
 	/*
@@ -1041,10 +1032,10 @@ void bdev_release(struct file *bdev_file)
 		sync_blockdev(bdev);
 
 	mutex_lock(&disk->open_mutex);
-	bdev_yield_write_access(bdev_file, handle->mode);
+	bdev_yield_write_access(bdev_file);
 
-	if (handle->holder)
-		bd_end_claim(bdev, handle->holder);
+	if (holder)
+		bd_end_claim(bdev, holder);
 
 	/*
 	 * Trigger event checking and tell drivers to flush MEDIA_CHANGE
@@ -1061,7 +1052,6 @@ void bdev_release(struct file *bdev_file)
 
 	module_put(disk->fops->owner);
 	blkdev_put_no_open(bdev);
-	kfree(handle);
 }
 
 /**
diff --git a/block/blk.h b/block/blk.h
index dfa958909c54..cce1ac0ff303 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -27,11 +27,6 @@ struct blk_flush_queue {
 	struct request		*flush_rq;
 };
 
-struct bdev_handle {
-	void *holder;
-	blk_mode_t mode;
-};
-
 bool is_flush_rq(struct request *req);
 
 struct blk_flush_queue *blk_alloc_flush_queue(int node, int cmd_size,
diff --git a/block/fops.c b/block/fops.c
index f56bdfe459de..a0bff2c0d88d 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -569,7 +569,6 @@ static int blkdev_fsync(struct file *filp, loff_t start, loff_t end,
 blk_mode_t file_to_blk_mode(struct file *file)
 {
 	blk_mode_t mode = 0;
-	struct bdev_handle *handle = file->private_data;
 
 	if (file->f_mode & FMODE_READ)
 		mode |= BLK_OPEN_READ;
@@ -579,8 +578,8 @@ blk_mode_t file_to_blk_mode(struct file *file)
 	 * do_dentry_open() clears O_EXCL from f_flags, use handle->mode to
 	 * determine whether the open was exclusive for already open files.
 	 */
-	if (handle)
-		mode |= handle->mode & BLK_OPEN_EXCL;
+	if (file->private_data)
+		mode |= BLK_OPEN_EXCL;
 	else if (file->f_flags & O_EXCL)
 		mode |= BLK_OPEN_EXCL;
 	if (file->f_flags & O_NDELAY)
@@ -601,12 +600,17 @@ static int blkdev_open(struct inode *inode, struct file *filp)
 {
 	struct block_device *bdev;
 	blk_mode_t mode;
-	void *holder;
 	int ret;
 
+	/*
+	 * Use the file private data to store the holder for exclusive opens.
+	 * file_to_blk_mode relies on it being present to set BLK_OPEN_EXCL.
+	 */
+	if (filp->f_flags & O_EXCL)
+		filp->private_data = filp;
+
 	mode = file_to_blk_mode(filp);
-	holder = mode & BLK_OPEN_EXCL ? filp : NULL;
-	ret = bdev_permission(inode->i_rdev, mode, holder);
+	ret = bdev_permission(inode->i_rdev, mode, filp->private_data);
 	if (ret)
 		return ret;
 
@@ -614,7 +618,7 @@ static int blkdev_open(struct inode *inode, struct file *filp)
 	if (!bdev)
 		return -ENXIO;
 
-	ret = bdev_open(bdev, mode, holder, NULL, filp);
+	ret = bdev_open(bdev, mode, filp->private_data, NULL, filp);
 	if (ret)
 		blkdev_put_no_open(bdev);
 	return ret;

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 33/34] block: expose bdev_file_inode()
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (31 preceding siblings ...)
  2024-01-23 13:26 ` [PATCH v2 32/34] block: remove bdev_handle completely Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-02-01 10:09   ` Jan Kara
  2024-01-23 13:26 ` [PATCH v2 34/34] ext4: rely on sb->f_bdev only Christian Brauner
                   ` (4 subsequent siblings)
  37 siblings, 1 reply; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

Now that we open block devices as files we don't need to rely on
bd_inode to get to the correct inode. Use the helper.

We could use bdev_file->f_inode directly here since we know that
@f_inode refers to a bdev fs inode but it is generically correct to use
bdev_file->f_mapping->host since that will also work for bdev_files
opened from userspace.

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 block/bdev.c           | 2 +-
 block/fops.c           | 5 -----
 include/linux/blkdev.h | 5 +++++
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/block/bdev.c b/block/bdev.c
index 4b47003d8082..185c43ebeea5 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -51,7 +51,7 @@ EXPORT_SYMBOL(I_BDEV);
 
 struct block_device *file_bdev(struct file *bdev_file)
 {
-	return I_BDEV(bdev_file->f_mapping->host);
+	return I_BDEV(bdev_file_inode(bdev_file));
 }
 EXPORT_SYMBOL(file_bdev);
 
diff --git a/block/fops.c b/block/fops.c
index a0bff2c0d88d..240d968c281c 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -19,11 +19,6 @@
 #include <linux/module.h>
 #include "blk.h"
 
-static inline struct inode *bdev_file_inode(struct file *file)
-{
-	return file->f_mapping->host;
-}
-
 static blk_opf_t dio_bio_write_op(struct kiocb *iocb)
 {
 	blk_opf_t opf = REQ_OP_WRITE | REQ_SYNC | REQ_IDLE;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 2f5dbde23094..4b7080e56e44 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1490,6 +1490,11 @@ void blkdev_put_no_open(struct block_device *bdev);
 struct block_device *I_BDEV(struct inode *inode);
 struct block_device *file_bdev(struct file *bdev_file);
 
+static inline struct inode *bdev_file_inode(struct file *file)
+{
+	return file->f_mapping->host;
+}
+
 #ifdef CONFIG_BLOCK
 void invalidate_bdev(struct block_device *bdev);
 int sync_blockdev(struct block_device *bdev);

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH v2 34/34] ext4: rely on sb->f_bdev only
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (32 preceding siblings ...)
  2024-01-23 13:26 ` [PATCH v2 33/34] block: expose bdev_file_inode() Christian Brauner
@ 2024-01-23 13:26 ` Christian Brauner
  2024-02-01 11:34   ` Jan Kara
  2024-01-29  6:17 ` [PATCH v2 00/34] Open block devices as files Christoph Hellwig
                   ` (3 subsequent siblings)
  37 siblings, 1 reply; 146+ messages in thread
From: Christian Brauner @ 2024-01-23 13:26 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, Christian Brauner

(1) Instead of bdev->bd_inode->i_mapping we do f_bdev->f_mapping
(2) Instead of bdev->bd_inode we could do f_bdev->f_inode

I mention this explicitly because (1) is dependent on how the block
device is opened while (2) isn't. Consider:

mount -t tmpfs tmpfs /mnt
mknod /mnt/foo b <minor> <major>
open("/mnt/foo", O_RDWR);

then (1) doesn't work because f_bdev->f_inode is a tmpfs inode _not_ the
actual bdev filesystem inode. But (2) is still the bd_inode->i_mapping
as that's set up during bdev_open().

IOW, I'm explicitly _not_ going via f_bdev->f_inode but via
f_bdev->f_mapping->host aka bdev_file_inode(f_bdev). Currently this
isn't a problem because sb->s_bdev_file stashes the a block device file
opened via bdev_open_by_*() which is always a file on the bdev
filesystem.

_If_ we ever wanted to allow userspace to pass a block device file
descriptor during superblock creation. Say:

fsconfig(fs_fd, FSCONFIG_CMD_CREATE_EXCL, "source", bdev_fd);

then using f_bdev->f_inode would be very wrong. Another thing to keep in
mind there would be that this would implicitly pin another filesystem.
Say:

mount -t ext4 /dev/sda /mnt
mknod /mnt/foo b <minor> <major>
bdev_fd = open("/mnt/foo", O_RDWR);

fd_fs = fsopen("xfs")
fsconfig(fd_fs, FSCONFIG_CMD_CREATE, "source", bdev_fd);
fd_mnt = fsmount(fd_fs);
move_mount(fd_mnt, "/mnt2");

umount /mnt # EBUSY

Because the xfs filesystem now pins the ext4 filesystem via the
bdev_file we're keeping. In other words, this is probably a bad idea and
if we allow userspace to do this then we should only use the provided fd
to lookup the block device and open our own handle to it.

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 fs/ext4/dir.c         |  2 +-
 fs/ext4/ext4_jbd2.c   |  2 +-
 fs/ext4/fast_commit.c |  2 +-
 fs/ext4/super.c       | 37 +++++++++++++++++++------------------
 4 files changed, 22 insertions(+), 21 deletions(-)

diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c
index 3985f8c33f95..0733bc1eec7a 100644
--- a/fs/ext4/dir.c
+++ b/fs/ext4/dir.c
@@ -192,7 +192,7 @@ static int ext4_readdir(struct file *file, struct dir_context *ctx)
 					(PAGE_SHIFT - inode->i_blkbits);
 			if (!ra_has_index(&file->f_ra, index))
 				page_cache_sync_readahead(
-					sb->s_bdev->bd_inode->i_mapping,
+					sb->s_bdev_file->f_mapping,
 					&file->f_ra, file,
 					index, 1);
 			file->f_ra.prev_pos = (loff_t)index << PAGE_SHIFT;
diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c
index 5d8055161acd..dbb9aff07ac1 100644
--- a/fs/ext4/ext4_jbd2.c
+++ b/fs/ext4/ext4_jbd2.c
@@ -206,7 +206,7 @@ static void ext4_journal_abort_handle(const char *caller, unsigned int line,
 
 static void ext4_check_bdev_write_error(struct super_block *sb)
 {
-	struct address_space *mapping = sb->s_bdev->bd_inode->i_mapping;
+	struct address_space *mapping = sb->s_bdev_file->f_mapping;
 	struct ext4_sb_info *sbi = EXT4_SB(sb);
 	int err;
 
diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
index 87c009e0c59a..9a4eb542735e 100644
--- a/fs/ext4/fast_commit.c
+++ b/fs/ext4/fast_commit.c
@@ -1605,7 +1605,7 @@ static int ext4_fc_replay_inode(struct super_block *sb,
 out:
 	iput(inode);
 	if (!ret)
-		blkdev_issue_flush(sb->s_bdev);
+		blkdev_issue_flush(file_bdev(sb->s_bdev_file));
 
 	return 0;
 }
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index aa007710cfc3..edb7221dce18 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -244,7 +244,7 @@ static struct buffer_head *__ext4_sb_bread_gfp(struct super_block *sb,
 struct buffer_head *ext4_sb_bread(struct super_block *sb, sector_t block,
 				   blk_opf_t op_flags)
 {
-	gfp_t gfp = mapping_gfp_constraint(sb->s_bdev->bd_inode->i_mapping,
+	gfp_t gfp = mapping_gfp_constraint(sb->s_bdev_file->f_mapping,
 			~__GFP_FS) | __GFP_MOVABLE;
 
 	return __ext4_sb_bread_gfp(sb, block, op_flags, gfp);
@@ -253,7 +253,7 @@ struct buffer_head *ext4_sb_bread(struct super_block *sb, sector_t block,
 struct buffer_head *ext4_sb_bread_unmovable(struct super_block *sb,
 					    sector_t block)
 {
-	gfp_t gfp = mapping_gfp_constraint(sb->s_bdev->bd_inode->i_mapping,
+	gfp_t gfp = mapping_gfp_constraint(sb->s_bdev_file->f_mapping,
 			~__GFP_FS);
 
 	return __ext4_sb_bread_gfp(sb, block, 0, gfp);
@@ -261,7 +261,7 @@ struct buffer_head *ext4_sb_bread_unmovable(struct super_block *sb,
 
 void ext4_sb_breadahead_unmovable(struct super_block *sb, sector_t block)
 {
-	struct buffer_head *bh = bdev_getblk(sb->s_bdev, block,
+	struct buffer_head *bh = bdev_getblk(file_bdev(sb->s_bdev_file), block,
 			sb->s_blocksize, GFP_NOWAIT | __GFP_NOWARN);
 
 	if (likely(bh)) {
@@ -477,7 +477,7 @@ static void ext4_maybe_update_superblock(struct super_block *sb)
 		return;
 
 	lifetime_write_kbytes = sbi->s_kbytes_written +
-		((part_stat_read(sb->s_bdev, sectors[STAT_WRITE]) -
+		((part_stat_read(file_bdev(sb->s_bdev_file), sectors[STAT_WRITE]) -
 		  sbi->s_sectors_written_start) >> 1);
 
 	/* Get the number of kilobytes not written to disk to account
@@ -502,7 +502,7 @@ static void ext4_maybe_update_superblock(struct super_block *sb)
  */
 static int block_device_ejected(struct super_block *sb)
 {
-	struct inode *bd_inode = sb->s_bdev->bd_inode;
+	struct inode *bd_inode = bdev_file_inode(sb->s_bdev_file);
 	struct backing_dev_info *bdi = inode_to_bdi(bd_inode);
 
 	return bdi->dev == NULL;
@@ -722,7 +722,7 @@ static void ext4_handle_error(struct super_block *sb, bool force_ro, int error,
 			jbd2_journal_abort(journal, -EIO);
 	}
 
-	if (!bdev_read_only(sb->s_bdev)) {
+	if (!bdev_read_only(file_bdev(sb->s_bdev_file))) {
 		save_error_info(sb, error, ino, block, func, line);
 		/*
 		 * In case the fs should keep running, we need to writeout
@@ -1084,7 +1084,7 @@ __acquires(bitlock)
 		if (test_opt(sb, WARN_ON_ERROR))
 			WARN_ON_ONCE(1);
 		EXT4_SB(sb)->s_mount_state |= EXT4_ERROR_FS;
-		if (!bdev_read_only(sb->s_bdev)) {
+		if (!bdev_read_only(file_bdev(sb->s_bdev_file))) {
 			save_error_info(sb, EFSCORRUPTED, ino, block, function,
 					line);
 			schedule_work(&EXT4_SB(sb)->s_sb_upd_work);
@@ -1357,8 +1357,8 @@ static void ext4_put_super(struct super_block *sb)
 		dump_orphan_list(sb, sbi);
 	ASSERT(list_empty(&sbi->s_orphan));
 
-	sync_blockdev(sb->s_bdev);
-	invalidate_bdev(sb->s_bdev);
+	sync_blockdev(file_bdev(sb->s_bdev_file));
+	invalidate_bdev(file_bdev(sb->s_bdev_file));
 	if (sbi->s_journal_bdev_file) {
 		/*
 		 * Invalidate the journal device's buffers.  We don't want them
@@ -4329,7 +4329,7 @@ static struct ext4_sb_info *ext4_alloc_sbi(struct super_block *sb)
 	if (!sbi)
 		return NULL;
 
-	sbi->s_daxdev = fs_dax_get_by_bdev(sb->s_bdev, &sbi->s_dax_part_off,
+	sbi->s_daxdev = fs_dax_get_by_bdev(file_bdev(sb->s_bdev_file), &sbi->s_dax_part_off,
 					   NULL, NULL);
 
 	sbi->s_blockgroup_lock =
@@ -5222,7 +5222,7 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
 
 	sbi->s_inode_readahead_blks = EXT4_DEF_INODE_READAHEAD_BLKS;
 	sbi->s_sectors_written_start =
-		part_stat_read(sb->s_bdev, sectors[STAT_WRITE]);
+		part_stat_read(file_bdev(sb->s_bdev_file), sectors[STAT_WRITE]);
 
 	err = ext4_load_super(sb, &logical_sb_block, silent);
 	if (err)
@@ -5576,7 +5576,7 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
 	 * used to detect the metadata async write error.
 	 */
 	spin_lock_init(&sbi->s_bdev_wb_lock);
-	errseq_check_and_advance(&sb->s_bdev->bd_inode->i_mapping->wb_err,
+	errseq_check_and_advance(&sb->s_bdev_file->f_mapping->wb_err,
 				 &sbi->s_bdev_wb_err);
 	EXT4_SB(sb)->s_mount_state |= EXT4_ORPHAN_FS;
 	ext4_orphan_cleanup(sb, es);
@@ -5596,7 +5596,7 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
 			goto failed_mount10;
 	}
 
-	if (test_opt(sb, DISCARD) && !bdev_max_discard_sectors(sb->s_bdev))
+	if (test_opt(sb, DISCARD) && !bdev_max_discard_sectors(file_bdev(sb->s_bdev_file)))
 		ext4_msg(sb, KERN_WARNING,
 			 "mounting with \"discard\" option, but the device does not support discard");
 
@@ -5675,7 +5675,7 @@ failed_mount9: __maybe_unused
 		fput(sbi->s_journal_bdev_file);
 	}
 out_fail:
-	invalidate_bdev(sb->s_bdev);
+	invalidate_bdev(file_bdev(sb->s_bdev_file));
 	sb->s_fs_info = NULL;
 	return err;
 }
@@ -5934,7 +5934,8 @@ static journal_t *ext4_open_dev_journal(struct super_block *sb,
 	if (IS_ERR(bdev_file))
 		return ERR_CAST(bdev_file);
 
-	journal = jbd2_journal_init_dev(file_bdev(bdev_file), sb->s_bdev, j_start,
+	journal = jbd2_journal_init_dev(file_bdev(bdev_file),
+					file_bdev(sb->s_bdev_file), j_start,
 					j_len, sb->s_blocksize);
 	if (IS_ERR(journal)) {
 		ext4_msg(sb, KERN_ERR, "failed to create device journal");
@@ -5999,7 +6000,7 @@ static int ext4_load_journal(struct super_block *sb,
 	}
 
 	journal_dev_ro = bdev_read_only(journal->j_dev);
-	really_read_only = bdev_read_only(sb->s_bdev) | journal_dev_ro;
+	really_read_only = bdev_read_only(file_bdev(sb->s_bdev_file)) | journal_dev_ro;
 
 	if (journal_dev_ro && !sb_rdonly(sb)) {
 		ext4_msg(sb, KERN_ERR,
@@ -6116,7 +6117,7 @@ static void ext4_update_super(struct super_block *sb)
 		ext4_update_tstamp(es, s_wtime);
 	es->s_kbytes_written =
 		cpu_to_le64(sbi->s_kbytes_written +
-		    ((part_stat_read(sb->s_bdev, sectors[STAT_WRITE]) -
+		    ((part_stat_read(file_bdev(sb->s_bdev_file), sectors[STAT_WRITE]) -
 		      sbi->s_sectors_written_start) >> 1));
 	if (percpu_counter_initialized(&sbi->s_freeclusters_counter))
 		ext4_free_blocks_count_set(es,
@@ -6350,7 +6351,7 @@ static int ext4_sync_fs(struct super_block *sb, int wait)
 		needs_barrier = true;
 	if (needs_barrier) {
 		int err;
-		err = blkdev_issue_flush(sb->s_bdev);
+		err = blkdev_issue_flush(file_bdev(sb->s_bdev_file));
 		if (!ret)
 			ret = err;
 	}

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 00/34] Open block devices as files
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (33 preceding siblings ...)
  2024-01-23 13:26 ` [PATCH v2 34/34] ext4: rely on sb->f_bdev only Christian Brauner
@ 2024-01-29  6:17 ` Christoph Hellwig
  2024-01-29 10:17   ` Christian Brauner
  2024-01-29 10:56 ` [PATCH RFC 0/2] fs & block: remove bd_inode Christian Brauner
                   ` (2 subsequent siblings)
  37 siblings, 1 reply; 146+ messages in thread
From: Christoph Hellwig @ 2024-01-29  6:17 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

Do you have a git tree for this series? 

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 00/34] Open block devices as files
  2024-01-29  6:17 ` [PATCH v2 00/34] Open block devices as files Christoph Hellwig
@ 2024-01-29 10:17   ` Christian Brauner
  0 siblings, 0 replies; 146+ messages in thread
From: Christian Brauner @ 2024-01-29 10:17 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jan Kara, Jens Axboe, Darrick J. Wong, linux-fsdevel, linux-block

On Mon, Jan 29, 2024 at 07:17:07AM +0100, Christoph Hellwig wrote:
> Do you have a git tree for this series? 

b4/vfs-bdev-file on vfs.git

^ permalink raw reply	[flat|nested] 146+ messages in thread

* [PATCH RFC 0/2] fs & block: remove bd_inode
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (34 preceding siblings ...)
  2024-01-29  6:17 ` [PATCH v2 00/34] Open block devices as files Christoph Hellwig
@ 2024-01-29 10:56 ` Christian Brauner
  2024-01-29 10:56   ` [PATCH RFC 1/2] fs & block: remove bdev->bd_inode Christian Brauner
  2024-01-29 10:56   ` [PATCH RFC 2/2] fs,drivers: remove bdev_inode() usage outside of block layer and drivers Christian Brauner
  2024-02-05 11:55 ` [PATCH v2 00/34] Open block devices as files Christian Brauner
  2024-03-21 22:17 ` Matthew Wilcox
  37 siblings, 2 replies; 146+ messages in thread
From: Christian Brauner @ 2024-01-29 10:56 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Christian Brauner, Darrick J. Wong, linux-fsdevel, linux-block

Hey Christoph,
Hey Jan,

This is an attempt to remove bdev->bd_inode and restrict direct access
to the block layer, block drivers and few instances in fs/buffer.c where
it's needed. Suggestions to do better welcome!

Thanks!
Christian

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
Christian Brauner (2):
      fs & block: remove bdev->bd_inode
      fs,drivers: remove bdev_inode() usage outside of block layer and drivers

 block/bdev.c                          | 48 ++++++++++--------
 block/blk-zoned.c                     |  4 +-
 block/fops.c                          |  4 +-
 block/genhd.c                         |  8 +--
 block/ioctl.c                         |  8 +--
 block/partitions/core.c               | 11 ++--
 drivers/gpu/drm/drm_gem_vram_helper.c |  2 +-
 drivers/md/bcache/super.c             |  7 +--
 drivers/md/md-bitmap.c                |  2 +-
 drivers/mtd/devices/block2mtd.c       |  4 +-
 drivers/s390/block/dasd_ioctl.c       |  2 +-
 drivers/scsi/scsicam.c                |  2 +-
 fs/affs/file.c                        |  2 +-
 fs/bcachefs/util.h                    |  5 --
 fs/btrfs/dev-replace.c                |  2 +-
 fs/btrfs/disk-io.c                    | 17 ++++---
 fs/btrfs/disk-io.h                    |  4 +-
 fs/btrfs/inode.c                      |  2 +-
 fs/btrfs/super.c                      |  2 +-
 fs/btrfs/volumes.c                    | 26 +++++-----
 fs/btrfs/volumes.h                    |  2 +-
 fs/btrfs/zoned.c                      | 18 ++++---
 fs/btrfs/zoned.h                      |  4 +-
 fs/buffer.c                           | 95 +++++++++++++++++++----------------
 fs/cramfs/inode.c                     |  2 +-
 fs/direct-io.c                        |  7 +--
 fs/erofs/data.c                       |  5 +-
 fs/erofs/internal.h                   |  1 +
 fs/erofs/zmap.c                       |  2 +-
 fs/ext2/inode.c                       |  4 +-
 fs/ext2/xattr.c                       |  2 +-
 fs/ext4/inode.c                       |  8 +--
 fs/ext4/mmp.c                         |  2 +-
 fs/ext4/page-io.c                     |  4 +-
 fs/ext4/super.c                       |  7 ++-
 fs/ext4/xattr.c                       |  2 +-
 fs/f2fs/data.c                        |  7 ++-
 fs/f2fs/f2fs.h                        |  1 +
 fs/fuse/dax.c                         |  2 +-
 fs/gfs2/aops.c                        |  2 +-
 fs/gfs2/bmap.c                        |  2 +-
 fs/gfs2/glock.c                       |  2 +-
 fs/gfs2/meta_io.c                     |  2 +-
 fs/gfs2/ops_fstype.c                  |  2 +-
 fs/hpfs/file.c                        |  2 +-
 fs/iomap/buffered-io.c                |  8 +--
 fs/iomap/direct-io.c                  | 10 ++--
 fs/iomap/swapfile.c                   |  2 +-
 fs/iomap/trace.h                      |  2 +-
 fs/jbd2/commit.c                      |  2 +-
 fs/jbd2/journal.c                     | 29 ++++++-----
 fs/jbd2/recovery.c                    |  6 +--
 fs/jbd2/revoke.c                      | 10 ++--
 fs/jbd2/transaction.c                 |  8 +--
 fs/mpage.c                            | 26 ++++++----
 fs/nilfs2/btnode.c                    |  4 +-
 fs/nilfs2/gcinode.c                   |  2 +-
 fs/nilfs2/mdt.c                       |  2 +-
 fs/nilfs2/page.c                      |  4 +-
 fs/nilfs2/recovery.c                  | 20 ++++----
 fs/nilfs2/segment.c                   |  2 +-
 fs/nilfs2/the_nilfs.c                 |  1 +
 fs/nilfs2/the_nilfs.h                 |  1 +
 fs/ntfs/aops.c                        |  6 +--
 fs/ntfs/file.c                        |  2 +-
 fs/ntfs/mft.c                         |  4 +-
 fs/ntfs3/fsntfs.c                     |  8 +--
 fs/ntfs3/inode.c                      |  2 +-
 fs/ntfs3/super.c                      |  2 +-
 fs/ocfs2/aops.c                       |  2 +-
 fs/ocfs2/journal.c                    |  2 +-
 fs/reiserfs/fix_node.c                |  2 +-
 fs/reiserfs/journal.c                 | 10 ++--
 fs/reiserfs/prints.c                  |  4 +-
 fs/reiserfs/reiserfs.h                |  6 +--
 fs/reiserfs/stree.c                   |  2 +-
 fs/reiserfs/tail_conversion.c         |  2 +-
 fs/xfs/xfs_iomap.c                    |  4 +-
 fs/zonefs/file.c                      |  4 +-
 include/linux/blk_types.h             |  1 -
 include/linux/blkdev.h                |  6 ++-
 include/linux/buffer_head.h           | 68 ++++++++++++++++---------
 include/linux/fs.h                    |  4 +-
 include/linux/iomap.h                 | 13 ++++-
 include/linux/jbd2.h                  | 10 ++--
 include/trace/events/block.h          |  2 +-
 86 files changed, 363 insertions(+), 291 deletions(-)
---
base-commit: 0bd1bf95a554f5f877724c27dbe33d4db0af4d0c
change-id: 20240129-vfs-bdev-file-bd_inode-385a56c57a51


^ permalink raw reply	[flat|nested] 146+ messages in thread

* [PATCH RFC 1/2] fs & block: remove bdev->bd_inode
  2024-01-29 10:56 ` [PATCH RFC 0/2] fs & block: remove bd_inode Christian Brauner
@ 2024-01-29 10:56   ` Christian Brauner
  2024-02-20 11:57     ` Yu Kuai
  2024-01-29 10:56   ` [PATCH RFC 2/2] fs,drivers: remove bdev_inode() usage outside of block layer and drivers Christian Brauner
  1 sibling, 1 reply; 146+ messages in thread
From: Christian Brauner @ 2024-01-29 10:56 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Christian Brauner, Darrick J. Wong, linux-fsdevel, linux-block

In prior patches we introduced the ability to open block devices as
files and made all filesystems stash the opened block device files. With
this patch we remove bdev->bd_inode from struct block_device.

Using files allows us to stop passing struct block_device directly to
almost all buffer_head helpers. Whenever access to the inode of the
block device is needed bdev_file_inode(bdev_file) can be used instead of
bdev->bd_inode.

The only user that doesn't rely on files is the block layer itself in
block/fops.c where we only have access to the block device. As the bdev
filesystem doesn't open block devices as files obviously.

This introduces a union into struct buffer_head and struct iomap. The
union encompasses both struct block_device and struct file. In both
cases a flag is used to differentiate whether a block device or a proper
file was stashed. Simple accessors bh_bdev() and iomap_bdev() are used
to return the block device in the really low-level functions where it's
needed. These are overall just a few callsites.

The block layer itself continues to get direct access to the block
device inode from the block device. But afaict, it isn't necessary to
have bdev->bd_inode for this. Instead, the few places that need it can
use the bdev_inode() helper which returns the vfs_inode stashed in
struct bdev_inode.

That helper is currently exposed to a few non-block layer and block
drivers such as btrfs. The next patch fixes this.

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 block/bdev.c                          | 48 ++++++++++--------
 block/blk-zoned.c                     |  4 +-
 block/fops.c                          |  4 +-
 block/genhd.c                         |  8 +--
 block/ioctl.c                         |  8 +--
 block/partitions/core.c               | 11 ++--
 drivers/gpu/drm/drm_gem_vram_helper.c |  2 +-
 drivers/md/bcache/super.c             |  2 +-
 drivers/md/md-bitmap.c                |  2 +-
 drivers/mtd/devices/block2mtd.c       |  4 +-
 drivers/s390/block/dasd_ioctl.c       |  2 +-
 drivers/scsi/scsicam.c                |  2 +-
 fs/affs/file.c                        |  2 +-
 fs/bcachefs/util.h                    |  2 +-
 fs/btrfs/disk-io.c                    |  6 +--
 fs/btrfs/inode.c                      |  2 +-
 fs/btrfs/volumes.c                    |  2 +-
 fs/btrfs/zoned.c                      |  2 +-
 fs/buffer.c                           | 95 +++++++++++++++++++----------------
 fs/cramfs/inode.c                     |  2 +-
 fs/direct-io.c                        |  7 +--
 fs/erofs/data.c                       |  5 +-
 fs/erofs/internal.h                   |  1 +
 fs/erofs/zmap.c                       |  2 +-
 fs/ext2/inode.c                       |  4 +-
 fs/ext2/xattr.c                       |  2 +-
 fs/ext4/inode.c                       |  8 +--
 fs/ext4/mmp.c                         |  2 +-
 fs/ext4/page-io.c                     |  4 +-
 fs/ext4/super.c                       |  7 ++-
 fs/ext4/xattr.c                       |  2 +-
 fs/f2fs/data.c                        |  7 ++-
 fs/f2fs/f2fs.h                        |  1 +
 fs/fuse/dax.c                         |  2 +-
 fs/gfs2/aops.c                        |  2 +-
 fs/gfs2/bmap.c                        |  2 +-
 fs/gfs2/glock.c                       |  2 +-
 fs/gfs2/meta_io.c                     |  2 +-
 fs/gfs2/ops_fstype.c                  |  2 +-
 fs/hpfs/file.c                        |  2 +-
 fs/iomap/buffered-io.c                |  8 +--
 fs/iomap/direct-io.c                  | 10 ++--
 fs/iomap/swapfile.c                   |  2 +-
 fs/iomap/trace.h                      |  2 +-
 fs/jbd2/commit.c                      |  2 +-
 fs/jbd2/journal.c                     | 29 ++++++-----
 fs/jbd2/recovery.c                    |  6 +--
 fs/jbd2/revoke.c                      | 10 ++--
 fs/jbd2/transaction.c                 |  8 +--
 fs/mpage.c                            | 26 ++++++----
 fs/nilfs2/btnode.c                    |  4 +-
 fs/nilfs2/gcinode.c                   |  2 +-
 fs/nilfs2/mdt.c                       |  2 +-
 fs/nilfs2/page.c                      |  4 +-
 fs/nilfs2/recovery.c                  | 20 ++++----
 fs/nilfs2/segment.c                   |  2 +-
 fs/nilfs2/the_nilfs.c                 |  1 +
 fs/nilfs2/the_nilfs.h                 |  1 +
 fs/ntfs/aops.c                        |  6 +--
 fs/ntfs/file.c                        |  2 +-
 fs/ntfs/mft.c                         |  4 +-
 fs/ntfs3/fsntfs.c                     |  8 +--
 fs/ntfs3/inode.c                      |  2 +-
 fs/ntfs3/super.c                      |  2 +-
 fs/ocfs2/aops.c                       |  2 +-
 fs/ocfs2/journal.c                    |  2 +-
 fs/reiserfs/fix_node.c                |  2 +-
 fs/reiserfs/journal.c                 | 10 ++--
 fs/reiserfs/prints.c                  |  4 +-
 fs/reiserfs/reiserfs.h                |  6 +--
 fs/reiserfs/stree.c                   |  2 +-
 fs/reiserfs/tail_conversion.c         |  2 +-
 fs/xfs/xfs_iomap.c                    |  4 +-
 fs/zonefs/file.c                      |  4 +-
 include/linux/blk_types.h             |  1 -
 include/linux/blkdev.h                |  6 ++-
 include/linux/buffer_head.h           | 68 ++++++++++++++++---------
 include/linux/fs.h                    |  4 +-
 include/linux/iomap.h                 | 13 ++++-
 include/linux/jbd2.h                  | 10 ++--
 include/trace/events/block.h          |  2 +-
 81 files changed, 326 insertions(+), 255 deletions(-)

diff --git a/block/bdev.c b/block/bdev.c
index 185c43ebeea5..131cd1a7a877 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -43,6 +43,12 @@ static inline struct bdev_inode *BDEV_I(struct inode *inode)
 	return container_of(inode, struct bdev_inode, vfs_inode);
 }
 
+struct inode *bdev_inode(struct block_device *bdev)
+{
+	return &container_of(bdev, struct bdev_inode, bdev)->vfs_inode;
+}
+EXPORT_SYMBOL_GPL(bdev_inode);
+
 struct block_device *I_BDEV(struct inode *inode)
 {
 	return &BDEV_I(inode)->bdev;
@@ -57,7 +63,7 @@ EXPORT_SYMBOL(file_bdev);
 
 static void bdev_write_inode(struct block_device *bdev)
 {
-	struct inode *inode = bdev->bd_inode;
+	struct inode *inode = bdev_inode(bdev);
 	int ret;
 
 	spin_lock(&inode->i_lock);
@@ -76,7 +82,7 @@ static void bdev_write_inode(struct block_device *bdev)
 /* Kill _all_ buffers and pagecache , dirty or not.. */
 static void kill_bdev(struct block_device *bdev)
 {
-	struct address_space *mapping = bdev->bd_inode->i_mapping;
+	struct address_space *mapping = bdev_inode(bdev)->i_mapping;
 
 	if (mapping_empty(mapping))
 		return;
@@ -88,7 +94,7 @@ static void kill_bdev(struct block_device *bdev)
 /* Invalidate clean unused buffers and pagecache. */
 void invalidate_bdev(struct block_device *bdev)
 {
-	struct address_space *mapping = bdev->bd_inode->i_mapping;
+	struct address_space *mapping = bdev_inode(bdev)->i_mapping;
 
 	if (mapping->nrpages) {
 		invalidate_bh_lrus();
@@ -116,7 +122,7 @@ int truncate_bdev_range(struct block_device *bdev, blk_mode_t mode,
 			goto invalidate;
 	}
 
-	truncate_inode_pages_range(bdev->bd_inode->i_mapping, lstart, lend);
+	truncate_inode_pages_range(bdev_inode(bdev)->i_mapping, lstart, lend);
 	if (!(mode & BLK_OPEN_EXCL))
 		bd_abort_claiming(bdev, truncate_bdev_range);
 	return 0;
@@ -126,7 +132,7 @@ int truncate_bdev_range(struct block_device *bdev, blk_mode_t mode,
 	 * Someone else has handle exclusively open. Try invalidating instead.
 	 * The 'end' argument is inclusive so the rounding is safe.
 	 */
-	return invalidate_inode_pages2_range(bdev->bd_inode->i_mapping,
+	return invalidate_inode_pages2_range(bdev_inode(bdev)->i_mapping,
 					     lstart >> PAGE_SHIFT,
 					     lend >> PAGE_SHIFT);
 }
@@ -134,14 +140,14 @@ int truncate_bdev_range(struct block_device *bdev, blk_mode_t mode,
 static void set_init_blocksize(struct block_device *bdev)
 {
 	unsigned int bsize = bdev_logical_block_size(bdev);
-	loff_t size = i_size_read(bdev->bd_inode);
+	loff_t size = i_size_read(bdev_inode(bdev));
 
 	while (bsize < PAGE_SIZE) {
 		if (size & bsize)
 			break;
 		bsize <<= 1;
 	}
-	bdev->bd_inode->i_blkbits = blksize_bits(bsize);
+	bdev_inode(bdev)->i_blkbits = blksize_bits(bsize);
 }
 
 int set_blocksize(struct block_device *bdev, int size)
@@ -155,9 +161,9 @@ int set_blocksize(struct block_device *bdev, int size)
 		return -EINVAL;
 
 	/* Don't change the size if it is same as current */
-	if (bdev->bd_inode->i_blkbits != blksize_bits(size)) {
+	if (bdev_inode(bdev)->i_blkbits != blksize_bits(size)) {
 		sync_blockdev(bdev);
-		bdev->bd_inode->i_blkbits = blksize_bits(size);
+		bdev_inode(bdev)->i_blkbits = blksize_bits(size);
 		kill_bdev(bdev);
 	}
 	return 0;
@@ -192,7 +198,7 @@ int sync_blockdev_nowait(struct block_device *bdev)
 {
 	if (!bdev)
 		return 0;
-	return filemap_flush(bdev->bd_inode->i_mapping);
+	return filemap_flush(bdev_inode(bdev)->i_mapping);
 }
 EXPORT_SYMBOL_GPL(sync_blockdev_nowait);
 
@@ -204,13 +210,13 @@ int sync_blockdev(struct block_device *bdev)
 {
 	if (!bdev)
 		return 0;
-	return filemap_write_and_wait(bdev->bd_inode->i_mapping);
+	return filemap_write_and_wait(bdev_inode(bdev)->i_mapping);
 }
 EXPORT_SYMBOL(sync_blockdev);
 
 int sync_blockdev_range(struct block_device *bdev, loff_t lstart, loff_t lend)
 {
-	return filemap_write_and_wait_range(bdev->bd_inode->i_mapping,
+	return filemap_write_and_wait_range(bdev_inode(bdev)->i_mapping,
 			lstart, lend);
 }
 EXPORT_SYMBOL(sync_blockdev_range);
@@ -412,7 +418,6 @@ struct block_device *bdev_alloc(struct gendisk *disk, u8 partno)
 	spin_lock_init(&bdev->bd_size_lock);
 	mutex_init(&bdev->bd_holder_lock);
 	bdev->bd_partno = partno;
-	bdev->bd_inode = inode;
 	bdev->bd_queue = disk->queue;
 	if (partno)
 		bdev->bd_has_submit_bio = disk->part0->bd_has_submit_bio;
@@ -430,19 +435,20 @@ struct block_device *bdev_alloc(struct gendisk *disk, u8 partno)
 void bdev_set_nr_sectors(struct block_device *bdev, sector_t sectors)
 {
 	spin_lock(&bdev->bd_size_lock);
-	i_size_write(bdev->bd_inode, (loff_t)sectors << SECTOR_SHIFT);
+	i_size_write(bdev_inode(bdev), (loff_t)sectors << SECTOR_SHIFT);
 	bdev->bd_nr_sectors = sectors;
 	spin_unlock(&bdev->bd_size_lock);
 }
 
 void bdev_add(struct block_device *bdev, dev_t dev)
 {
+	struct inode *inode = bdev_inode(bdev);
 	if (bdev_stable_writes(bdev))
-		mapping_set_stable_writes(bdev->bd_inode->i_mapping);
+		mapping_set_stable_writes(inode->i_mapping);
 	bdev->bd_dev = dev;
-	bdev->bd_inode->i_rdev = dev;
-	bdev->bd_inode->i_ino = dev;
-	insert_inode_hash(bdev->bd_inode);
+	inode->i_rdev = dev;
+	inode->i_ino = dev;
+	insert_inode_hash(bdev_inode(bdev));
 }
 
 long nr_blockdev_pages(void)
@@ -901,7 +907,7 @@ int bdev_open(struct block_device *bdev, blk_mode_t mode, void *holder,
 	bdev_file->f_mode |= FMODE_BUF_RASYNC | FMODE_CAN_ODIRECT;
 	if (bdev_nowait(bdev))
 		bdev_file->f_mode |= FMODE_NOWAIT;
-	bdev_file->f_mapping = bdev->bd_inode->i_mapping;
+	bdev_file->f_mapping = bdev_inode(bdev)->i_mapping;
 	bdev_file->f_wb_err = filemap_sample_wb_err(bdev_file->f_mapping);
 	bdev_file->private_data = holder;
 
@@ -971,7 +977,7 @@ struct file *bdev_file_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
 		blk_fops = &def_blk_fops_restricted;
 	else
 		blk_fops = &def_blk_fops;
-	bdev_file = alloc_file_pseudo_noaccount(bdev->bd_inode,
+	bdev_file = alloc_file_pseudo_noaccount(bdev_inode(bdev),
 			blockdev_mnt, "", flags | O_LARGEFILE, blk_fops);
 	if (IS_ERR(bdev_file)) {
 		blkdev_put_no_open(bdev);
@@ -979,7 +985,7 @@ struct file *bdev_file_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
 	}
 	bdev_file->f_mode &= ~FMODE_OPENED;
 
-	ihold(bdev->bd_inode);
+	ihold(bdev_inode(bdev));
 	ret = bdev_open(bdev, mode, holder, hops, bdev_file);
 	if (ret) {
 		fput(bdev_file);
diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index d343e5756a9c..232003d70318 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -401,7 +401,7 @@ int blkdev_zone_mgmt_ioctl(struct block_device *bdev, blk_mode_t mode,
 		op = REQ_OP_ZONE_RESET;
 
 		/* Invalidate the page cache, including dirty pages. */
-		filemap_invalidate_lock(bdev->bd_inode->i_mapping);
+		filemap_invalidate_lock(bdev_inode(bdev)->i_mapping);
 		ret = blkdev_truncate_zone_range(bdev, mode, &zrange);
 		if (ret)
 			goto fail;
@@ -424,7 +424,7 @@ int blkdev_zone_mgmt_ioctl(struct block_device *bdev, blk_mode_t mode,
 
 fail:
 	if (cmd == BLKRESETZONE)
-		filemap_invalidate_unlock(bdev->bd_inode->i_mapping);
+		filemap_invalidate_unlock(bdev_inode(bdev)->i_mapping);
 
 	return ret;
 }
diff --git a/block/fops.c b/block/fops.c
index 4e65a7ce965e..1557c7bfcf1f 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -381,6 +381,7 @@ static int blkdev_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
 	loff_t isize = i_size_read(inode);
 
 	iomap->bdev = bdev;
+	iomap->flags |= IOMAP_F_BDEV;
 	iomap->offset = ALIGN_DOWN(offset, bdev_logical_block_size(bdev));
 	if (iomap->offset >= isize)
 		return -EIO;
@@ -402,6 +403,7 @@ static int blkdev_get_block(struct inode *inode, sector_t iblock,
 	bh->b_bdev = I_BDEV(inode);
 	bh->b_blocknr = iblock;
 	set_buffer_mapped(bh);
+	set_buffer_bdev(bh);
 	return 0;
 }
 
@@ -665,7 +667,7 @@ static ssize_t blkdev_write_iter(struct kiocb *iocb, struct iov_iter *from)
 {
 	struct file *file = iocb->ki_filp;
 	struct block_device *bdev = I_BDEV(file->f_mapping->host);
-	struct inode *bd_inode = bdev->bd_inode;
+	struct inode *bd_inode = bdev_inode(bdev);
 	loff_t size = bdev_nr_bytes(bdev);
 	size_t shorted = 0;
 	ssize_t ret;
diff --git a/block/genhd.c b/block/genhd.c
index a911d2969c07..8e64cc5172c5 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -656,7 +656,7 @@ void del_gendisk(struct gendisk *disk)
 	 */
 	mutex_lock(&disk->open_mutex);
 	xa_for_each(&disk->part_tbl, idx, part)
-		remove_inode_hash(part->bd_inode);
+		remove_inode_hash(bdev_inode(part));
 	mutex_unlock(&disk->open_mutex);
 
 	/*
@@ -745,7 +745,7 @@ void invalidate_disk(struct gendisk *disk)
 	struct block_device *bdev = disk->part0;
 
 	invalidate_bdev(bdev);
-	bdev->bd_inode->i_mapping->wb_err = 0;
+	bdev_inode(bdev)->i_mapping->wb_err = 0;
 	set_capacity(disk, 0);
 }
 EXPORT_SYMBOL(invalidate_disk);
@@ -1191,7 +1191,7 @@ static void disk_release(struct device *dev)
 	if (test_bit(GD_ADDED, &disk->state) && disk->fops->free_disk)
 		disk->fops->free_disk(disk);
 
-	iput(disk->part0->bd_inode);	/* frees the disk */
+	iput(bdev_inode(disk->part0));	/* frees the disk */
 }
 
 static int block_uevent(const struct device *dev, struct kobj_uevent_env *env)
@@ -1381,7 +1381,7 @@ struct gendisk *__alloc_disk_node(struct request_queue *q, int node_id,
 out_destroy_part_tbl:
 	xa_destroy(&disk->part_tbl);
 	disk->part0->bd_disk = NULL;
-	iput(disk->part0->bd_inode);
+	iput(bdev_inode(disk->part0));
 out_free_bdi:
 	bdi_put(disk->bdi);
 out_free_bioset:
diff --git a/block/ioctl.c b/block/ioctl.c
index 5d0619e02e4c..376339e0db6c 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -92,7 +92,7 @@ static int blk_ioctl_discard(struct block_device *bdev, blk_mode_t mode,
 {
 	uint64_t range[2];
 	uint64_t start, len;
-	struct inode *inode = bdev->bd_inode;
+	struct inode *inode = bdev_inode(bdev);
 	int err;
 
 	if (!(mode & BLK_OPEN_WRITE))
@@ -146,12 +146,12 @@ static int blk_ioctl_secure_erase(struct block_device *bdev, blk_mode_t mode,
 	if (start + len > bdev_nr_bytes(bdev))
 		return -EINVAL;
 
-	filemap_invalidate_lock(bdev->bd_inode->i_mapping);
+	filemap_invalidate_lock(bdev_inode(bdev)->i_mapping);
 	err = truncate_bdev_range(bdev, mode, start, start + len - 1);
 	if (!err)
 		err = blkdev_issue_secure_erase(bdev, start >> 9, len >> 9,
 						GFP_KERNEL);
-	filemap_invalidate_unlock(bdev->bd_inode->i_mapping);
+	filemap_invalidate_unlock(bdev_inode(bdev)->i_mapping);
 	return err;
 }
 
@@ -161,7 +161,7 @@ static int blk_ioctl_zeroout(struct block_device *bdev, blk_mode_t mode,
 {
 	uint64_t range[2];
 	uint64_t start, end, len;
-	struct inode *inode = bdev->bd_inode;
+	struct inode *inode = bdev_inode(bdev);
 	int err;
 
 	if (!(mode & BLK_OPEN_WRITE))
diff --git a/block/partitions/core.c b/block/partitions/core.c
index cab0d76a828e..2808c5a2f19e 100644
--- a/block/partitions/core.c
+++ b/block/partitions/core.c
@@ -242,8 +242,9 @@ static const struct attribute_group *part_attr_groups[] = {
 
 static void part_release(struct device *dev)
 {
-	put_disk(dev_to_bdev(dev)->bd_disk);
-	iput(dev_to_bdev(dev)->bd_inode);
+	struct block_device *bdev = dev_to_bdev(dev);
+	put_disk(bdev->bd_disk);
+	iput(bdev_inode(bdev));
 }
 
 static int part_uevent(const struct device *dev, struct kobj_uevent_env *env)
@@ -475,7 +476,7 @@ int bdev_del_partition(struct gendisk *disk, int partno)
 	 * Just delete the partition and invalidate it.
 	 */
 
-	remove_inode_hash(part->bd_inode);
+	remove_inode_hash(bdev_inode(part));
 	invalidate_bdev(part);
 	drop_partition(part);
 	ret = 0;
@@ -661,7 +662,7 @@ int bdev_disk_changed(struct gendisk *disk, bool invalidate)
 		 * it cannot be looked up any more even when openers
 		 * still hold references.
 		 */
-		remove_inode_hash(part->bd_inode);
+		remove_inode_hash(bdev_inode(part));
 
 		/*
 		 * If @disk->open_partitions isn't elevated but there's
@@ -710,7 +711,7 @@ EXPORT_SYMBOL_GPL(bdev_disk_changed);
 
 void *read_part_sector(struct parsed_partitions *state, sector_t n, Sector *p)
 {
-	struct address_space *mapping = state->disk->part0->bd_inode->i_mapping;
+	struct address_space *mapping = bdev_inode(state->disk->part0)->i_mapping;
 	struct folio *folio;
 
 	if (n >= get_capacity(state->disk)) {
diff --git a/drivers/gpu/drm/drm_gem_vram_helper.c b/drivers/gpu/drm/drm_gem_vram_helper.c
index b67eafa55715..ce9c2d51f1f6 100644
--- a/drivers/gpu/drm/drm_gem_vram_helper.c
+++ b/drivers/gpu/drm/drm_gem_vram_helper.c
@@ -935,7 +935,7 @@ static int bo_driver_move(struct ttm_buffer_object *bo,
 static int bo_driver_io_mem_reserve(struct ttm_device *bdev,
 				    struct ttm_resource *mem)
 {
-	struct drm_vram_mm *vmm = drm_vram_mm_of_bdev(bdev);
+	struct drm_vram_mm *vmm = drm_vram_mm_obdev_file(bdev);
 
 	switch (mem->mem_type) {
 	case TTM_PL_SYSTEM:	/* nothing to do */
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index d00b3abab133..8971e769d5e7 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -171,7 +171,7 @@ static const char *read_super(struct cache_sb *sb, struct block_device *bdev,
 	struct page *page;
 	unsigned int i;
 
-	page = read_cache_page_gfp(bdev->bd_inode->i_mapping,
+	page = read_cache_page_gfp(bdev_inode(bdev)->i_mapping,
 				   SB_OFFSET >> PAGE_SHIFT, GFP_KERNEL);
 	if (IS_ERR(page))
 		return "IO error";
diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
index 9672f75c3050..689f5f543520 100644
--- a/drivers/md/md-bitmap.c
+++ b/drivers/md/md-bitmap.c
@@ -380,7 +380,7 @@ static int read_file_page(struct file *file, unsigned long index,
 			}
 
 			bh->b_blocknr = block;
-			bh->b_bdev = inode->i_sb->s_bdev;
+			bh->b_bdev_file = inode->i_sb->s_bdev_file;
 			if (count < blocksize)
 				count = 0;
 			else
diff --git a/drivers/mtd/devices/block2mtd.c b/drivers/mtd/devices/block2mtd.c
index 97a00ec9a4d4..dc3df3a600cf 100644
--- a/drivers/mtd/devices/block2mtd.c
+++ b/drivers/mtd/devices/block2mtd.c
@@ -291,7 +291,7 @@ static struct block2mtd_dev *add_device(char *devname, int erase_size,
 		goto err_free_block2mtd;
 	}
 
-	if ((long)bdev->bd_inode->i_size % erase_size) {
+	if ((long)bdev_inode(bdev)->i_size % erase_size) {
 		pr_err("erasesize must be a divisor of device size\n");
 		goto err_free_block2mtd;
 	}
@@ -309,7 +309,7 @@ static struct block2mtd_dev *add_device(char *devname, int erase_size,
 
 	dev->mtd.name = name;
 
-	dev->mtd.size = bdev->bd_inode->i_size & PAGE_MASK;
+	dev->mtd.size = bdev_inode(bdev)->i_size & PAGE_MASK;
 	dev->mtd.erasesize = erase_size;
 	dev->mtd.writesize = 1;
 	dev->mtd.writebufsize = PAGE_SIZE;
diff --git a/drivers/s390/block/dasd_ioctl.c b/drivers/s390/block/dasd_ioctl.c
index de85a5e4e21b..c6295ef35437 100644
--- a/drivers/s390/block/dasd_ioctl.c
+++ b/drivers/s390/block/dasd_ioctl.c
@@ -221,7 +221,7 @@ dasd_format(struct dasd_block *block, struct format_data_t *fdata)
 	 * enabling the device later.
 	 */
 	if (fdata->start_unit == 0) {
-		block->gdp->part0->bd_inode->i_blkbits =
+		bdev_inode(block->gdp->part0)->i_blkbits =
 			blksize_bits(fdata->blksize);
 	}
 
diff --git a/drivers/scsi/scsicam.c b/drivers/scsi/scsicam.c
index e2c7d8ef205f..de40a5ef7d96 100644
--- a/drivers/scsi/scsicam.c
+++ b/drivers/scsi/scsicam.c
@@ -32,7 +32,7 @@
  */
 unsigned char *scsi_bios_ptable(struct block_device *dev)
 {
-	struct address_space *mapping = bdev_whole(dev)->bd_inode->i_mapping;
+	struct address_space *mapping = bdev_inode(bdev_whole(dev))->i_mapping;
 	unsigned char *res = NULL;
 	struct folio *folio;
 
diff --git a/fs/affs/file.c b/fs/affs/file.c
index 04c018e19602..c0583831c58f 100644
--- a/fs/affs/file.c
+++ b/fs/affs/file.c
@@ -365,7 +365,7 @@ affs_get_block(struct inode *inode, sector_t block, struct buffer_head *bh_resul
 err_alloc:
 	brelse(ext_bh);
 	clear_buffer_mapped(bh_result);
-	bh_result->b_bdev = NULL;
+	bh_result->b_bdev_file = NULL;
 	// unlock cache
 	affs_unlock_ext(inode);
 	return -ENOSPC;
diff --git a/fs/bcachefs/util.h b/fs/bcachefs/util.h
index df67bf55fe2b..5ab765d056d6 100644
--- a/fs/bcachefs/util.h
+++ b/fs/bcachefs/util.h
@@ -554,7 +554,7 @@ int bch2_bio_alloc_pages(struct bio *, size_t, gfp_t);
 
 static inline sector_t bdev_sectors(struct block_device *bdev)
 {
-	return bdev->bd_inode->i_size >> 9;
+	return bdev_inode(bdev)->i_size >> 9;
 }
 
 #define closure_bio_submit(bio, cl)					\
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index c6907d533fe8..7d5d022b0bde 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3639,7 +3639,7 @@ struct btrfs_super_block *btrfs_read_dev_one_super(struct block_device *bdev,
 	struct btrfs_super_block *super;
 	struct page *page;
 	u64 bytenr, bytenr_orig;
-	struct address_space *mapping = bdev->bd_inode->i_mapping;
+	struct address_space *mapping = bdev_inode(bdev)->i_mapping;
 	int ret;
 
 	bytenr_orig = btrfs_sb_offset(copy_num);
@@ -3726,7 +3726,7 @@ static int write_dev_supers(struct btrfs_device *device,
 			    struct btrfs_super_block *sb, int max_mirrors)
 {
 	struct btrfs_fs_info *fs_info = device->fs_info;
-	struct address_space *mapping = device->bdev->bd_inode->i_mapping;
+	struct address_space *mapping = bdev_inode(device->bdev)->i_mapping;
 	SHASH_DESC_ON_STACK(shash, fs_info->csum_shash);
 	int i;
 	int errors = 0;
@@ -3843,7 +3843,7 @@ static int wait_dev_supers(struct btrfs_device *device, int max_mirrors)
 		    device->commit_total_bytes)
 			break;
 
-		page = find_get_page(device->bdev->bd_inode->i_mapping,
+		page = find_get_page(bdev_inode(device->bdev)->i_mapping,
 				     bytenr >> PAGE_SHIFT);
 		if (!page) {
 			errors++;
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 809b11472a80..449922af0a18 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -7660,7 +7660,7 @@ static int btrfs_dio_iomap_begin(struct inode *inode, loff_t start,
 		iomap->type = IOMAP_MAPPED;
 	}
 	iomap->offset = start;
-	iomap->bdev = fs_info->fs_devices->latest_dev->bdev;
+	iomap->bdev_file = fs_info->fs_devices->latest_dev->bdev_file;
 	iomap->length = len;
 	free_extent_map(em);
 
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 769a1dc4b756..1f12122ae7ce 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1268,7 +1268,7 @@ static struct btrfs_super_block *btrfs_read_disk_super(struct block_device *bdev
 		return ERR_PTR(-EINVAL);
 
 	/* pull in the page with our super */
-	page = read_cache_page_gfp(bdev->bd_inode->i_mapping, index, GFP_KERNEL);
+	page = read_cache_page_gfp(bdev_inode(bdev)->i_mapping, index, GFP_KERNEL);
 
 	if (IS_ERR(page))
 		return ERR_CAST(page);
diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
index 5bd76813b23f..42893771532f 100644
--- a/fs/btrfs/zoned.c
+++ b/fs/btrfs/zoned.c
@@ -120,7 +120,7 @@ static int sb_write_pointer(struct block_device *bdev, struct blk_zone *zones,
 		return -ENOENT;
 	} else if (full[0] && full[1]) {
 		/* Compare two super blocks */
-		struct address_space *mapping = bdev->bd_inode->i_mapping;
+		struct address_space *mapping = bdev_inode(bdev)->i_mapping;
 		struct page *page[BTRFS_NR_SB_LOG_ZONES];
 		struct btrfs_super_block *super[BTRFS_NR_SB_LOG_ZONES];
 		int i;
diff --git a/fs/buffer.c b/fs/buffer.c
index d3bcf601d3e5..3f3677668a80 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -129,7 +129,7 @@ static void buffer_io_error(struct buffer_head *bh, char *msg)
 	if (!test_bit(BH_Quiet, &bh->b_state))
 		printk_ratelimited(KERN_ERR
 			"Buffer I/O error on dev %pg, logical block %llu%s\n",
-			bh->b_bdev, (unsigned long long)bh->b_blocknr, msg);
+			bh_bdev(bh), (unsigned long long)bh->b_blocknr, msg);
 }
 
 /*
@@ -187,9 +187,9 @@ EXPORT_SYMBOL(end_buffer_write_sync);
  * succeeds, there is no need to take i_private_lock.
  */
 static struct buffer_head *
-__find_get_block_slow(struct block_device *bdev, sector_t block)
+__find_get_block_slow(struct file *bdev_file, sector_t block)
 {
-	struct inode *bd_inode = bdev->bd_inode;
+	struct inode *bd_inode = bdev_file_inode(bdev_file);
 	struct address_space *bd_mapping = bd_inode->i_mapping;
 	struct buffer_head *ret = NULL;
 	pgoff_t index;
@@ -232,7 +232,7 @@ __find_get_block_slow(struct block_device *bdev, sector_t block)
 		       "device %pg blocksize: %d\n",
 		       (unsigned long long)block,
 		       (unsigned long long)bh->b_blocknr,
-		       bh->b_state, bh->b_size, bdev,
+		       bh->b_state, bh->b_size, file_bdev(bdev_file),
 		       1 << bd_inode->i_blkbits);
 	}
 out_unlock:
@@ -655,10 +655,10 @@ EXPORT_SYMBOL(generic_buffers_fsync);
  * `bblock + 1' is probably a dirty indirect block.  Hunt it down and, if it's
  * dirty, schedule it for IO.  So that indirects merge nicely with their data.
  */
-void write_boundary_block(struct block_device *bdev,
+void write_boundary_block(struct file *bdev_file,
 			sector_t bblock, unsigned blocksize)
 {
-	struct buffer_head *bh = __find_get_block(bdev, bblock + 1, blocksize);
+	struct buffer_head *bh = __find_get_block(bdev_file, bblock + 1, blocksize);
 	if (bh) {
 		if (buffer_dirty(bh))
 			write_dirty_buffer(bh, 0);
@@ -994,8 +994,9 @@ static sector_t blkdev_max_block(struct block_device *bdev, unsigned int size)
  * Initialise the state of a blockdev folio's buffers.
  */ 
 static sector_t folio_init_buffers(struct folio *folio,
-		struct block_device *bdev, unsigned size)
+		struct file *bdev_file, unsigned size)
 {
+	struct block_device *bdev = file_bdev(bdev_file);
 	struct buffer_head *head = folio_buffers(folio);
 	struct buffer_head *bh = head;
 	bool uptodate = folio_test_uptodate(folio);
@@ -1006,7 +1007,7 @@ static sector_t folio_init_buffers(struct folio *folio,
 		if (!buffer_mapped(bh)) {
 			bh->b_end_io = NULL;
 			bh->b_private = NULL;
-			bh->b_bdev = bdev;
+			bh->b_bdev_file = bdev_file;
 			bh->b_blocknr = block;
 			if (uptodate)
 				set_buffer_uptodate(bh);
@@ -1031,10 +1032,10 @@ static sector_t folio_init_buffers(struct folio *folio,
  * Returns false if we have a failure which cannot be cured by retrying
  * without sleeping.  Returns true if we succeeded, or the caller should retry.
  */
-static bool grow_dev_folio(struct block_device *bdev, sector_t block,
+static bool grow_dev_folio(struct file *bdev_file, sector_t block,
 		pgoff_t index, unsigned size, gfp_t gfp)
 {
-	struct inode *inode = bdev->bd_inode;
+	struct inode *inode = bdev_file_inode(bdev_file);
 	struct folio *folio;
 	struct buffer_head *bh;
 	sector_t end_block = 0;
@@ -1047,7 +1048,7 @@ static bool grow_dev_folio(struct block_device *bdev, sector_t block,
 	bh = folio_buffers(folio);
 	if (bh) {
 		if (bh->b_size == size) {
-			end_block = folio_init_buffers(folio, bdev, size);
+			end_block = folio_init_buffers(folio, bdev_file, size);
 			goto unlock;
 		}
 
@@ -1075,7 +1076,7 @@ static bool grow_dev_folio(struct block_device *bdev, sector_t block,
 	 */
 	spin_lock(&inode->i_mapping->i_private_lock);
 	link_dev_buffers(folio, bh);
-	end_block = folio_init_buffers(folio, bdev, size);
+	end_block = folio_init_buffers(folio, bdev_file, size);
 	spin_unlock(&inode->i_mapping->i_private_lock);
 unlock:
 	folio_unlock(folio);
@@ -1088,7 +1089,8 @@ static bool grow_dev_folio(struct block_device *bdev, sector_t block,
  * that folio was dirty, the buffers are set dirty also.  Returns false
  * if we've hit a permanent error.
  */
-static bool grow_buffers(struct block_device *bdev, sector_t block,
+
+static bool grow_buffers(struct file *bdev_file, sector_t block,
 		unsigned size, gfp_t gfp)
 {
 	loff_t pos;
@@ -1100,25 +1102,25 @@ static bool grow_buffers(struct block_device *bdev, sector_t block,
 	if (check_mul_overflow(block, (sector_t)size, &pos) || pos > MAX_LFS_FILESIZE) {
 		printk(KERN_ERR "%s: requested out-of-range block %llu for device %pg\n",
 			__func__, (unsigned long long)block,
-			bdev);
+			file_bdev(bdev_file));
 		return false;
 	}
 
 	/* Create a folio with the proper size buffers */
-	return grow_dev_folio(bdev, block, pos / PAGE_SIZE, size, gfp);
+	return grow_dev_folio(bdev_file, block, pos / PAGE_SIZE, size, gfp);
 }
 
 static struct buffer_head *
-__getblk_slow(struct block_device *bdev, sector_t block,
+__getblk_slow(struct file *bdev_file, sector_t block,
 	     unsigned size, gfp_t gfp)
 {
 	/* Size must be multiple of hard sectorsize */
-	if (unlikely(size & (bdev_logical_block_size(bdev)-1) ||
+	if (unlikely(size & (bdev_logical_block_size(file_bdev(bdev_file))-1) ||
 			(size < 512 || size > PAGE_SIZE))) {
 		printk(KERN_ERR "getblk(): invalid block size %d requested\n",
 					size);
 		printk(KERN_ERR "logical block size: %d\n",
-					bdev_logical_block_size(bdev));
+					bdev_logical_block_size(file_bdev(bdev_file)));
 
 		dump_stack();
 		return NULL;
@@ -1127,11 +1129,11 @@ __getblk_slow(struct block_device *bdev, sector_t block,
 	for (;;) {
 		struct buffer_head *bh;
 
-		bh = __find_get_block(bdev, block, size);
+		bh = __find_get_block(bdev_file, block, size);
 		if (bh)
 			return bh;
 
-		if (!grow_buffers(bdev, block, size, gfp))
+		if (!grow_buffers(bdev_file, block, size, gfp))
 			return NULL;
 	}
 }
@@ -1367,7 +1369,7 @@ lookup_bh_lru(struct block_device *bdev, sector_t block, unsigned size)
 	for (i = 0; i < BH_LRU_SIZE; i++) {
 		struct buffer_head *bh = __this_cpu_read(bh_lrus.bhs[i]);
 
-		if (bh && bh->b_blocknr == block && bh->b_bdev == bdev &&
+		if (bh && bh->b_blocknr == block && bh_bdev(bh) == bdev &&
 		    bh->b_size == size) {
 			if (i) {
 				while (i) {
@@ -1392,13 +1394,13 @@ lookup_bh_lru(struct block_device *bdev, sector_t block, unsigned size)
  * NULL
  */
 struct buffer_head *
-__find_get_block(struct block_device *bdev, sector_t block, unsigned size)
+__find_get_block(struct file *bdev_file, sector_t block, unsigned size)
 {
-	struct buffer_head *bh = lookup_bh_lru(bdev, block, size);
+	struct buffer_head *bh = lookup_bh_lru(file_bdev(bdev_file), block, size);
 
 	if (bh == NULL) {
 		/* __find_get_block_slow will mark the page accessed */
-		bh = __find_get_block_slow(bdev, block);
+		bh = __find_get_block_slow(bdev_file, block);
 		if (bh)
 			bh_lru_install(bh);
 	} else
@@ -1410,32 +1412,32 @@ EXPORT_SYMBOL(__find_get_block);
 
 /**
  * bdev_getblk - Get a buffer_head in a block device's buffer cache.
- * @bdev: The block device.
+ * @bdev_file: The block device.
  * @block: The block number.
- * @size: The size of buffer_heads for this @bdev.
+ * @size: The size of buffer_heads for this @bdev_file.
  * @gfp: The memory allocation flags to use.
  *
  * Return: The buffer head, or NULL if memory could not be allocated.
  */
-struct buffer_head *bdev_getblk(struct block_device *bdev, sector_t block,
+struct buffer_head *bdev_getblk(struct file *bdev_file, sector_t block,
 		unsigned size, gfp_t gfp)
 {
-	struct buffer_head *bh = __find_get_block(bdev, block, size);
+	struct buffer_head *bh = __find_get_block(bdev_file, block, size);
 
 	might_alloc(gfp);
 	if (bh)
 		return bh;
 
-	return __getblk_slow(bdev, block, size, gfp);
+	return __getblk_slow(bdev_file, block, size, gfp);
 }
 EXPORT_SYMBOL(bdev_getblk);
 
 /*
  * Do async read-ahead on a buffer..
  */
-void __breadahead(struct block_device *bdev, sector_t block, unsigned size)
+void __breadahead(struct file *bdev_file, sector_t block, unsigned size)
 {
-	struct buffer_head *bh = bdev_getblk(bdev, block, size,
+	struct buffer_head *bh = bdev_getblk(bdev_file, block, size,
 			GFP_NOWAIT | __GFP_MOVABLE);
 
 	if (likely(bh)) {
@@ -1447,7 +1449,7 @@ EXPORT_SYMBOL(__breadahead);
 
 /**
  *  __bread_gfp() - reads a specified block and returns the bh
- *  @bdev: the block_device to read from
+ *  @bdev_file: the block_device to read from
  *  @block: number of block
  *  @size: size (in bytes) to read
  *  @gfp: page allocation flag
@@ -1458,12 +1460,12 @@ EXPORT_SYMBOL(__breadahead);
  *  It returns NULL if the block was unreadable.
  */
 struct buffer_head *
-__bread_gfp(struct block_device *bdev, sector_t block,
+__bread_gfp(struct file *bdev_file, sector_t block,
 		   unsigned size, gfp_t gfp)
 {
 	struct buffer_head *bh;
 
-	gfp |= mapping_gfp_constraint(bdev->bd_inode->i_mapping, ~__GFP_FS);
+	gfp |= mapping_gfp_constraint(bdev_file_inode(bdev_file)->i_mapping, ~__GFP_FS);
 
 	/*
 	 * Prefer looping in the allocator rather than here, at least that
@@ -1471,7 +1473,7 @@ __bread_gfp(struct block_device *bdev, sector_t block,
 	 */
 	gfp |= __GFP_NOFAIL;
 
-	bh = bdev_getblk(bdev, block, size, gfp);
+	bh = bdev_getblk(bdev_file, block, size, gfp);
 
 	if (likely(bh) && !buffer_uptodate(bh))
 		bh = __bread_slow(bh);
@@ -1556,7 +1558,7 @@ EXPORT_SYMBOL(folio_set_bh);
 /* Bits that are cleared during an invalidate */
 #define BUFFER_FLAGS_DISCARD \
 	(1 << BH_Mapped | 1 << BH_New | 1 << BH_Req | \
-	 1 << BH_Delay | 1 << BH_Unwritten)
+	 1 << BH_Delay | 1 << BH_Unwritten | 1 << BH_Bdev)
 
 static void discard_buffer(struct buffer_head * bh)
 {
@@ -1564,7 +1566,7 @@ static void discard_buffer(struct buffer_head * bh)
 
 	lock_buffer(bh);
 	clear_buffer_dirty(bh);
-	bh->b_bdev = NULL;
+	bh->b_bdev_file = NULL;
 	b_state = READ_ONCE(bh->b_state);
 	do {
 	} while (!try_cmpxchg(&bh->b_state, &b_state,
@@ -1694,9 +1696,9 @@ EXPORT_SYMBOL(create_empty_buffers);
  * I/O in bforget() - it's more efficient to wait on the I/O only if we really
  * need to.  That happens here.
  */
-void clean_bdev_aliases(struct block_device *bdev, sector_t block, sector_t len)
+void __clean_bdev_aliases(struct block_device *bdev, sector_t block, sector_t len)
 {
-	struct inode *bd_inode = bdev->bd_inode;
+	struct inode *bd_inode = bdev_inode(bdev);
 	struct address_space *bd_mapping = bd_inode->i_mapping;
 	struct folio_batch fbatch;
 	pgoff_t index = ((loff_t)block << bd_inode->i_blkbits) / PAGE_SIZE;
@@ -1746,7 +1748,6 @@ void clean_bdev_aliases(struct block_device *bdev, sector_t block, sector_t len)
 			break;
 	}
 }
-EXPORT_SYMBOL(clean_bdev_aliases);
 
 static struct buffer_head *folio_create_buffers(struct folio *folio,
 						struct inode *inode,
@@ -2003,7 +2004,17 @@ iomap_to_bh(struct inode *inode, sector_t block, struct buffer_head *bh,
 {
 	loff_t offset = (loff_t)block << inode->i_blkbits;
 
-	bh->b_bdev = iomap->bdev;
+	if (iomap->flags & IOMAP_F_BDEV) {
+		 /*
+		  * If this request originated directly from the block layer we
+		  * only have access to the plain block device. Mark the
+		  * buffer_head similarly.
+		  */
+		bh->b_bdev = iomap->bdev;
+		set_buffer_bdev(bh);
+	} else {
+		bh->b_bdev_file = iomap->bdev_file;
+	}
 
 	/*
 	 * Block points to offset in file we need to map, iomap contains
@@ -2778,7 +2789,7 @@ static void submit_bh_wbc(blk_opf_t opf, struct buffer_head *bh,
 	if (buffer_prio(bh))
 		opf |= REQ_PRIO;
 
-	bio = bio_alloc(bh->b_bdev, 1, opf, GFP_NOIO);
+	bio = bio_alloc(bh_bdev(bh), 1, opf, GFP_NOIO);
 
 	fscrypt_set_bio_crypt_ctx_bh(bio, bh, GFP_NOIO);
 
diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
index 39e75131fd5a..1df4dd89350e 100644
--- a/fs/cramfs/inode.c
+++ b/fs/cramfs/inode.c
@@ -183,7 +183,7 @@ static int next_buffer;
 static void *cramfs_blkdev_read(struct super_block *sb, unsigned int offset,
 				unsigned int len)
 {
-	struct address_space *mapping = sb->s_bdev->bd_inode->i_mapping;
+	struct address_space *mapping = sb->s_bdev_file->f_mapping;
 	struct file_ra_state ra = {};
 	struct page *pages[BLKS_PER_BUF];
 	unsigned i, blocknr, buffer;
diff --git a/fs/direct-io.c b/fs/direct-io.c
index 60456263a338..cc77c86d17db 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -671,7 +671,7 @@ static inline int dio_new_bio(struct dio *dio, struct dio_submit *sdio,
 	sector = start_sector << (sdio->blkbits - 9);
 	nr_pages = bio_max_segs(sdio->pages_in_io);
 	BUG_ON(nr_pages <= 0);
-	dio_bio_alloc(dio, sdio, map_bh->b_bdev, sector, nr_pages);
+	dio_bio_alloc(dio, sdio, bh_bdev(map_bh), sector, nr_pages);
 	sdio->boundary = 0;
 out:
 	return ret;
@@ -946,7 +946,7 @@ static int do_direct_IO(struct dio *dio, struct dio_submit *sdio,
 					map_bh->b_blocknr << sdio->blkfactor;
 				if (buffer_new(map_bh)) {
 					clean_bdev_aliases(
-						map_bh->b_bdev,
+						map_bh->b_bdev_file,
 						map_bh->b_blocknr,
 						map_bh->b_size >> i_blkbits);
 				}
@@ -1102,10 +1102,11 @@ static inline int drop_refcount(struct dio *dio)
  * for the whole file.
  */
 ssize_t __blockdev_direct_IO(struct kiocb *iocb, struct inode *inode,
-		struct block_device *bdev, struct iov_iter *iter,
+		struct file *bdev_file, struct iov_iter *iter,
 		get_block_t get_block, dio_iodone_t end_io,
 		int flags)
 {
+	struct block_device *bdev = file_bdev(bdev_file);
 	unsigned i_blkbits = READ_ONCE(inode->i_blkbits);
 	unsigned blkbits = i_blkbits;
 	unsigned blocksize_mask = (1 << blkbits) - 1;
diff --git a/fs/erofs/data.c b/fs/erofs/data.c
index 433fc39ba423..8b4780395b2f 100644
--- a/fs/erofs/data.c
+++ b/fs/erofs/data.c
@@ -70,7 +70,7 @@ void erofs_init_metabuf(struct erofs_buf *buf, struct super_block *sb)
 	if (erofs_is_fscache_mode(sb))
 		buf->inode = EROFS_SB(sb)->s_fscache->inode;
 	else
-		buf->inode = sb->s_bdev->bd_inode;
+		buf->inode = bdev_file_inode(sb->s_bdev_file);
 }
 
 void *erofs_read_metabuf(struct erofs_buf *buf, struct super_block *sb,
@@ -204,6 +204,7 @@ int erofs_map_dev(struct super_block *sb, struct erofs_map_dev *map)
 	int id;
 
 	map->m_bdev = sb->s_bdev;
+	map->f_m_bdev = sb->s_bdev_file;
 	map->m_daxdev = EROFS_SB(sb)->dax_dev;
 	map->m_dax_part_off = EROFS_SB(sb)->dax_part_off;
 	map->m_fscache = EROFS_SB(sb)->s_fscache;
@@ -278,7 +279,7 @@ static int erofs_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
 	if (flags & IOMAP_DAX)
 		iomap->dax_dev = mdev.m_daxdev;
 	else
-		iomap->bdev = mdev.m_bdev;
+		iomap->bdev_file = mdev.f_m_bdev;
 	iomap->length = map.m_llen;
 	iomap->flags = 0;
 	iomap->private = NULL;
diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index 0f0706325b7b..140188c28f9d 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -377,6 +377,7 @@ enum {
 
 struct erofs_map_dev {
 	struct erofs_fscache *m_fscache;
+	struct file *f_m_bdev;
 	struct block_device *m_bdev;
 	struct dax_device *m_daxdev;
 	u64 m_dax_part_off;
diff --git a/fs/erofs/zmap.c b/fs/erofs/zmap.c
index e313c936351d..6da3083e8252 100644
--- a/fs/erofs/zmap.c
+++ b/fs/erofs/zmap.c
@@ -739,7 +739,7 @@ static int z_erofs_iomap_begin_report(struct inode *inode, loff_t offset,
 	if (ret < 0)
 		return ret;
 
-	iomap->bdev = inode->i_sb->s_bdev;
+	iomap->bdev_file = inode->i_sb->s_bdev_file;
 	iomap->offset = map.m_la;
 	iomap->length = map.m_llen;
 	if (map.m_flags & EROFS_MAP_MAPPED) {
diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
index 5a4272b2c6b0..3dcd03b5bad6 100644
--- a/fs/ext2/inode.c
+++ b/fs/ext2/inode.c
@@ -744,7 +744,7 @@ static int ext2_get_blocks(struct inode *inode,
 		 * We must unmap blocks before zeroing so that writeback cannot
 		 * overwrite zeros with stale data from block device page cache.
 		 */
-		clean_bdev_aliases(inode->i_sb->s_bdev,
+		clean_bdev_aliases(inode->i_sb->s_bdev_file,
 				   le32_to_cpu(chain[depth-1].key),
 				   count);
 		/*
@@ -842,7 +842,7 @@ static int ext2_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
 	if (flags & IOMAP_DAX)
 		iomap->dax_dev = sbi->s_daxdev;
 	else
-		iomap->bdev = inode->i_sb->s_bdev;
+		iomap->bdev_file = inode->i_sb->s_bdev_file;
 
 	if (ret == 0) {
 		/*
diff --git a/fs/ext2/xattr.c b/fs/ext2/xattr.c
index e849241ebb8f..e4df3f82fbe1 100644
--- a/fs/ext2/xattr.c
+++ b/fs/ext2/xattr.c
@@ -80,7 +80,7 @@
 	} while (0)
 # define ea_bdebug(bh, f...) do { \
 		printk(KERN_DEBUG "block %pg:%lu: ", \
-			bh->b_bdev, (unsigned long) bh->b_blocknr); \
+			bh_bdev(bh), (unsigned long) bh->b_blocknr); \
 		printk(f); \
 		printk("\n"); \
 	} while (0)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 5af1b0b8680e..4594af99be27 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1797,11 +1797,11 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock,
  * reserve space for a single block.
  *
  * For delayed buffer_head we have BH_Mapped, BH_New, BH_Delay set.
- * We also have b_blocknr = -1 and b_bdev initialized properly
+ * We also have b_blocknr = -1 and b_bdev_file initialized properly
  *
  * For unwritten buffer_head we have BH_Mapped, BH_New, BH_Unwritten set.
- * We also have b_blocknr = physicalblock mapping unwritten extent and b_bdev
- * initialized properly.
+ * We also have b_blocknr = physicalblock mapping unwritten extent and
+ * b_bdev_file initialized properly.
  */
 int ext4_da_get_block_prep(struct inode *inode, sector_t iblock,
 			   struct buffer_head *bh, int create)
@@ -3241,7 +3241,7 @@ static void ext4_set_iomap(struct inode *inode, struct iomap *iomap,
 	if (flags & IOMAP_DAX)
 		iomap->dax_dev = EXT4_SB(inode->i_sb)->s_daxdev;
 	else
-		iomap->bdev = inode->i_sb->s_bdev;
+		iomap->bdev_file = inode->i_sb->s_bdev_file;
 	iomap->offset = (u64) map->m_lblk << blkbits;
 	iomap->length = (u64) map->m_len << blkbits;
 
diff --git a/fs/ext4/mmp.c b/fs/ext4/mmp.c
index bd946d0c71b7..5641bd34d021 100644
--- a/fs/ext4/mmp.c
+++ b/fs/ext4/mmp.c
@@ -384,7 +384,7 @@ int ext4_multi_mount_protect(struct super_block *sb,
 
 	BUILD_BUG_ON(sizeof(mmp->mmp_bdevname) < BDEVNAME_SIZE);
 	snprintf(mmp->mmp_bdevname, sizeof(mmp->mmp_bdevname),
-		 "%pg", bh->b_bdev);
+		 "%pg", bh_bdev(bh));
 
 	/*
 	 * Start a kernel thread to update the MMP block periodically.
diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
index 312bc6813357..8317877d83ce 100644
--- a/fs/ext4/page-io.c
+++ b/fs/ext4/page-io.c
@@ -93,7 +93,7 @@ struct ext4_io_end_vec *ext4_last_io_end_vec(ext4_io_end_t *io_end)
 static void buffer_io_error(struct buffer_head *bh)
 {
 	printk_ratelimited(KERN_ERR "Buffer I/O error on device %pg, logical block %llu\n",
-		       bh->b_bdev,
+		       bh_bdev(bh),
 			(unsigned long long)bh->b_blocknr);
 }
 
@@ -397,7 +397,7 @@ static void io_submit_init_bio(struct ext4_io_submit *io,
 	 * bio_alloc will _always_ be able to allocate a bio if
 	 * __GFP_DIRECT_RECLAIM is set, see comments for bio_alloc_bioset().
 	 */
-	bio = bio_alloc(bh->b_bdev, BIO_MAX_VECS, REQ_OP_WRITE, GFP_NOIO);
+	bio = bio_alloc(bh_bdev(bh), BIO_MAX_VECS, REQ_OP_WRITE, GFP_NOIO);
 	fscrypt_set_bio_crypt_ctx_bh(bio, bh, GFP_NOIO);
 	bio->bi_iter.bi_sector = bh->b_blocknr * (bh->b_size >> 9);
 	bio->bi_end_io = ext4_end_bio;
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index edb7221dce18..6a0c2e15b48b 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -261,7 +261,7 @@ struct buffer_head *ext4_sb_bread_unmovable(struct super_block *sb,
 
 void ext4_sb_breadahead_unmovable(struct super_block *sb, sector_t block)
 {
-	struct buffer_head *bh = bdev_getblk(file_bdev(sb->s_bdev_file), block,
+	struct buffer_head *bh = bdev_getblk(sb->s_bdev_file, block,
 			sb->s_blocksize, GFP_NOWAIT | __GFP_NOWARN);
 
 	if (likely(bh)) {
@@ -5878,7 +5878,7 @@ static struct file *ext4_get_journal_blkdev(struct super_block *sb,
 	sb_block = EXT4_MIN_BLOCK_SIZE / blocksize;
 	offset = EXT4_MIN_BLOCK_SIZE % blocksize;
 	set_blocksize(bdev, blocksize);
-	bh = __bread(bdev, sb_block, blocksize);
+	bh = __bread(bdev_file, sb_block, blocksize);
 	if (!bh) {
 		ext4_msg(sb, KERN_ERR, "couldn't read superblock of "
 		       "external journal");
@@ -5934,8 +5934,7 @@ static journal_t *ext4_open_dev_journal(struct super_block *sb,
 	if (IS_ERR(bdev_file))
 		return ERR_CAST(bdev_file);
 
-	journal = jbd2_journal_init_dev(file_bdev(bdev_file),
-					file_bdev(sb->s_bdev_file), j_start,
+	journal = jbd2_journal_init_dev(bdev_file, sb->s_bdev_file, j_start,
 					j_len, sb->s_blocksize);
 	if (IS_ERR(journal)) {
 		ext4_msg(sb, KERN_ERR, "failed to create device journal");
diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
index 82dc5e673d5c..41128ccec2ec 100644
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -68,7 +68,7 @@
 	       inode->i_sb->s_id, inode->i_ino, ##__VA_ARGS__)
 # define ea_bdebug(bh, fmt, ...)					\
 	printk(KERN_DEBUG "block %pg:%lu: " fmt "\n",			\
-	       bh->b_bdev, (unsigned long)bh->b_blocknr, ##__VA_ARGS__)
+	       bh_bdev(bh), (unsigned long)bh->b_blocknr, ##__VA_ARGS__)
 #else
 # define ea_idebug(inode, fmt, ...)	no_printk(fmt, ##__VA_ARGS__)
 # define ea_bdebug(bh, fmt, ...)	no_printk(fmt, ##__VA_ARGS__)
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 26e317696b33..fd2a6db57d67 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1605,6 +1605,7 @@ int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map, int flag)
 		goto out;
 
 	map->m_bdev = inode->i_sb->s_bdev;
+	map->f_m_bdev = inode->i_sb->s_bdev_file;
 	map->m_multidev_dio =
 		f2fs_allow_multi_device_dio(F2FS_I_SB(inode), flag);
 
@@ -1723,8 +1724,10 @@ int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map, int flag)
 		map->m_pblk = blkaddr;
 		map->m_len = 1;
 
-		if (map->m_multidev_dio)
+		if (map->m_multidev_dio) {
 			map->m_bdev = FDEV(bidx).bdev;
+			map->f_m_bdev = FDEV(bidx).bdev_file;
+		}
 	} else if ((map->m_pblk != NEW_ADDR &&
 			blkaddr == (map->m_pblk + ofs)) ||
 			(map->m_pblk == NEW_ADDR && blkaddr == NEW_ADDR) ||
@@ -4248,7 +4251,7 @@ static int f2fs_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
 		iomap->length = blks_to_bytes(inode, map.m_len);
 		iomap->type = IOMAP_MAPPED;
 		iomap->flags |= IOMAP_F_MERGED;
-		iomap->bdev = map.m_bdev;
+		iomap->bdev_file = map.f_m_bdev;
 		iomap->addr = blks_to_bytes(inode, map.m_pblk);
 	} else {
 		if (flags & IOMAP_WRITE)
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 6fc172c99915..0e3a5b86276b 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -691,6 +691,7 @@ struct extent_tree_info {
 				F2FS_MAP_DELALLOC)
 
 struct f2fs_map_blocks {
+	struct file *f_m_bdev;	/* for multi-device dio */
 	struct block_device *m_bdev;	/* for multi-device dio */
 	block_t m_pblk;
 	block_t m_lblk;
diff --git a/fs/fuse/dax.c b/fs/fuse/dax.c
index 12ef91d170bb..24966e93a237 100644
--- a/fs/fuse/dax.c
+++ b/fs/fuse/dax.c
@@ -575,7 +575,7 @@ static int fuse_iomap_begin(struct inode *inode, loff_t pos, loff_t length,
 
 	iomap->offset = pos;
 	iomap->flags = 0;
-	iomap->bdev = NULL;
+	iomap->bdev_file = NULL;
 	iomap->dax_dev = fc->dax->dev;
 
 	/*
diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
index 974aca9c8ea8..0e4e295ebf49 100644
--- a/fs/gfs2/aops.c
+++ b/fs/gfs2/aops.c
@@ -622,7 +622,7 @@ static void gfs2_discard(struct gfs2_sbd *sdp, struct buffer_head *bh)
 			spin_unlock(&sdp->sd_ail_lock);
 		}
 	}
-	bh->b_bdev = NULL;
+	bh->b_bdev_file = NULL;
 	clear_buffer_mapped(bh);
 	clear_buffer_req(bh);
 	clear_buffer_new(bh);
diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index d9ccfd27e4f1..e20627a2353d 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -926,7 +926,7 @@ static int __gfs2_iomap_get(struct inode *inode, loff_t pos, loff_t length,
 		iomap->flags |= IOMAP_F_GFS2_BOUNDARY;
 
 out:
-	iomap->bdev = inode->i_sb->s_bdev;
+	iomap->bdev_file = inode->i_sb->s_bdev_file;
 unlock:
 	up_read(&ip->i_rw_mutex);
 	return ret;
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 34540f9d011c..9ae09a48d83c 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -1227,7 +1227,7 @@ int gfs2_glock_get(struct gfs2_sbd *sdp, u64 number,
 	mapping = gfs2_glock2aspace(gl);
 	if (mapping) {
                 mapping->a_ops = &gfs2_meta_aops;
-		mapping->host = s->s_bdev->bd_inode;
+		mapping->host = bdev_file_inode(s->s_bdev_file);
 		mapping->flags = 0;
 		mapping_set_gfp_mask(mapping, GFP_NOFS);
 		mapping->i_private_data = NULL;
diff --git a/fs/gfs2/meta_io.c b/fs/gfs2/meta_io.c
index f814054c8cd0..2052d3fc2c24 100644
--- a/fs/gfs2/meta_io.c
+++ b/fs/gfs2/meta_io.c
@@ -218,7 +218,7 @@ static void gfs2_submit_bhs(blk_opf_t opf, struct buffer_head *bhs[], int num)
 		struct buffer_head *bh = *bhs;
 		struct bio *bio;
 
-		bio = bio_alloc(bh->b_bdev, num, opf, GFP_NOIO);
+		bio = bio_alloc(bh_bdev(bh), num, opf, GFP_NOIO);
 		bio->bi_iter.bi_sector = bh->b_blocknr * (bh->b_size >> 9);
 		while (num > 0) {
 			bh = *bhs;
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 1281e60be639..ca7324cfbad5 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -114,7 +114,7 @@ static struct gfs2_sbd *init_sbd(struct super_block *sb)
 
 	address_space_init_once(mapping);
 	mapping->a_ops = &gfs2_rgrp_aops;
-	mapping->host = sb->s_bdev->bd_inode;
+	mapping->host = bdev_file_inode(sb->s_bdev_file);
 	mapping->flags = 0;
 	mapping_set_gfp_mask(mapping, GFP_NOFS);
 	mapping->i_private_data = NULL;
diff --git a/fs/hpfs/file.c b/fs/hpfs/file.c
index 1bb8d97cd9ae..7353d0e2f35a 100644
--- a/fs/hpfs/file.c
+++ b/fs/hpfs/file.c
@@ -128,7 +128,7 @@ static int hpfs_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
 	if (WARN_ON_ONCE(flags & (IOMAP_WRITE | IOMAP_ZERO)))
 		return -EINVAL;
 
-	iomap->bdev = inode->i_sb->s_bdev;
+	iomap->bdev_file = inode->i_sb->s_bdev_file;
 	iomap->offset = offset;
 
 	hpfs_lock(sb);
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 093c4515b22a..8b143676c4d2 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -377,7 +377,7 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
 
 		if (ctx->rac) /* same as readahead_gfp_mask */
 			gfp |= __GFP_NORETRY | __GFP_NOWARN;
-		ctx->bio = bio_alloc(iomap->bdev, bio_max_segs(nr_vecs),
+		ctx->bio = bio_alloc(iomap_bdev(iomap), bio_max_segs(nr_vecs),
 				     REQ_OP_READ, gfp);
 		/*
 		 * If the bio_alloc fails, try it again for a single page to
@@ -385,7 +385,7 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
 		 * what do_mpage_read_folio does.
 		 */
 		if (!ctx->bio) {
-			ctx->bio = bio_alloc(iomap->bdev, 1, REQ_OP_READ,
+			ctx->bio = bio_alloc(iomap_bdev(iomap), 1, REQ_OP_READ,
 					     orig_gfp);
 		}
 		if (ctx->rac)
@@ -624,7 +624,7 @@ static int iomap_read_folio_sync(loff_t block_start, struct folio *folio,
 	struct bio_vec bvec;
 	struct bio bio;
 
-	bio_init(&bio, iomap->bdev, &bvec, 1, REQ_OP_READ);
+	bio_init(&bio, iomap_bdev(iomap), &bvec, 1, REQ_OP_READ);
 	bio.bi_iter.bi_sector = iomap_sector(iomap, block_start);
 	bio_add_folio_nofail(&bio, folio, plen, poff);
 	return submit_bio_wait(&bio);
@@ -1663,7 +1663,7 @@ iomap_alloc_ioend(struct inode *inode, struct iomap_writepage_ctx *wpc,
 	struct iomap_ioend *ioend;
 	struct bio *bio;
 
-	bio = bio_alloc_bioset(wpc->iomap.bdev, BIO_MAX_VECS,
+	bio = bio_alloc_bioset(iomap_bdev(&wpc->iomap), BIO_MAX_VECS,
 			       REQ_OP_WRITE | wbc_to_write_flags(wbc),
 			       GFP_NOFS, &iomap_ioend_bioset);
 	bio->bi_iter.bi_sector = sector;
diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index bcd3f8cf5ea4..9e875a4dde24 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -56,9 +56,9 @@ static struct bio *iomap_dio_alloc_bio(const struct iomap_iter *iter,
 		struct iomap_dio *dio, unsigned short nr_vecs, blk_opf_t opf)
 {
 	if (dio->dops && dio->dops->bio_set)
-		return bio_alloc_bioset(iter->iomap.bdev, nr_vecs, opf,
+		return bio_alloc_bioset(iomap_bdev(&iter->iomap), nr_vecs, opf,
 					GFP_KERNEL, dio->dops->bio_set);
-	return bio_alloc(iter->iomap.bdev, nr_vecs, opf, GFP_KERNEL);
+	return bio_alloc(iomap_bdev(&iter->iomap), nr_vecs, opf, GFP_KERNEL);
 }
 
 static void iomap_dio_submit_bio(const struct iomap_iter *iter,
@@ -288,8 +288,8 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
 	size_t copied = 0;
 	size_t orig_count;
 
-	if ((pos | length) & (bdev_logical_block_size(iomap->bdev) - 1) ||
-	    !bdev_iter_is_aligned(iomap->bdev, dio->submit.iter))
+	if ((pos | length) & (bdev_logical_block_size(iomap_bdev(iomap)) - 1) ||
+	    !bdev_iter_is_aligned(iomap_bdev(iomap), dio->submit.iter))
 		return -EINVAL;
 
 	if (iomap->type == IOMAP_UNWRITTEN) {
@@ -316,7 +316,7 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
 		 */
 		if (!(iomap->flags & (IOMAP_F_SHARED|IOMAP_F_DIRTY)) &&
 		    (dio->flags & IOMAP_DIO_WRITE_THROUGH) &&
-		    (bdev_fua(iomap->bdev) || !bdev_write_cache(iomap->bdev)))
+		    (bdev_fua(iomap_bdev(iomap)) || !bdev_write_cache(iomap_bdev(iomap))))
 			use_fua = true;
 		else if (dio->flags & IOMAP_DIO_NEED_SYNC)
 			dio->flags &= ~IOMAP_DIO_CALLER_COMP;
diff --git a/fs/iomap/swapfile.c b/fs/iomap/swapfile.c
index 5fc0ac36dee3..20bd67e85d15 100644
--- a/fs/iomap/swapfile.c
+++ b/fs/iomap/swapfile.c
@@ -116,7 +116,7 @@ static loff_t iomap_swapfile_iter(const struct iomap_iter *iter,
 		return iomap_swapfile_fail(isi, "has shared extents");
 
 	/* Only one bdev per swap file. */
-	if (iomap->bdev != isi->sis->bdev)
+	if (iomap_bdev(iomap) != isi->sis->bdev)
 		return iomap_swapfile_fail(isi, "outside the main device");
 
 	if (isi->iomap.length == 0) {
diff --git a/fs/iomap/trace.h b/fs/iomap/trace.h
index c16fd55f5595..43fb3ce21674 100644
--- a/fs/iomap/trace.h
+++ b/fs/iomap/trace.h
@@ -134,7 +134,7 @@ DECLARE_EVENT_CLASS(iomap_class,
 		__entry->length = iomap->length;
 		__entry->type = iomap->type;
 		__entry->flags = iomap->flags;
-		__entry->bdev = iomap->bdev ? iomap->bdev->bd_dev : 0;
+		__entry->bdev = iomap_bdev(iomap) ? iomap_bdev(iomap)->bd_dev : 0;
 	),
 	TP_printk("dev %d:%d ino 0x%llx bdev %d:%d addr 0x%llx offset 0x%llx "
 		  "length 0x%llx type %s flags %s",
diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
index 5e122586e06e..fffb1b4e2068 100644
--- a/fs/jbd2/commit.c
+++ b/fs/jbd2/commit.c
@@ -1014,7 +1014,7 @@ void jbd2_journal_commit_transaction(journal_t *journal)
 				clear_buffer_mapped(bh);
 				clear_buffer_new(bh);
 				clear_buffer_req(bh);
-				bh->b_bdev = NULL;
+				bh->b_bdev_file = NULL;
 			}
 		}
 
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index b6c114c11b97..fd0d99b98c5f 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -434,7 +434,7 @@ int jbd2_journal_write_metadata_buffer(transaction_t *transaction,
 
 	folio_set_bh(new_bh, new_folio, new_offset);
 	new_bh->b_size = bh_in->b_size;
-	new_bh->b_bdev = journal->j_dev;
+	new_bh->b_bdev_file = journal->j_bdev_file;
 	new_bh->b_blocknr = blocknr;
 	new_bh->b_private = bh_in;
 	set_buffer_mapped(new_bh);
@@ -880,7 +880,7 @@ int jbd2_fc_get_buf(journal_t *journal, struct buffer_head **bh_out)
 	if (ret)
 		return ret;
 
-	bh = __getblk(journal->j_dev, pblock, journal->j_blocksize);
+	bh = __getblk(journal->j_bdev_file, pblock, journal->j_blocksize);
 	if (!bh)
 		return -ENOMEM;
 
@@ -1007,7 +1007,7 @@ jbd2_journal_get_descriptor_buffer(transaction_t *transaction, int type)
 	if (err)
 		return NULL;
 
-	bh = __getblk(journal->j_dev, blocknr, journal->j_blocksize);
+	bh = __getblk(journal->j_bdev_file, blocknr, journal->j_blocksize);
 	if (!bh)
 		return NULL;
 	atomic_dec(&transaction->t_outstanding_credits);
@@ -1461,7 +1461,7 @@ static int journal_load_superblock(journal_t *journal)
 	struct buffer_head *bh;
 	journal_superblock_t *sb;
 
-	bh = getblk_unmovable(journal->j_dev, journal->j_blk_offset,
+	bh = getblk_unmovable(journal->j_bdev_file, journal->j_blk_offset,
 			      journal->j_blocksize);
 	if (bh)
 		err = bh_read(bh, 0);
@@ -1516,11 +1516,12 @@ static int journal_load_superblock(journal_t *journal)
  * very few fields yet: that has to wait until we have created the
  * journal structures from from scratch, or loaded them from disk. */
 
-static journal_t *journal_init_common(struct block_device *bdev,
-			struct block_device *fs_dev,
+static journal_t *journal_init_common(struct file *bdev_file,
+			struct file *fs_dev,
 			unsigned long long start, int len, int blocksize)
 {
 	static struct lock_class_key jbd2_trans_commit_key;
+	struct block_device *bdev = file_bdev(bdev_file);
 	journal_t *journal;
 	int err;
 	int n;
@@ -1530,8 +1531,10 @@ static journal_t *journal_init_common(struct block_device *bdev,
 		return ERR_PTR(-ENOMEM);
 
 	journal->j_blocksize = blocksize;
-	journal->j_dev = bdev;
-	journal->j_fs_dev = fs_dev;
+	journal->j_bdev_file = bdev_file;
+	journal->j_fs_bdev_file = fs_dev;
+	journal->j_dev = file_bdev(bdev_file);
+	journal->j_fs_dev = file_bdev(fs_dev);
 	journal->j_blk_offset = start;
 	journal->j_total_len = len;
 	jbd2_init_fs_dev_write_error(journal);
@@ -1640,13 +1643,13 @@ static journal_t *journal_init_common(struct block_device *bdev,
  *  range of blocks on an arbitrary block device.
  *
  */
-journal_t *jbd2_journal_init_dev(struct block_device *bdev,
-			struct block_device *fs_dev,
+journal_t *jbd2_journal_init_dev(struct file *bdev_file,
+			struct file *fs_dev,
 			unsigned long long start, int len, int blocksize)
 {
 	journal_t *journal;
 
-	journal = journal_init_common(bdev, fs_dev, start, len, blocksize);
+	journal = journal_init_common(bdev_file, fs_dev, start, len, blocksize);
 	if (IS_ERR(journal))
 		return ERR_CAST(journal);
 
@@ -1683,7 +1686,7 @@ journal_t *jbd2_journal_init_inode(struct inode *inode)
 		  inode->i_sb->s_id, inode->i_ino, (long long) inode->i_size,
 		  inode->i_sb->s_blocksize_bits, inode->i_sb->s_blocksize);
 
-	journal = journal_init_common(inode->i_sb->s_bdev, inode->i_sb->s_bdev,
+	journal = journal_init_common(inode->i_sb->s_bdev_file, inode->i_sb->s_bdev_file,
 			blocknr, inode->i_size >> inode->i_sb->s_blocksize_bits,
 			inode->i_sb->s_blocksize);
 	if (IS_ERR(journal))
@@ -2009,7 +2012,7 @@ static int __jbd2_journal_erase(journal_t *journal, unsigned int flags)
 		byte_count = (block_stop - block_start + 1) *
 				journal->j_blocksize;
 
-		truncate_inode_pages_range(journal->j_dev->bd_inode->i_mapping,
+		truncate_inode_pages_range(journal->j_bdev_file->f_mapping,
 				byte_start, byte_stop);
 
 		if (flags & JBD2_JOURNAL_FLUSH_DISCARD) {
diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c
index 1f7664984d6e..0740ba6b5802 100644
--- a/fs/jbd2/recovery.c
+++ b/fs/jbd2/recovery.c
@@ -92,7 +92,7 @@ static int do_readahead(journal_t *journal, unsigned int start)
 			goto failed;
 		}
 
-		bh = __getblk(journal->j_dev, blocknr, journal->j_blocksize);
+		bh = __getblk(journal->j_bdev_file, blocknr, journal->j_blocksize);
 		if (!bh) {
 			err = -ENOMEM;
 			goto failed;
@@ -148,7 +148,7 @@ static int jread(struct buffer_head **bhp, journal_t *journal,
 		return err;
 	}
 
-	bh = __getblk(journal->j_dev, blocknr, journal->j_blocksize);
+	bh = __getblk(journal->j_bdev_file, blocknr, journal->j_blocksize);
 	if (!bh)
 		return -ENOMEM;
 
@@ -672,7 +672,7 @@ static int do_one_pass(journal_t *journal,
 
 					/* Find a buffer for the new
 					 * data being restored */
-					nbh = __getblk(journal->j_fs_dev,
+					nbh = __getblk(journal->j_fs_bdev_file,
 							blocknr,
 							journal->j_blocksize);
 					if (nbh == NULL) {
diff --git a/fs/jbd2/revoke.c b/fs/jbd2/revoke.c
index 4556e4689024..cb0bfb0cd248 100644
--- a/fs/jbd2/revoke.c
+++ b/fs/jbd2/revoke.c
@@ -328,7 +328,6 @@ int jbd2_journal_revoke(handle_t *handle, unsigned long long blocknr,
 {
 	struct buffer_head *bh = NULL;
 	journal_t *journal;
-	struct block_device *bdev;
 	int err;
 
 	might_sleep();
@@ -341,11 +340,10 @@ int jbd2_journal_revoke(handle_t *handle, unsigned long long blocknr,
 		return -EINVAL;
 	}
 
-	bdev = journal->j_fs_dev;
 	bh = bh_in;
 
 	if (!bh) {
-		bh = __find_get_block(bdev, blocknr, journal->j_blocksize);
+		bh = __find_get_block(journal->j_fs_bdev_file, blocknr, journal->j_blocksize);
 		if (bh)
 			BUFFER_TRACE(bh, "found on hash");
 	}
@@ -355,7 +353,7 @@ int jbd2_journal_revoke(handle_t *handle, unsigned long long blocknr,
 
 		/* If there is a different buffer_head lying around in
 		 * memory anywhere... */
-		bh2 = __find_get_block(bdev, blocknr, journal->j_blocksize);
+		bh2 = __find_get_block(journal->j_fs_bdev_file, blocknr, journal->j_blocksize);
 		if (bh2) {
 			/* ... and it has RevokeValid status... */
 			if (bh2 != bh && buffer_revokevalid(bh2))
@@ -466,7 +464,7 @@ int jbd2_journal_cancel_revoke(handle_t *handle, struct journal_head *jh)
 	 * state machine will get very upset later on. */
 	if (need_cancel) {
 		struct buffer_head *bh2;
-		bh2 = __find_get_block(bh->b_bdev, bh->b_blocknr, bh->b_size);
+		bh2 = __find_get_block(bh->b_bdev_file, bh->b_blocknr, bh->b_size);
 		if (bh2) {
 			if (bh2 != bh)
 				clear_buffer_revoked(bh2);
@@ -495,7 +493,7 @@ void jbd2_clear_buffer_revoked_flags(journal_t *journal)
 			struct jbd2_revoke_record_s *record;
 			struct buffer_head *bh;
 			record = (struct jbd2_revoke_record_s *)list_entry;
-			bh = __find_get_block(journal->j_fs_dev,
+			bh = __find_get_block(journal->j_fs_bdev_file,
 					      record->blocknr,
 					      journal->j_blocksize);
 			if (bh) {
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index cb0b8d6fc0c6..30ebc93dc430 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -929,7 +929,7 @@ static void warn_dirty_buffer(struct buffer_head *bh)
 	       "JBD2: Spotted dirty metadata buffer (dev = %pg, blocknr = %llu). "
 	       "There's a risk of filesystem corruption in case of system "
 	       "crash.\n",
-	       bh->b_bdev, (unsigned long long)bh->b_blocknr);
+	       bh_bdev(bh), (unsigned long long)bh->b_blocknr);
 }
 
 /* Call t_frozen trigger and copy buffer data into jh->b_frozen_data. */
@@ -990,7 +990,7 @@ do_get_write_access(handle_t *handle, struct journal_head *jh,
 	/* If it takes too long to lock the buffer, trace it */
 	time_lock = jbd2_time_diff(start_lock, jiffies);
 	if (time_lock > HZ/10)
-		trace_jbd2_lock_buffer_stall(bh->b_bdev->bd_dev,
+		trace_jbd2_lock_buffer_stall(bh_bdev(bh)->bd_dev,
 			jiffies_to_msecs(time_lock));
 
 	/* We now hold the buffer lock so it is safe to query the buffer
@@ -2374,7 +2374,7 @@ static int journal_unmap_buffer(journal_t *journal, struct buffer_head *bh,
 			write_unlock(&journal->j_state_lock);
 			jbd2_journal_put_journal_head(jh);
 			/* Already zapped buffer? Nothing to do... */
-			if (!bh->b_bdev)
+			if (!bh_bdev(bh))
 				return 0;
 			return -EBUSY;
 		}
@@ -2428,7 +2428,7 @@ static int journal_unmap_buffer(journal_t *journal, struct buffer_head *bh,
 	clear_buffer_new(bh);
 	clear_buffer_delay(bh);
 	clear_buffer_unwritten(bh);
-	bh->b_bdev = NULL;
+	bh->b_bdev_file = NULL;
 	return may_free;
 }
 
diff --git a/fs/mpage.c b/fs/mpage.c
index 738882e0766d..dea42c440373 100644
--- a/fs/mpage.c
+++ b/fs/mpage.c
@@ -126,7 +126,12 @@ static void map_buffer_to_folio(struct folio *folio, struct buffer_head *bh,
 	do {
 		if (block == page_block) {
 			page_bh->b_state = bh->b_state;
-			page_bh->b_bdev = bh->b_bdev;
+			if (buffer_bdev(bh)) {
+				page_bh->b_bdev = bh->b_bdev;
+				set_buffer_bdev(page_bh);
+			} else {
+				page_bh->b_bdev_file = bh->b_bdev_file;
+			}
 			page_bh->b_blocknr = bh->b_blocknr;
 			break;
 		}
@@ -216,7 +221,7 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args)
 			page_block++;
 			block_in_file++;
 		}
-		bdev = map_bh->b_bdev;
+		bdev = bh_bdev(map_bh);
 	}
 
 	/*
@@ -272,7 +277,7 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args)
 			page_block++;
 			block_in_file++;
 		}
-		bdev = map_bh->b_bdev;
+		bdev = bh_bdev(map_bh);
 	}
 
 	if (first_hole != blocks_per_page) {
@@ -472,7 +477,7 @@ static int __mpage_writepage(struct folio *folio, struct writeback_control *wbc,
 	struct block_device *bdev = NULL;
 	int boundary = 0;
 	sector_t boundary_block = 0;
-	struct block_device *boundary_bdev = NULL;
+	struct file *f_boundary_bdev = NULL;
 	size_t length;
 	struct buffer_head map_bh;
 	loff_t i_size = i_size_read(inode);
@@ -513,9 +518,9 @@ static int __mpage_writepage(struct folio *folio, struct writeback_control *wbc,
 			boundary = buffer_boundary(bh);
 			if (boundary) {
 				boundary_block = bh->b_blocknr;
-				boundary_bdev = bh->b_bdev;
+				f_boundary_bdev = bh->b_bdev_file;
 			}
-			bdev = bh->b_bdev;
+			bdev = bh_bdev(bh);
 		} while ((bh = bh->b_this_page) != head);
 
 		if (first_unmapped)
@@ -549,13 +554,16 @@ static int __mpage_writepage(struct folio *folio, struct writeback_control *wbc,
 		map_bh.b_size = 1 << blkbits;
 		if (mpd->get_block(inode, block_in_file, &map_bh, 1))
 			goto confused;
+		/* This helper cannot be used from the block layer directly. */
+		if (WARN_ON_ONCE(buffer_bdev(&map_bh)))
+			goto confused;
 		if (!buffer_mapped(&map_bh))
 			goto confused;
 		if (buffer_new(&map_bh))
 			clean_bdev_bh_alias(&map_bh);
 		if (buffer_boundary(&map_bh)) {
 			boundary_block = map_bh.b_blocknr;
-			boundary_bdev = map_bh.b_bdev;
+			f_boundary_bdev = map_bh.b_bdev_file;
 		}
 		if (page_block) {
 			if (map_bh.b_blocknr != first_block + page_block)
@@ -565,7 +573,7 @@ static int __mpage_writepage(struct folio *folio, struct writeback_control *wbc,
 		}
 		page_block++;
 		boundary = buffer_boundary(&map_bh);
-		bdev = map_bh.b_bdev;
+		bdev = bh_bdev(&map_bh);
 		if (block_in_file == last_block)
 			break;
 		block_in_file++;
@@ -627,7 +635,7 @@ static int __mpage_writepage(struct folio *folio, struct writeback_control *wbc,
 	if (boundary || (first_unmapped != blocks_per_page)) {
 		bio = mpage_bio_submit_write(bio);
 		if (boundary_block) {
-			write_boundary_block(boundary_bdev,
+			write_boundary_block(f_boundary_bdev,
 					boundary_block, 1 << blkbits);
 		}
 	} else {
diff --git a/fs/nilfs2/btnode.c b/fs/nilfs2/btnode.c
index 0131d83b912d..0620bccbf6e0 100644
--- a/fs/nilfs2/btnode.c
+++ b/fs/nilfs2/btnode.c
@@ -59,7 +59,7 @@ nilfs_btnode_create_block(struct address_space *btnc, __u64 blocknr)
 		BUG();
 	}
 	memset(bh->b_data, 0, i_blocksize(inode));
-	bh->b_bdev = inode->i_sb->s_bdev;
+	bh->b_bdev_file = inode->i_sb->s_bdev_file;
 	bh->b_blocknr = blocknr;
 	set_buffer_mapped(bh);
 	set_buffer_uptodate(bh);
@@ -118,7 +118,7 @@ int nilfs_btnode_submit_block(struct address_space *btnc, __u64 blocknr,
 		goto found;
 	}
 	set_buffer_mapped(bh);
-	bh->b_bdev = inode->i_sb->s_bdev;
+	bh->b_bdev_file = inode->i_sb->s_bdev_file;
 	bh->b_blocknr = pblocknr; /* set block address for read */
 	bh->b_end_io = end_buffer_read_sync;
 	get_bh(bh);
diff --git a/fs/nilfs2/gcinode.c b/fs/nilfs2/gcinode.c
index bf9a11d58817..77d4b9275b87 100644
--- a/fs/nilfs2/gcinode.c
+++ b/fs/nilfs2/gcinode.c
@@ -84,7 +84,7 @@ int nilfs_gccache_submit_read_data(struct inode *inode, sector_t blkoff,
 	}
 
 	if (!buffer_mapped(bh)) {
-		bh->b_bdev = inode->i_sb->s_bdev;
+		bh->b_bdev_file = inode->i_sb->s_bdev_file;
 		set_buffer_mapped(bh);
 	}
 	bh->b_blocknr = pbn;
diff --git a/fs/nilfs2/mdt.c b/fs/nilfs2/mdt.c
index e45c01a559c0..8c2d32e9ba06 100644
--- a/fs/nilfs2/mdt.c
+++ b/fs/nilfs2/mdt.c
@@ -89,7 +89,7 @@ static int nilfs_mdt_create_block(struct inode *inode, unsigned long block,
 	if (buffer_uptodate(bh))
 		goto failed_bh;
 
-	bh->b_bdev = sb->s_bdev;
+	bh->b_bdev_file = sb->s_bdev_file;
 	err = nilfs_mdt_insert_new_block(inode, block, bh, init_block);
 	if (likely(!err)) {
 		get_bh(bh);
diff --git a/fs/nilfs2/page.c b/fs/nilfs2/page.c
index 5c2eba1987bd..1bd4630ad5c5 100644
--- a/fs/nilfs2/page.c
+++ b/fs/nilfs2/page.c
@@ -111,7 +111,7 @@ void nilfs_copy_buffer(struct buffer_head *dbh, struct buffer_head *sbh)
 
 	dbh->b_state = sbh->b_state & NILFS_BUFFER_INHERENT_BITS;
 	dbh->b_blocknr = sbh->b_blocknr;
-	dbh->b_bdev = sbh->b_bdev;
+	dbh->b_bdev_file = sbh->b_bdev_file;
 
 	bh = dbh;
 	bits = sbh->b_state & (BIT(BH_Uptodate) | BIT(BH_Mapped));
@@ -216,7 +216,7 @@ static void nilfs_copy_folio(struct folio *dst, struct folio *src,
 		lock_buffer(dbh);
 		dbh->b_state = sbh->b_state & mask;
 		dbh->b_blocknr = sbh->b_blocknr;
-		dbh->b_bdev = sbh->b_bdev;
+		dbh->b_bdev_file = sbh->b_bdev_file;
 		sbh = sbh->b_this_page;
 		dbh = dbh->b_this_page;
 	} while (dbh != dbufs);
diff --git a/fs/nilfs2/recovery.c b/fs/nilfs2/recovery.c
index 0955b657938f..7d407dd63ff3 100644
--- a/fs/nilfs2/recovery.c
+++ b/fs/nilfs2/recovery.c
@@ -107,7 +107,7 @@ static int nilfs_compute_checksum(struct the_nilfs *nilfs,
 		do {
 			struct buffer_head *bh;
 
-			bh = __bread(nilfs->ns_bdev, ++start, blocksize);
+			bh = __bread(nilfs->ns_bdev_file, ++start, blocksize);
 			if (!bh)
 				return -EIO;
 			check_bytes -= size;
@@ -136,7 +136,7 @@ int nilfs_read_super_root_block(struct the_nilfs *nilfs, sector_t sr_block,
 	int ret;
 
 	*pbh = NULL;
-	bh_sr = __bread(nilfs->ns_bdev, sr_block, nilfs->ns_blocksize);
+	bh_sr = __bread(nilfs->ns_bdev_file, sr_block, nilfs->ns_blocksize);
 	if (unlikely(!bh_sr)) {
 		ret = NILFS_SEG_FAIL_IO;
 		goto failed;
@@ -183,7 +183,7 @@ nilfs_read_log_header(struct the_nilfs *nilfs, sector_t start_blocknr,
 {
 	struct buffer_head *bh_sum;
 
-	bh_sum = __bread(nilfs->ns_bdev, start_blocknr, nilfs->ns_blocksize);
+	bh_sum = __bread(nilfs->ns_bdev_file, start_blocknr, nilfs->ns_blocksize);
 	if (bh_sum)
 		*sum = (struct nilfs_segment_summary *)bh_sum->b_data;
 	return bh_sum;
@@ -250,7 +250,7 @@ static void *nilfs_read_summary_info(struct the_nilfs *nilfs,
 	if (bytes > (*pbh)->b_size - *offset) {
 		blocknr = (*pbh)->b_blocknr;
 		brelse(*pbh);
-		*pbh = __bread(nilfs->ns_bdev, blocknr + 1,
+		*pbh = __bread(nilfs->ns_bdev_file, blocknr + 1,
 			       nilfs->ns_blocksize);
 		if (unlikely(!*pbh))
 			return NULL;
@@ -289,7 +289,7 @@ static void nilfs_skip_summary_info(struct the_nilfs *nilfs,
 		*offset = bytes * (count - (bcnt - 1) * nitem_per_block);
 
 		brelse(*pbh);
-		*pbh = __bread(nilfs->ns_bdev, blocknr + bcnt,
+		*pbh = __bread(nilfs->ns_bdev_file, blocknr + bcnt,
 			       nilfs->ns_blocksize);
 	}
 }
@@ -318,7 +318,7 @@ static int nilfs_scan_dsync_log(struct the_nilfs *nilfs, sector_t start_blocknr,
 
 	sumbytes = le32_to_cpu(sum->ss_sumbytes);
 	blocknr = start_blocknr + DIV_ROUND_UP(sumbytes, nilfs->ns_blocksize);
-	bh = __bread(nilfs->ns_bdev, start_blocknr, nilfs->ns_blocksize);
+	bh = __bread(nilfs->ns_bdev_file, start_blocknr, nilfs->ns_blocksize);
 	if (unlikely(!bh))
 		goto out;
 
@@ -477,7 +477,7 @@ static int nilfs_recovery_copy_block(struct the_nilfs *nilfs,
 	struct buffer_head *bh_org;
 	void *kaddr;
 
-	bh_org = __bread(nilfs->ns_bdev, rb->blocknr, nilfs->ns_blocksize);
+	bh_org = __bread(nilfs->ns_bdev_file, rb->blocknr, nilfs->ns_blocksize);
 	if (unlikely(!bh_org))
 		return -EIO;
 
@@ -696,7 +696,7 @@ static void nilfs_finish_roll_forward(struct the_nilfs *nilfs,
 	    nilfs_get_segnum_of_block(nilfs, ri->ri_super_root))
 		return;
 
-	bh = __getblk(nilfs->ns_bdev, ri->ri_lsegs_start, nilfs->ns_blocksize);
+	bh = __getblk(nilfs->ns_bdev_file, ri->ri_lsegs_start, nilfs->ns_blocksize);
 	BUG_ON(!bh);
 	memset(bh->b_data, 0, bh->b_size);
 	set_buffer_dirty(bh);
@@ -822,7 +822,7 @@ int nilfs_search_super_root(struct the_nilfs *nilfs,
 	/* Read ahead segment */
 	b = seg_start;
 	while (b <= seg_end)
-		__breadahead(nilfs->ns_bdev, b++, nilfs->ns_blocksize);
+		__breadahead(nilfs->ns_bdev_file, b++, nilfs->ns_blocksize);
 
 	for (;;) {
 		brelse(bh_sum);
@@ -868,7 +868,7 @@ int nilfs_search_super_root(struct the_nilfs *nilfs,
 		if (pseg_start == seg_start) {
 			nilfs_get_segment_range(nilfs, nextnum, &b, &end);
 			while (b <= end)
-				__breadahead(nilfs->ns_bdev, b++,
+				__breadahead(nilfs->ns_bdev_file, b++,
 					     nilfs->ns_blocksize);
 		}
 		if (!(flags & NILFS_SS_SR)) {
diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
index 2590a0860eab..642015cd6d1c 100644
--- a/fs/nilfs2/segment.c
+++ b/fs/nilfs2/segment.c
@@ -2825,7 +2825,7 @@ int nilfs_attach_log_writer(struct super_block *sb, struct nilfs_root *root)
 	if (!nilfs->ns_writer)
 		return -ENOMEM;
 
-	inode_attach_wb(nilfs->ns_bdev->bd_inode, NULL);
+	inode_attach_wb(bdev_file_inode(nilfs->ns_bdev_file), NULL);
 
 	err = nilfs_segctor_start_thread(nilfs->ns_writer);
 	if (unlikely(err))
diff --git a/fs/nilfs2/the_nilfs.c b/fs/nilfs2/the_nilfs.c
index 71400496ed36..30776f67cb9b 100644
--- a/fs/nilfs2/the_nilfs.c
+++ b/fs/nilfs2/the_nilfs.c
@@ -63,6 +63,7 @@ struct the_nilfs *alloc_nilfs(struct super_block *sb)
 
 	nilfs->ns_sb = sb;
 	nilfs->ns_bdev = sb->s_bdev;
+	nilfs->ns_bdev_file = sb->s_bdev_file;
 	atomic_set(&nilfs->ns_ndirtyblks, 0);
 	init_rwsem(&nilfs->ns_sem);
 	mutex_init(&nilfs->ns_snapshot_mount_mutex);
diff --git a/fs/nilfs2/the_nilfs.h b/fs/nilfs2/the_nilfs.h
index cd4ae1b8ae16..d47243774181 100644
--- a/fs/nilfs2/the_nilfs.h
+++ b/fs/nilfs2/the_nilfs.h
@@ -97,6 +97,7 @@ struct the_nilfs {
 	int			ns_flushed_device;
 
 	struct super_block     *ns_sb;
+	struct file            *ns_bdev_file;
 	struct block_device    *ns_bdev;
 	struct rw_semaphore	ns_sem;
 	struct mutex		ns_snapshot_mount_mutex;
diff --git a/fs/ntfs/aops.c b/fs/ntfs/aops.c
index 2d01517a2d59..1c56fd2cb0f3 100644
--- a/fs/ntfs/aops.c
+++ b/fs/ntfs/aops.c
@@ -227,7 +227,7 @@ static int ntfs_read_block(struct folio *folio)
 			arr[nr++] = bh;
 			continue;
 		}
-		bh->b_bdev = vol->sb->s_bdev;
+		bh->b_bdev_file = vol->sb->s_bdev_file;
 		/* Is the block within the allowed limits? */
 		if (iblock < lblock) {
 			bool is_retry = false;
@@ -678,7 +678,7 @@ static int ntfs_write_block(struct folio *folio, struct writeback_control *wbc)
 			continue;
 
 		/* Unmapped, dirty buffer. Need to map it. */
-		bh->b_bdev = vol->sb->s_bdev;
+		bh->b_bdev_file = vol->sb->s_bdev_file;
 
 		/* Convert block into corresponding vcn and offset. */
 		vcn = (VCN)block << blocksize_bits;
@@ -988,7 +988,7 @@ static int ntfs_write_mst_block(struct page *page,
 			LCN lcn;
 			unsigned int vcn_ofs;
 
-			bh->b_bdev = vol->sb->s_bdev;
+			bh->b_bdev_file = vol->sb->s_bdev_file;
 			/* Obtain the vcn and offset of the current block. */
 			vcn = (VCN)block << bh_size_bits;
 			vcn_ofs = vcn & vol->cluster_size_mask;
diff --git a/fs/ntfs/file.c b/fs/ntfs/file.c
index 297c0b9db621..894be07d2971 100644
--- a/fs/ntfs/file.c
+++ b/fs/ntfs/file.c
@@ -680,7 +680,7 @@ static int ntfs_prepare_pages_for_non_resident_write(struct page **pages,
 			continue;
 		}
 		/* Unmapped buffer.  Need to map it. */
-		bh->b_bdev = vol->sb->s_bdev;
+		bh->b_bdev_file = vol->sb->s_bdev_file;
 		/*
 		 * If the current buffer is in the same clusters as the map
 		 * cache, there is no need to check the runlist again.  The
diff --git a/fs/ntfs/mft.c b/fs/ntfs/mft.c
index 6fd1dc4b08c8..88a61448522b 100644
--- a/fs/ntfs/mft.c
+++ b/fs/ntfs/mft.c
@@ -526,7 +526,7 @@ int ntfs_sync_mft_mirror(ntfs_volume *vol, const unsigned long mft_no,
 			LCN lcn;
 			unsigned int vcn_ofs;
 
-			bh->b_bdev = vol->sb->s_bdev;
+			bh->b_bdev_file = vol->sb->s_bdev_file;
 			/* Obtain the vcn and offset of the current block. */
 			vcn = ((VCN)mft_no << vol->mft_record_size_bits) +
 					(block_start - m_start);
@@ -719,7 +719,7 @@ int write_mft_record_nolock(ntfs_inode *ni, MFT_RECORD *m, int sync)
 			LCN lcn;
 			unsigned int vcn_ofs;
 
-			bh->b_bdev = vol->sb->s_bdev;
+			bh->b_bdev_file = vol->sb->s_bdev_file;
 			/* Obtain the vcn and offset of the current block. */
 			vcn = ((VCN)ni->mft_no << vol->mft_record_size_bits) +
 					(block_start - m_start);
diff --git a/fs/ntfs3/fsntfs.c b/fs/ntfs3/fsntfs.c
index fbfe21dbb425..b01b0e1f6990 100644
--- a/fs/ntfs3/fsntfs.c
+++ b/fs/ntfs3/fsntfs.c
@@ -1015,7 +1015,7 @@ int ntfs_sb_read(struct super_block *sb, u64 lbo, size_t bytes, void *buffer)
 	u32 op = blocksize - off;
 
 	for (; bytes; block += 1, off = 0, op = blocksize) {
-		struct buffer_head *bh = __bread(bdev, block, blocksize);
+		struct buffer_head *bh = __bread(sb->s_bdev_file, block, blocksize);
 
 		if (!bh)
 			return -EIO;
@@ -1052,14 +1052,14 @@ int ntfs_sb_write(struct super_block *sb, u64 lbo, size_t bytes,
 			op = bytes;
 
 		if (op < blocksize) {
-			bh = __bread(bdev, block, blocksize);
+			bh = __bread(sb->s_bdev_file, block, blocksize);
 			if (!bh) {
 				ntfs_err(sb, "failed to read block %llx",
 					 (u64)block);
 				return -EIO;
 			}
 		} else {
-			bh = __getblk(bdev, block, blocksize);
+			bh = __getblk(sb->s_bdev_file, block, blocksize);
 			if (!bh)
 				return -ENOMEM;
 		}
@@ -2673,4 +2673,4 @@ int ntfs_set_label(struct ntfs_sb_info *sbi, u8 *label, int len)
 out:
 	__putname(uni);
 	return err;
-}
\ No newline at end of file
+}
diff --git a/fs/ntfs3/inode.c b/fs/ntfs3/inode.c
index 5e3d71374918..8a39600b834d 100644
--- a/fs/ntfs3/inode.c
+++ b/fs/ntfs3/inode.c
@@ -611,7 +611,7 @@ static noinline int ntfs_get_block_vbo(struct inode *inode, u64 vbo,
 	lbo = ((u64)lcn << cluster_bits) + off;
 
 	set_buffer_mapped(bh);
-	bh->b_bdev = sb->s_bdev;
+	bh->b_bdev_file = sb->s_bdev_file;
 	bh->b_blocknr = lbo >> sb->s_blocksize_bits;
 
 	valid = ni->i_valid;
diff --git a/fs/ntfs3/super.c b/fs/ntfs3/super.c
index 9153dffde950..fb426cf1887a 100644
--- a/fs/ntfs3/super.c
+++ b/fs/ntfs3/super.c
@@ -1632,7 +1632,7 @@ void ntfs_unmap_meta(struct super_block *sb, CLST lcn, CLST len)
 		limit >>= 1;
 
 	while (blocks--) {
-		clean_bdev_aliases(bdev, devblock++, 1);
+		clean_bdev_aliases(sb->s_bdev_file, devblock++, 1);
 		if (cnt++ >= limit) {
 			sync_blockdev(bdev);
 			cnt = 0;
diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
index b82185075de7..e976950b0a2b 100644
--- a/fs/ocfs2/aops.c
+++ b/fs/ocfs2/aops.c
@@ -2459,7 +2459,7 @@ static ssize_t ocfs2_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
 	else
 		get_block = ocfs2_dio_wr_get_block;
 
-	return __blockdev_direct_IO(iocb, inode, inode->i_sb->s_bdev,
+	return __blockdev_direct_IO(iocb, inode, inode->i_sb->s_bdev_file,
 				    iter, get_block,
 				    ocfs2_dio_end_io, 0);
 }
diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
index 604fea3a26ff..4ad64997f3c7 100644
--- a/fs/ocfs2/journal.c
+++ b/fs/ocfs2/journal.c
@@ -1209,7 +1209,7 @@ static int ocfs2_force_read_journal(struct inode *inode)
 		}
 
 		for (i = 0; i < p_blocks; i++, p_blkno++) {
-			bh = __find_get_block(osb->sb->s_bdev, p_blkno,
+			bh = __find_get_block(osb->sb->s_bdev_file, p_blkno,
 					osb->sb->s_blocksize);
 			/* block not cached. */
 			if (!bh)
diff --git a/fs/reiserfs/fix_node.c b/fs/reiserfs/fix_node.c
index 6c13a8d9a73c..2b288b1539d9 100644
--- a/fs/reiserfs/fix_node.c
+++ b/fs/reiserfs/fix_node.c
@@ -2332,7 +2332,7 @@ static void tb_buffer_sanity_check(struct super_block *sb,
 				       "in tree %s[%d] (%b)",
 				       descr, level, bh);
 
-		if (bh->b_bdev != sb->s_bdev)
+		if (bh_bdev(bh) != sb->s_bdev)
 			reiserfs_panic(sb, "jmacd-4", "buffer has wrong "
 				       "device %s[%d] (%b)",
 				       descr, level, bh);
diff --git a/fs/reiserfs/journal.c b/fs/reiserfs/journal.c
index 6474529c4253..11652650264c 100644
--- a/fs/reiserfs/journal.c
+++ b/fs/reiserfs/journal.c
@@ -618,7 +618,7 @@ static void reiserfs_end_buffer_io_sync(struct buffer_head *bh, int uptodate)
 	if (buffer_journaled(bh)) {
 		reiserfs_warning(NULL, "clm-2084",
 				 "pinned buffer %lu:%pg sent to disk",
-				 bh->b_blocknr, bh->b_bdev);
+				 bh->b_blocknr, bh_bdev(bh));
 	}
 	if (uptodate)
 		set_buffer_uptodate(bh);
@@ -2315,7 +2315,7 @@ static int journal_read_transaction(struct super_block *sb,
  * from other places.
  * Note: Do not use journal_getblk/sb_getblk functions here!
  */
-static struct buffer_head *reiserfs_breada(struct block_device *dev,
+static struct buffer_head *reiserfs_breada(struct file *f_dev,
 					   b_blocknr_t block, int bufsize,
 					   b_blocknr_t max_block)
 {
@@ -2324,7 +2324,7 @@ static struct buffer_head *reiserfs_breada(struct block_device *dev,
 	struct buffer_head *bh;
 	int i, j;
 
-	bh = __getblk(dev, block, bufsize);
+	bh = __getblk(f_dev, block, bufsize);
 	if (!bh || buffer_uptodate(bh))
 		return (bh);
 
@@ -2334,7 +2334,7 @@ static struct buffer_head *reiserfs_breada(struct block_device *dev,
 	bhlist[0] = bh;
 	j = 1;
 	for (i = 1; i < blocks; i++) {
-		bh = __getblk(dev, block + i, bufsize);
+		bh = __getblk(f_dev, block + i, bufsize);
 		if (!bh)
 			break;
 		if (buffer_uptodate(bh)) {
@@ -2447,7 +2447,7 @@ static int journal_read(struct super_block *sb)
 		 * device and journal device to be the same
 		 */
 		d_bh =
-		    reiserfs_breada(file_bdev(journal->j_bdev_file), cur_dblock,
+		    reiserfs_breada(journal->j_bdev_file, cur_dblock,
 				    sb->s_blocksize,
 				    SB_ONDISK_JOURNAL_1st_BLOCK(sb) +
 				    SB_ONDISK_JOURNAL_SIZE(sb));
diff --git a/fs/reiserfs/prints.c b/fs/reiserfs/prints.c
index 84a194b77f19..249a458b6e28 100644
--- a/fs/reiserfs/prints.c
+++ b/fs/reiserfs/prints.c
@@ -156,7 +156,7 @@ static int scnprintf_buffer_head(char *buf, size_t size, struct buffer_head *bh)
 {
 	return scnprintf(buf, size,
 			 "dev %pg, size %zd, blocknr %llu, count %d, state 0x%lx, page %p, (%s, %s, %s)",
-			 bh->b_bdev, bh->b_size,
+			 bh_bdev(bh), bh->b_size,
 			 (unsigned long long)bh->b_blocknr,
 			 atomic_read(&(bh->b_count)),
 			 bh->b_state, bh->b_page,
@@ -561,7 +561,7 @@ static int print_super_block(struct buffer_head *bh)
 		return 1;
 	}
 
-	printk("%pg\'s super block is in block %llu\n", bh->b_bdev,
+	printk("%pg\'s super block is in block %llu\n", bh_bdev(bh),
 	       (unsigned long long)bh->b_blocknr);
 	printk("Reiserfs version %s\n", version);
 	printk("Block count %u\n", sb_block_count(rs));
diff --git a/fs/reiserfs/reiserfs.h b/fs/reiserfs/reiserfs.h
index 0554903f42a9..0bf515815b5d 100644
--- a/fs/reiserfs/reiserfs.h
+++ b/fs/reiserfs/reiserfs.h
@@ -2810,10 +2810,10 @@ struct reiserfs_journal_header {
 
 /* We need these to make journal.c code more readable */
 #define journal_find_get_block(s, block) __find_get_block(\
-		file_bdev(SB_JOURNAL(s)->j_bdev_file), block, s->s_blocksize)
-#define journal_getblk(s, block) __getblk(file_bdev(SB_JOURNAL(s)->j_bdev_file),\
+		SB_JOURNAL(s)->j_bdev_file, block, s->s_blocksize)
+#define journal_getblk(s, block) __getblk(SB_JOURNAL(s)->j_bdev_file,\
 		block, s->s_blocksize)
-#define journal_bread(s, block) __bread(file_bdev(SB_JOURNAL(s)->j_bdev_file),\
+#define journal_bread(s, block) __bread(SB_JOURNAL(s)->j_bdev_file,\
 		block, s->s_blocksize)
 
 enum reiserfs_bh_state_bits {
diff --git a/fs/reiserfs/stree.c b/fs/reiserfs/stree.c
index 5faf702f8d15..23998f071d9c 100644
--- a/fs/reiserfs/stree.c
+++ b/fs/reiserfs/stree.c
@@ -331,7 +331,7 @@ static inline int key_in_buffer(
 	       || chk_path->path_length > MAX_HEIGHT,
 	       "PAP-5050: pointer to the key(%p) is NULL or invalid path length(%d)",
 	       key, chk_path->path_length);
-	RFALSE(!PATH_PLAST_BUFFER(chk_path)->b_bdev,
+	RFALSE(!bh_bdev(PATH_PLAST_BUFFER(chk_path)),
 	       "PAP-5060: device must not be NODEV");
 
 	if (comp_keys(get_lkey(chk_path, sb), key) == 1)
diff --git a/fs/reiserfs/tail_conversion.c b/fs/reiserfs/tail_conversion.c
index 2cec61af2a9e..f38dfae74e32 100644
--- a/fs/reiserfs/tail_conversion.c
+++ b/fs/reiserfs/tail_conversion.c
@@ -187,7 +187,7 @@ void reiserfs_unmap_buffer(struct buffer_head *bh)
 	clear_buffer_mapped(bh);
 	clear_buffer_req(bh);
 	clear_buffer_new(bh);
-	bh->b_bdev = NULL;
+	bh->b_bdev_file = NULL;
 	unlock_buffer(bh);
 }
 
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 18c8f168b153..c06d41bbb919 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -125,7 +125,7 @@ xfs_bmbt_to_iomap(
 	if (mapping_flags & IOMAP_DAX)
 		iomap->dax_dev = target->bt_daxdev;
 	else
-		iomap->bdev = target->bt_bdev;
+		iomap->bdev_file = target->bt_bdev_file;
 	iomap->flags = iomap_flags;
 
 	if (xfs_ipincount(ip) &&
@@ -150,7 +150,7 @@ xfs_hole_to_iomap(
 	iomap->type = IOMAP_HOLE;
 	iomap->offset = XFS_FSB_TO_B(ip->i_mount, offset_fsb);
 	iomap->length = XFS_FSB_TO_B(ip->i_mount, end_fsb - offset_fsb);
-	iomap->bdev = target->bt_bdev;
+	iomap->bdev_file = target->bt_bdev_file;
 	iomap->dax_dev = target->bt_daxdev;
 }
 
diff --git a/fs/zonefs/file.c b/fs/zonefs/file.c
index 6ab2318a9c8e..e8dd9125213a 100644
--- a/fs/zonefs/file.c
+++ b/fs/zonefs/file.c
@@ -38,7 +38,7 @@ static int zonefs_read_iomap_begin(struct inode *inode, loff_t offset,
 	 * act as if there is a hole up to the file maximum size.
 	 */
 	mutex_lock(&zi->i_truncate_mutex);
-	iomap->bdev = inode->i_sb->s_bdev;
+	iomap->bdev_file = inode->i_sb->s_bdev_file;
 	iomap->offset = ALIGN_DOWN(offset, sb->s_blocksize);
 	isize = i_size_read(inode);
 	if (iomap->offset >= isize) {
@@ -88,7 +88,7 @@ static int zonefs_write_iomap_begin(struct inode *inode, loff_t offset,
 	 * write pointer) and unwriten beyond.
 	 */
 	mutex_lock(&zi->i_truncate_mutex);
-	iomap->bdev = inode->i_sb->s_bdev;
+	iomap->bdev_file = inode->i_sb->s_bdev_file;
 	iomap->offset = ALIGN_DOWN(offset, sb->s_blocksize);
 	iomap->addr = (z->z_sector << SECTOR_SHIFT) + iomap->offset;
 	isize = i_size_read(inode);
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index f288c94374b3..f51ff7261c4d 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -49,7 +49,6 @@ struct block_device {
 	bool			bd_write_holder;
 	bool			bd_has_submit_bio;
 	dev_t			bd_dev;
-	struct inode		*bd_inode;	/* will die */
 
 	atomic_t		bd_openers;
 	spinlock_t		bd_size_lock; /* for bd_inode->i_size updates */
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 4b7080e56e44..b08289492e51 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -212,9 +212,11 @@ struct gendisk {
 	struct blk_independent_access_ranges *ia_ranges;
 };
 
+struct inode *bdev_inode(struct block_device *bdev);
+
 static inline bool disk_live(struct gendisk *disk)
 {
-	return !inode_unhashed(disk->part0->bd_inode);
+	return !inode_unhashed(bdev_inode(disk->part0));
 }
 
 /**
@@ -1319,7 +1321,7 @@ static inline unsigned int blksize_bits(unsigned int size)
 
 static inline unsigned int block_size(struct block_device *bdev)
 {
-	return 1 << bdev->bd_inode->i_blkbits;
+	return 1 << bdev_inode(bdev)->i_blkbits;
 }
 
 int kblockd_schedule_work(struct work_struct *work);
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index d78454a4dd1f..d41a55005515 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -9,6 +9,7 @@
 #define _LINUX_BUFFER_HEAD_H
 
 #include <linux/types.h>
+#include <linux/blkdev.h>
 #include <linux/blk_types.h>
 #include <linux/fs.h>
 #include <linux/linkage.h>
@@ -34,6 +35,7 @@ enum bh_state_bits {
 	BH_Meta,	/* Buffer contains metadata */
 	BH_Prio,	/* Buffer should be submitted with REQ_PRIO */
 	BH_Defer_Completion, /* Defer AIO completion to workqueue */
+	BH_Bdev,
 
 	BH_PrivateStart,/* not a state bit, but the first bit available
 			 * for private allocation by other entities
@@ -68,7 +70,10 @@ struct buffer_head {
 	size_t b_size;			/* size of mapping */
 	char *b_data;			/* pointer to data within the page */
 
-	struct block_device *b_bdev;
+	union {
+		struct file *b_bdev_file;
+		struct block_device *b_bdev;
+	};
 	bh_end_io_t *b_end_io;		/* I/O completion */
  	void *b_private;		/* reserved for b_end_io */
 	struct list_head b_assoc_buffers; /* associated with another mapping */
@@ -135,6 +140,14 @@ BUFFER_FNS(Unwritten, unwritten)
 BUFFER_FNS(Meta, meta)
 BUFFER_FNS(Prio, prio)
 BUFFER_FNS(Defer_Completion, defer_completion)
+BUFFER_FNS(Bdev, bdev)
+
+static __always_inline struct block_device *bh_bdev(struct buffer_head *bh)
+{
+	if (buffer_bdev(bh))
+		return bh->b_bdev;
+	return file_bdev(bh->b_bdev_file);
+}
 
 static __always_inline void set_buffer_uptodate(struct buffer_head *bh)
 {
@@ -212,24 +225,31 @@ int generic_buffers_fsync_noflush(struct file *file, loff_t start, loff_t end,
 				  bool datasync);
 int generic_buffers_fsync(struct file *file, loff_t start, loff_t end,
 			  bool datasync);
-void clean_bdev_aliases(struct block_device *bdev, sector_t block,
-			sector_t len);
+void __clean_bdev_aliases(struct block_device *bdev, sector_t block, sector_t len);
+static inline void clean_bdev_aliases(struct file *bdev_file, sector_t block,
+				      sector_t len)
+{
+	__clean_bdev_aliases(file_bdev(bdev_file), block, len);
+}
 static inline void clean_bdev_bh_alias(struct buffer_head *bh)
 {
-	clean_bdev_aliases(bh->b_bdev, bh->b_blocknr, 1);
+	if (buffer_bdev(bh))
+		__clean_bdev_aliases(bh->b_bdev, bh->b_blocknr, 1);
+	else
+		__clean_bdev_aliases(file_bdev(bh->b_bdev_file), bh->b_blocknr, 1);
 }
 
 void mark_buffer_async_write(struct buffer_head *bh);
 void __wait_on_buffer(struct buffer_head *);
 wait_queue_head_t *bh_waitq_head(struct buffer_head *bh);
-struct buffer_head *__find_get_block(struct block_device *bdev, sector_t block,
+struct buffer_head *__find_get_block(struct file *bdev_file, sector_t block,
 			unsigned size);
-struct buffer_head *bdev_getblk(struct block_device *bdev, sector_t block,
+struct buffer_head *bdev_getblk(struct file *bdev_file, sector_t block,
 		unsigned size, gfp_t gfp);
 void __brelse(struct buffer_head *);
 void __bforget(struct buffer_head *);
-void __breadahead(struct block_device *, sector_t block, unsigned int size);
-struct buffer_head *__bread_gfp(struct block_device *,
+void __breadahead(struct file *, sector_t block, unsigned int size);
+struct buffer_head *__bread_gfp(struct file *,
 				sector_t block, unsigned size, gfp_t gfp);
 struct buffer_head *alloc_buffer_head(gfp_t gfp_flags);
 void free_buffer_head(struct buffer_head * bh);
@@ -239,7 +259,7 @@ int sync_dirty_buffer(struct buffer_head *bh);
 int __sync_dirty_buffer(struct buffer_head *bh, blk_opf_t op_flags);
 void write_dirty_buffer(struct buffer_head *bh, blk_opf_t op_flags);
 void submit_bh(blk_opf_t, struct buffer_head *);
-void write_boundary_block(struct block_device *bdev,
+void write_boundary_block(struct file *bdev_file,
 			sector_t bblock, unsigned blocksize);
 int bh_uptodate_or_lock(struct buffer_head *bh);
 int __bh_read(struct buffer_head *bh, blk_opf_t op_flags, bool wait);
@@ -318,66 +338,66 @@ static inline void bforget(struct buffer_head *bh)
 static inline struct buffer_head *
 sb_bread(struct super_block *sb, sector_t block)
 {
-	return __bread_gfp(sb->s_bdev, block, sb->s_blocksize, __GFP_MOVABLE);
+	return __bread_gfp(sb->s_bdev_file, block, sb->s_blocksize, __GFP_MOVABLE);
 }
 
 static inline struct buffer_head *
 sb_bread_unmovable(struct super_block *sb, sector_t block)
 {
-	return __bread_gfp(sb->s_bdev, block, sb->s_blocksize, 0);
+	return __bread_gfp(sb->s_bdev_file, block, sb->s_blocksize, 0);
 }
 
 static inline void
 sb_breadahead(struct super_block *sb, sector_t block)
 {
-	__breadahead(sb->s_bdev, block, sb->s_blocksize);
+	__breadahead(sb->s_bdev_file, block, sb->s_blocksize);
 }
 
-static inline struct buffer_head *getblk_unmovable(struct block_device *bdev,
+static inline struct buffer_head *getblk_unmovable(struct file *bdev_file,
 		sector_t block, unsigned size)
 {
 	gfp_t gfp;
 
-	gfp = mapping_gfp_constraint(bdev->bd_inode->i_mapping, ~__GFP_FS);
+	gfp = mapping_gfp_constraint(bdev_file_inode(bdev_file)->i_mapping, ~__GFP_FS);
 	gfp |= __GFP_NOFAIL;
 
-	return bdev_getblk(bdev, block, size, gfp);
+	return bdev_getblk(bdev_file, block, size, gfp);
 }
 
-static inline struct buffer_head *__getblk(struct block_device *bdev,
+static inline struct buffer_head *__getblk(struct file *bdev_file,
 		sector_t block, unsigned size)
 {
 	gfp_t gfp;
 
-	gfp = mapping_gfp_constraint(bdev->bd_inode->i_mapping, ~__GFP_FS);
+	gfp = mapping_gfp_constraint(bdev_file_inode(bdev_file)->i_mapping, ~__GFP_FS);
 	gfp |= __GFP_MOVABLE | __GFP_NOFAIL;
 
-	return bdev_getblk(bdev, block, size, gfp);
+	return bdev_getblk(bdev_file, block, size, gfp);
 }
 
 static inline struct buffer_head *sb_getblk(struct super_block *sb,
 		sector_t block)
 {
-	return __getblk(sb->s_bdev, block, sb->s_blocksize);
+	return __getblk(sb->s_bdev_file, block, sb->s_blocksize);
 }
 
 static inline struct buffer_head *sb_getblk_gfp(struct super_block *sb,
 		sector_t block, gfp_t gfp)
 {
-	return bdev_getblk(sb->s_bdev, block, sb->s_blocksize, gfp);
+	return bdev_getblk(sb->s_bdev_file, block, sb->s_blocksize, gfp);
 }
 
 static inline struct buffer_head *
 sb_find_get_block(struct super_block *sb, sector_t block)
 {
-	return __find_get_block(sb->s_bdev, block, sb->s_blocksize);
+	return __find_get_block(sb->s_bdev_file, block, sb->s_blocksize);
 }
 
 static inline void
 map_bh(struct buffer_head *bh, struct super_block *sb, sector_t block)
 {
 	set_buffer_mapped(bh);
-	bh->b_bdev = sb->s_bdev;
+	bh->b_bdev_file = sb->s_bdev_file;
 	bh->b_blocknr = block;
 	bh->b_size = sb->s_blocksize;
 }
@@ -447,9 +467,9 @@ static inline void bh_readahead_batch(int nr, struct buffer_head *bhs[],
  *  It returns NULL if the block was unreadable.
  */
 static inline struct buffer_head *
-__bread(struct block_device *bdev, sector_t block, unsigned size)
+__bread(struct file *bdev_file, sector_t block, unsigned size)
 {
-	return __bread_gfp(bdev, block, size, __GFP_MOVABLE);
+	return __bread_gfp(bdev_file, block, size, __GFP_MOVABLE);
 }
 
 /**
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 6e0714d35d9b..039df1dacf7d 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3105,7 +3105,7 @@ enum {
 };
 
 ssize_t __blockdev_direct_IO(struct kiocb *iocb, struct inode *inode,
-			     struct block_device *bdev, struct iov_iter *iter,
+			     struct file *bdev_file, struct iov_iter *iter,
 			     get_block_t get_block,
 			     dio_iodone_t end_io,
 			     int flags);
@@ -3115,7 +3115,7 @@ static inline ssize_t blockdev_direct_IO(struct kiocb *iocb,
 					 struct iov_iter *iter,
 					 get_block_t get_block)
 {
-	return __blockdev_direct_IO(iocb, inode, inode->i_sb->s_bdev, iter,
+	return __blockdev_direct_IO(iocb, inode, inode->i_sb->s_bdev_file, iter,
 			get_block, NULL, DIO_LOCKING | DIO_SKIP_HOLES);
 }
 #endif
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index 96dd0acbba44..a99b27b290e8 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -77,6 +77,7 @@ struct vm_fault;
  */
 #define IOMAP_F_SIZE_CHANGED	(1U << 8)
 #define IOMAP_F_STALE		(1U << 9)
+#define IOMAP_F_BDEV		(1U << 10)
 
 /*
  * Flags from 0x1000 up are for file system specific usage:
@@ -97,7 +98,10 @@ struct iomap {
 	u64			length;	/* length of mapping, bytes */
 	u16			type;	/* type of mapping */
 	u16			flags;	/* flags for mapping */
-	struct block_device	*bdev;	/* block device for I/O */
+	union {
+		struct file		*bdev_file;
+		struct block_device	*bdev;
+	};
 	struct dax_device	*dax_dev; /* dax_dev for dax operations */
 	void			*inline_data;
 	void			*private; /* filesystem private */
@@ -105,6 +109,13 @@ struct iomap {
 	u64			validity_cookie; /* used with .iomap_valid() */
 };
 
+static inline struct block_device *iomap_bdev(const struct iomap *iomap)
+{
+	if (iomap->flags & IOMAP_F_BDEV)
+		return iomap->bdev;
+	return file_bdev(iomap->bdev_file);
+}
+
 static inline sector_t iomap_sector(const struct iomap *iomap, loff_t pos)
 {
 	return (iomap->addr + pos - iomap->offset) >> SECTOR_SHIFT;
diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index 971f3e826e15..3a68308674ad 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -967,6 +967,7 @@ struct journal_s
 	 * @j_dev: Device where we store the journal.
 	 */
 	struct block_device	*j_dev;
+	struct file		*j_bdev_file;
 
 	/**
 	 * @j_blocksize: Block size for the location where we store the journal.
@@ -992,6 +993,7 @@ struct journal_s
 	 * equal to j_dev.
 	 */
 	struct block_device	*j_fs_dev;
+	struct file		*j_fs_bdev_file;
 
 	/**
 	 * @j_fs_dev_wb_err:
@@ -1533,8 +1535,8 @@ extern void	 jbd2_journal_unlock_updates (journal_t *);
 
 void jbd2_journal_wait_updates(journal_t *);
 
-extern journal_t * jbd2_journal_init_dev(struct block_device *bdev,
-				struct block_device *fs_dev,
+extern journal_t * jbd2_journal_init_dev(struct file *bdev_file,
+				struct file *fs_dev,
 				unsigned long long start, int len, int bsize);
 extern journal_t * jbd2_journal_init_inode (struct inode *);
 extern int	   jbd2_journal_update_format (journal_t *);
@@ -1696,7 +1698,7 @@ static inline void jbd2_journal_abort_handle(handle_t *handle)
 
 static inline void jbd2_init_fs_dev_write_error(journal_t *journal)
 {
-	struct address_space *mapping = journal->j_fs_dev->bd_inode->i_mapping;
+	struct address_space *mapping = journal->j_fs_bdev_file->f_mapping;
 
 	/*
 	 * Save the original wb_err value of client fs's bdev mapping which
@@ -1707,7 +1709,7 @@ static inline void jbd2_init_fs_dev_write_error(journal_t *journal)
 
 static inline int jbd2_check_fs_dev_write_error(journal_t *journal)
 {
-	struct address_space *mapping = journal->j_fs_dev->bd_inode->i_mapping;
+	struct address_space *mapping = journal->j_fs_bdev_file->f_mapping;
 
 	return errseq_check(&mapping->wb_err,
 			    READ_ONCE(journal->j_fs_dev_wb_err));
diff --git a/include/trace/events/block.h b/include/trace/events/block.h
index 0e128ad51460..95d3ed978864 100644
--- a/include/trace/events/block.h
+++ b/include/trace/events/block.h
@@ -26,7 +26,7 @@ DECLARE_EVENT_CLASS(block_buffer,
 	),
 
 	TP_fast_assign(
-		__entry->dev		= bh->b_bdev->bd_dev;
+		__entry->dev		= bh_bdev(bh)->bd_dev;
 		__entry->sector		= bh->b_blocknr;
 		__entry->size		= bh->b_size;
 	),

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* [PATCH RFC 2/2] fs,drivers: remove bdev_inode() usage outside of block layer and drivers
  2024-01-29 10:56 ` [PATCH RFC 0/2] fs & block: remove bd_inode Christian Brauner
  2024-01-29 10:56   ` [PATCH RFC 1/2] fs & block: remove bdev->bd_inode Christian Brauner
@ 2024-01-29 10:56   ` Christian Brauner
  2024-01-29 14:37     ` Christoph Hellwig
  1 sibling, 1 reply; 146+ messages in thread
From: Christian Brauner @ 2024-01-29 10:56 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Christian Brauner, Darrick J. Wong, linux-fsdevel, linux-block

There are a few places that use bdev->bd_inode. They don't need to
anymore as they can use the bdev file and bdev_file_inode().

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 drivers/md/bcache/super.c       |  7 ++++---
 drivers/mtd/devices/block2mtd.c |  4 ++--
 fs/bcachefs/util.h              |  5 -----
 fs/btrfs/dev-replace.c          |  2 +-
 fs/btrfs/disk-io.c              | 17 +++++++++--------
 fs/btrfs/disk-io.h              |  4 ++--
 fs/btrfs/super.c                |  2 +-
 fs/btrfs/volumes.c              | 26 ++++++++++++++------------
 fs/btrfs/volumes.h              |  2 +-
 fs/btrfs/zoned.c                | 18 ++++++++++--------
 fs/btrfs/zoned.h                |  4 ++--
 11 files changed, 46 insertions(+), 45 deletions(-)

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 8971e769d5e7..48af785d8cd7 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -163,15 +163,16 @@ static const char *read_super_common(struct cache_sb *sb,  struct block_device *
 }
 
 
-static const char *read_super(struct cache_sb *sb, struct block_device *bdev,
+static const char *read_super(struct cache_sb *sb, struct bdev_file *bdev_file,
 			      struct cache_sb_disk **res)
 {
 	const char *err;
 	struct cache_sb_disk *s;
 	struct page *page;
 	unsigned int i;
+	struct block_device *bdev = file_bdev(bdev_file);
 
-	page = read_cache_page_gfp(bdev_inode(bdev)->i_mapping,
+	page = read_cache_page_gfp(bdev_file->f_mapping,
 				   SB_OFFSET >> PAGE_SHIFT, GFP_KERNEL);
 	if (IS_ERR(page))
 		return "IO error";
@@ -2557,7 +2558,7 @@ static ssize_t register_bcache(struct kobject *k, struct kobj_attribute *attr,
 	if (set_blocksize(file_bdev(bdev_file), 4096))
 		goto out_blkdev_put;
 
-	err = read_super(sb, file_bdev(bdev_file), &sb_disk);
+	err = read_super(sb, bdev_file, &sb_disk);
 	if (err)
 		goto out_blkdev_put;
 
diff --git a/drivers/mtd/devices/block2mtd.c b/drivers/mtd/devices/block2mtd.c
index dc3df3a600cf..b8c224bf4b66 100644
--- a/drivers/mtd/devices/block2mtd.c
+++ b/drivers/mtd/devices/block2mtd.c
@@ -291,7 +291,7 @@ static struct block2mtd_dev *add_device(char *devname, int erase_size,
 		goto err_free_block2mtd;
 	}
 
-	if ((long)bdev_inode(bdev)->i_size % erase_size) {
+	if ((long)bdev_file_inode(bdev_file)->i_size % erase_size) {
 		pr_err("erasesize must be a divisor of device size\n");
 		goto err_free_block2mtd;
 	}
@@ -309,7 +309,7 @@ static struct block2mtd_dev *add_device(char *devname, int erase_size,
 
 	dev->mtd.name = name;
 
-	dev->mtd.size = bdev_inode(bdev)->i_size & PAGE_MASK;
+	dev->mtd.size = bdev_file_inode(bdev_file)->i_size & PAGE_MASK;
 	dev->mtd.erasesize = erase_size;
 	dev->mtd.writesize = 1;
 	dev->mtd.writebufsize = PAGE_SIZE;
diff --git a/fs/bcachefs/util.h b/fs/bcachefs/util.h
index 5ab765d056d6..ed869d67bd85 100644
--- a/fs/bcachefs/util.h
+++ b/fs/bcachefs/util.h
@@ -552,11 +552,6 @@ static inline unsigned fract_exp_two(unsigned x, unsigned fract_bits)
 void bch2_bio_map(struct bio *bio, void *base, size_t);
 int bch2_bio_alloc_pages(struct bio *, size_t, gfp_t);
 
-static inline sector_t bdev_sectors(struct block_device *bdev)
-{
-	return bdev_inode(bdev)->i_size >> 9;
-}
-
 #define closure_bio_submit(bio, cl)					\
 do {									\
 	closure_get(cl);						\
diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
index 2eb11fe4bd05..bd5498d2a187 100644
--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -984,7 +984,7 @@ static int btrfs_dev_replace_finishing(struct btrfs_fs_info *fs_info,
 	btrfs_sysfs_remove_device(src_device);
 	btrfs_sysfs_update_devid(tgt_device);
 	if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &src_device->dev_state))
-		btrfs_scratch_superblocks(fs_info, src_device->bdev,
+		btrfs_scratch_superblocks(fs_info, src_device->bdev_file,
 					  src_device->name->str);
 
 	/* write back the superblocks */
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 7d5d022b0bde..8a652374fa51 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3222,7 +3222,7 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
 	/*
 	 * Read super block and check the signature bytes only
 	 */
-	disk_super = btrfs_read_dev_super(fs_devices->latest_dev->bdev);
+	disk_super = btrfs_read_dev_super(fs_devices->latest_dev->bdev_file);
 	if (IS_ERR(disk_super)) {
 		ret = PTR_ERR(disk_super);
 		goto fail_alloc;
@@ -3633,17 +3633,18 @@ static void btrfs_end_super_write(struct bio *bio)
 	bio_put(bio);
 }
 
-struct btrfs_super_block *btrfs_read_dev_one_super(struct block_device *bdev,
+struct btrfs_super_block *btrfs_read_dev_one_super(struct file *bdev_file,
 						   int copy_num, bool drop_cache)
 {
 	struct btrfs_super_block *super;
+	struct block_device *bdev = file_bdev(bdev_file);
 	struct page *page;
 	u64 bytenr, bytenr_orig;
-	struct address_space *mapping = bdev_inode(bdev)->i_mapping;
+	struct address_space *mapping = bdev_file->f_mapping;
 	int ret;
 
 	bytenr_orig = btrfs_sb_offset(copy_num);
-	ret = btrfs_sb_log_location_bdev(bdev, copy_num, READ, &bytenr);
+	ret = btrfs_sb_log_location_bdev(bdev_file, copy_num, READ, &bytenr);
 	if (ret == -ENOENT)
 		return ERR_PTR(-EINVAL);
 	else if (ret)
@@ -3684,7 +3685,7 @@ struct btrfs_super_block *btrfs_read_dev_one_super(struct block_device *bdev,
 }
 
 
-struct btrfs_super_block *btrfs_read_dev_super(struct block_device *bdev)
+struct btrfs_super_block *btrfs_read_dev_super(struct file *bdev_file)
 {
 	struct btrfs_super_block *super, *latest = NULL;
 	int i;
@@ -3696,7 +3697,7 @@ struct btrfs_super_block *btrfs_read_dev_super(struct block_device *bdev)
 	 * later supers, using BTRFS_SUPER_MIRROR_MAX instead
 	 */
 	for (i = 0; i < 1; i++) {
-		super = btrfs_read_dev_one_super(bdev, i, false);
+		super = btrfs_read_dev_one_super(bdev_file, i, false);
 		if (IS_ERR(super))
 			continue;
 
@@ -3726,7 +3727,7 @@ static int write_dev_supers(struct btrfs_device *device,
 			    struct btrfs_super_block *sb, int max_mirrors)
 {
 	struct btrfs_fs_info *fs_info = device->fs_info;
-	struct address_space *mapping = bdev_inode(device->bdev)->i_mapping;
+	struct address_space *mapping = device->bdev_file->f_mapping;
 	SHASH_DESC_ON_STACK(shash, fs_info->csum_shash);
 	int i;
 	int errors = 0;
@@ -3843,7 +3844,7 @@ static int wait_dev_supers(struct btrfs_device *device, int max_mirrors)
 		    device->commit_total_bytes)
 			break;
 
-		page = find_get_page(bdev_inode(device->bdev)->i_mapping,
+		page = find_get_page(device->bdev_file->f_mapping,
 				     bytenr >> PAGE_SHIFT);
 		if (!page) {
 			errors++;
diff --git a/fs/btrfs/disk-io.h b/fs/btrfs/disk-io.h
index 9413726b329b..0e4494ffd7a1 100644
--- a/fs/btrfs/disk-io.h
+++ b/fs/btrfs/disk-io.h
@@ -48,8 +48,8 @@ int btrfs_validate_super(struct btrfs_fs_info *fs_info,
 			 struct btrfs_super_block *sb, int mirror_num);
 int btrfs_check_features(struct btrfs_fs_info *fs_info, bool is_rw_mount);
 int write_all_supers(struct btrfs_fs_info *fs_info, int max_mirrors);
-struct btrfs_super_block *btrfs_read_dev_super(struct block_device *bdev);
-struct btrfs_super_block *btrfs_read_dev_one_super(struct block_device *bdev,
+struct btrfs_super_block *btrfs_read_dev_super(struct file *bdev_file);
+struct btrfs_super_block *btrfs_read_dev_one_super(struct file *bdev_file,
 						   int copy_num, bool drop_cache);
 int btrfs_commit_super(struct btrfs_fs_info *fs_info);
 struct btrfs_root *btrfs_read_tree_root(struct btrfs_root *tree_root,
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 896acfda1789..ffa4d0ea6b62 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -2280,7 +2280,7 @@ static int check_dev_super(struct btrfs_device *dev)
 		return 0;
 
 	/* Only need to check the primary super block. */
-	sb = btrfs_read_dev_one_super(dev->bdev, 0, true);
+	sb = btrfs_read_dev_one_super(dev->bdev_file, 0, true);
 	if (IS_ERR(sb))
 		return PTR_ERR(sb);
 
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 1f12122ae7ce..50d43a0deafe 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -490,7 +490,7 @@ btrfs_get_bdev_and_sb(const char *device_path, blk_mode_t flags, void *holder,
 		goto error;
 	}
 	invalidate_bdev(bdev);
-	*disk_super = btrfs_read_dev_super(bdev);
+	*disk_super = btrfs_read_dev_super(*bdev_file);
 	if (IS_ERR(*disk_super)) {
 		ret = PTR_ERR(*disk_super);
 		fput(*bdev_file);
@@ -1246,10 +1246,11 @@ void btrfs_release_disk_super(struct btrfs_super_block *super)
 	put_page(page);
 }
 
-static struct btrfs_super_block *btrfs_read_disk_super(struct block_device *bdev,
+static struct btrfs_super_block *btrfs_read_disk_super(struct file *bdev_file,
 						       u64 bytenr, u64 bytenr_orig)
 {
 	struct btrfs_super_block *disk_super;
+	struct block_device *bdev = file_bdev(bdev_file);
 	struct page *page;
 	void *p;
 	pgoff_t index;
@@ -1268,7 +1269,7 @@ static struct btrfs_super_block *btrfs_read_disk_super(struct block_device *bdev
 		return ERR_PTR(-EINVAL);
 
 	/* pull in the page with our super */
-	page = read_cache_page_gfp(bdev_inode(bdev)->i_mapping, index, GFP_KERNEL);
+	page = read_cache_page_gfp(bdev_file->f_mapping, index, GFP_KERNEL);
 
 	if (IS_ERR(page))
 		return ERR_CAST(page);
@@ -1344,14 +1345,13 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags,
 		return ERR_CAST(bdev_file);
 
 	bytenr_orig = btrfs_sb_offset(0);
-	ret = btrfs_sb_log_location_bdev(file_bdev(bdev_file), 0, READ, &bytenr);
+	ret = btrfs_sb_log_location_bdev(bdev_file, 0, READ, &bytenr);
 	if (ret) {
 		device = ERR_PTR(ret);
 		goto error_bdev_put;
 	}
 
-	disk_super = btrfs_read_disk_super(file_bdev(bdev_file), bytenr,
-					   bytenr_orig);
+	disk_super = btrfs_read_disk_super(bdev_file, bytenr, bytenr_orig);
 	if (IS_ERR(disk_super)) {
 		device = ERR_CAST(disk_super);
 		goto error_bdev_put;
@@ -2011,14 +2011,15 @@ static u64 btrfs_num_devices(struct btrfs_fs_info *fs_info)
 }
 
 static void btrfs_scratch_superblock(struct btrfs_fs_info *fs_info,
-				     struct block_device *bdev, int copy_num)
+				     struct file *bdev_file, int copy_num)
 {
+	struct block_device *bdev = file_bdev(bdev_file);
 	struct btrfs_super_block *disk_super;
 	const size_t len = sizeof(disk_super->magic);
 	const u64 bytenr = btrfs_sb_offset(copy_num);
 	int ret;
 
-	disk_super = btrfs_read_disk_super(bdev, bytenr, bytenr);
+	disk_super = btrfs_read_disk_super(bdev_file, bytenr, bytenr);
 	if (IS_ERR(disk_super))
 		return;
 
@@ -2033,10 +2034,11 @@ static void btrfs_scratch_superblock(struct btrfs_fs_info *fs_info,
 }
 
 void btrfs_scratch_superblocks(struct btrfs_fs_info *fs_info,
-			       struct block_device *bdev,
+			       struct file *bdev_file,
 			       const char *device_path)
 {
 	int copy_num;
+	struct block_device *bdev = file_bdev(bdev_file);
 
 	if (!bdev)
 		return;
@@ -2045,7 +2047,7 @@ void btrfs_scratch_superblocks(struct btrfs_fs_info *fs_info,
 		if (bdev_is_zoned(bdev))
 			btrfs_reset_sb_log_zones(bdev, copy_num);
 		else
-			btrfs_scratch_superblock(fs_info, bdev, copy_num);
+			btrfs_scratch_superblock(fs_info, bdev_file, copy_num);
 	}
 
 	/* Notify udev that device has changed */
@@ -2187,7 +2189,7 @@ int btrfs_rm_device(struct btrfs_fs_info *fs_info,
 	 *  just flush the device and let the caller do the final bdev_release.
 	 */
 	if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) {
-		btrfs_scratch_superblocks(fs_info, device->bdev,
+		btrfs_scratch_superblocks(fs_info, device->bdev_file,
 					  device->name->str);
 		if (device->bdev) {
 			sync_blockdev(device->bdev);
@@ -2301,7 +2303,7 @@ void btrfs_destroy_dev_replace_tgtdev(struct btrfs_device *tgtdev)
 
 	mutex_unlock(&fs_devices->device_list_mutex);
 
-	btrfs_scratch_superblocks(tgtdev->fs_info, tgtdev->bdev,
+	btrfs_scratch_superblocks(tgtdev->fs_info, tgtdev->bdev_file,
 				  tgtdev->name->str);
 
 	btrfs_close_bdev(tgtdev);
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index a11854912d53..8b2a98a0459f 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -781,7 +781,7 @@ struct list_head * __attribute_const__ btrfs_get_fs_uuids(void);
 bool btrfs_check_rw_degradable(struct btrfs_fs_info *fs_info,
 					struct btrfs_device *failing_dev);
 void btrfs_scratch_superblocks(struct btrfs_fs_info *fs_info,
-			       struct block_device *bdev,
+			       struct file *bdev_file,
 			       const char *device_path);
 
 enum btrfs_raid_types __attribute_const__ btrfs_bg_flags_to_raid_index(u64 flags);
diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
index 42893771532f..d5d200f1a078 100644
--- a/fs/btrfs/zoned.c
+++ b/fs/btrfs/zoned.c
@@ -83,7 +83,7 @@ static int copy_zone_info_cb(struct blk_zone *zone, unsigned int idx, void *data
 	return 0;
 }
 
-static int sb_write_pointer(struct block_device *bdev, struct blk_zone *zones,
+static int sb_write_pointer(struct file *bdev_file, struct blk_zone *zones,
 			    u64 *wp_ret)
 {
 	bool empty[BTRFS_NR_SB_LOG_ZONES];
@@ -120,7 +120,7 @@ static int sb_write_pointer(struct block_device *bdev, struct blk_zone *zones,
 		return -ENOENT;
 	} else if (full[0] && full[1]) {
 		/* Compare two super blocks */
-		struct address_space *mapping = bdev_inode(bdev)->i_mapping;
+		struct address_space *mapping = bdev_file->f_mapping;
 		struct page *page[BTRFS_NR_SB_LOG_ZONES];
 		struct btrfs_super_block *super[BTRFS_NR_SB_LOG_ZONES];
 		int i;
@@ -564,7 +564,7 @@ int btrfs_get_dev_zone_info(struct btrfs_device *device, bool populate_cache)
 		    BLK_ZONE_TYPE_CONVENTIONAL)
 			continue;
 
-		ret = sb_write_pointer(device->bdev,
+		ret = sb_write_pointer(device->bdev_file,
 				       &zone_info->sb_zones[sb_pos], &sb_wp);
 		if (ret != -ENOENT && ret) {
 			btrfs_err_in_rcu(device->fs_info,
@@ -800,18 +800,19 @@ int btrfs_check_mountopts_zoned(struct btrfs_fs_info *info, unsigned long *mount
 	return 0;
 }
 
-static int sb_log_location(struct block_device *bdev, struct blk_zone *zones,
+static int sb_log_location(struct file *bdev_file, struct blk_zone *zones,
 			   int rw, u64 *bytenr_ret)
 {
 	u64 wp;
 	int ret;
+	struct block_device *bdev = file_bdev(bdev_file);
 
 	if (zones[0].type == BLK_ZONE_TYPE_CONVENTIONAL) {
 		*bytenr_ret = zones[0].start << SECTOR_SHIFT;
 		return 0;
 	}
 
-	ret = sb_write_pointer(bdev, zones, &wp);
+	ret = sb_write_pointer(bdev_file, zones, &wp);
 	if (ret != -ENOENT && ret < 0)
 		return ret;
 
@@ -858,10 +859,11 @@ static int sb_log_location(struct block_device *bdev, struct blk_zone *zones,
 
 }
 
-int btrfs_sb_log_location_bdev(struct block_device *bdev, int mirror, int rw,
+int btrfs_sb_log_location_bdev(struct file *bdev_file, int mirror, int rw,
 			       u64 *bytenr_ret)
 {
 	struct blk_zone zones[BTRFS_NR_SB_LOG_ZONES];
+	struct block_device *bdev = file_bdev(bdev_file);
 	sector_t zone_sectors;
 	u32 sb_zone;
 	int ret;
@@ -895,7 +897,7 @@ int btrfs_sb_log_location_bdev(struct block_device *bdev, int mirror, int rw,
 	if (ret != BTRFS_NR_SB_LOG_ZONES)
 		return -EIO;
 
-	return sb_log_location(bdev, zones, rw, bytenr_ret);
+	return sb_log_location(bdev_file, zones, rw, bytenr_ret);
 }
 
 int btrfs_sb_log_location(struct btrfs_device *device, int mirror, int rw,
@@ -919,7 +921,7 @@ int btrfs_sb_log_location(struct btrfs_device *device, int mirror, int rw,
 	if (zone_num + 1 >= zinfo->nr_zones)
 		return -ENOENT;
 
-	return sb_log_location(device->bdev,
+	return sb_log_location(device->bdev_file,
 			       &zinfo->sb_zones[BTRFS_NR_SB_LOG_ZONES * mirror],
 			       rw, bytenr_ret);
 }
diff --git a/fs/btrfs/zoned.h b/fs/btrfs/zoned.h
index f573bda496fb..225d1c26d955 100644
--- a/fs/btrfs/zoned.h
+++ b/fs/btrfs/zoned.h
@@ -46,7 +46,7 @@ void btrfs_destroy_dev_zone_info(struct btrfs_device *device);
 struct btrfs_zoned_device_info *btrfs_clone_dev_zone_info(struct btrfs_device *orig_dev);
 int btrfs_check_zoned_mode(struct btrfs_fs_info *fs_info);
 int btrfs_check_mountopts_zoned(struct btrfs_fs_info *info, unsigned long *mount_opt);
-int btrfs_sb_log_location_bdev(struct block_device *bdev, int mirror, int rw,
+int btrfs_sb_log_location_bdev(struct file *bdev_file, int mirror, int rw,
 			       u64 *bytenr_ret);
 int btrfs_sb_log_location(struct btrfs_device *device, int mirror, int rw,
 			  u64 *bytenr_ret);
@@ -127,7 +127,7 @@ static inline int btrfs_check_mountopts_zoned(struct btrfs_fs_info *info,
 	return 0;
 }
 
-static inline int btrfs_sb_log_location_bdev(struct block_device *bdev,
+static inline int btrfs_sb_log_location_bdev(struct file *bdev_file,
 					     int mirror, int rw, u64 *bytenr_ret)
 {
 	*bytenr_ret = btrfs_sb_offset(mirror);

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* Re: [PATCH RFC 2/2] fs,drivers: remove bdev_inode() usage outside of block layer and drivers
  2024-01-29 10:56   ` [PATCH RFC 2/2] fs,drivers: remove bdev_inode() usage outside of block layer and drivers Christian Brauner
@ 2024-01-29 14:37     ` Christoph Hellwig
  2024-01-29 15:29       ` Christian Brauner
  0 siblings, 1 reply; 146+ messages in thread
From: Christoph Hellwig @ 2024-01-29 14:37 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

Most of these really should be using proper high level APIs.  The
last round of work on this is here:

https://lore.kernel.org/linux-nilfs/4b11a311-c121-1f44-0ccf-a3966a396994@huaweicloud.com/


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH RFC 2/2] fs,drivers: remove bdev_inode() usage outside of block layer and drivers
  2024-01-29 14:37     ` Christoph Hellwig
@ 2024-01-29 15:29       ` Christian Brauner
  2024-01-29 15:36         ` Christoph Hellwig
  0 siblings, 1 reply; 146+ messages in thread
From: Christian Brauner @ 2024-01-29 15:29 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jan Kara, Jens Axboe, Darrick J. Wong, linux-fsdevel, linux-block

On Mon, Jan 29, 2024 at 03:37:09PM +0100, Christoph Hellwig wrote:
> Most of these really should be using proper high level APIs.  The
> last round of work on this is here:
> 
> https://lore.kernel.org/linux-nilfs/4b11a311-c121-1f44-0ccf-a3966a396994@huaweicloud.com/

Are you saying that I should just drop this patch here?

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH RFC 2/2] fs,drivers: remove bdev_inode() usage outside of block layer and drivers
  2024-01-29 15:29       ` Christian Brauner
@ 2024-01-29 15:36         ` Christoph Hellwig
  2024-02-19 13:34           ` Yu Kuai
  2024-02-19 13:42           ` Yu Kuai
  0 siblings, 2 replies; 146+ messages in thread
From: Christoph Hellwig @ 2024-01-29 15:36 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Christoph Hellwig, Jan Kara, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Mon, Jan 29, 2024 at 04:29:32PM +0100, Christian Brauner wrote:
> On Mon, Jan 29, 2024 at 03:37:09PM +0100, Christoph Hellwig wrote:
> > Most of these really should be using proper high level APIs.  The
> > last round of work on this is here:
> > 
> > https://lore.kernel.org/linux-nilfs/4b11a311-c121-1f44-0ccf-a3966a396994@huaweicloud.com/
> 
> Are you saying that I should just drop this patch here?

I think we need to order the work:

 - get your use struct file as bdev handle series in
 - rebase the above series on top of that, including some bigger changes
   like block2mtd which can then use normal file read/write APIs
 - rebase what is left of this series on top of that, and hopefully not
   much of this patch and a lot less of patch 1 will be left at that
   point.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 01/34] bdev: open block device as files
  2024-01-23 13:26 ` [PATCH v2 01/34] bdev: open block device " Christian Brauner
@ 2024-01-29 16:02   ` Christoph Hellwig
  2024-02-01 17:08     ` Christian Brauner
  2024-03-13  2:32   ` Christoph Hellwig
  1 sibling, 1 reply; 146+ messages in thread
From: Christoph Hellwig @ 2024-01-29 16:02 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

> +static unsigned blk_to_file_flags(blk_mode_t mode)
> +{
> +	unsigned int flags = 0;
> +

...

> +	/*
> +	 * O_EXCL is one of those flags that the VFS clears once it's done with
> +	 * the operation. So don't raise it here either.
> +	 */
> +	if (mode & BLK_OPEN_NDELAY)
> +		flags |= O_NDELAY;

O_EXCL isn't dealt with in this helper at all.

> +	/*
> +	 * If BLK_OPEN_WRITE_IOCTL is set then this is a historical quirk
> +	 * associated with the floppy driver where it has allowed ioctls if the
> +	 * file was opened for writing, but does not allow reads or writes.
> +	 * Make sure that this quirk is reflected in @f_flags.
> +	 */
> +	if (mode & BLK_OPEN_WRITE_IOCTL)
> +		flags |= O_RDWR | O_WRONLY;

.. and BLK_OPEN_WRITE_IOCTL will never be passed to it.  It only comes
from open block devices nodes.

That being said, passing BLK_OPEN_* to bdev_file_open_by_* actually
feels wrong.  They deal with files and should just take normal
O_* flags instead of translating from BLK_OPEN_* to O_* back to
BLK_OPEN_* for the driver (where they make sense as the driver
flags are pretty different from what is passed to open).

Now of course changing that would make a mess of the whole series,
so maybe that can go into a new patch at the end?

> + * @noaccount: whether this is an internal open that shouldn't be counted
>   */
>  static struct file *alloc_file(const struct path *path, int flags,
> -		const struct file_operations *fop)
> +		const struct file_operations *fop, bool noaccount)

Just a suggestion as you are the maintainer here, but I always find
it hard to follow when infrastructure in subsystem A is changed in
a patch primarily changing subsystem B.  Can the file_table.c
changes go into a separate patch or patches with commit logs
documenting their semantics?

And while we're at the semantics I find this area already a bit of a
a mess and this doesn't make it any better..

How about the following:

 - alloc_file loses the actual file allocation and gets a new name
   (unfortunatel init_file is already taken), callers call
   alloc_empty_file_noaccount or alloc_empty_file plus the
   new helper.
 - similarly __alloc_file_pseudo is split into a helper creating
   a path for mnt and inode, and callers call that plus the
   file allocation

?

> +extern struct file *alloc_file_pseudo_noaccount(struct inode *, struct vfsmount *,

no need for the extern here.

> +	struct block_device	*s_bdev;	/* can go away once we use an accessor for @s_bdev_file */

can you put the comment into a separate line to make it readable.

But I'm not even sure it should go away.  s_bdev is used all over the
data and metadata I/O path, so caching it and avoiding multiple levels
of pointer chasing would seem useful.


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 02/34] block/ioctl: port blkdev_bszset() to file
  2024-01-23 13:26 ` [PATCH v2 02/34] block/ioctl: port blkdev_bszset() to file Christian Brauner
@ 2024-01-29 16:14   ` Christoph Hellwig
  2024-01-31 18:10   ` Jan Kara
  1 sibling, 0 replies; 146+ messages in thread
From: Christoph Hellwig @ 2024-01-29 16:14 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 03/34] block/genhd: port disk_scan_partitions() to file
  2024-01-23 13:26 ` [PATCH v2 03/34] block/genhd: port disk_scan_partitions() " Christian Brauner
@ 2024-01-29 16:14   ` Christoph Hellwig
  2024-01-31 18:13   ` Jan Kara
  1 sibling, 0 replies; 146+ messages in thread
From: Christoph Hellwig @ 2024-01-29 16:14 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 04/34] md: port block device access to file
  2024-01-23 13:26 ` [PATCH v2 04/34] md: port block device access " Christian Brauner
@ 2024-01-29 16:14   ` Christoph Hellwig
  2024-01-31 18:15   ` Jan Kara
  2024-04-15  9:26   ` Ming Lei
  2 siblings, 0 replies; 146+ messages in thread
From: Christoph Hellwig @ 2024-01-29 16:14 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 05/34] swap: port block device usage to file
  2024-01-23 13:26 ` [PATCH v2 05/34] swap: port block device usage " Christian Brauner
@ 2024-01-29 16:15   ` Christoph Hellwig
  2024-01-31 18:16   ` Jan Kara
  1 sibling, 0 replies; 146+ messages in thread
From: Christoph Hellwig @ 2024-01-29 16:15 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 06/34] power: port block device access to file
  2024-01-23 13:26 ` [PATCH v2 06/34] power: port block device access " Christian Brauner
@ 2024-01-29 16:15   ` Christoph Hellwig
  2024-01-31 18:17   ` Jan Kara
  1 sibling, 0 replies; 146+ messages in thread
From: Christoph Hellwig @ 2024-01-29 16:15 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 07/34] xfs: port block device access to files
  2024-01-23 13:26 ` [PATCH v2 07/34] xfs: port block device access to files Christian Brauner
@ 2024-01-29 16:17   ` Christoph Hellwig
  2024-02-01 14:33     ` Christian Brauner
  2024-01-31 18:19   ` Jan Kara
  1 sibling, 1 reply; 146+ messages in thread
From: Christoph Hellwig @ 2024-01-29 16:17 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

>  	if (mp->m_rtname) {
> -		error = xfs_blkdev_get(mp, mp->m_rtname, &rtdev_handle);
> +		error = xfs_blkdev_get(mp, mp->m_rtname, &rtdev_file);
>  		if (error)
>  			goto out_close_logdev;
>  
> -		if (rtdev_handle->bdev == ddev ||
> -		    (logdev_handle &&
> -		     rtdev_handle->bdev == logdev_handle->bdev)) {
> +		if (file_bdev(rtdev_file) == ddev ||
> +		    (logdev_file && file_bdev(rtdev_file) == file_bdev(logdev_file))) {

Please avoid the overly long line here.

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

(note that this will probably have some not too bad merge conflict
with in-flight xfs work)

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 27/34] bdev: remove bdev_open_by_path()
  2024-01-23 13:26 ` [PATCH v2 27/34] bdev: remove bdev_open_by_path() Christian Brauner
@ 2024-01-29 16:17   ` Christoph Hellwig
  2024-02-01 10:24   ` Jan Kara
  1 sibling, 0 replies; 146+ messages in thread
From: Christoph Hellwig @ 2024-01-29 16:17 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 28/34] bdev: make bdev_release() private to block layer
  2024-01-23 13:26 ` [PATCH v2 28/34] bdev: make bdev_release() private to block layer Christian Brauner
@ 2024-01-29 16:19   ` Christoph Hellwig
  2024-02-01 10:26   ` Jan Kara
  1 sibling, 0 replies; 146+ messages in thread
From: Christoph Hellwig @ 2024-01-29 16:19 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue, Jan 23, 2024 at 02:26:45PM +0100, Christian Brauner wrote:
> and move both of them to the private block header. There's no caller in
> the tree anymore that uses them directly.

the subject only takes about a single helper, but then the commit
message mentions "both".  Seems like the subject is missing a
"bdev_open_by_dev and".

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 29/34] bdev: make struct bdev_handle private to the block layer
  2024-01-23 13:26 ` [PATCH v2 29/34] bdev: make struct bdev_handle private to the " Christian Brauner
@ 2024-01-29 16:22   ` Christoph Hellwig
  2024-02-01 14:50     ` Christian Brauner
  2024-02-01 10:54   ` Jan Kara
  2024-02-01 11:23   ` Jan Kara
  2 siblings, 1 reply; 146+ messages in thread
From: Christoph Hellwig @ 2024-01-29 16:22 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

> +	ret = devcgroup_check_permission(
> +		DEVCG_DEV_BLOCK, MAJOR(dev), MINOR(dev),
> +		((mode & BLK_OPEN_READ) ? DEVCG_ACC_READ : 0) |
> +			((mode & BLK_OPEN_WRITE) ? DEVCG_ACC_WRITE : 0));

Somewhat weird formatting here with DEVCG_DEV_BLOCK not on the
same line as the opening brace and the extra indentation after
the |.  I would have expected something like:

	ret = devcgroup_check_permission(DEVCG_DEV_BLOCK,
		MAJOR(dev), MINOR(dev),
		((mode & BLK_OPEN_READ) ? DEVCG_ACC_READ : 0) |
		((mode & BLK_OPEN_WRITE) ? DEVCG_ACC_WRITE : 0));

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 30/34] bdev: remove bdev pointer from struct bdev_handle
  2024-01-23 13:26 ` [PATCH v2 30/34] bdev: remove bdev pointer from struct bdev_handle Christian Brauner
@ 2024-01-29 16:22   ` Christoph Hellwig
  2024-02-01 10:57   ` Jan Kara
  1 sibling, 0 replies; 146+ messages in thread
From: Christoph Hellwig @ 2024-01-29 16:22 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 31/34] block: use file->f_op to indicate restricted writes
  2024-01-23 13:26 ` [PATCH v2 31/34] block: use file->f_op to indicate restricted writes Christian Brauner
@ 2024-01-29 16:49   ` Christoph Hellwig
  2024-01-29 17:09     ` [PATCH v2 31/34] block: use file->f_op to indicate restricted writes^[ Christian Brauner
  2024-02-01 11:08   ` [PATCH v2 31/34] block: use file->f_op to indicate restricted writes Jan Kara
  1 sibling, 1 reply; 146+ messages in thread
From: Christoph Hellwig @ 2024-01-29 16:49 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue, Jan 23, 2024 at 02:26:48PM +0100, Christian Brauner wrote:
> Make it possible to detected a block device that was opened with
> restricted write access solely based on its file operations that it was
> opened with. This avoids wasting an FMODE_* flag.
> 
> def_blk_fops isn't needed to check whether something is a block device
> checking the inode type is enough for that. And def_blk_fops_restricted
> can be kept private to the block layer.

I agree with not wasting a FMODE_* flag, but I also really hate
duplicating the file operations.

I went to search for a good place to stash this information and ended up
at the f_version field in struct file.  That one is never touched by the 
VFS proper but just the file system and a few lseek helpers.  The
latter currently happen to be used by block devices unfortunately,
but it seem like moving the f_version clearing into the few filesystem
actualy using it would be good code hygiene anyway.


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 32/34] block: remove bdev_handle completely
  2024-01-23 13:26 ` [PATCH v2 32/34] block: remove bdev_handle completely Christian Brauner
@ 2024-01-29 16:50   ` Christoph Hellwig
  2024-02-01 11:20   ` Jan Kara
  1 sibling, 0 replies; 146+ messages in thread
From: Christoph Hellwig @ 2024-01-29 16:50 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 31/34] block: use file->f_op to indicate restricted writes^[
  2024-01-29 16:49   ` Christoph Hellwig
@ 2024-01-29 17:09     ` Christian Brauner
  2024-01-30  8:32       ` Christoph Hellwig
  0 siblings, 1 reply; 146+ messages in thread
From: Christian Brauner @ 2024-01-29 17:09 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jan Kara, Jens Axboe, Darrick J. Wong, linux-fsdevel, linux-block

On Mon, Jan 29, 2024 at 05:49:34PM +0100, Christoph Hellwig wrote:
> On Tue, Jan 23, 2024 at 02:26:48PM +0100, Christian Brauner wrote:
> > Make it possible to detected a block device that was opened with
> > restricted write access solely based on its file operations that it was
> > opened with. This avoids wasting an FMODE_* flag.
> > 
> > def_blk_fops isn't needed to check whether something is a block device
> > checking the inode type is enough for that. And def_blk_fops_restricted
> > can be kept private to the block layer.
> 
> I agree with not wasting a FMODE_* flag, but I also really hate
> duplicating the file operations.

I don't think it's that bad and is temporary until we can
unconditionally disable writing to mounted block devices. Until then we
can place all of this under #if IS_ENABLED(CONFIG_BLK_DEV_WRITE_MOUNTED)
in a single location in block/fops.c so its nicely encapsulated and
confined.

> I went to search for a good place to stash this information and ended up
> at the f_version field in struct file.  That one is never touched by the 
> VFS proper but just the file system and a few lseek helpers.  The
> latter currently happen to be used by block devices unfortunately,
> but it seem like moving the f_version clearing into the few filesystem
> actualy using it would be good code hygiene anyway.

Kinda like choosing between pest and cholera. I think that the f_op
solution is nicer. Overloading f_version is not something I feel I have
the stomach for. The cleanup itself might still be worth it ofc.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 31/34] block: use file->f_op to indicate restricted writes^[
  2024-01-29 17:09     ` [PATCH v2 31/34] block: use file->f_op to indicate restricted writes^[ Christian Brauner
@ 2024-01-30  8:32       ` Christoph Hellwig
  2024-01-30  9:11         ` Christian Brauner
  0 siblings, 1 reply; 146+ messages in thread
From: Christoph Hellwig @ 2024-01-30  8:32 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Christoph Hellwig, Jan Kara, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Mon, Jan 29, 2024 at 06:09:37PM +0100, Christian Brauner wrote:
> I don't think it's that bad and is temporary until we can
> unconditionally disable writing to mounted block devices. Until then we
> can place all of this under #if IS_ENABLED(CONFIG_BLK_DEV_WRITE_MOUNTED)
> in a single location in block/fops.c so its nicely encapsulated and
> confined.

Oh well.  If Jens is fine with this I can live with it even if I don't
like it too much.  I'll probably just clean it up as a follow up.

OTOH I fear we won't be able to unconditionally disable writing to
mounted block devices anytime soon if ever.


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 31/34] block: use file->f_op to indicate restricted writes^[
  2024-01-30  8:32       ` Christoph Hellwig
@ 2024-01-30  9:11         ` Christian Brauner
  0 siblings, 0 replies; 146+ messages in thread
From: Christian Brauner @ 2024-01-30  9:11 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jan Kara, Jens Axboe, Darrick J. Wong, linux-fsdevel, linux-block

On Tue, Jan 30, 2024 at 09:32:13AM +0100, Christoph Hellwig wrote:
> On Mon, Jan 29, 2024 at 06:09:37PM +0100, Christian Brauner wrote:
> > I don't think it's that bad and is temporary until we can
> > unconditionally disable writing to mounted block devices. Until then we
> > can place all of this under #if IS_ENABLED(CONFIG_BLK_DEV_WRITE_MOUNTED)
> > in a single location in block/fops.c so its nicely encapsulated and
> > confined.
> 
> Oh well.  If Jens is fine with this I can live with it even if I don't
> like it too much.  I'll probably just clean it up as a follow up.
> 
> OTOH I fear we won't be able to unconditionally disable writing to
> mounted block devices anytime soon if ever.

One my dream. Put another way, if we don't even allow us to think that
we can remove insecure functionality in the future then we have to
accept that we'll be piling on #ifdefine's and mostly unused code
forever which is just sad. :/

I'm hopeful that writing to mounted block devices is something that we
can make all major distros move away from. We should start just because
we need to figure out what tools do actually try to do stuff like that.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 02/34] block/ioctl: port blkdev_bszset() to file
  2024-01-23 13:26 ` [PATCH v2 02/34] block/ioctl: port blkdev_bszset() to file Christian Brauner
  2024-01-29 16:14   ` Christoph Hellwig
@ 2024-01-31 18:10   ` Jan Kara
  1 sibling, 0 replies; 146+ messages in thread
From: Jan Kara @ 2024-01-31 18:10 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:19, Christian Brauner wrote:
> Signed-off-by: Christian Brauner <brauner@kernel.org>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  block/ioctl.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/block/ioctl.c b/block/ioctl.c
> index 9c73a763ef88..5d0619e02e4c 100644
> --- a/block/ioctl.c
> +++ b/block/ioctl.c
> @@ -471,7 +471,7 @@ static int blkdev_bszset(struct block_device *bdev, blk_mode_t mode,
>  		int __user *argp)
>  {
>  	int ret, n;
> -	struct bdev_handle *handle;
> +	struct file *file;
>  
>  	if (!capable(CAP_SYS_ADMIN))
>  		return -EACCES;
> @@ -483,12 +483,11 @@ static int blkdev_bszset(struct block_device *bdev, blk_mode_t mode,
>  	if (mode & BLK_OPEN_EXCL)
>  		return set_blocksize(bdev, n);
>  
> -	handle = bdev_open_by_dev(bdev->bd_dev, mode, &bdev, NULL);
> -	if (IS_ERR(handle))
> +	file = bdev_file_open_by_dev(bdev->bd_dev, mode, &bdev, NULL);
> +	if (IS_ERR(file))
>  		return -EBUSY;
>  	ret = set_blocksize(bdev, n);
> -	bdev_release(handle);
> -
> +	fput(file);
>  	return ret;
>  }
>  
> 
> -- 
> 2.43.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 03/34] block/genhd: port disk_scan_partitions() to file
  2024-01-23 13:26 ` [PATCH v2 03/34] block/genhd: port disk_scan_partitions() " Christian Brauner
  2024-01-29 16:14   ` Christoph Hellwig
@ 2024-01-31 18:13   ` Jan Kara
  1 sibling, 0 replies; 146+ messages in thread
From: Jan Kara @ 2024-01-31 18:13 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:20, Christian Brauner wrote:
> This may run from a kernel thread via device_add_disk(). So this could
> also use __fput_sync() if we were worried about EBUSY. But when it is
> called from a kernel thread it's always BLK_OPEN_READ so EBUSY can't
> really happen even if we do BLK_OPEN_RESTRICT_WRITES or BLK_OPEN_EXCL.
> 
> Otherwise it's called from an ioctl on the block device which is only
> called from userspace and can rely on task work.
> 
> Signed-off-by: Christian Brauner <brauner@kernel.org>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  block/genhd.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/block/genhd.c b/block/genhd.c
> index d74fb5b4ae68..a911d2969c07 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -342,7 +342,7 @@ EXPORT_SYMBOL_GPL(disk_uevent);
>  
>  int disk_scan_partitions(struct gendisk *disk, blk_mode_t mode)
>  {
> -	struct bdev_handle *handle;
> +	struct file *file;
>  	int ret = 0;
>  
>  	if (disk->flags & (GENHD_FL_NO_PART | GENHD_FL_HIDDEN))
> @@ -366,12 +366,12 @@ int disk_scan_partitions(struct gendisk *disk, blk_mode_t mode)
>  	}
>  
>  	set_bit(GD_NEED_PART_SCAN, &disk->state);
> -	handle = bdev_open_by_dev(disk_devt(disk), mode & ~BLK_OPEN_EXCL, NULL,
> -				  NULL);
> -	if (IS_ERR(handle))
> -		ret = PTR_ERR(handle);
> +	file = bdev_file_open_by_dev(disk_devt(disk), mode & ~BLK_OPEN_EXCL,
> +				     NULL, NULL);
> +	if (IS_ERR(file))
> +		ret = PTR_ERR(file);
>  	else
> -		bdev_release(handle);
> +		fput(file);
>  
>  	/*
>  	 * If blkdev_get_by_dev() failed early, GD_NEED_PART_SCAN is still set,
> 
> -- 
> 2.43.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 04/34] md: port block device access to file
  2024-01-23 13:26 ` [PATCH v2 04/34] md: port block device access " Christian Brauner
  2024-01-29 16:14   ` Christoph Hellwig
@ 2024-01-31 18:15   ` Jan Kara
  2024-04-15  9:26   ` Ming Lei
  2 siblings, 0 replies; 146+ messages in thread
From: Jan Kara @ 2024-01-31 18:15 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:21, Christian Brauner wrote:
> Signed-off-by: Christian Brauner <brauner@kernel.org>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  drivers/md/dm.c               | 23 +++++++++++++----------
>  drivers/md/md.c               | 12 ++++++------
>  drivers/md/md.h               |  2 +-
>  include/linux/device-mapper.h |  2 +-
>  4 files changed, 21 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index 8dcabf84d866..87de5b5682ad 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -726,7 +726,8 @@ static struct table_device *open_table_device(struct mapped_device *md,
>  		dev_t dev, blk_mode_t mode)
>  {
>  	struct table_device *td;
> -	struct bdev_handle *bdev_handle;
> +	struct file *bdev_file;
> +	struct block_device *bdev;
>  	u64 part_off;
>  	int r;
>  
> @@ -735,34 +736,36 @@ static struct table_device *open_table_device(struct mapped_device *md,
>  		return ERR_PTR(-ENOMEM);
>  	refcount_set(&td->count, 1);
>  
> -	bdev_handle = bdev_open_by_dev(dev, mode, _dm_claim_ptr, NULL);
> -	if (IS_ERR(bdev_handle)) {
> -		r = PTR_ERR(bdev_handle);
> +	bdev_file = bdev_file_open_by_dev(dev, mode, _dm_claim_ptr, NULL);
> +	if (IS_ERR(bdev_file)) {
> +		r = PTR_ERR(bdev_file);
>  		goto out_free_td;
>  	}
>  
> +	bdev = file_bdev(bdev_file);
> +
>  	/*
>  	 * We can be called before the dm disk is added.  In that case we can't
>  	 * register the holder relation here.  It will be done once add_disk was
>  	 * called.
>  	 */
>  	if (md->disk->slave_dir) {
> -		r = bd_link_disk_holder(bdev_handle->bdev, md->disk);
> +		r = bd_link_disk_holder(bdev, md->disk);
>  		if (r)
>  			goto out_blkdev_put;
>  	}
>  
>  	td->dm_dev.mode = mode;
> -	td->dm_dev.bdev = bdev_handle->bdev;
> -	td->dm_dev.bdev_handle = bdev_handle;
> -	td->dm_dev.dax_dev = fs_dax_get_by_bdev(bdev_handle->bdev, &part_off,
> +	td->dm_dev.bdev = bdev;
> +	td->dm_dev.bdev_file = bdev_file;
> +	td->dm_dev.dax_dev = fs_dax_get_by_bdev(bdev, &part_off,
>  						NULL, NULL);
>  	format_dev_t(td->dm_dev.name, dev);
>  	list_add(&td->list, &md->table_devices);
>  	return td;
>  
>  out_blkdev_put:
> -	bdev_release(bdev_handle);
> +	fput(bdev_file);
>  out_free_td:
>  	kfree(td);
>  	return ERR_PTR(r);
> @@ -775,7 +778,7 @@ static void close_table_device(struct table_device *td, struct mapped_device *md
>  {
>  	if (md->disk->slave_dir)
>  		bd_unlink_disk_holder(td->dm_dev.bdev, md->disk);
> -	bdev_release(td->dm_dev.bdev_handle);
> +	fput(td->dm_dev.bdev_file);
>  	put_dax(td->dm_dev.dax_dev);
>  	list_del(&td->list);
>  	kfree(td);
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 2266358d8074..0653584db63b 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -2578,7 +2578,7 @@ static void export_rdev(struct md_rdev *rdev, struct mddev *mddev)
>  	if (test_bit(AutoDetected, &rdev->flags))
>  		md_autodetect_dev(rdev->bdev->bd_dev);
>  #endif
> -	bdev_release(rdev->bdev_handle);
> +	fput(rdev->bdev_file);
>  	rdev->bdev = NULL;
>  	kobject_put(&rdev->kobj);
>  }
> @@ -3773,16 +3773,16 @@ static struct md_rdev *md_import_device(dev_t newdev, int super_format, int supe
>  	if (err)
>  		goto out_clear_rdev;
>  
> -	rdev->bdev_handle = bdev_open_by_dev(newdev,
> +	rdev->bdev_file = bdev_file_open_by_dev(newdev,
>  			BLK_OPEN_READ | BLK_OPEN_WRITE,
>  			super_format == -2 ? &claim_rdev : rdev, NULL);
> -	if (IS_ERR(rdev->bdev_handle)) {
> +	if (IS_ERR(rdev->bdev_file)) {
>  		pr_warn("md: could not open device unknown-block(%u,%u).\n",
>  			MAJOR(newdev), MINOR(newdev));
> -		err = PTR_ERR(rdev->bdev_handle);
> +		err = PTR_ERR(rdev->bdev_file);
>  		goto out_clear_rdev;
>  	}
> -	rdev->bdev = rdev->bdev_handle->bdev;
> +	rdev->bdev = file_bdev(rdev->bdev_file);
>  
>  	kobject_init(&rdev->kobj, &rdev_ktype);
>  
> @@ -3813,7 +3813,7 @@ static struct md_rdev *md_import_device(dev_t newdev, int super_format, int supe
>  	return rdev;
>  
>  out_blkdev_put:
> -	bdev_release(rdev->bdev_handle);
> +	fput(rdev->bdev_file);
>  out_clear_rdev:
>  	md_rdev_clear(rdev);
>  out_free_rdev:
> diff --git a/drivers/md/md.h b/drivers/md/md.h
> index 8d881cc59799..a079ee9b6190 100644
> --- a/drivers/md/md.h
> +++ b/drivers/md/md.h
> @@ -59,7 +59,7 @@ struct md_rdev {
>  	 */
>  	struct block_device *meta_bdev;
>  	struct block_device *bdev;	/* block device handle */
> -	struct bdev_handle *bdev_handle;	/* Handle from open for bdev */
> +	struct file *bdev_file;		/* Handle from open for bdev */
>  
>  	struct page	*sb_page, *bb_page;
>  	int		sb_loaded;
> diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h
> index 772ab4d74d94..82b2195efaca 100644
> --- a/include/linux/device-mapper.h
> +++ b/include/linux/device-mapper.h
> @@ -165,7 +165,7 @@ void dm_error(const char *message);
>  
>  struct dm_dev {
>  	struct block_device *bdev;
> -	struct bdev_handle *bdev_handle;
> +	struct file *bdev_file;
>  	struct dax_device *dax_dev;
>  	blk_mode_t mode;
>  	char name[16];
> 
> -- 
> 2.43.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 05/34] swap: port block device usage to file
  2024-01-23 13:26 ` [PATCH v2 05/34] swap: port block device usage " Christian Brauner
  2024-01-29 16:15   ` Christoph Hellwig
@ 2024-01-31 18:16   ` Jan Kara
  1 sibling, 0 replies; 146+ messages in thread
From: Jan Kara @ 2024-01-31 18:16 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:22, Christian Brauner wrote:
> Signed-off-by: Christian Brauner <brauner@kernel.org>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  include/linux/swap.h |  2 +-
>  mm/swapfile.c        | 22 +++++++++++-----------
>  2 files changed, 12 insertions(+), 12 deletions(-)
> 
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index 4db00ddad261..e5b82bc05e60 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -298,7 +298,7 @@ struct swap_info_struct {
>  	unsigned int __percpu *cluster_next_cpu; /*percpu index for next allocation */
>  	struct percpu_cluster __percpu *percpu_cluster; /* per cpu's swap location */
>  	struct rb_root swap_extent_root;/* root of the swap extent rbtree */
> -	struct bdev_handle *bdev_handle;/* open handle of the bdev */
> +	struct file *bdev_file;		/* open handle of the bdev */
>  	struct block_device *bdev;	/* swap device or bdev of swap file */
>  	struct file *swap_file;		/* seldom referenced */
>  	unsigned int old_block_size;	/* seldom referenced */
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 556ff7347d5f..73edd6fed6a2 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -2532,10 +2532,10 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile)
>  	exit_swap_address_space(p->type);
>  
>  	inode = mapping->host;
> -	if (p->bdev_handle) {
> +	if (p->bdev_file) {
>  		set_blocksize(p->bdev, old_block_size);
> -		bdev_release(p->bdev_handle);
> -		p->bdev_handle = NULL;
> +		fput(p->bdev_file);
> +		p->bdev_file = NULL;
>  	}
>  
>  	inode_lock(inode);
> @@ -2765,14 +2765,14 @@ static int claim_swapfile(struct swap_info_struct *p, struct inode *inode)
>  	int error;
>  
>  	if (S_ISBLK(inode->i_mode)) {
> -		p->bdev_handle = bdev_open_by_dev(inode->i_rdev,
> +		p->bdev_file = bdev_file_open_by_dev(inode->i_rdev,
>  				BLK_OPEN_READ | BLK_OPEN_WRITE, p, NULL);
> -		if (IS_ERR(p->bdev_handle)) {
> -			error = PTR_ERR(p->bdev_handle);
> -			p->bdev_handle = NULL;
> +		if (IS_ERR(p->bdev_file)) {
> +			error = PTR_ERR(p->bdev_file);
> +			p->bdev_file = NULL;
>  			return error;
>  		}
> -		p->bdev = p->bdev_handle->bdev;
> +		p->bdev = file_bdev(p->bdev_file);
>  		p->old_block_size = block_size(p->bdev);
>  		error = set_blocksize(p->bdev, PAGE_SIZE);
>  		if (error < 0)
> @@ -3208,10 +3208,10 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
>  	p->percpu_cluster = NULL;
>  	free_percpu(p->cluster_next_cpu);
>  	p->cluster_next_cpu = NULL;
> -	if (p->bdev_handle) {
> +	if (p->bdev_file) {
>  		set_blocksize(p->bdev, p->old_block_size);
> -		bdev_release(p->bdev_handle);
> -		p->bdev_handle = NULL;
> +		fput(p->bdev_file);
> +		p->bdev_file = NULL;
>  	}
>  	inode = NULL;
>  	destroy_swap_extents(p);
> 
> -- 
> 2.43.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 06/34] power: port block device access to file
  2024-01-23 13:26 ` [PATCH v2 06/34] power: port block device access " Christian Brauner
  2024-01-29 16:15   ` Christoph Hellwig
@ 2024-01-31 18:17   ` Jan Kara
  1 sibling, 0 replies; 146+ messages in thread
From: Jan Kara @ 2024-01-31 18:17 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:23, Christian Brauner wrote:
> Signed-off-by: Christian Brauner <brauner@kernel.org>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  kernel/power/swap.c | 28 ++++++++++++++--------------
>  1 file changed, 14 insertions(+), 14 deletions(-)
> 
> diff --git a/kernel/power/swap.c b/kernel/power/swap.c
> index 6053ddddaf65..692f12fe60c1 100644
> --- a/kernel/power/swap.c
> +++ b/kernel/power/swap.c
> @@ -222,7 +222,7 @@ int swsusp_swap_in_use(void)
>   */
>  
>  static unsigned short root_swap = 0xffff;
> -static struct bdev_handle *hib_resume_bdev_handle;
> +static struct file *hib_resume_bdev_file;
>  
>  struct hib_bio_batch {
>  	atomic_t		count;
> @@ -276,7 +276,7 @@ static int hib_submit_io(blk_opf_t opf, pgoff_t page_off, void *addr,
>  	struct bio *bio;
>  	int error = 0;
>  
> -	bio = bio_alloc(hib_resume_bdev_handle->bdev, 1, opf,
> +	bio = bio_alloc(file_bdev(hib_resume_bdev_file), 1, opf,
>  			GFP_NOIO | __GFP_HIGH);
>  	bio->bi_iter.bi_sector = page_off * (PAGE_SIZE >> 9);
>  
> @@ -357,14 +357,14 @@ static int swsusp_swap_check(void)
>  		return res;
>  	root_swap = res;
>  
> -	hib_resume_bdev_handle = bdev_open_by_dev(swsusp_resume_device,
> +	hib_resume_bdev_file = bdev_file_open_by_dev(swsusp_resume_device,
>  			BLK_OPEN_WRITE, NULL, NULL);
> -	if (IS_ERR(hib_resume_bdev_handle))
> -		return PTR_ERR(hib_resume_bdev_handle);
> +	if (IS_ERR(hib_resume_bdev_file))
> +		return PTR_ERR(hib_resume_bdev_file);
>  
> -	res = set_blocksize(hib_resume_bdev_handle->bdev, PAGE_SIZE);
> +	res = set_blocksize(file_bdev(hib_resume_bdev_file), PAGE_SIZE);
>  	if (res < 0)
> -		bdev_release(hib_resume_bdev_handle);
> +		fput(hib_resume_bdev_file);
>  
>  	return res;
>  }
> @@ -1523,10 +1523,10 @@ int swsusp_check(bool exclusive)
>  	void *holder = exclusive ? &swsusp_holder : NULL;
>  	int error;
>  
> -	hib_resume_bdev_handle = bdev_open_by_dev(swsusp_resume_device,
> +	hib_resume_bdev_file = bdev_file_open_by_dev(swsusp_resume_device,
>  				BLK_OPEN_READ, holder, NULL);
> -	if (!IS_ERR(hib_resume_bdev_handle)) {
> -		set_blocksize(hib_resume_bdev_handle->bdev, PAGE_SIZE);
> +	if (!IS_ERR(hib_resume_bdev_file)) {
> +		set_blocksize(file_bdev(hib_resume_bdev_file), PAGE_SIZE);
>  		clear_page(swsusp_header);
>  		error = hib_submit_io(REQ_OP_READ, swsusp_resume_block,
>  					swsusp_header, NULL);
> @@ -1551,11 +1551,11 @@ int swsusp_check(bool exclusive)
>  
>  put:
>  		if (error)
> -			bdev_release(hib_resume_bdev_handle);
> +			fput(hib_resume_bdev_file);
>  		else
>  			pr_debug("Image signature found, resuming\n");
>  	} else {
> -		error = PTR_ERR(hib_resume_bdev_handle);
> +		error = PTR_ERR(hib_resume_bdev_file);
>  	}
>  
>  	if (error)
> @@ -1570,12 +1570,12 @@ int swsusp_check(bool exclusive)
>  
>  void swsusp_close(void)
>  {
> -	if (IS_ERR(hib_resume_bdev_handle)) {
> +	if (IS_ERR(hib_resume_bdev_file)) {
>  		pr_debug("Image device not initialised\n");
>  		return;
>  	}
>  
> -	bdev_release(hib_resume_bdev_handle);
> +	fput(hib_resume_bdev_file);
>  }
>  
>  /**
> 
> -- 
> 2.43.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 07/34] xfs: port block device access to files
  2024-01-23 13:26 ` [PATCH v2 07/34] xfs: port block device access to files Christian Brauner
  2024-01-29 16:17   ` Christoph Hellwig
@ 2024-01-31 18:19   ` Jan Kara
  1 sibling, 0 replies; 146+ messages in thread
From: Jan Kara @ 2024-01-31 18:19 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:24, Christian Brauner wrote:
> Signed-off-by: Christian Brauner <brauner@kernel.org>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/xfs/xfs_buf.c   | 10 +++++-----
>  fs/xfs/xfs_buf.h   |  4 ++--
>  fs/xfs/xfs_super.c | 43 +++++++++++++++++++++----------------------
>  3 files changed, 28 insertions(+), 29 deletions(-)
> 
> diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> index 8e5bd50d29fe..01b41fabbe3c 100644
> --- a/fs/xfs/xfs_buf.c
> +++ b/fs/xfs/xfs_buf.c
> @@ -1951,7 +1951,7 @@ xfs_free_buftarg(
>  	fs_put_dax(btp->bt_daxdev, btp->bt_mount);
>  	/* the main block device is closed by kill_block_super */
>  	if (btp->bt_bdev != btp->bt_mount->m_super->s_bdev)
> -		bdev_release(btp->bt_bdev_handle);
> +		fput(btp->bt_bdev_file);
>  
>  	kmem_free(btp);
>  }
> @@ -1994,7 +1994,7 @@ xfs_setsize_buftarg_early(
>  struct xfs_buftarg *
>  xfs_alloc_buftarg(
>  	struct xfs_mount	*mp,
> -	struct bdev_handle	*bdev_handle)
> +	struct file		*bdev_file)
>  {
>  	xfs_buftarg_t		*btp;
>  	const struct dax_holder_operations *ops = NULL;
> @@ -2005,9 +2005,9 @@ xfs_alloc_buftarg(
>  	btp = kmem_zalloc(sizeof(*btp), KM_NOFS);
>  
>  	btp->bt_mount = mp;
> -	btp->bt_bdev_handle = bdev_handle;
> -	btp->bt_dev = bdev_handle->bdev->bd_dev;
> -	btp->bt_bdev = bdev_handle->bdev;
> +	btp->bt_bdev_file = bdev_file;
> +	btp->bt_bdev = file_bdev(bdev_file);
> +	btp->bt_dev = btp->bt_bdev->bd_dev;
>  	btp->bt_daxdev = fs_dax_get_by_bdev(btp->bt_bdev, &btp->bt_dax_part_off,
>  					    mp, ops);
>  
> diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h
> index b470de08a46c..304e858d04fb 100644
> --- a/fs/xfs/xfs_buf.h
> +++ b/fs/xfs/xfs_buf.h
> @@ -98,7 +98,7 @@ typedef unsigned int xfs_buf_flags_t;
>   */
>  typedef struct xfs_buftarg {
>  	dev_t			bt_dev;
> -	struct bdev_handle	*bt_bdev_handle;
> +	struct file		*bt_bdev_file;
>  	struct block_device	*bt_bdev;
>  	struct dax_device	*bt_daxdev;
>  	u64			bt_dax_part_off;
> @@ -366,7 +366,7 @@ xfs_buf_update_cksum(struct xfs_buf *bp, unsigned long cksum_offset)
>   *	Handling of buftargs.
>   */
>  struct xfs_buftarg *xfs_alloc_buftarg(struct xfs_mount *mp,
> -		struct bdev_handle *bdev_handle);
> +		struct file *bdev_file);
>  extern void xfs_free_buftarg(struct xfs_buftarg *);
>  extern void xfs_buftarg_wait(struct xfs_buftarg *);
>  extern void xfs_buftarg_drain(struct xfs_buftarg *);
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index e5ac0e59ede9..4a076c464177 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -362,16 +362,16 @@ STATIC int
>  xfs_blkdev_get(
>  	xfs_mount_t		*mp,
>  	const char		*name,
> -	struct bdev_handle	**handlep)
> +	struct file		**bdev_filep)
>  {
>  	int			error = 0;
>  
> -	*handlep = bdev_open_by_path(name,
> +	*bdev_filep = bdev_file_open_by_path(name,
>  		BLK_OPEN_READ | BLK_OPEN_WRITE | BLK_OPEN_RESTRICT_WRITES,
>  		mp->m_super, &fs_holder_ops);
> -	if (IS_ERR(*handlep)) {
> -		error = PTR_ERR(*handlep);
> -		*handlep = NULL;
> +	if (IS_ERR(*bdev_filep)) {
> +		error = PTR_ERR(*bdev_filep);
> +		*bdev_filep = NULL;
>  		xfs_warn(mp, "Invalid device [%s], error=%d", name, error);
>  	}
>  
> @@ -436,26 +436,25 @@ xfs_open_devices(
>  {
>  	struct super_block	*sb = mp->m_super;
>  	struct block_device	*ddev = sb->s_bdev;
> -	struct bdev_handle	*logdev_handle = NULL, *rtdev_handle = NULL;
> +	struct file		*logdev_file = NULL, *rtdev_file = NULL;
>  	int			error;
>  
>  	/*
>  	 * Open real time and log devices - order is important.
>  	 */
>  	if (mp->m_logname) {
> -		error = xfs_blkdev_get(mp, mp->m_logname, &logdev_handle);
> +		error = xfs_blkdev_get(mp, mp->m_logname, &logdev_file);
>  		if (error)
>  			return error;
>  	}
>  
>  	if (mp->m_rtname) {
> -		error = xfs_blkdev_get(mp, mp->m_rtname, &rtdev_handle);
> +		error = xfs_blkdev_get(mp, mp->m_rtname, &rtdev_file);
>  		if (error)
>  			goto out_close_logdev;
>  
> -		if (rtdev_handle->bdev == ddev ||
> -		    (logdev_handle &&
> -		     rtdev_handle->bdev == logdev_handle->bdev)) {
> +		if (file_bdev(rtdev_file) == ddev ||
> +		    (logdev_file && file_bdev(rtdev_file) == file_bdev(logdev_file))) {
>  			xfs_warn(mp,
>  	"Cannot mount filesystem with identical rtdev and ddev/logdev.");
>  			error = -EINVAL;
> @@ -467,25 +466,25 @@ xfs_open_devices(
>  	 * Setup xfs_mount buffer target pointers
>  	 */
>  	error = -ENOMEM;
> -	mp->m_ddev_targp = xfs_alloc_buftarg(mp, sb_bdev_handle(sb));
> +	mp->m_ddev_targp = xfs_alloc_buftarg(mp, sb->s_bdev_file);
>  	if (!mp->m_ddev_targp)
>  		goto out_close_rtdev;
>  
> -	if (rtdev_handle) {
> -		mp->m_rtdev_targp = xfs_alloc_buftarg(mp, rtdev_handle);
> +	if (rtdev_file) {
> +		mp->m_rtdev_targp = xfs_alloc_buftarg(mp, rtdev_file);
>  		if (!mp->m_rtdev_targp)
>  			goto out_free_ddev_targ;
>  	}
>  
> -	if (logdev_handle && logdev_handle->bdev != ddev) {
> -		mp->m_logdev_targp = xfs_alloc_buftarg(mp, logdev_handle);
> +	if (logdev_file && file_bdev(logdev_file) != ddev) {
> +		mp->m_logdev_targp = xfs_alloc_buftarg(mp, logdev_file);
>  		if (!mp->m_logdev_targp)
>  			goto out_free_rtdev_targ;
>  	} else {
>  		mp->m_logdev_targp = mp->m_ddev_targp;
>  		/* Handle won't be used, drop it */
> -		if (logdev_handle)
> -			bdev_release(logdev_handle);
> +		if (logdev_file)
> +			fput(logdev_file);
>  	}
>  
>  	return 0;
> @@ -496,11 +495,11 @@ xfs_open_devices(
>   out_free_ddev_targ:
>  	xfs_free_buftarg(mp->m_ddev_targp);
>   out_close_rtdev:
> -	 if (rtdev_handle)
> -		bdev_release(rtdev_handle);
> +	 if (rtdev_file)
> +		fput(rtdev_file);
>   out_close_logdev:
> -	if (logdev_handle)
> -		bdev_release(logdev_handle);
> +	if (logdev_file)
> +		fput(logdev_file);
>  	return error;
>  }
>  
> 
> -- 
> 2.43.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 08/34] drbd: port block device access to file
  2024-01-23 13:26 ` [PATCH v2 08/34] drbd: port block device access to file Christian Brauner
@ 2024-01-31 18:22   ` Jan Kara
  0 siblings, 0 replies; 146+ messages in thread
From: Jan Kara @ 2024-01-31 18:22 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:25, Christian Brauner wrote:
> Signed-off-by: Christian Brauner <brauner@kernel.org>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  drivers/block/drbd/drbd_int.h |  4 +--
>  drivers/block/drbd/drbd_nl.c  | 58 +++++++++++++++++++++----------------------
>  2 files changed, 31 insertions(+), 31 deletions(-)
> 
> diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
> index c21e3732759e..94dc0a235919 100644
> --- a/drivers/block/drbd/drbd_int.h
> +++ b/drivers/block/drbd/drbd_int.h
> @@ -524,9 +524,9 @@ struct drbd_md {
>  
>  struct drbd_backing_dev {
>  	struct block_device *backing_bdev;
> -	struct bdev_handle *backing_bdev_handle;
> +	struct file *backing_bdev_file;
>  	struct block_device *md_bdev;
> -	struct bdev_handle *md_bdev_handle;
> +	struct file *f_md_bdev;
>  	struct drbd_md md;
>  	struct disk_conf *disk_conf; /* RCU, for updates: resource->conf_update */
>  	sector_t known_size; /* last known size of that backing device */
> diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
> index 43747a1aae43..6aed67278e8b 100644
> --- a/drivers/block/drbd/drbd_nl.c
> +++ b/drivers/block/drbd/drbd_nl.c
> @@ -1635,45 +1635,45 @@ int drbd_adm_disk_opts(struct sk_buff *skb, struct genl_info *info)
>  	return 0;
>  }
>  
> -static struct bdev_handle *open_backing_dev(struct drbd_device *device,
> +static struct file *open_backing_dev(struct drbd_device *device,
>  		const char *bdev_path, void *claim_ptr, bool do_bd_link)
>  {
> -	struct bdev_handle *handle;
> +	struct file *file;
>  	int err = 0;
>  
> -	handle = bdev_open_by_path(bdev_path, BLK_OPEN_READ | BLK_OPEN_WRITE,
> -				   claim_ptr, NULL);
> -	if (IS_ERR(handle)) {
> +	file = bdev_file_open_by_path(bdev_path, BLK_OPEN_READ | BLK_OPEN_WRITE,
> +				      claim_ptr, NULL);
> +	if (IS_ERR(file)) {
>  		drbd_err(device, "open(\"%s\") failed with %ld\n",
> -				bdev_path, PTR_ERR(handle));
> -		return handle;
> +				bdev_path, PTR_ERR(file));
> +		return file;
>  	}
>  
>  	if (!do_bd_link)
> -		return handle;
> +		return file;
>  
> -	err = bd_link_disk_holder(handle->bdev, device->vdisk);
> +	err = bd_link_disk_holder(file_bdev(file), device->vdisk);
>  	if (err) {
> -		bdev_release(handle);
> +		fput(file);
>  		drbd_err(device, "bd_link_disk_holder(\"%s\", ...) failed with %d\n",
>  				bdev_path, err);
> -		handle = ERR_PTR(err);
> +		file = ERR_PTR(err);
>  	}
> -	return handle;
> +	return file;
>  }
>  
>  static int open_backing_devices(struct drbd_device *device,
>  		struct disk_conf *new_disk_conf,
>  		struct drbd_backing_dev *nbc)
>  {
> -	struct bdev_handle *handle;
> +	struct file *file;
>  
> -	handle = open_backing_dev(device, new_disk_conf->backing_dev, device,
> +	file = open_backing_dev(device, new_disk_conf->backing_dev, device,
>  				  true);
> -	if (IS_ERR(handle))
> +	if (IS_ERR(file))
>  		return ERR_OPEN_DISK;
> -	nbc->backing_bdev = handle->bdev;
> -	nbc->backing_bdev_handle = handle;
> +	nbc->backing_bdev = file_bdev(file);
> +	nbc->backing_bdev_file = file;
>  
>  	/*
>  	 * meta_dev_idx >= 0: external fixed size, possibly multiple
> @@ -1683,7 +1683,7 @@ static int open_backing_devices(struct drbd_device *device,
>  	 * should check it for you already; but if you don't, or
>  	 * someone fooled it, we need to double check here)
>  	 */
> -	handle = open_backing_dev(device, new_disk_conf->meta_dev,
> +	file = open_backing_dev(device, new_disk_conf->meta_dev,
>  		/* claim ptr: device, if claimed exclusively; shared drbd_m_holder,
>  		 * if potentially shared with other drbd minors */
>  			(new_disk_conf->meta_dev_idx < 0) ? (void*)device : (void*)drbd_m_holder,
> @@ -1691,21 +1691,21 @@ static int open_backing_devices(struct drbd_device *device,
>  		 * as would happen with internal metadata. */
>  			(new_disk_conf->meta_dev_idx != DRBD_MD_INDEX_FLEX_INT &&
>  			 new_disk_conf->meta_dev_idx != DRBD_MD_INDEX_INTERNAL));
> -	if (IS_ERR(handle))
> +	if (IS_ERR(file))
>  		return ERR_OPEN_MD_DISK;
> -	nbc->md_bdev = handle->bdev;
> -	nbc->md_bdev_handle = handle;
> +	nbc->md_bdev = file_bdev(file);
> +	nbc->f_md_bdev = file;
>  	return NO_ERROR;
>  }
>  
>  static void close_backing_dev(struct drbd_device *device,
> -		struct bdev_handle *handle, bool do_bd_unlink)
> +		struct file *bdev_file, bool do_bd_unlink)
>  {
> -	if (!handle)
> +	if (!bdev_file)
>  		return;
>  	if (do_bd_unlink)
> -		bd_unlink_disk_holder(handle->bdev, device->vdisk);
> -	bdev_release(handle);
> +		bd_unlink_disk_holder(file_bdev(bdev_file), device->vdisk);
> +	fput(bdev_file);
>  }
>  
>  void drbd_backing_dev_free(struct drbd_device *device, struct drbd_backing_dev *ldev)
> @@ -1713,9 +1713,9 @@ void drbd_backing_dev_free(struct drbd_device *device, struct drbd_backing_dev *
>  	if (ldev == NULL)
>  		return;
>  
> -	close_backing_dev(device, ldev->md_bdev_handle,
> +	close_backing_dev(device, ldev->f_md_bdev,
>  			  ldev->md_bdev != ldev->backing_bdev);
> -	close_backing_dev(device, ldev->backing_bdev_handle, true);
> +	close_backing_dev(device, ldev->backing_bdev_file, true);
>  
>  	kfree(ldev->disk_conf);
>  	kfree(ldev);
> @@ -2131,9 +2131,9 @@ int drbd_adm_attach(struct sk_buff *skb, struct genl_info *info)
>   fail:
>  	conn_reconfig_done(connection);
>  	if (nbc) {
> -		close_backing_dev(device, nbc->md_bdev_handle,
> +		close_backing_dev(device, nbc->f_md_bdev,
>  			  nbc->md_bdev != nbc->backing_bdev);
> -		close_backing_dev(device, nbc->backing_bdev_handle, true);
> +		close_backing_dev(device, nbc->backing_bdev_file, true);
>  		kfree(nbc);
>  	}
>  	kfree(new_disk_conf);
> 
> -- 
> 2.43.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 09/34] pktcdvd: port block device access to file
  2024-01-23 13:26 ` [PATCH v2 09/34] pktcdvd: " Christian Brauner
@ 2024-01-31 18:26   ` Jan Kara
  0 siblings, 0 replies; 146+ messages in thread
From: Jan Kara @ 2024-01-31 18:26 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:26, Christian Brauner wrote:
> Signed-off-by: Christian Brauner <brauner@kernel.org>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  drivers/block/pktcdvd.c | 68 ++++++++++++++++++++++++-------------------------
>  include/linux/pktcdvd.h |  4 +--
>  2 files changed, 36 insertions(+), 36 deletions(-)
> 
> diff --git a/drivers/block/pktcdvd.c b/drivers/block/pktcdvd.c
> index d56d972aadb3..c21444716e43 100644
> --- a/drivers/block/pktcdvd.c
> +++ b/drivers/block/pktcdvd.c
> @@ -340,8 +340,8 @@ static ssize_t device_map_show(const struct class *c, const struct class_attribu
>  		n += sysfs_emit_at(data, n, "%s %u:%u %u:%u\n",
>  			pd->disk->disk_name,
>  			MAJOR(pd->pkt_dev), MINOR(pd->pkt_dev),
> -			MAJOR(pd->bdev_handle->bdev->bd_dev),
> -			MINOR(pd->bdev_handle->bdev->bd_dev));
> +			MAJOR(file_bdev(pd->bdev_file)->bd_dev),
> +			MINOR(file_bdev(pd->bdev_file)->bd_dev));
>  	}
>  	mutex_unlock(&ctl_mutex);
>  	return n;
> @@ -438,7 +438,7 @@ static int pkt_seq_show(struct seq_file *m, void *p)
>  	int states[PACKET_NUM_STATES];
>  
>  	seq_printf(m, "Writer %s mapped to %pg:\n", pd->disk->disk_name,
> -		   pd->bdev_handle->bdev);
> +		   file_bdev(pd->bdev_file));
>  
>  	seq_printf(m, "\nSettings:\n");
>  	seq_printf(m, "\tpacket size:\t\t%dkB\n", pd->settings.size / 2);
> @@ -715,7 +715,7 @@ static void pkt_rbtree_insert(struct pktcdvd_device *pd, struct pkt_rb_node *nod
>   */
>  static int pkt_generic_packet(struct pktcdvd_device *pd, struct packet_command *cgc)
>  {
> -	struct request_queue *q = bdev_get_queue(pd->bdev_handle->bdev);
> +	struct request_queue *q = bdev_get_queue(file_bdev(pd->bdev_file));
>  	struct scsi_cmnd *scmd;
>  	struct request *rq;
>  	int ret = 0;
> @@ -1048,7 +1048,7 @@ static void pkt_gather_data(struct pktcdvd_device *pd, struct packet_data *pkt)
>  			continue;
>  
>  		bio = pkt->r_bios[f];
> -		bio_init(bio, pd->bdev_handle->bdev, bio->bi_inline_vecs, 1,
> +		bio_init(bio, file_bdev(pd->bdev_file), bio->bi_inline_vecs, 1,
>  			 REQ_OP_READ);
>  		bio->bi_iter.bi_sector = pkt->sector + f * (CD_FRAMESIZE >> 9);
>  		bio->bi_end_io = pkt_end_io_read;
> @@ -1264,7 +1264,7 @@ static void pkt_start_write(struct pktcdvd_device *pd, struct packet_data *pkt)
>  	struct device *ddev = disk_to_dev(pd->disk);
>  	int f;
>  
> -	bio_init(pkt->w_bio, pd->bdev_handle->bdev, pkt->w_bio->bi_inline_vecs,
> +	bio_init(pkt->w_bio, file_bdev(pd->bdev_file), pkt->w_bio->bi_inline_vecs,
>  		 pkt->frames, REQ_OP_WRITE);
>  	pkt->w_bio->bi_iter.bi_sector = pkt->sector;
>  	pkt->w_bio->bi_end_io = pkt_end_io_packet_write;
> @@ -2162,20 +2162,20 @@ static int pkt_open_dev(struct pktcdvd_device *pd, bool write)
>  	int ret;
>  	long lba;
>  	struct request_queue *q;
> -	struct bdev_handle *bdev_handle;
> +	struct file *bdev_file;
>  
>  	/*
>  	 * We need to re-open the cdrom device without O_NONBLOCK to be able
>  	 * to read/write from/to it. It is already opened in O_NONBLOCK mode
>  	 * so open should not fail.
>  	 */
> -	bdev_handle = bdev_open_by_dev(pd->bdev_handle->bdev->bd_dev,
> +	bdev_file = bdev_file_open_by_dev(file_bdev(pd->bdev_file)->bd_dev,
>  				       BLK_OPEN_READ, pd, NULL);
> -	if (IS_ERR(bdev_handle)) {
> -		ret = PTR_ERR(bdev_handle);
> +	if (IS_ERR(bdev_file)) {
> +		ret = PTR_ERR(bdev_file);
>  		goto out;
>  	}
> -	pd->open_bdev_handle = bdev_handle;
> +	pd->f_open_bdev = bdev_file;
>  
>  	ret = pkt_get_last_written(pd, &lba);
>  	if (ret) {
> @@ -2184,9 +2184,9 @@ static int pkt_open_dev(struct pktcdvd_device *pd, bool write)
>  	}
>  
>  	set_capacity(pd->disk, lba << 2);
> -	set_capacity_and_notify(pd->bdev_handle->bdev->bd_disk, lba << 2);
> +	set_capacity_and_notify(file_bdev(pd->bdev_file)->bd_disk, lba << 2);
>  
> -	q = bdev_get_queue(pd->bdev_handle->bdev);
> +	q = bdev_get_queue(file_bdev(pd->bdev_file));
>  	if (write) {
>  		ret = pkt_open_write(pd);
>  		if (ret)
> @@ -2218,7 +2218,7 @@ static int pkt_open_dev(struct pktcdvd_device *pd, bool write)
>  	return 0;
>  
>  out_putdev:
> -	bdev_release(bdev_handle);
> +	fput(bdev_file);
>  out:
>  	return ret;
>  }
> @@ -2237,8 +2237,8 @@ static void pkt_release_dev(struct pktcdvd_device *pd, int flush)
>  	pkt_lock_door(pd, 0);
>  
>  	pkt_set_speed(pd, MAX_SPEED, MAX_SPEED);
> -	bdev_release(pd->open_bdev_handle);
> -	pd->open_bdev_handle = NULL;
> +	fput(pd->f_open_bdev);
> +	pd->f_open_bdev = NULL;
>  
>  	pkt_shrink_pktlist(pd);
>  }
> @@ -2326,7 +2326,7 @@ static void pkt_end_io_read_cloned(struct bio *bio)
>  
>  static void pkt_make_request_read(struct pktcdvd_device *pd, struct bio *bio)
>  {
> -	struct bio *cloned_bio = bio_alloc_clone(pd->bdev_handle->bdev, bio,
> +	struct bio *cloned_bio = bio_alloc_clone(file_bdev(pd->bdev_file), bio,
>  		GFP_NOIO, &pkt_bio_set);
>  	struct packet_stacked_data *psd = mempool_alloc(&psd_pool, GFP_NOIO);
>  
> @@ -2497,7 +2497,7 @@ static int pkt_new_dev(struct pktcdvd_device *pd, dev_t dev)
>  {
>  	struct device *ddev = disk_to_dev(pd->disk);
>  	int i;
> -	struct bdev_handle *bdev_handle;
> +	struct file *bdev_file;
>  	struct scsi_device *sdev;
>  
>  	if (pd->pkt_dev == dev) {
> @@ -2508,9 +2508,9 @@ static int pkt_new_dev(struct pktcdvd_device *pd, dev_t dev)
>  		struct pktcdvd_device *pd2 = pkt_devs[i];
>  		if (!pd2)
>  			continue;
> -		if (pd2->bdev_handle->bdev->bd_dev == dev) {
> +		if (file_bdev(pd2->bdev_file)->bd_dev == dev) {
>  			dev_err(ddev, "%pg already setup\n",
> -				pd2->bdev_handle->bdev);
> +				file_bdev(pd2->bdev_file));
>  			return -EBUSY;
>  		}
>  		if (pd2->pkt_dev == dev) {
> @@ -2519,13 +2519,13 @@ static int pkt_new_dev(struct pktcdvd_device *pd, dev_t dev)
>  		}
>  	}
>  
> -	bdev_handle = bdev_open_by_dev(dev, BLK_OPEN_READ | BLK_OPEN_NDELAY,
> +	bdev_file = bdev_file_open_by_dev(dev, BLK_OPEN_READ | BLK_OPEN_NDELAY,
>  				       NULL, NULL);
> -	if (IS_ERR(bdev_handle))
> -		return PTR_ERR(bdev_handle);
> -	sdev = scsi_device_from_queue(bdev_handle->bdev->bd_disk->queue);
> +	if (IS_ERR(bdev_file))
> +		return PTR_ERR(bdev_file);
> +	sdev = scsi_device_from_queue(file_bdev(bdev_file)->bd_disk->queue);
>  	if (!sdev) {
> -		bdev_release(bdev_handle);
> +		fput(bdev_file);
>  		return -EINVAL;
>  	}
>  	put_device(&sdev->sdev_gendev);
> @@ -2533,8 +2533,8 @@ static int pkt_new_dev(struct pktcdvd_device *pd, dev_t dev)
>  	/* This is safe, since we have a reference from open(). */
>  	__module_get(THIS_MODULE);
>  
> -	pd->bdev_handle = bdev_handle;
> -	set_blocksize(bdev_handle->bdev, CD_FRAMESIZE);
> +	pd->bdev_file = bdev_file;
> +	set_blocksize(file_bdev(bdev_file), CD_FRAMESIZE);
>  
>  	pkt_init_queue(pd);
>  
> @@ -2546,11 +2546,11 @@ static int pkt_new_dev(struct pktcdvd_device *pd, dev_t dev)
>  	}
>  
>  	proc_create_single_data(pd->disk->disk_name, 0, pkt_proc, pkt_seq_show, pd);
> -	dev_notice(ddev, "writer mapped to %pg\n", bdev_handle->bdev);
> +	dev_notice(ddev, "writer mapped to %pg\n", file_bdev(bdev_file));
>  	return 0;
>  
>  out_mem:
> -	bdev_release(bdev_handle);
> +	fput(bdev_file);
>  	/* This is safe: open() is still holding a reference. */
>  	module_put(THIS_MODULE);
>  	return -ENOMEM;
> @@ -2605,9 +2605,9 @@ static unsigned int pkt_check_events(struct gendisk *disk,
>  
>  	if (!pd)
>  		return 0;
> -	if (!pd->bdev_handle)
> +	if (!pd->bdev_file)
>  		return 0;
> -	attached_disk = pd->bdev_handle->bdev->bd_disk;
> +	attached_disk = file_bdev(pd->bdev_file)->bd_disk;
>  	if (!attached_disk || !attached_disk->fops->check_events)
>  		return 0;
>  	return attached_disk->fops->check_events(attached_disk, clearing);
> @@ -2692,7 +2692,7 @@ static int pkt_setup_dev(dev_t dev, dev_t* pkt_dev)
>  		goto out_mem2;
>  
>  	/* inherit events of the host device */
> -	disk->events = pd->bdev_handle->bdev->bd_disk->events;
> +	disk->events = file_bdev(pd->bdev_file)->bd_disk->events;
>  
>  	ret = add_disk(disk);
>  	if (ret)
> @@ -2757,7 +2757,7 @@ static int pkt_remove_dev(dev_t pkt_dev)
>  	pkt_debugfs_dev_remove(pd);
>  	pkt_sysfs_dev_remove(pd);
>  
> -	bdev_release(pd->bdev_handle);
> +	fput(pd->bdev_file);
>  
>  	remove_proc_entry(pd->disk->disk_name, pkt_proc);
>  	dev_notice(ddev, "writer unmapped\n");
> @@ -2784,7 +2784,7 @@ static void pkt_get_status(struct pkt_ctrl_command *ctrl_cmd)
>  
>  	pd = pkt_find_dev_from_minor(ctrl_cmd->dev_index);
>  	if (pd) {
> -		ctrl_cmd->dev = new_encode_dev(pd->bdev_handle->bdev->bd_dev);
> +		ctrl_cmd->dev = new_encode_dev(file_bdev(pd->bdev_file)->bd_dev);
>  		ctrl_cmd->pkt_dev = new_encode_dev(pd->pkt_dev);
>  	} else {
>  		ctrl_cmd->dev = 0;
> diff --git a/include/linux/pktcdvd.h b/include/linux/pktcdvd.h
> index 79594aeb160d..2f1b952d596a 100644
> --- a/include/linux/pktcdvd.h
> +++ b/include/linux/pktcdvd.h
> @@ -154,9 +154,9 @@ struct packet_stacked_data
>  
>  struct pktcdvd_device
>  {
> -	struct bdev_handle	*bdev_handle;	/* dev attached */
> +	struct file		*bdev_file;	/* dev attached */
>  	/* handle acquired for bdev during pkt_open_dev() */
> -	struct bdev_handle	*open_bdev_handle;
> +	struct file		*f_open_bdev;
>  	dev_t			pkt_dev;	/* our dev */
>  	struct packet_settings	settings;
>  	struct packet_stats	stats;
> 
> -- 
> 2.43.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 10/34] rnbd: port block device access to file
  2024-01-23 13:26 ` [PATCH v2 10/34] rnbd: " Christian Brauner
@ 2024-01-31 18:28   ` Jan Kara
  0 siblings, 0 replies; 146+ messages in thread
From: Jan Kara @ 2024-01-31 18:28 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:27, Christian Brauner wrote:
> Signed-off-by: Christian Brauner <brauner@kernel.org>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  drivers/block/rnbd/rnbd-srv.c | 28 ++++++++++++++--------------
>  drivers/block/rnbd/rnbd-srv.h |  2 +-
>  2 files changed, 15 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/block/rnbd/rnbd-srv.c b/drivers/block/rnbd/rnbd-srv.c
> index 3a0d5dcec6f2..f6e3a3c4b76c 100644
> --- a/drivers/block/rnbd/rnbd-srv.c
> +++ b/drivers/block/rnbd/rnbd-srv.c
> @@ -145,7 +145,7 @@ static int process_rdma(struct rnbd_srv_session *srv_sess,
>  	priv->sess_dev = sess_dev;
>  	priv->id = id;
>  
> -	bio = bio_alloc(sess_dev->bdev_handle->bdev, 1,
> +	bio = bio_alloc(file_bdev(sess_dev->bdev_file), 1,
>  			rnbd_to_bio_flags(le32_to_cpu(msg->rw)), GFP_KERNEL);
>  	if (bio_add_page(bio, virt_to_page(data), datalen,
>  			offset_in_page(data)) != datalen) {
> @@ -219,7 +219,7 @@ void rnbd_destroy_sess_dev(struct rnbd_srv_sess_dev *sess_dev, bool keep_id)
>  	rnbd_put_sess_dev(sess_dev);
>  	wait_for_completion(&dc); /* wait for inflights to drop to zero */
>  
> -	bdev_release(sess_dev->bdev_handle);
> +	fput(sess_dev->bdev_file);
>  	mutex_lock(&sess_dev->dev->lock);
>  	list_del(&sess_dev->dev_list);
>  	if (!sess_dev->readonly)
> @@ -534,7 +534,7 @@ rnbd_srv_get_or_create_srv_dev(struct block_device *bdev,
>  static void rnbd_srv_fill_msg_open_rsp(struct rnbd_msg_open_rsp *rsp,
>  					struct rnbd_srv_sess_dev *sess_dev)
>  {
> -	struct block_device *bdev = sess_dev->bdev_handle->bdev;
> +	struct block_device *bdev = file_bdev(sess_dev->bdev_file);
>  
>  	rsp->hdr.type = cpu_to_le16(RNBD_MSG_OPEN_RSP);
>  	rsp->device_id = cpu_to_le32(sess_dev->device_id);
> @@ -560,7 +560,7 @@ static void rnbd_srv_fill_msg_open_rsp(struct rnbd_msg_open_rsp *rsp,
>  static struct rnbd_srv_sess_dev *
>  rnbd_srv_create_set_sess_dev(struct rnbd_srv_session *srv_sess,
>  			      const struct rnbd_msg_open *open_msg,
> -			      struct bdev_handle *handle, bool readonly,
> +			      struct file *bdev_file, bool readonly,
>  			      struct rnbd_srv_dev *srv_dev)
>  {
>  	struct rnbd_srv_sess_dev *sdev = rnbd_sess_dev_alloc(srv_sess);
> @@ -572,7 +572,7 @@ rnbd_srv_create_set_sess_dev(struct rnbd_srv_session *srv_sess,
>  
>  	strscpy(sdev->pathname, open_msg->dev_name, sizeof(sdev->pathname));
>  
> -	sdev->bdev_handle	= handle;
> +	sdev->bdev_file		= bdev_file;
>  	sdev->sess		= srv_sess;
>  	sdev->dev		= srv_dev;
>  	sdev->readonly		= readonly;
> @@ -678,7 +678,7 @@ static int process_msg_open(struct rnbd_srv_session *srv_sess,
>  	struct rnbd_srv_dev *srv_dev;
>  	struct rnbd_srv_sess_dev *srv_sess_dev;
>  	const struct rnbd_msg_open *open_msg = msg;
> -	struct bdev_handle *bdev_handle;
> +	struct file *bdev_file;
>  	blk_mode_t open_flags = BLK_OPEN_READ;
>  	char *full_path;
>  	struct rnbd_msg_open_rsp *rsp = data;
> @@ -716,15 +716,15 @@ static int process_msg_open(struct rnbd_srv_session *srv_sess,
>  		goto reject;
>  	}
>  
> -	bdev_handle = bdev_open_by_path(full_path, open_flags, NULL, NULL);
> -	if (IS_ERR(bdev_handle)) {
> -		ret = PTR_ERR(bdev_handle);
> +	bdev_file = bdev_file_open_by_path(full_path, open_flags, NULL, NULL);
> +	if (IS_ERR(bdev_file)) {
> +		ret = PTR_ERR(bdev_file);
>  		pr_err("Opening device '%s' on session %s failed, failed to open the block device, err: %pe\n",
> -		       full_path, srv_sess->sessname, bdev_handle);
> +		       full_path, srv_sess->sessname, bdev_file);
>  		goto free_path;
>  	}
>  
> -	srv_dev = rnbd_srv_get_or_create_srv_dev(bdev_handle->bdev, srv_sess,
> +	srv_dev = rnbd_srv_get_or_create_srv_dev(file_bdev(bdev_file), srv_sess,
>  						  open_msg->access_mode);
>  	if (IS_ERR(srv_dev)) {
>  		pr_err("Opening device '%s' on session %s failed, creating srv_dev failed, err: %pe\n",
> @@ -734,7 +734,7 @@ static int process_msg_open(struct rnbd_srv_session *srv_sess,
>  	}
>  
>  	srv_sess_dev = rnbd_srv_create_set_sess_dev(srv_sess, open_msg,
> -				bdev_handle,
> +				bdev_file,
>  				open_msg->access_mode == RNBD_ACCESS_RO,
>  				srv_dev);
>  	if (IS_ERR(srv_sess_dev)) {
> @@ -750,7 +750,7 @@ static int process_msg_open(struct rnbd_srv_session *srv_sess,
>  	 */
>  	mutex_lock(&srv_dev->lock);
>  	if (!srv_dev->dev_kobj.state_in_sysfs) {
> -		ret = rnbd_srv_create_dev_sysfs(srv_dev, bdev_handle->bdev);
> +		ret = rnbd_srv_create_dev_sysfs(srv_dev, file_bdev(bdev_file));
>  		if (ret) {
>  			mutex_unlock(&srv_dev->lock);
>  			rnbd_srv_err(srv_sess_dev,
> @@ -793,7 +793,7 @@ static int process_msg_open(struct rnbd_srv_session *srv_sess,
>  	}
>  	rnbd_put_srv_dev(srv_dev);
>  blkdev_put:
> -	bdev_release(bdev_handle);
> +	fput(bdev_file);
>  free_path:
>  	kfree(full_path);
>  reject:
> diff --git a/drivers/block/rnbd/rnbd-srv.h b/drivers/block/rnbd/rnbd-srv.h
> index 343cc682b617..18d873808b8d 100644
> --- a/drivers/block/rnbd/rnbd-srv.h
> +++ b/drivers/block/rnbd/rnbd-srv.h
> @@ -46,7 +46,7 @@ struct rnbd_srv_dev {
>  struct rnbd_srv_sess_dev {
>  	/* Entry inside rnbd_srv_dev struct */
>  	struct list_head		dev_list;
> -	struct bdev_handle		*bdev_handle;
> +	struct file			*bdev_file;
>  	struct rnbd_srv_session		*sess;
>  	struct rnbd_srv_dev		*dev;
>  	struct kobject                  kobj;
> 
> -- 
> 2.43.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 11/34] xen: port block device access to file
  2024-01-23 13:26 ` [PATCH v2 11/34] xen: " Christian Brauner
@ 2024-01-31 18:31   ` Jan Kara
  0 siblings, 0 replies; 146+ messages in thread
From: Jan Kara @ 2024-01-31 18:31 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:28, Christian Brauner wrote:
> Signed-off-by: Christian Brauner <brauner@kernel.org>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  drivers/block/xen-blkback/blkback.c |  4 ++--
>  drivers/block/xen-blkback/common.h  |  4 ++--
>  drivers/block/xen-blkback/xenbus.c  | 37 ++++++++++++++++++-------------------
>  3 files changed, 22 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c
> index 4defd7f387c7..944576d582fb 100644
> --- a/drivers/block/xen-blkback/blkback.c
> +++ b/drivers/block/xen-blkback/blkback.c
> @@ -465,7 +465,7 @@ static int xen_vbd_translate(struct phys_req *req, struct xen_blkif *blkif,
>  	}
>  
>  	req->dev  = vbd->pdevice;
> -	req->bdev = vbd->bdev_handle->bdev;
> +	req->bdev = file_bdev(vbd->bdev_file);
>  	rc = 0;
>  
>   out:
> @@ -969,7 +969,7 @@ static int dispatch_discard_io(struct xen_blkif_ring *ring,
>  	int err = 0;
>  	int status = BLKIF_RSP_OKAY;
>  	struct xen_blkif *blkif = ring->blkif;
> -	struct block_device *bdev = blkif->vbd.bdev_handle->bdev;
> +	struct block_device *bdev = file_bdev(blkif->vbd.bdev_file);
>  	struct phys_req preq;
>  
>  	xen_blkif_get(blkif);
> diff --git a/drivers/block/xen-blkback/common.h b/drivers/block/xen-blkback/common.h
> index 1432c83183d0..b427d54bc120 100644
> --- a/drivers/block/xen-blkback/common.h
> +++ b/drivers/block/xen-blkback/common.h
> @@ -221,7 +221,7 @@ struct xen_vbd {
>  	unsigned char		type;
>  	/* phys device that this vbd maps to. */
>  	u32			pdevice;
> -	struct bdev_handle	*bdev_handle;
> +	struct file		*bdev_file;
>  	/* Cached size parameter. */
>  	sector_t		size;
>  	unsigned int		flush_support:1;
> @@ -360,7 +360,7 @@ struct pending_req {
>  };
>  
>  
> -#define vbd_sz(_v)	bdev_nr_sectors((_v)->bdev_handle->bdev)
> +#define vbd_sz(_v)	bdev_nr_sectors(file_bdev((_v)->bdev_file))
>  
>  #define xen_blkif_get(_b) (atomic_inc(&(_b)->refcnt))
>  #define xen_blkif_put(_b)				\
> diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
> index e34219ea2b05..0621878940ae 100644
> --- a/drivers/block/xen-blkback/xenbus.c
> +++ b/drivers/block/xen-blkback/xenbus.c
> @@ -81,7 +81,7 @@ static void xen_update_blkif_status(struct xen_blkif *blkif)
>  	int i;
>  
>  	/* Not ready to connect? */
> -	if (!blkif->rings || !blkif->rings[0].irq || !blkif->vbd.bdev_handle)
> +	if (!blkif->rings || !blkif->rings[0].irq || !blkif->vbd.bdev_file)
>  		return;
>  
>  	/* Already connected? */
> @@ -99,13 +99,12 @@ static void xen_update_blkif_status(struct xen_blkif *blkif)
>  		return;
>  	}
>  
> -	err = sync_blockdev(blkif->vbd.bdev_handle->bdev);
> +	err = sync_blockdev(file_bdev(blkif->vbd.bdev_file));
>  	if (err) {
>  		xenbus_dev_error(blkif->be->dev, err, "block flush");
>  		return;
>  	}
> -	invalidate_inode_pages2(
> -			blkif->vbd.bdev_handle->bdev->bd_inode->i_mapping);
> +	invalidate_inode_pages2(blkif->vbd.bdev_file->f_mapping);
>  
>  	for (i = 0; i < blkif->nr_rings; i++) {
>  		ring = &blkif->rings[i];
> @@ -473,9 +472,9 @@ static void xenvbd_sysfs_delif(struct xenbus_device *dev)
>  
>  static void xen_vbd_free(struct xen_vbd *vbd)
>  {
> -	if (vbd->bdev_handle)
> -		bdev_release(vbd->bdev_handle);
> -	vbd->bdev_handle = NULL;
> +	if (vbd->bdev_file)
> +		fput(vbd->bdev_file);
> +	vbd->bdev_file = NULL;
>  }
>  
>  static int xen_vbd_create(struct xen_blkif *blkif, blkif_vdev_t handle,
> @@ -483,7 +482,7 @@ static int xen_vbd_create(struct xen_blkif *blkif, blkif_vdev_t handle,
>  			  int cdrom)
>  {
>  	struct xen_vbd *vbd;
> -	struct bdev_handle *bdev_handle;
> +	struct file *bdev_file;
>  
>  	vbd = &blkif->vbd;
>  	vbd->handle   = handle;
> @@ -492,17 +491,17 @@ static int xen_vbd_create(struct xen_blkif *blkif, blkif_vdev_t handle,
>  
>  	vbd->pdevice  = MKDEV(major, minor);
>  
> -	bdev_handle = bdev_open_by_dev(vbd->pdevice, vbd->readonly ?
> +	bdev_file = bdev_file_open_by_dev(vbd->pdevice, vbd->readonly ?
>  				 BLK_OPEN_READ : BLK_OPEN_WRITE, NULL, NULL);
>  
> -	if (IS_ERR(bdev_handle)) {
> +	if (IS_ERR(bdev_file)) {
>  		pr_warn("xen_vbd_create: device %08x could not be opened\n",
>  			vbd->pdevice);
>  		return -ENOENT;
>  	}
>  
> -	vbd->bdev_handle = bdev_handle;
> -	if (vbd->bdev_handle->bdev->bd_disk == NULL) {
> +	vbd->bdev_file = bdev_file;
> +	if (file_bdev(vbd->bdev_file)->bd_disk == NULL) {
>  		pr_warn("xen_vbd_create: device %08x doesn't exist\n",
>  			vbd->pdevice);
>  		xen_vbd_free(vbd);
> @@ -510,14 +509,14 @@ static int xen_vbd_create(struct xen_blkif *blkif, blkif_vdev_t handle,
>  	}
>  	vbd->size = vbd_sz(vbd);
>  
> -	if (cdrom || disk_to_cdi(vbd->bdev_handle->bdev->bd_disk))
> +	if (cdrom || disk_to_cdi(file_bdev(vbd->bdev_file)->bd_disk))
>  		vbd->type |= VDISK_CDROM;
> -	if (vbd->bdev_handle->bdev->bd_disk->flags & GENHD_FL_REMOVABLE)
> +	if (file_bdev(vbd->bdev_file)->bd_disk->flags & GENHD_FL_REMOVABLE)
>  		vbd->type |= VDISK_REMOVABLE;
>  
> -	if (bdev_write_cache(bdev_handle->bdev))
> +	if (bdev_write_cache(file_bdev(bdev_file)))
>  		vbd->flush_support = true;
> -	if (bdev_max_secure_erase_sectors(bdev_handle->bdev))
> +	if (bdev_max_secure_erase_sectors(file_bdev(bdev_file)))
>  		vbd->discard_secure = true;
>  
>  	pr_debug("Successful creation of handle=%04x (dom=%u)\n",
> @@ -570,7 +569,7 @@ static void xen_blkbk_discard(struct xenbus_transaction xbt, struct backend_info
>  	struct xen_blkif *blkif = be->blkif;
>  	int err;
>  	int state = 0;
> -	struct block_device *bdev = be->blkif->vbd.bdev_handle->bdev;
> +	struct block_device *bdev = file_bdev(be->blkif->vbd.bdev_file);
>  
>  	if (!xenbus_read_unsigned(dev->nodename, "discard-enable", 1))
>  		return;
> @@ -932,7 +931,7 @@ static void connect(struct backend_info *be)
>  	}
>  	err = xenbus_printf(xbt, dev->nodename, "sector-size", "%lu",
>  			    (unsigned long)bdev_logical_block_size(
> -					be->blkif->vbd.bdev_handle->bdev));
> +					file_bdev(be->blkif->vbd.bdev_file)));
>  	if (err) {
>  		xenbus_dev_fatal(dev, err, "writing %s/sector-size",
>  				 dev->nodename);
> @@ -940,7 +939,7 @@ static void connect(struct backend_info *be)
>  	}
>  	err = xenbus_printf(xbt, dev->nodename, "physical-sector-size", "%u",
>  			    bdev_physical_block_size(
> -					be->blkif->vbd.bdev_handle->bdev));
> +					file_bdev(be->blkif->vbd.bdev_file)));
>  	if (err)
>  		xenbus_dev_error(dev, err, "writing %s/physical-sector-size",
>  				 dev->nodename);
> 
> -- 
> 2.43.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 12/34] zram: port block device access to file
  2024-01-23 13:26 ` [PATCH v2 12/34] zram: " Christian Brauner
@ 2024-01-31 18:32   ` Jan Kara
  0 siblings, 0 replies; 146+ messages in thread
From: Jan Kara @ 2024-01-31 18:32 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:29, Christian Brauner wrote:
> Signed-off-by: Christian Brauner <brauner@kernel.org>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  drivers/block/zram/zram_drv.c | 26 +++++++++++++-------------
>  drivers/block/zram/zram_drv.h |  2 +-
>  2 files changed, 14 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index 6772e0c654fa..d96b3851b5d3 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -426,11 +426,11 @@ static void reset_bdev(struct zram *zram)
>  	if (!zram->backing_dev)
>  		return;
>  
> -	bdev_release(zram->bdev_handle);
> +	fput(zram->bdev_file);
>  	/* hope filp_close flush all of IO */
>  	filp_close(zram->backing_dev, NULL);
>  	zram->backing_dev = NULL;
> -	zram->bdev_handle = NULL;
> +	zram->bdev_file = NULL;
>  	zram->disk->fops = &zram_devops;
>  	kvfree(zram->bitmap);
>  	zram->bitmap = NULL;
> @@ -476,7 +476,7 @@ static ssize_t backing_dev_store(struct device *dev,
>  	struct address_space *mapping;
>  	unsigned int bitmap_sz;
>  	unsigned long nr_pages, *bitmap = NULL;
> -	struct bdev_handle *bdev_handle = NULL;
> +	struct file *bdev_file = NULL;
>  	int err;
>  	struct zram *zram = dev_to_zram(dev);
>  
> @@ -513,11 +513,11 @@ static ssize_t backing_dev_store(struct device *dev,
>  		goto out;
>  	}
>  
> -	bdev_handle = bdev_open_by_dev(inode->i_rdev,
> +	bdev_file = bdev_file_open_by_dev(inode->i_rdev,
>  				BLK_OPEN_READ | BLK_OPEN_WRITE, zram, NULL);
> -	if (IS_ERR(bdev_handle)) {
> -		err = PTR_ERR(bdev_handle);
> -		bdev_handle = NULL;
> +	if (IS_ERR(bdev_file)) {
> +		err = PTR_ERR(bdev_file);
> +		bdev_file = NULL;
>  		goto out;
>  	}
>  
> @@ -531,7 +531,7 @@ static ssize_t backing_dev_store(struct device *dev,
>  
>  	reset_bdev(zram);
>  
> -	zram->bdev_handle = bdev_handle;
> +	zram->bdev_file = bdev_file;
>  	zram->backing_dev = backing_dev;
>  	zram->bitmap = bitmap;
>  	zram->nr_pages = nr_pages;
> @@ -544,8 +544,8 @@ static ssize_t backing_dev_store(struct device *dev,
>  out:
>  	kvfree(bitmap);
>  
> -	if (bdev_handle)
> -		bdev_release(bdev_handle);
> +	if (bdev_file)
> +		fput(bdev_file);
>  
>  	if (backing_dev)
>  		filp_close(backing_dev, NULL);
> @@ -587,7 +587,7 @@ static void read_from_bdev_async(struct zram *zram, struct page *page,
>  {
>  	struct bio *bio;
>  
> -	bio = bio_alloc(zram->bdev_handle->bdev, 1, parent->bi_opf, GFP_NOIO);
> +	bio = bio_alloc(file_bdev(zram->bdev_file), 1, parent->bi_opf, GFP_NOIO);
>  	bio->bi_iter.bi_sector = entry * (PAGE_SIZE >> 9);
>  	__bio_add_page(bio, page, PAGE_SIZE, 0);
>  	bio_chain(bio, parent);
> @@ -703,7 +703,7 @@ static ssize_t writeback_store(struct device *dev,
>  			continue;
>  		}
>  
> -		bio_init(&bio, zram->bdev_handle->bdev, &bio_vec, 1,
> +		bio_init(&bio, file_bdev(zram->bdev_file), &bio_vec, 1,
>  			 REQ_OP_WRITE | REQ_SYNC);
>  		bio.bi_iter.bi_sector = blk_idx * (PAGE_SIZE >> 9);
>  		__bio_add_page(&bio, page, PAGE_SIZE, 0);
> @@ -785,7 +785,7 @@ static void zram_sync_read(struct work_struct *work)
>  	struct bio_vec bv;
>  	struct bio bio;
>  
> -	bio_init(&bio, zw->zram->bdev_handle->bdev, &bv, 1, REQ_OP_READ);
> +	bio_init(&bio, file_bdev(zw->zram->bdev_file), &bv, 1, REQ_OP_READ);
>  	bio.bi_iter.bi_sector = zw->entry * (PAGE_SIZE >> 9);
>  	__bio_add_page(&bio, zw->page, PAGE_SIZE, 0);
>  	zw->error = submit_bio_wait(&bio);
> diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
> index 3b94d12f41b4..37bf29f34d26 100644
> --- a/drivers/block/zram/zram_drv.h
> +++ b/drivers/block/zram/zram_drv.h
> @@ -132,7 +132,7 @@ struct zram {
>  	spinlock_t wb_limit_lock;
>  	bool wb_limit_enable;
>  	u64 bd_wb_limit;
> -	struct bdev_handle *bdev_handle;
> +	struct file *bdev_file;
>  	unsigned long *bitmap;
>  	unsigned long nr_pages;
>  #endif
> 
> -- 
> 2.43.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 13/34] bcache: port block device access to files
  2024-01-23 13:26 ` [PATCH v2 13/34] bcache: port block device access to files Christian Brauner
@ 2024-02-01  9:45   ` Jan Kara
  0 siblings, 0 replies; 146+ messages in thread
From: Jan Kara @ 2024-02-01  9:45 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:30, Christian Brauner wrote:
> Signed-off-by: Christian Brauner <brauner@kernel.org>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  drivers/md/bcache/bcache.h |  4 +--
>  drivers/md/bcache/super.c  | 74 +++++++++++++++++++++++-----------------------
>  2 files changed, 39 insertions(+), 39 deletions(-)
> 
> diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
> index 6ae2329052c9..4e6afa89921f 100644
> --- a/drivers/md/bcache/bcache.h
> +++ b/drivers/md/bcache/bcache.h
> @@ -300,7 +300,7 @@ struct cached_dev {
>  	struct list_head	list;
>  	struct bcache_device	disk;
>  	struct block_device	*bdev;
> -	struct bdev_handle	*bdev_handle;
> +	struct file		*bdev_file;
>  
>  	struct cache_sb		sb;
>  	struct cache_sb_disk	*sb_disk;
> @@ -423,7 +423,7 @@ struct cache {
>  
>  	struct kobject		kobj;
>  	struct block_device	*bdev;
> -	struct bdev_handle	*bdev_handle;
> +	struct file		*bdev_file;
>  
>  	struct task_struct	*alloc_thread;
>  
> diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
> index dc3f50f69714..d00b3abab133 100644
> --- a/drivers/md/bcache/super.c
> +++ b/drivers/md/bcache/super.c
> @@ -1369,8 +1369,8 @@ static CLOSURE_CALLBACK(cached_dev_free)
>  	if (dc->sb_disk)
>  		put_page(virt_to_page(dc->sb_disk));
>  
> -	if (dc->bdev_handle)
> -		bdev_release(dc->bdev_handle);
> +	if (dc->bdev_file)
> +		fput(dc->bdev_file);
>  
>  	wake_up(&unregister_wait);
>  
> @@ -1445,7 +1445,7 @@ static int cached_dev_init(struct cached_dev *dc, unsigned int block_size)
>  /* Cached device - bcache superblock */
>  
>  static int register_bdev(struct cache_sb *sb, struct cache_sb_disk *sb_disk,
> -				 struct bdev_handle *bdev_handle,
> +				 struct file *bdev_file,
>  				 struct cached_dev *dc)
>  {
>  	const char *err = "cannot allocate memory";
> @@ -1453,8 +1453,8 @@ static int register_bdev(struct cache_sb *sb, struct cache_sb_disk *sb_disk,
>  	int ret = -ENOMEM;
>  
>  	memcpy(&dc->sb, sb, sizeof(struct cache_sb));
> -	dc->bdev_handle = bdev_handle;
> -	dc->bdev = bdev_handle->bdev;
> +	dc->bdev_file = bdev_file;
> +	dc->bdev = file_bdev(bdev_file);
>  	dc->sb_disk = sb_disk;
>  
>  	if (cached_dev_init(dc, sb->block_size << 9))
> @@ -2218,8 +2218,8 @@ void bch_cache_release(struct kobject *kobj)
>  	if (ca->sb_disk)
>  		put_page(virt_to_page(ca->sb_disk));
>  
> -	if (ca->bdev_handle)
> -		bdev_release(ca->bdev_handle);
> +	if (ca->bdev_file)
> +		fput(ca->bdev_file);
>  
>  	kfree(ca);
>  	module_put(THIS_MODULE);
> @@ -2339,18 +2339,18 @@ static int cache_alloc(struct cache *ca)
>  }
>  
>  static int register_cache(struct cache_sb *sb, struct cache_sb_disk *sb_disk,
> -				struct bdev_handle *bdev_handle,
> +				struct file *bdev_file,
>  				struct cache *ca)
>  {
>  	const char *err = NULL; /* must be set for any error case */
>  	int ret = 0;
>  
>  	memcpy(&ca->sb, sb, sizeof(struct cache_sb));
> -	ca->bdev_handle = bdev_handle;
> -	ca->bdev = bdev_handle->bdev;
> +	ca->bdev_file = bdev_file;
> +	ca->bdev = file_bdev(bdev_file);
>  	ca->sb_disk = sb_disk;
>  
> -	if (bdev_max_discard_sectors((bdev_handle->bdev)))
> +	if (bdev_max_discard_sectors(file_bdev(bdev_file)))
>  		ca->discard = CACHE_DISCARD(&ca->sb);
>  
>  	ret = cache_alloc(ca);
> @@ -2361,20 +2361,20 @@ static int register_cache(struct cache_sb *sb, struct cache_sb_disk *sb_disk,
>  			err = "cache_alloc(): cache device is too small";
>  		else
>  			err = "cache_alloc(): unknown error";
> -		pr_notice("error %pg: %s\n", bdev_handle->bdev, err);
> +		pr_notice("error %pg: %s\n", file_bdev(bdev_file), err);
>  		/*
>  		 * If we failed here, it means ca->kobj is not initialized yet,
>  		 * kobject_put() won't be called and there is no chance to
> -		 * call bdev_release() to bdev in bch_cache_release(). So
> -		 * we explicitly call bdev_release() here.
> +		 * call fput() to bdev in bch_cache_release(). So
> +		 * we explicitly call fput() on the block device here.
>  		 */
> -		bdev_release(bdev_handle);
> +		fput(bdev_file);
>  		return ret;
>  	}
>  
> -	if (kobject_add(&ca->kobj, bdev_kobj(bdev_handle->bdev), "bcache")) {
> +	if (kobject_add(&ca->kobj, bdev_kobj(file_bdev(bdev_file)), "bcache")) {
>  		pr_notice("error %pg: error calling kobject_add\n",
> -			  bdev_handle->bdev);
> +			  file_bdev(bdev_file));
>  		ret = -ENOMEM;
>  		goto out;
>  	}
> @@ -2388,7 +2388,7 @@ static int register_cache(struct cache_sb *sb, struct cache_sb_disk *sb_disk,
>  		goto out;
>  	}
>  
> -	pr_info("registered cache device %pg\n", ca->bdev_handle->bdev);
> +	pr_info("registered cache device %pg\n", file_bdev(ca->bdev_file));
>  
>  out:
>  	kobject_put(&ca->kobj);
> @@ -2446,7 +2446,7 @@ struct async_reg_args {
>  	char *path;
>  	struct cache_sb *sb;
>  	struct cache_sb_disk *sb_disk;
> -	struct bdev_handle *bdev_handle;
> +	struct file *bdev_file;
>  	void *holder;
>  };
>  
> @@ -2457,7 +2457,7 @@ static void register_bdev_worker(struct work_struct *work)
>  		container_of(work, struct async_reg_args, reg_work.work);
>  
>  	mutex_lock(&bch_register_lock);
> -	if (register_bdev(args->sb, args->sb_disk, args->bdev_handle,
> +	if (register_bdev(args->sb, args->sb_disk, args->bdev_file,
>  			  args->holder) < 0)
>  		fail = true;
>  	mutex_unlock(&bch_register_lock);
> @@ -2478,7 +2478,7 @@ static void register_cache_worker(struct work_struct *work)
>  		container_of(work, struct async_reg_args, reg_work.work);
>  
>  	/* blkdev_put() will be called in bch_cache_release() */
> -	if (register_cache(args->sb, args->sb_disk, args->bdev_handle,
> +	if (register_cache(args->sb, args->sb_disk, args->bdev_file,
>  			   args->holder))
>  		fail = true;
>  
> @@ -2516,7 +2516,7 @@ static ssize_t register_bcache(struct kobject *k, struct kobj_attribute *attr,
>  	char *path = NULL;
>  	struct cache_sb *sb;
>  	struct cache_sb_disk *sb_disk;
> -	struct bdev_handle *bdev_handle, *bdev_handle2;
> +	struct file *bdev_file, *bdev_file2;
>  	void *holder = NULL;
>  	ssize_t ret;
>  	bool async_registration = false;
> @@ -2549,15 +2549,15 @@ static ssize_t register_bcache(struct kobject *k, struct kobj_attribute *attr,
>  
>  	ret = -EINVAL;
>  	err = "failed to open device";
> -	bdev_handle = bdev_open_by_path(strim(path), BLK_OPEN_READ, NULL, NULL);
> -	if (IS_ERR(bdev_handle))
> +	bdev_file = bdev_file_open_by_path(strim(path), BLK_OPEN_READ, NULL, NULL);
> +	if (IS_ERR(bdev_file))
>  		goto out_free_sb;
>  
>  	err = "failed to set blocksize";
> -	if (set_blocksize(bdev_handle->bdev, 4096))
> +	if (set_blocksize(file_bdev(bdev_file), 4096))
>  		goto out_blkdev_put;
>  
> -	err = read_super(sb, bdev_handle->bdev, &sb_disk);
> +	err = read_super(sb, file_bdev(bdev_file), &sb_disk);
>  	if (err)
>  		goto out_blkdev_put;
>  
> @@ -2569,13 +2569,13 @@ static ssize_t register_bcache(struct kobject *k, struct kobj_attribute *attr,
>  	}
>  
>  	/* Now reopen in exclusive mode with proper holder */
> -	bdev_handle2 = bdev_open_by_dev(bdev_handle->bdev->bd_dev,
> +	bdev_file2 = bdev_file_open_by_dev(file_bdev(bdev_file)->bd_dev,
>  			BLK_OPEN_READ | BLK_OPEN_WRITE, holder, NULL);
> -	bdev_release(bdev_handle);
> -	bdev_handle = bdev_handle2;
> -	if (IS_ERR(bdev_handle)) {
> -		ret = PTR_ERR(bdev_handle);
> -		bdev_handle = NULL;
> +	fput(bdev_file);
> +	bdev_file = bdev_file2;
> +	if (IS_ERR(bdev_file)) {
> +		ret = PTR_ERR(bdev_file);
> +		bdev_file = NULL;
>  		if (ret == -EBUSY) {
>  			dev_t dev;
>  
> @@ -2610,7 +2610,7 @@ static ssize_t register_bcache(struct kobject *k, struct kobj_attribute *attr,
>  		args->path	= path;
>  		args->sb	= sb;
>  		args->sb_disk	= sb_disk;
> -		args->bdev_handle	= bdev_handle;
> +		args->bdev_file	= bdev_file;
>  		args->holder	= holder;
>  		register_device_async(args);
>  		/* No wait and returns to user space */
> @@ -2619,14 +2619,14 @@ static ssize_t register_bcache(struct kobject *k, struct kobj_attribute *attr,
>  
>  	if (SB_IS_BDEV(sb)) {
>  		mutex_lock(&bch_register_lock);
> -		ret = register_bdev(sb, sb_disk, bdev_handle, holder);
> +		ret = register_bdev(sb, sb_disk, bdev_file, holder);
>  		mutex_unlock(&bch_register_lock);
>  		/* blkdev_put() will be called in cached_dev_free() */
>  		if (ret < 0)
>  			goto out_free_sb;
>  	} else {
>  		/* blkdev_put() will be called in bch_cache_release() */
> -		ret = register_cache(sb, sb_disk, bdev_handle, holder);
> +		ret = register_cache(sb, sb_disk, bdev_file, holder);
>  		if (ret)
>  			goto out_free_sb;
>  	}
> @@ -2642,8 +2642,8 @@ static ssize_t register_bcache(struct kobject *k, struct kobj_attribute *attr,
>  out_put_sb_page:
>  	put_page(virt_to_page(sb_disk));
>  out_blkdev_put:
> -	if (bdev_handle)
> -		bdev_release(bdev_handle);
> +	if (bdev_file)
> +		fput(bdev_file);
>  out_free_sb:
>  	kfree(sb);
>  out_free_path:
> 
> -- 
> 2.43.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 14/34] block2mtd: port device access to files
  2024-01-23 13:26 ` [PATCH v2 14/34] block2mtd: port " Christian Brauner
@ 2024-02-01  9:47   ` Jan Kara
  0 siblings, 0 replies; 146+ messages in thread
From: Jan Kara @ 2024-02-01  9:47 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:31, Christian Brauner wrote:
> Signed-off-by: Christian Brauner <brauner@kernel.org>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  drivers/mtd/devices/block2mtd.c | 46 +++++++++++++++++++----------------------
>  1 file changed, 21 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/mtd/devices/block2mtd.c b/drivers/mtd/devices/block2mtd.c
> index aa44a23ec045..97a00ec9a4d4 100644
> --- a/drivers/mtd/devices/block2mtd.c
> +++ b/drivers/mtd/devices/block2mtd.c
> @@ -37,7 +37,7 @@
>  /* Info for the block device */
>  struct block2mtd_dev {
>  	struct list_head list;
> -	struct bdev_handle *bdev_handle;
> +	struct file *bdev_file;
>  	struct mtd_info mtd;
>  	struct mutex write_mutex;
>  };
> @@ -55,8 +55,7 @@ static struct page *page_read(struct address_space *mapping, pgoff_t index)
>  /* erase a specified part of the device */
>  static int _block2mtd_erase(struct block2mtd_dev *dev, loff_t to, size_t len)
>  {
> -	struct address_space *mapping =
> -				dev->bdev_handle->bdev->bd_inode->i_mapping;
> +	struct address_space *mapping = dev->bdev_file->f_mapping;
>  	struct page *page;
>  	pgoff_t index = to >> PAGE_SHIFT;	// page index
>  	int pages = len >> PAGE_SHIFT;
> @@ -106,8 +105,7 @@ static int block2mtd_read(struct mtd_info *mtd, loff_t from, size_t len,
>  		size_t *retlen, u_char *buf)
>  {
>  	struct block2mtd_dev *dev = mtd->priv;
> -	struct address_space *mapping =
> -				dev->bdev_handle->bdev->bd_inode->i_mapping;
> +	struct address_space *mapping = dev->bdev_file->f_mapping;
>  	struct page *page;
>  	pgoff_t index = from >> PAGE_SHIFT;
>  	int offset = from & (PAGE_SIZE-1);
> @@ -142,8 +140,7 @@ static int _block2mtd_write(struct block2mtd_dev *dev, const u_char *buf,
>  		loff_t to, size_t len, size_t *retlen)
>  {
>  	struct page *page;
> -	struct address_space *mapping =
> -				dev->bdev_handle->bdev->bd_inode->i_mapping;
> +	struct address_space *mapping = dev->bdev_file->f_mapping;
>  	pgoff_t index = to >> PAGE_SHIFT;	// page index
>  	int offset = to & ~PAGE_MASK;	// page offset
>  	int cpylen;
> @@ -198,7 +195,7 @@ static int block2mtd_write(struct mtd_info *mtd, loff_t to, size_t len,
>  static void block2mtd_sync(struct mtd_info *mtd)
>  {
>  	struct block2mtd_dev *dev = mtd->priv;
> -	sync_blockdev(dev->bdev_handle->bdev);
> +	sync_blockdev(file_bdev(dev->bdev_file));
>  	return;
>  }
>  
> @@ -210,10 +207,9 @@ static void block2mtd_free_device(struct block2mtd_dev *dev)
>  
>  	kfree(dev->mtd.name);
>  
> -	if (dev->bdev_handle) {
> -		invalidate_mapping_pages(
> -			dev->bdev_handle->bdev->bd_inode->i_mapping, 0, -1);
> -		bdev_release(dev->bdev_handle);
> +	if (dev->bdev_file) {
> +		invalidate_mapping_pages(dev->bdev_file->f_mapping, 0, -1);
> +		fput(dev->bdev_file);
>  	}
>  
>  	kfree(dev);
> @@ -223,10 +219,10 @@ static void block2mtd_free_device(struct block2mtd_dev *dev)
>   * This function is marked __ref because it calls the __init marked
>   * early_lookup_bdev when called from the early boot code.
>   */
> -static struct bdev_handle __ref *mdtblock_early_get_bdev(const char *devname,
> +static struct file __ref *mdtblock_early_get_bdev(const char *devname,
>  		blk_mode_t mode, int timeout, struct block2mtd_dev *dev)
>  {
> -	struct bdev_handle *bdev_handle = ERR_PTR(-ENODEV);
> +	struct file *bdev_file = ERR_PTR(-ENODEV);
>  #ifndef MODULE
>  	int i;
>  
> @@ -234,7 +230,7 @@ static struct bdev_handle __ref *mdtblock_early_get_bdev(const char *devname,
>  	 * We can't use early_lookup_bdev from a running system.
>  	 */
>  	if (system_state >= SYSTEM_RUNNING)
> -		return bdev_handle;
> +		return bdev_file;
>  
>  	/*
>  	 * We might not have the root device mounted at this point.
> @@ -253,20 +249,20 @@ static struct bdev_handle __ref *mdtblock_early_get_bdev(const char *devname,
>  		wait_for_device_probe();
>  
>  		if (!early_lookup_bdev(devname, &devt)) {
> -			bdev_handle = bdev_open_by_dev(devt, mode, dev, NULL);
> -			if (!IS_ERR(bdev_handle))
> +			bdev_file = bdev_file_open_by_dev(devt, mode, dev, NULL);
> +			if (!IS_ERR(bdev_file))
>  				break;
>  		}
>  	}
>  #endif
> -	return bdev_handle;
> +	return bdev_file;
>  }
>  
>  static struct block2mtd_dev *add_device(char *devname, int erase_size,
>  		char *label, int timeout)
>  {
>  	const blk_mode_t mode = BLK_OPEN_READ | BLK_OPEN_WRITE;
> -	struct bdev_handle *bdev_handle;
> +	struct file *bdev_file;
>  	struct block_device *bdev;
>  	struct block2mtd_dev *dev;
>  	char *name;
> @@ -279,16 +275,16 @@ static struct block2mtd_dev *add_device(char *devname, int erase_size,
>  		return NULL;
>  
>  	/* Get a handle on the device */
> -	bdev_handle = bdev_open_by_path(devname, mode, dev, NULL);
> -	if (IS_ERR(bdev_handle))
> -		bdev_handle = mdtblock_early_get_bdev(devname, mode, timeout,
> +	bdev_file = bdev_file_open_by_path(devname, mode, dev, NULL);
> +	if (IS_ERR(bdev_file))
> +		bdev_file = mdtblock_early_get_bdev(devname, mode, timeout,
>  						      dev);
> -	if (IS_ERR(bdev_handle)) {
> +	if (IS_ERR(bdev_file)) {
>  		pr_err("error: cannot open device %s\n", devname);
>  		goto err_free_block2mtd;
>  	}
> -	dev->bdev_handle = bdev_handle;
> -	bdev = bdev_handle->bdev;
> +	dev->bdev_file = bdev_file;
> +	bdev = file_bdev(bdev_file);
>  
>  	if (MAJOR(bdev->bd_dev) == MTD_BLOCK_MAJOR) {
>  		pr_err("attempting to use an MTD device as a block device\n");
> 
> -- 
> 2.43.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 15/34] nvme: port block device access to file
  2024-01-23 13:26 ` [PATCH v2 15/34] nvme: port block device access to file Christian Brauner
@ 2024-02-01  9:48   ` Jan Kara
  0 siblings, 0 replies; 146+ messages in thread
From: Jan Kara @ 2024-02-01  9:48 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:32, Christian Brauner wrote:
> Signed-off-by: Christian Brauner <brauner@kernel.org>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  drivers/nvme/target/io-cmd-bdev.c | 16 ++++++++--------
>  drivers/nvme/target/nvmet.h       |  2 +-
>  2 files changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/nvme/target/io-cmd-bdev.c b/drivers/nvme/target/io-cmd-bdev.c
> index f11400a908f2..6426aac2634a 100644
> --- a/drivers/nvme/target/io-cmd-bdev.c
> +++ b/drivers/nvme/target/io-cmd-bdev.c
> @@ -50,10 +50,10 @@ void nvmet_bdev_set_limits(struct block_device *bdev, struct nvme_id_ns *id)
>  
>  void nvmet_bdev_ns_disable(struct nvmet_ns *ns)
>  {
> -	if (ns->bdev_handle) {
> -		bdev_release(ns->bdev_handle);
> +	if (ns->bdev_file) {
> +		fput(ns->bdev_file);
>  		ns->bdev = NULL;
> -		ns->bdev_handle = NULL;
> +		ns->bdev_file = NULL;
>  	}
>  }
>  
> @@ -85,18 +85,18 @@ int nvmet_bdev_ns_enable(struct nvmet_ns *ns)
>  	if (ns->buffered_io)
>  		return -ENOTBLK;
>  
> -	ns->bdev_handle = bdev_open_by_path(ns->device_path,
> +	ns->bdev_file = bdev_file_open_by_path(ns->device_path,
>  				BLK_OPEN_READ | BLK_OPEN_WRITE, NULL, NULL);
> -	if (IS_ERR(ns->bdev_handle)) {
> -		ret = PTR_ERR(ns->bdev_handle);
> +	if (IS_ERR(ns->bdev_file)) {
> +		ret = PTR_ERR(ns->bdev_file);
>  		if (ret != -ENOTBLK) {
>  			pr_err("failed to open block device %s: (%d)\n",
>  					ns->device_path, ret);
>  		}
> -		ns->bdev_handle = NULL;
> +		ns->bdev_file = NULL;
>  		return ret;
>  	}
> -	ns->bdev = ns->bdev_handle->bdev;
> +	ns->bdev = file_bdev(ns->bdev_file);
>  	ns->size = bdev_nr_bytes(ns->bdev);
>  	ns->blksize_shift = blksize_bits(bdev_logical_block_size(ns->bdev));
>  
> diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h
> index 6c8acebe1a1a..33e61b4f478b 100644
> --- a/drivers/nvme/target/nvmet.h
> +++ b/drivers/nvme/target/nvmet.h
> @@ -58,7 +58,7 @@
>  
>  struct nvmet_ns {
>  	struct percpu_ref	ref;
> -	struct bdev_handle	*bdev_handle;
> +	struct file		*bdev_file;
>  	struct block_device	*bdev;
>  	struct file		*file;
>  	bool			readonly;
> 
> -- 
> 2.43.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 33/34] block: expose bdev_file_inode()
  2024-01-23 13:26 ` [PATCH v2 33/34] block: expose bdev_file_inode() Christian Brauner
@ 2024-02-01 10:09   ` Jan Kara
  0 siblings, 0 replies; 146+ messages in thread
From: Jan Kara @ 2024-02-01 10:09 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:50, Christian Brauner wrote:
> Now that we open block devices as files we don't need to rely on
> bd_inode to get to the correct inode. Use the helper.
> 
> We could use bdev_file->f_inode directly here since we know that
> @f_inode refers to a bdev fs inode but it is generically correct to use
> bdev_file->f_mapping->host since that will also work for bdev_files
> opened from userspace.
> 
> Signed-off-by: Christian Brauner <brauner@kernel.org>

I think you wouldn't need this patch, if you picked up patch:
https://lore.kernel.org/all/20231211140833.975935-1-yukuai1@huaweicloud.com

from previous Yu Kuai's series. Because the only user of bdev_file_inode()
in ext4 is actually dead code...

								Honza

> ---
>  block/bdev.c           | 2 +-
>  block/fops.c           | 5 -----
>  include/linux/blkdev.h | 5 +++++
>  3 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/block/bdev.c b/block/bdev.c
> index 4b47003d8082..185c43ebeea5 100644
> --- a/block/bdev.c
> +++ b/block/bdev.c
> @@ -51,7 +51,7 @@ EXPORT_SYMBOL(I_BDEV);
>  
>  struct block_device *file_bdev(struct file *bdev_file)
>  {
> -	return I_BDEV(bdev_file->f_mapping->host);
> +	return I_BDEV(bdev_file_inode(bdev_file));
>  }
>  EXPORT_SYMBOL(file_bdev);
>  
> diff --git a/block/fops.c b/block/fops.c
> index a0bff2c0d88d..240d968c281c 100644
> --- a/block/fops.c
> +++ b/block/fops.c
> @@ -19,11 +19,6 @@
>  #include <linux/module.h>
>  #include "blk.h"
>  
> -static inline struct inode *bdev_file_inode(struct file *file)
> -{
> -	return file->f_mapping->host;
> -}
> -
>  static blk_opf_t dio_bio_write_op(struct kiocb *iocb)
>  {
>  	blk_opf_t opf = REQ_OP_WRITE | REQ_SYNC | REQ_IDLE;
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 2f5dbde23094..4b7080e56e44 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -1490,6 +1490,11 @@ void blkdev_put_no_open(struct block_device *bdev);
>  struct block_device *I_BDEV(struct inode *inode);
>  struct block_device *file_bdev(struct file *bdev_file);
>  
> +static inline struct inode *bdev_file_inode(struct file *file)
> +{
> +	return file->f_mapping->host;
> +}
> +
>  #ifdef CONFIG_BLOCK
>  void invalidate_bdev(struct block_device *bdev);
>  int sync_blockdev(struct block_device *bdev);
> 
> -- 
> 2.43.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 16/34] s390: port block device access to file
  2024-01-23 13:26 ` [PATCH v2 16/34] s390: " Christian Brauner
@ 2024-02-01 10:11   ` Jan Kara
  0 siblings, 0 replies; 146+ messages in thread
From: Jan Kara @ 2024-02-01 10:11 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:33, Christian Brauner wrote:
> Signed-off-by: Christian Brauner <brauner@kernel.org>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  drivers/s390/block/dasd.c       | 10 +++++-----
>  drivers/s390/block/dasd_genhd.c | 36 ++++++++++++++++++------------------
>  drivers/s390/block/dasd_int.h   |  2 +-
>  drivers/s390/block/dasd_ioctl.c |  2 +-
>  4 files changed, 25 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/s390/block/dasd.c b/drivers/s390/block/dasd.c
> index 7327e81352e9..c833a7c7d7b2 100644
> --- a/drivers/s390/block/dasd.c
> +++ b/drivers/s390/block/dasd.c
> @@ -412,7 +412,7 @@ dasd_state_ready_to_online(struct dasd_device * device)
>  					KOBJ_CHANGE);
>  			return 0;
>  		}
> -		disk_uevent(device->block->bdev_handle->bdev->bd_disk,
> +		disk_uevent(file_bdev(device->block->bdev_file)->bd_disk,
>  			    KOBJ_CHANGE);
>  	}
>  	return 0;
> @@ -433,7 +433,7 @@ static int dasd_state_online_to_ready(struct dasd_device *device)
>  
>  	device->state = DASD_STATE_READY;
>  	if (device->block && !(device->features & DASD_FEATURE_USERAW))
> -		disk_uevent(device->block->bdev_handle->bdev->bd_disk,
> +		disk_uevent(file_bdev(device->block->bdev_file)->bd_disk,
>  			    KOBJ_CHANGE);
>  	return 0;
>  }
> @@ -3588,7 +3588,7 @@ int dasd_generic_set_offline(struct ccw_device *cdev)
>  	 * in the other openers.
>  	 */
>  	if (device->block) {
> -		max_count = device->block->bdev_handle ? 0 : -1;
> +		max_count = device->block->bdev_file ? 0 : -1;
>  		open_count = atomic_read(&device->block->open_count);
>  		if (open_count > max_count) {
>  			if (open_count > 0)
> @@ -3634,8 +3634,8 @@ int dasd_generic_set_offline(struct ccw_device *cdev)
>  		 * so sync bdev first and then wait for our queues to become
>  		 * empty
>  		 */
> -		if (device->block && device->block->bdev_handle)
> -			bdev_mark_dead(device->block->bdev_handle->bdev, false);
> +		if (device->block && device->block->bdev_file)
> +			bdev_mark_dead(file_bdev(device->block->bdev_file), false);
>  		dasd_schedule_device_bh(device);
>  		rc = wait_event_interruptible(shutdown_waitq,
>  					      _wait_for_empty_queues(device));
> diff --git a/drivers/s390/block/dasd_genhd.c b/drivers/s390/block/dasd_genhd.c
> index 55e3abe94cde..8bf2cf0ccc15 100644
> --- a/drivers/s390/block/dasd_genhd.c
> +++ b/drivers/s390/block/dasd_genhd.c
> @@ -127,15 +127,15 @@ void dasd_gendisk_free(struct dasd_block *block)
>   */
>  int dasd_scan_partitions(struct dasd_block *block)
>  {
> -	struct bdev_handle *bdev_handle;
> +	struct file *bdev_file;
>  	int rc;
>  
> -	bdev_handle = bdev_open_by_dev(disk_devt(block->gdp), BLK_OPEN_READ,
> +	bdev_file = bdev_file_open_by_dev(disk_devt(block->gdp), BLK_OPEN_READ,
>  				       NULL, NULL);
> -	if (IS_ERR(bdev_handle)) {
> +	if (IS_ERR(bdev_file)) {
>  		DBF_DEV_EVENT(DBF_ERR, block->base,
>  			      "scan partitions error, blkdev_get returned %ld",
> -			      PTR_ERR(bdev_handle));
> +			      PTR_ERR(bdev_file));
>  		return -ENODEV;
>  	}
>  
> @@ -147,15 +147,15 @@ int dasd_scan_partitions(struct dasd_block *block)
>  				"scan partitions error, rc %d", rc);
>  
>  	/*
> -	 * Since the matching bdev_release() call to the
> -	 * bdev_open_by_path() in this function is not called before
> +	 * Since the matching fput() call to the
> +	 * bdev_file_open_by_path() in this function is not called before
>  	 * dasd_destroy_partitions the offline open_count limit needs to be
> -	 * increased from 0 to 1. This is done by setting device->bdev_handle
> +	 * increased from 0 to 1. This is done by setting device->bdev_file
>  	 * (see dasd_generic_set_offline). As long as the partition detection
>  	 * is running no offline should be allowed. That is why the assignment
> -	 * to block->bdev_handle is done AFTER the BLKRRPART ioctl.
> +	 * to block->bdev_file is done AFTER the BLKRRPART ioctl.
>  	 */
> -	block->bdev_handle = bdev_handle;
> +	block->bdev_file = bdev_file;
>  	return 0;
>  }
>  
> @@ -165,21 +165,21 @@ int dasd_scan_partitions(struct dasd_block *block)
>   */
>  void dasd_destroy_partitions(struct dasd_block *block)
>  {
> -	struct bdev_handle *bdev_handle;
> +	struct file *bdev_file;
>  
>  	/*
> -	 * Get the bdev_handle pointer from the device structure and clear
> -	 * device->bdev_handle to lower the offline open_count limit again.
> +	 * Get the bdev_file pointer from the device structure and clear
> +	 * device->bdev_file to lower the offline open_count limit again.
>  	 */
> -	bdev_handle = block->bdev_handle;
> -	block->bdev_handle = NULL;
> +	bdev_file = block->bdev_file;
> +	block->bdev_file = NULL;
>  
> -	mutex_lock(&bdev_handle->bdev->bd_disk->open_mutex);
> -	bdev_disk_changed(bdev_handle->bdev->bd_disk, true);
> -	mutex_unlock(&bdev_handle->bdev->bd_disk->open_mutex);
> +	mutex_lock(&file_bdev(bdev_file)->bd_disk->open_mutex);
> +	bdev_disk_changed(file_bdev(bdev_file)->bd_disk, true);
> +	mutex_unlock(&file_bdev(bdev_file)->bd_disk->open_mutex);
>  
>  	/* Matching blkdev_put to the blkdev_get in dasd_scan_partitions. */
> -	bdev_release(bdev_handle);
> +	fput(bdev_file);
>  }
>  
>  int dasd_gendisk_init(void)
> diff --git a/drivers/s390/block/dasd_int.h b/drivers/s390/block/dasd_int.h
> index 1b1b8a41c4d4..aecd502aec51 100644
> --- a/drivers/s390/block/dasd_int.h
> +++ b/drivers/s390/block/dasd_int.h
> @@ -650,7 +650,7 @@ struct dasd_block {
>  	struct gendisk *gdp;
>  	spinlock_t request_queue_lock;
>  	struct blk_mq_tag_set tag_set;
> -	struct bdev_handle *bdev_handle;
> +	struct file *bdev_file;
>  	atomic_t open_count;
>  
>  	unsigned long blocks;	   /* size of volume in blocks */
> diff --git a/drivers/s390/block/dasd_ioctl.c b/drivers/s390/block/dasd_ioctl.c
> index 61b9675e2a67..de85a5e4e21b 100644
> --- a/drivers/s390/block/dasd_ioctl.c
> +++ b/drivers/s390/block/dasd_ioctl.c
> @@ -537,7 +537,7 @@ static int __dasd_ioctl_information(struct dasd_block *block,
>  	 * This must be hidden from user-space.
>  	 */
>  	dasd_info->open_count = atomic_read(&block->open_count);
> -	if (!block->bdev_handle)
> +	if (!block->bdev_file)
>  		dasd_info->open_count++;
>  
>  	/*
> 
> -- 
> 2.43.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 17/34] target: port block device access to file
  2024-01-23 13:26 ` [PATCH v2 17/34] target: " Christian Brauner
@ 2024-02-01 10:12   ` Jan Kara
  0 siblings, 0 replies; 146+ messages in thread
From: Jan Kara @ 2024-02-01 10:12 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:34, Christian Brauner wrote:
> Signed-off-by: Christian Brauner <brauner@kernel.org>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  drivers/target/target_core_iblock.c | 18 +++++++++---------
>  drivers/target/target_core_iblock.h |  2 +-
>  drivers/target/target_core_pscsi.c  | 22 +++++++++++-----------
>  drivers/target/target_core_pscsi.h  |  2 +-
>  4 files changed, 22 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/target/target_core_iblock.c b/drivers/target/target_core_iblock.c
> index 8eb9eb7ce5df..7f6ca8177845 100644
> --- a/drivers/target/target_core_iblock.c
> +++ b/drivers/target/target_core_iblock.c
> @@ -91,7 +91,7 @@ static int iblock_configure_device(struct se_device *dev)
>  {
>  	struct iblock_dev *ib_dev = IBLOCK_DEV(dev);
>  	struct request_queue *q;
> -	struct bdev_handle *bdev_handle;
> +	struct file *bdev_file;
>  	struct block_device *bd;
>  	struct blk_integrity *bi;
>  	blk_mode_t mode = BLK_OPEN_READ;
> @@ -117,14 +117,14 @@ static int iblock_configure_device(struct se_device *dev)
>  	else
>  		dev->dev_flags |= DF_READ_ONLY;
>  
> -	bdev_handle = bdev_open_by_path(ib_dev->ibd_udev_path, mode, ib_dev,
> +	bdev_file = bdev_file_open_by_path(ib_dev->ibd_udev_path, mode, ib_dev,
>  					NULL);
> -	if (IS_ERR(bdev_handle)) {
> -		ret = PTR_ERR(bdev_handle);
> +	if (IS_ERR(bdev_file)) {
> +		ret = PTR_ERR(bdev_file);
>  		goto out_free_bioset;
>  	}
> -	ib_dev->ibd_bdev_handle = bdev_handle;
> -	ib_dev->ibd_bd = bd = bdev_handle->bdev;
> +	ib_dev->ibd_bdev_file = bdev_file;
> +	ib_dev->ibd_bd = bd = file_bdev(bdev_file);
>  
>  	q = bdev_get_queue(bd);
>  
> @@ -180,7 +180,7 @@ static int iblock_configure_device(struct se_device *dev)
>  	return 0;
>  
>  out_blkdev_put:
> -	bdev_release(ib_dev->ibd_bdev_handle);
> +	fput(ib_dev->ibd_bdev_file);
>  out_free_bioset:
>  	bioset_exit(&ib_dev->ibd_bio_set);
>  out:
> @@ -205,8 +205,8 @@ static void iblock_destroy_device(struct se_device *dev)
>  {
>  	struct iblock_dev *ib_dev = IBLOCK_DEV(dev);
>  
> -	if (ib_dev->ibd_bdev_handle)
> -		bdev_release(ib_dev->ibd_bdev_handle);
> +	if (ib_dev->ibd_bdev_file)
> +		fput(ib_dev->ibd_bdev_file);
>  	bioset_exit(&ib_dev->ibd_bio_set);
>  }
>  
> diff --git a/drivers/target/target_core_iblock.h b/drivers/target/target_core_iblock.h
> index 683f9a55945b..91f6f4280666 100644
> --- a/drivers/target/target_core_iblock.h
> +++ b/drivers/target/target_core_iblock.h
> @@ -32,7 +32,7 @@ struct iblock_dev {
>  	u32	ibd_flags;
>  	struct bio_set	ibd_bio_set;
>  	struct block_device *ibd_bd;
> -	struct bdev_handle *ibd_bdev_handle;
> +	struct file *ibd_bdev_file;
>  	bool ibd_readonly;
>  	struct iblock_dev_plug *ibd_plug;
>  } ____cacheline_aligned;
> diff --git a/drivers/target/target_core_pscsi.c b/drivers/target/target_core_pscsi.c
> index 41b7489d37ce..9aedd682d10c 100644
> --- a/drivers/target/target_core_pscsi.c
> +++ b/drivers/target/target_core_pscsi.c
> @@ -352,7 +352,7 @@ static int pscsi_create_type_disk(struct se_device *dev, struct scsi_device *sd)
>  	struct pscsi_hba_virt *phv = dev->se_hba->hba_ptr;
>  	struct pscsi_dev_virt *pdv = PSCSI_DEV(dev);
>  	struct Scsi_Host *sh = sd->host;
> -	struct bdev_handle *bdev_handle;
> +	struct file *bdev_file;
>  	int ret;
>  
>  	if (scsi_device_get(sd)) {
> @@ -366,18 +366,18 @@ static int pscsi_create_type_disk(struct se_device *dev, struct scsi_device *sd)
>  	 * Claim exclusive struct block_device access to struct scsi_device
>  	 * for TYPE_DISK and TYPE_ZBC using supplied udev_path
>  	 */
> -	bdev_handle = bdev_open_by_path(dev->udev_path,
> +	bdev_file = bdev_file_open_by_path(dev->udev_path,
>  				BLK_OPEN_WRITE | BLK_OPEN_READ, pdv, NULL);
> -	if (IS_ERR(bdev_handle)) {
> +	if (IS_ERR(bdev_file)) {
>  		pr_err("pSCSI: bdev_open_by_path() failed\n");
>  		scsi_device_put(sd);
> -		return PTR_ERR(bdev_handle);
> +		return PTR_ERR(bdev_file);
>  	}
> -	pdv->pdv_bdev_handle = bdev_handle;
> +	pdv->pdv_bdev_file = bdev_file;
>  
>  	ret = pscsi_add_device_to_list(dev, sd);
>  	if (ret) {
> -		bdev_release(bdev_handle);
> +		fput(bdev_file);
>  		scsi_device_put(sd);
>  		return ret;
>  	}
> @@ -564,9 +564,9 @@ static void pscsi_destroy_device(struct se_device *dev)
>  		 * from pscsi_create_type_disk()
>  		 */
>  		if ((sd->type == TYPE_DISK || sd->type == TYPE_ZBC) &&
> -		    pdv->pdv_bdev_handle) {
> -			bdev_release(pdv->pdv_bdev_handle);
> -			pdv->pdv_bdev_handle = NULL;
> +		    pdv->pdv_bdev_file) {
> +			fput(pdv->pdv_bdev_file);
> +			pdv->pdv_bdev_file = NULL;
>  		}
>  		/*
>  		 * For HBA mode PHV_LLD_SCSI_HOST_NO, release the reference
> @@ -994,8 +994,8 @@ static sector_t pscsi_get_blocks(struct se_device *dev)
>  {
>  	struct pscsi_dev_virt *pdv = PSCSI_DEV(dev);
>  
> -	if (pdv->pdv_bdev_handle)
> -		return bdev_nr_sectors(pdv->pdv_bdev_handle->bdev);
> +	if (pdv->pdv_bdev_file)
> +		return bdev_nr_sectors(file_bdev(pdv->pdv_bdev_file));
>  	return 0;
>  }
>  
> diff --git a/drivers/target/target_core_pscsi.h b/drivers/target/target_core_pscsi.h
> index b0a3ef136592..9acaa21e4c78 100644
> --- a/drivers/target/target_core_pscsi.h
> +++ b/drivers/target/target_core_pscsi.h
> @@ -37,7 +37,7 @@ struct pscsi_dev_virt {
>  	int	pdv_channel_id;
>  	int	pdv_target_id;
>  	int	pdv_lun_id;
> -	struct bdev_handle *pdv_bdev_handle;
> +	struct file *pdv_bdev_file;
>  	struct scsi_device *pdv_sd;
>  	struct Scsi_Host *pdv_lld_host;
>  } ____cacheline_aligned;
> 
> -- 
> 2.43.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 18/34] bcachefs: port block device access to file
  2024-01-23 13:26 ` [PATCH v2 18/34] bcachefs: " Christian Brauner
@ 2024-02-01 10:13   ` Jan Kara
  0 siblings, 0 replies; 146+ messages in thread
From: Jan Kara @ 2024-02-01 10:13 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:35, Christian Brauner wrote:
> Signed-off-by: Christian Brauner <brauner@kernel.org>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/bcachefs/super-io.c    | 20 ++++++++++----------
>  fs/bcachefs/super_types.h |  2 +-
>  2 files changed, 11 insertions(+), 11 deletions(-)
> 
> diff --git a/fs/bcachefs/super-io.c b/fs/bcachefs/super-io.c
> index d60c7d27a047..ce8cf2d91f84 100644
> --- a/fs/bcachefs/super-io.c
> +++ b/fs/bcachefs/super-io.c
> @@ -142,8 +142,8 @@ void bch2_sb_field_delete(struct bch_sb_handle *sb,
>  void bch2_free_super(struct bch_sb_handle *sb)
>  {
>  	kfree(sb->bio);
> -	if (!IS_ERR_OR_NULL(sb->bdev_handle))
> -		bdev_release(sb->bdev_handle);
> +	if (!IS_ERR_OR_NULL(sb->s_bdev_file))
> +		fput(sb->s_bdev_file);
>  	kfree(sb->holder);
>  	kfree(sb->sb_name);
>  
> @@ -704,22 +704,22 @@ static int __bch2_read_super(const char *path, struct bch_opts *opts,
>  	if (!opt_get(*opts, nochanges))
>  		sb->mode |= BLK_OPEN_WRITE;
>  
> -	sb->bdev_handle = bdev_open_by_path(path, sb->mode, sb->holder, &bch2_sb_handle_bdev_ops);
> -	if (IS_ERR(sb->bdev_handle) &&
> -	    PTR_ERR(sb->bdev_handle) == -EACCES &&
> +	sb->s_bdev_file = bdev_file_open_by_path(path, sb->mode, sb->holder, &bch2_sb_handle_bdev_ops);
> +	if (IS_ERR(sb->s_bdev_file) &&
> +	    PTR_ERR(sb->s_bdev_file) == -EACCES &&
>  	    opt_get(*opts, read_only)) {
>  		sb->mode &= ~BLK_OPEN_WRITE;
>  
> -		sb->bdev_handle = bdev_open_by_path(path, sb->mode, sb->holder, &bch2_sb_handle_bdev_ops);
> -		if (!IS_ERR(sb->bdev_handle))
> +		sb->s_bdev_file = bdev_file_open_by_path(path, sb->mode, sb->holder, &bch2_sb_handle_bdev_ops);
> +		if (!IS_ERR(sb->s_bdev_file))
>  			opt_set(*opts, nochanges, true);
>  	}
>  
> -	if (IS_ERR(sb->bdev_handle)) {
> -		ret = PTR_ERR(sb->bdev_handle);
> +	if (IS_ERR(sb->s_bdev_file)) {
> +		ret = PTR_ERR(sb->s_bdev_file);
>  		goto out;
>  	}
> -	sb->bdev = sb->bdev_handle->bdev;
> +	sb->bdev = file_bdev(sb->s_bdev_file);
>  
>  	ret = bch2_sb_realloc(sb, 0);
>  	if (ret) {
> diff --git a/fs/bcachefs/super_types.h b/fs/bcachefs/super_types.h
> index 0e5a14fc8e7f..ec784d975f66 100644
> --- a/fs/bcachefs/super_types.h
> +++ b/fs/bcachefs/super_types.h
> @@ -4,7 +4,7 @@
>  
>  struct bch_sb_handle {
>  	struct bch_sb		*sb;
> -	struct bdev_handle	*bdev_handle;
> +	struct file		*s_bdev_file;
>  	struct block_device	*bdev;
>  	char			*sb_name;
>  	struct bio		*bio;
> 
> -- 
> 2.43.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 19/34] btrfs: port device access to file
  2024-01-23 13:26 ` [PATCH v2 19/34] btrfs: port " Christian Brauner
@ 2024-02-01 10:16   ` Jan Kara
  0 siblings, 0 replies; 146+ messages in thread
From: Jan Kara @ 2024-02-01 10:16 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:36, Christian Brauner wrote:
> Signed-off-by: Christian Brauner <brauner@kernel.org>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/btrfs/dev-replace.c | 14 ++++----
>  fs/btrfs/ioctl.c       | 16 ++++-----
>  fs/btrfs/volumes.c     | 92 +++++++++++++++++++++++++-------------------------
>  fs/btrfs/volumes.h     |  4 +--
>  4 files changed, 63 insertions(+), 63 deletions(-)
> 
> diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
> index 1502d664c892..2eb11fe4bd05 100644
> --- a/fs/btrfs/dev-replace.c
> +++ b/fs/btrfs/dev-replace.c
> @@ -246,7 +246,7 @@ static int btrfs_init_dev_replace_tgtdev(struct btrfs_fs_info *fs_info,
>  {
>  	struct btrfs_fs_devices *fs_devices = fs_info->fs_devices;
>  	struct btrfs_device *device;
> -	struct bdev_handle *bdev_handle;
> +	struct file *bdev_file;
>  	struct block_device *bdev;
>  	u64 devid = BTRFS_DEV_REPLACE_DEVID;
>  	int ret = 0;
> @@ -257,13 +257,13 @@ static int btrfs_init_dev_replace_tgtdev(struct btrfs_fs_info *fs_info,
>  		return -EINVAL;
>  	}
>  
> -	bdev_handle = bdev_open_by_path(device_path, BLK_OPEN_WRITE,
> +	bdev_file = bdev_file_open_by_path(device_path, BLK_OPEN_WRITE,
>  					fs_info->bdev_holder, NULL);
> -	if (IS_ERR(bdev_handle)) {
> +	if (IS_ERR(bdev_file)) {
>  		btrfs_err(fs_info, "target device %s is invalid!", device_path);
> -		return PTR_ERR(bdev_handle);
> +		return PTR_ERR(bdev_file);
>  	}
> -	bdev = bdev_handle->bdev;
> +	bdev = file_bdev(bdev_file);
>  
>  	if (!btrfs_check_device_zone_type(fs_info, bdev)) {
>  		btrfs_err(fs_info,
> @@ -314,7 +314,7 @@ static int btrfs_init_dev_replace_tgtdev(struct btrfs_fs_info *fs_info,
>  	device->commit_bytes_used = device->bytes_used;
>  	device->fs_info = fs_info;
>  	device->bdev = bdev;
> -	device->bdev_handle = bdev_handle;
> +	device->bdev_file = bdev_file;
>  	set_bit(BTRFS_DEV_STATE_IN_FS_METADATA, &device->dev_state);
>  	set_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state);
>  	device->dev_stats_valid = 1;
> @@ -335,7 +335,7 @@ static int btrfs_init_dev_replace_tgtdev(struct btrfs_fs_info *fs_info,
>  	return 0;
>  
>  error:
> -	bdev_release(bdev_handle);
> +	fput(bdev_file);
>  	return ret;
>  }
>  
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 41b479861b3c..9e0b3932d90c 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -2691,7 +2691,7 @@ static long btrfs_ioctl_rm_dev_v2(struct file *file, void __user *arg)
>  	struct inode *inode = file_inode(file);
>  	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
>  	struct btrfs_ioctl_vol_args_v2 *vol_args;
> -	struct bdev_handle *bdev_handle = NULL;
> +	struct file *bdev_file = NULL;
>  	int ret;
>  	bool cancel = false;
>  
> @@ -2728,7 +2728,7 @@ static long btrfs_ioctl_rm_dev_v2(struct file *file, void __user *arg)
>  		goto err_drop;
>  
>  	/* Exclusive operation is now claimed */
> -	ret = btrfs_rm_device(fs_info, &args, &bdev_handle);
> +	ret = btrfs_rm_device(fs_info, &args, &bdev_file);
>  
>  	btrfs_exclop_finish(fs_info);
>  
> @@ -2742,8 +2742,8 @@ static long btrfs_ioctl_rm_dev_v2(struct file *file, void __user *arg)
>  	}
>  err_drop:
>  	mnt_drop_write_file(file);
> -	if (bdev_handle)
> -		bdev_release(bdev_handle);
> +	if (bdev_file)
> +		fput(bdev_file);
>  out:
>  	btrfs_put_dev_args_from_path(&args);
>  	kfree(vol_args);
> @@ -2756,7 +2756,7 @@ static long btrfs_ioctl_rm_dev(struct file *file, void __user *arg)
>  	struct inode *inode = file_inode(file);
>  	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
>  	struct btrfs_ioctl_vol_args *vol_args;
> -	struct bdev_handle *bdev_handle = NULL;
> +	struct file *bdev_file = NULL;
>  	int ret;
>  	bool cancel = false;
>  
> @@ -2783,15 +2783,15 @@ static long btrfs_ioctl_rm_dev(struct file *file, void __user *arg)
>  	ret = exclop_start_or_cancel_reloc(fs_info, BTRFS_EXCLOP_DEV_REMOVE,
>  					   cancel);
>  	if (ret == 0) {
> -		ret = btrfs_rm_device(fs_info, &args, &bdev_handle);
> +		ret = btrfs_rm_device(fs_info, &args, &bdev_file);
>  		if (!ret)
>  			btrfs_info(fs_info, "disk deleted %s", vol_args->name);
>  		btrfs_exclop_finish(fs_info);
>  	}
>  
>  	mnt_drop_write_file(file);
> -	if (bdev_handle)
> -		bdev_release(bdev_handle);
> +	if (bdev_file)
> +		fput(bdev_file);
>  out:
>  	btrfs_put_dev_args_from_path(&args);
>  	kfree(vol_args);
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 4c32497311d2..769a1dc4b756 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -468,39 +468,39 @@ static noinline struct btrfs_fs_devices *find_fsid(
>  
>  static int
>  btrfs_get_bdev_and_sb(const char *device_path, blk_mode_t flags, void *holder,
> -		      int flush, struct bdev_handle **bdev_handle,
> +		      int flush, struct file **bdev_file,
>  		      struct btrfs_super_block **disk_super)
>  {
>  	struct block_device *bdev;
>  	int ret;
>  
> -	*bdev_handle = bdev_open_by_path(device_path, flags, holder, NULL);
> +	*bdev_file = bdev_file_open_by_path(device_path, flags, holder, NULL);
>  
> -	if (IS_ERR(*bdev_handle)) {
> -		ret = PTR_ERR(*bdev_handle);
> +	if (IS_ERR(*bdev_file)) {
> +		ret = PTR_ERR(*bdev_file);
>  		goto error;
>  	}
> -	bdev = (*bdev_handle)->bdev;
> +	bdev = file_bdev(*bdev_file);
>  
>  	if (flush)
>  		sync_blockdev(bdev);
>  	ret = set_blocksize(bdev, BTRFS_BDEV_BLOCKSIZE);
>  	if (ret) {
> -		bdev_release(*bdev_handle);
> +		fput(*bdev_file);
>  		goto error;
>  	}
>  	invalidate_bdev(bdev);
>  	*disk_super = btrfs_read_dev_super(bdev);
>  	if (IS_ERR(*disk_super)) {
>  		ret = PTR_ERR(*disk_super);
> -		bdev_release(*bdev_handle);
> +		fput(*bdev_file);
>  		goto error;
>  	}
>  
>  	return 0;
>  
>  error:
> -	*bdev_handle = NULL;
> +	*bdev_file = NULL;
>  	return ret;
>  }
>  
> @@ -643,7 +643,7 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices,
>  			struct btrfs_device *device, blk_mode_t flags,
>  			void *holder)
>  {
> -	struct bdev_handle *bdev_handle;
> +	struct file *bdev_file;
>  	struct btrfs_super_block *disk_super;
>  	u64 devid;
>  	int ret;
> @@ -654,7 +654,7 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices,
>  		return -EINVAL;
>  
>  	ret = btrfs_get_bdev_and_sb(device->name->str, flags, holder, 1,
> -				    &bdev_handle, &disk_super);
> +				    &bdev_file, &disk_super);
>  	if (ret)
>  		return ret;
>  
> @@ -678,20 +678,20 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices,
>  		clear_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state);
>  		fs_devices->seeding = true;
>  	} else {
> -		if (bdev_read_only(bdev_handle->bdev))
> +		if (bdev_read_only(file_bdev(bdev_file)))
>  			clear_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state);
>  		else
>  			set_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state);
>  	}
>  
> -	if (!bdev_nonrot(bdev_handle->bdev))
> +	if (!bdev_nonrot(file_bdev(bdev_file)))
>  		fs_devices->rotating = true;
>  
> -	if (bdev_max_discard_sectors(bdev_handle->bdev))
> +	if (bdev_max_discard_sectors(file_bdev(bdev_file)))
>  		fs_devices->discardable = true;
>  
> -	device->bdev_handle = bdev_handle;
> -	device->bdev = bdev_handle->bdev;
> +	device->bdev_file = bdev_file;
> +	device->bdev = file_bdev(bdev_file);
>  	clear_bit(BTRFS_DEV_STATE_IN_FS_METADATA, &device->dev_state);
>  
>  	fs_devices->open_devices++;
> @@ -706,7 +706,7 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices,
>  
>  error_free_page:
>  	btrfs_release_disk_super(disk_super);
> -	bdev_release(bdev_handle);
> +	fput(bdev_file);
>  
>  	return -EINVAL;
>  }
> @@ -1015,10 +1015,10 @@ static void __btrfs_free_extra_devids(struct btrfs_fs_devices *fs_devices,
>  		if (device->devid == BTRFS_DEV_REPLACE_DEVID)
>  			continue;
>  
> -		if (device->bdev_handle) {
> -			bdev_release(device->bdev_handle);
> +		if (device->bdev_file) {
> +			fput(device->bdev_file);
>  			device->bdev = NULL;
> -			device->bdev_handle = NULL;
> +			device->bdev_file = NULL;
>  			fs_devices->open_devices--;
>  		}
>  		if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) {
> @@ -1063,7 +1063,7 @@ static void btrfs_close_bdev(struct btrfs_device *device)
>  		invalidate_bdev(device->bdev);
>  	}
>  
> -	bdev_release(device->bdev_handle);
> +	fput(device->bdev_file);
>  }
>  
>  static void btrfs_close_one_device(struct btrfs_device *device)
> @@ -1316,7 +1316,7 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags,
>  	struct btrfs_super_block *disk_super;
>  	bool new_device_added = false;
>  	struct btrfs_device *device = NULL;
> -	struct bdev_handle *bdev_handle;
> +	struct file *bdev_file;
>  	u64 bytenr, bytenr_orig;
>  	int ret;
>  
> @@ -1339,18 +1339,18 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags,
>  	 * values temporarily, as the device paths of the fsid are the only
>  	 * required information for assembling the volume.
>  	 */
> -	bdev_handle = bdev_open_by_path(path, flags, NULL, NULL);
> -	if (IS_ERR(bdev_handle))
> -		return ERR_CAST(bdev_handle);
> +	bdev_file = bdev_file_open_by_path(path, flags, NULL, NULL);
> +	if (IS_ERR(bdev_file))
> +		return ERR_CAST(bdev_file);
>  
>  	bytenr_orig = btrfs_sb_offset(0);
> -	ret = btrfs_sb_log_location_bdev(bdev_handle->bdev, 0, READ, &bytenr);
> +	ret = btrfs_sb_log_location_bdev(file_bdev(bdev_file), 0, READ, &bytenr);
>  	if (ret) {
>  		device = ERR_PTR(ret);
>  		goto error_bdev_put;
>  	}
>  
> -	disk_super = btrfs_read_disk_super(bdev_handle->bdev, bytenr,
> +	disk_super = btrfs_read_disk_super(file_bdev(bdev_file), bytenr,
>  					   bytenr_orig);
>  	if (IS_ERR(disk_super)) {
>  		device = ERR_CAST(disk_super);
> @@ -1381,7 +1381,7 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags,
>  	btrfs_release_disk_super(disk_super);
>  
>  error_bdev_put:
> -	bdev_release(bdev_handle);
> +	fput(bdev_file);
>  
>  	return device;
>  }
> @@ -2057,7 +2057,7 @@ void btrfs_scratch_superblocks(struct btrfs_fs_info *fs_info,
>  
>  int btrfs_rm_device(struct btrfs_fs_info *fs_info,
>  		    struct btrfs_dev_lookup_args *args,
> -		    struct bdev_handle **bdev_handle)
> +		    struct file **bdev_file)
>  {
>  	struct btrfs_trans_handle *trans;
>  	struct btrfs_device *device;
> @@ -2166,7 +2166,7 @@ int btrfs_rm_device(struct btrfs_fs_info *fs_info,
>  
>  	btrfs_assign_next_active_device(device, NULL);
>  
> -	if (device->bdev_handle) {
> +	if (device->bdev_file) {
>  		cur_devices->open_devices--;
>  		/* remove sysfs entry */
>  		btrfs_sysfs_remove_device(device);
> @@ -2182,9 +2182,9 @@ int btrfs_rm_device(struct btrfs_fs_info *fs_info,
>  	 * free the device.
>  	 *
>  	 * We cannot call btrfs_close_bdev() here because we're holding the sb
> -	 * write lock, and bdev_release() will pull in the ->open_mutex on
> -	 * the block device and it's dependencies.  Instead just flush the
> -	 * device and let the caller do the final bdev_release.
> +	 * write lock, and fput() on the block device will pull in the
> +	 * ->open_mutex on the block device and it's dependencies.  Instead
> +	 *  just flush the device and let the caller do the final bdev_release.
>  	 */
>  	if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) {
>  		btrfs_scratch_superblocks(fs_info, device->bdev,
> @@ -2195,7 +2195,7 @@ int btrfs_rm_device(struct btrfs_fs_info *fs_info,
>  		}
>  	}
>  
> -	*bdev_handle = device->bdev_handle;
> +	*bdev_file = device->bdev_file;
>  	synchronize_rcu();
>  	btrfs_free_device(device);
>  
> @@ -2332,7 +2332,7 @@ int btrfs_get_dev_args_from_path(struct btrfs_fs_info *fs_info,
>  				 const char *path)
>  {
>  	struct btrfs_super_block *disk_super;
> -	struct bdev_handle *bdev_handle;
> +	struct file *bdev_file;
>  	int ret;
>  
>  	if (!path || !path[0])
> @@ -2350,7 +2350,7 @@ int btrfs_get_dev_args_from_path(struct btrfs_fs_info *fs_info,
>  	}
>  
>  	ret = btrfs_get_bdev_and_sb(path, BLK_OPEN_READ, NULL, 0,
> -				    &bdev_handle, &disk_super);
> +				    &bdev_file, &disk_super);
>  	if (ret) {
>  		btrfs_put_dev_args_from_path(args);
>  		return ret;
> @@ -2363,7 +2363,7 @@ int btrfs_get_dev_args_from_path(struct btrfs_fs_info *fs_info,
>  	else
>  		memcpy(args->fsid, disk_super->fsid, BTRFS_FSID_SIZE);
>  	btrfs_release_disk_super(disk_super);
> -	bdev_release(bdev_handle);
> +	fput(bdev_file);
>  	return 0;
>  }
>  
> @@ -2583,7 +2583,7 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
>  	struct btrfs_root *root = fs_info->dev_root;
>  	struct btrfs_trans_handle *trans;
>  	struct btrfs_device *device;
> -	struct bdev_handle *bdev_handle;
> +	struct file *bdev_file;
>  	struct super_block *sb = fs_info->sb;
>  	struct btrfs_fs_devices *fs_devices = fs_info->fs_devices;
>  	struct btrfs_fs_devices *seed_devices = NULL;
> @@ -2596,12 +2596,12 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
>  	if (sb_rdonly(sb) && !fs_devices->seeding)
>  		return -EROFS;
>  
> -	bdev_handle = bdev_open_by_path(device_path, BLK_OPEN_WRITE,
> +	bdev_file = bdev_file_open_by_path(device_path, BLK_OPEN_WRITE,
>  					fs_info->bdev_holder, NULL);
> -	if (IS_ERR(bdev_handle))
> -		return PTR_ERR(bdev_handle);
> +	if (IS_ERR(bdev_file))
> +		return PTR_ERR(bdev_file);
>  
> -	if (!btrfs_check_device_zone_type(fs_info, bdev_handle->bdev)) {
> +	if (!btrfs_check_device_zone_type(fs_info, file_bdev(bdev_file))) {
>  		ret = -EINVAL;
>  		goto error;
>  	}
> @@ -2613,11 +2613,11 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
>  		locked = true;
>  	}
>  
> -	sync_blockdev(bdev_handle->bdev);
> +	sync_blockdev(file_bdev(bdev_file));
>  
>  	rcu_read_lock();
>  	list_for_each_entry_rcu(device, &fs_devices->devices, dev_list) {
> -		if (device->bdev == bdev_handle->bdev) {
> +		if (device->bdev == file_bdev(bdev_file)) {
>  			ret = -EEXIST;
>  			rcu_read_unlock();
>  			goto error;
> @@ -2633,8 +2633,8 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
>  	}
>  
>  	device->fs_info = fs_info;
> -	device->bdev_handle = bdev_handle;
> -	device->bdev = bdev_handle->bdev;
> +	device->bdev_file = bdev_file;
> +	device->bdev = file_bdev(bdev_file);
>  	ret = lookup_bdev(device_path, &device->devt);
>  	if (ret)
>  		goto error_free_device;
> @@ -2817,7 +2817,7 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
>  error_free_device:
>  	btrfs_free_device(device);
>  error:
> -	bdev_release(bdev_handle);
> +	fput(bdev_file);
>  	if (locked) {
>  		mutex_unlock(&uuid_mutex);
>  		up_write(&sb->s_umount);
> diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
> index 53f87f398da7..a11854912d53 100644
> --- a/fs/btrfs/volumes.h
> +++ b/fs/btrfs/volumes.h
> @@ -90,7 +90,7 @@ struct btrfs_device {
>  
>  	u64 generation;
>  
> -	struct bdev_handle *bdev_handle;
> +	struct file *bdev_file;
>  	struct block_device *bdev;
>  
>  	struct btrfs_zoned_device_info *zone_info;
> @@ -661,7 +661,7 @@ struct btrfs_device *btrfs_alloc_device(struct btrfs_fs_info *fs_info,
>  void btrfs_put_dev_args_from_path(struct btrfs_dev_lookup_args *args);
>  int btrfs_rm_device(struct btrfs_fs_info *fs_info,
>  		    struct btrfs_dev_lookup_args *args,
> -		    struct bdev_handle **bdev_handle);
> +		    struct file **bdev_file);
>  void __exit btrfs_cleanup_fs_uuids(void);
>  int btrfs_num_copies(struct btrfs_fs_info *fs_info, u64 logical, u64 len);
>  int btrfs_grow_device(struct btrfs_trans_handle *trans,
> 
> -- 
> 2.43.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 20/34] erofs: port device access to file
  2024-01-23 13:26 ` [PATCH v2 20/34] erofs: " Christian Brauner
@ 2024-02-01 10:16   ` Jan Kara
  0 siblings, 0 replies; 146+ messages in thread
From: Jan Kara @ 2024-02-01 10:16 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:37, Christian Brauner wrote:
> Signed-off-by: Christian Brauner <brauner@kernel.org>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/erofs/data.c     |  6 +++---
>  fs/erofs/internal.h |  2 +-
>  fs/erofs/super.c    | 16 ++++++++--------
>  3 files changed, 12 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/erofs/data.c b/fs/erofs/data.c
> index c98aeda8abb2..433fc39ba423 100644
> --- a/fs/erofs/data.c
> +++ b/fs/erofs/data.c
> @@ -220,7 +220,7 @@ int erofs_map_dev(struct super_block *sb, struct erofs_map_dev *map)
>  			up_read(&devs->rwsem);
>  			return 0;
>  		}
> -		map->m_bdev = dif->bdev_handle ? dif->bdev_handle->bdev : NULL;
> +		map->m_bdev = dif->bdev_file ? file_bdev(dif->bdev_file) : NULL;
>  		map->m_daxdev = dif->dax_dev;
>  		map->m_dax_part_off = dif->dax_part_off;
>  		map->m_fscache = dif->fscache;
> @@ -238,8 +238,8 @@ int erofs_map_dev(struct super_block *sb, struct erofs_map_dev *map)
>  			if (map->m_pa >= startoff &&
>  			    map->m_pa < startoff + length) {
>  				map->m_pa -= startoff;
> -				map->m_bdev = dif->bdev_handle ?
> -					      dif->bdev_handle->bdev : NULL;
> +				map->m_bdev = dif->bdev_file ?
> +					      file_bdev(dif->bdev_file) : NULL;
>  				map->m_daxdev = dif->dax_dev;
>  				map->m_dax_part_off = dif->dax_part_off;
>  				map->m_fscache = dif->fscache;
> diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
> index b0409badb017..0f0706325b7b 100644
> --- a/fs/erofs/internal.h
> +++ b/fs/erofs/internal.h
> @@ -49,7 +49,7 @@ typedef u32 erofs_blk_t;
>  struct erofs_device_info {
>  	char *path;
>  	struct erofs_fscache *fscache;
> -	struct bdev_handle *bdev_handle;
> +	struct file *bdev_file;
>  	struct dax_device *dax_dev;
>  	u64 dax_part_off;
>  
> diff --git a/fs/erofs/super.c b/fs/erofs/super.c
> index 5f60f163bd56..9b4b66dcdd4f 100644
> --- a/fs/erofs/super.c
> +++ b/fs/erofs/super.c
> @@ -177,7 +177,7 @@ static int erofs_init_device(struct erofs_buf *buf, struct super_block *sb,
>  	struct erofs_sb_info *sbi = EROFS_SB(sb);
>  	struct erofs_fscache *fscache;
>  	struct erofs_deviceslot *dis;
> -	struct bdev_handle *bdev_handle;
> +	struct file *bdev_file;
>  	void *ptr;
>  
>  	ptr = erofs_read_metabuf(buf, sb, erofs_blknr(sb, *pos), EROFS_KMAP);
> @@ -201,12 +201,12 @@ static int erofs_init_device(struct erofs_buf *buf, struct super_block *sb,
>  			return PTR_ERR(fscache);
>  		dif->fscache = fscache;
>  	} else if (!sbi->devs->flatdev) {
> -		bdev_handle = bdev_open_by_path(dif->path, BLK_OPEN_READ,
> +		bdev_file = bdev_file_open_by_path(dif->path, BLK_OPEN_READ,
>  						sb->s_type, NULL);
> -		if (IS_ERR(bdev_handle))
> -			return PTR_ERR(bdev_handle);
> -		dif->bdev_handle = bdev_handle;
> -		dif->dax_dev = fs_dax_get_by_bdev(bdev_handle->bdev,
> +		if (IS_ERR(bdev_file))
> +			return PTR_ERR(bdev_file);
> +		dif->bdev_file = bdev_file;
> +		dif->dax_dev = fs_dax_get_by_bdev(file_bdev(bdev_file),
>  				&dif->dax_part_off, NULL, NULL);
>  	}
>  
> @@ -754,8 +754,8 @@ static int erofs_release_device_info(int id, void *ptr, void *data)
>  	struct erofs_device_info *dif = ptr;
>  
>  	fs_put_dax(dif->dax_dev, NULL);
> -	if (dif->bdev_handle)
> -		bdev_release(dif->bdev_handle);
> +	if (dif->bdev_file)
> +		fput(dif->bdev_file);
>  	erofs_fscache_unregister_cookie(dif->fscache);
>  	dif->fscache = NULL;
>  	kfree(dif->path);
> 
> -- 
> 2.43.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 21/34] ext4: port block device access to file
  2024-01-23 13:26 ` [PATCH v2 21/34] ext4: port block " Christian Brauner
@ 2024-02-01 10:18   ` Jan Kara
  0 siblings, 0 replies; 146+ messages in thread
From: Jan Kara @ 2024-02-01 10:18 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:38, Christian Brauner wrote:
> Signed-off-by: Christian Brauner <brauner@kernel.org>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/ext4/ext4.h  |  2 +-
>  fs/ext4/fsmap.c |  8 ++++----
>  fs/ext4/super.c | 52 ++++++++++++++++++++++++++--------------------------
>  3 files changed, 31 insertions(+), 31 deletions(-)
> 
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index a5d784872303..dcdad5da419e 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -1548,7 +1548,7 @@ struct ext4_sb_info {
>  	unsigned long s_commit_interval;
>  	u32 s_max_batch_time;
>  	u32 s_min_batch_time;
> -	struct bdev_handle *s_journal_bdev_handle;
> +	struct file *s_journal_bdev_file;
>  #ifdef CONFIG_QUOTA
>  	/* Names of quota files with journalled quota */
>  	char __rcu *s_qf_names[EXT4_MAXQUOTAS];
> diff --git a/fs/ext4/fsmap.c b/fs/ext4/fsmap.c
> index 11e6f33677a2..df853c4d3a8c 100644
> --- a/fs/ext4/fsmap.c
> +++ b/fs/ext4/fsmap.c
> @@ -576,9 +576,9 @@ static bool ext4_getfsmap_is_valid_device(struct super_block *sb,
>  	if (fm->fmr_device == 0 || fm->fmr_device == UINT_MAX ||
>  	    fm->fmr_device == new_encode_dev(sb->s_bdev->bd_dev))
>  		return true;
> -	if (EXT4_SB(sb)->s_journal_bdev_handle &&
> +	if (EXT4_SB(sb)->s_journal_bdev_file &&
>  	    fm->fmr_device ==
> -	    new_encode_dev(EXT4_SB(sb)->s_journal_bdev_handle->bdev->bd_dev))
> +	    new_encode_dev(file_bdev(EXT4_SB(sb)->s_journal_bdev_file)->bd_dev))
>  		return true;
>  	return false;
>  }
> @@ -648,9 +648,9 @@ int ext4_getfsmap(struct super_block *sb, struct ext4_fsmap_head *head,
>  	memset(handlers, 0, sizeof(handlers));
>  	handlers[0].gfd_dev = new_encode_dev(sb->s_bdev->bd_dev);
>  	handlers[0].gfd_fn = ext4_getfsmap_datadev;
> -	if (EXT4_SB(sb)->s_journal_bdev_handle) {
> +	if (EXT4_SB(sb)->s_journal_bdev_file) {
>  		handlers[1].gfd_dev = new_encode_dev(
> -			EXT4_SB(sb)->s_journal_bdev_handle->bdev->bd_dev);
> +			file_bdev(EXT4_SB(sb)->s_journal_bdev_file)->bd_dev);
>  		handlers[1].gfd_fn = ext4_getfsmap_logdev;
>  	}
>  
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index dcba0f85dfe2..aa007710cfc3 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -1359,14 +1359,14 @@ static void ext4_put_super(struct super_block *sb)
>  
>  	sync_blockdev(sb->s_bdev);
>  	invalidate_bdev(sb->s_bdev);
> -	if (sbi->s_journal_bdev_handle) {
> +	if (sbi->s_journal_bdev_file) {
>  		/*
>  		 * Invalidate the journal device's buffers.  We don't want them
>  		 * floating about in memory - the physical journal device may
>  		 * hotswapped, and it breaks the `ro-after' testing code.
>  		 */
> -		sync_blockdev(sbi->s_journal_bdev_handle->bdev);
> -		invalidate_bdev(sbi->s_journal_bdev_handle->bdev);
> +		sync_blockdev(file_bdev(sbi->s_journal_bdev_file));
> +		invalidate_bdev(file_bdev(sbi->s_journal_bdev_file));
>  	}
>  
>  	ext4_xattr_destroy_cache(sbi->s_ea_inode_cache);
> @@ -4233,7 +4233,7 @@ int ext4_calculate_overhead(struct super_block *sb)
>  	 * Add the internal journal blocks whether the journal has been
>  	 * loaded or not
>  	 */
> -	if (sbi->s_journal && !sbi->s_journal_bdev_handle)
> +	if (sbi->s_journal && !sbi->s_journal_bdev_file)
>  		overhead += EXT4_NUM_B2C(sbi, sbi->s_journal->j_total_len);
>  	else if (ext4_has_feature_journal(sb) && !sbi->s_journal && j_inum) {
>  		/* j_inum for internal journal is non-zero */
> @@ -5670,9 +5670,9 @@ failed_mount9: __maybe_unused
>  #endif
>  	fscrypt_free_dummy_policy(&sbi->s_dummy_enc_policy);
>  	brelse(sbi->s_sbh);
> -	if (sbi->s_journal_bdev_handle) {
> -		invalidate_bdev(sbi->s_journal_bdev_handle->bdev);
> -		bdev_release(sbi->s_journal_bdev_handle);
> +	if (sbi->s_journal_bdev_file) {
> +		invalidate_bdev(file_bdev(sbi->s_journal_bdev_file));
> +		fput(sbi->s_journal_bdev_file);
>  	}
>  out_fail:
>  	invalidate_bdev(sb->s_bdev);
> @@ -5842,30 +5842,30 @@ static journal_t *ext4_open_inode_journal(struct super_block *sb,
>  	return journal;
>  }
>  
> -static struct bdev_handle *ext4_get_journal_blkdev(struct super_block *sb,
> +static struct file *ext4_get_journal_blkdev(struct super_block *sb,
>  					dev_t j_dev, ext4_fsblk_t *j_start,
>  					ext4_fsblk_t *j_len)
>  {
>  	struct buffer_head *bh;
>  	struct block_device *bdev;
> -	struct bdev_handle *bdev_handle;
> +	struct file *bdev_file;
>  	int hblock, blocksize;
>  	ext4_fsblk_t sb_block;
>  	unsigned long offset;
>  	struct ext4_super_block *es;
>  	int errno;
>  
> -	bdev_handle = bdev_open_by_dev(j_dev,
> +	bdev_file = bdev_file_open_by_dev(j_dev,
>  		BLK_OPEN_READ | BLK_OPEN_WRITE | BLK_OPEN_RESTRICT_WRITES,
>  		sb, &fs_holder_ops);
> -	if (IS_ERR(bdev_handle)) {
> +	if (IS_ERR(bdev_file)) {
>  		ext4_msg(sb, KERN_ERR,
>  			 "failed to open journal device unknown-block(%u,%u) %ld",
> -			 MAJOR(j_dev), MINOR(j_dev), PTR_ERR(bdev_handle));
> -		return bdev_handle;
> +			 MAJOR(j_dev), MINOR(j_dev), PTR_ERR(bdev_file));
> +		return bdev_file;
>  	}
>  
> -	bdev = bdev_handle->bdev;
> +	bdev = file_bdev(bdev_file);
>  	blocksize = sb->s_blocksize;
>  	hblock = bdev_logical_block_size(bdev);
>  	if (blocksize < hblock) {
> @@ -5912,12 +5912,12 @@ static struct bdev_handle *ext4_get_journal_blkdev(struct super_block *sb,
>  	*j_start = sb_block + 1;
>  	*j_len = ext4_blocks_count(es);
>  	brelse(bh);
> -	return bdev_handle;
> +	return bdev_file;
>  
>  out_bh:
>  	brelse(bh);
>  out_bdev:
> -	bdev_release(bdev_handle);
> +	fput(bdev_file);
>  	return ERR_PTR(errno);
>  }
>  
> @@ -5927,14 +5927,14 @@ static journal_t *ext4_open_dev_journal(struct super_block *sb,
>  	journal_t *journal;
>  	ext4_fsblk_t j_start;
>  	ext4_fsblk_t j_len;
> -	struct bdev_handle *bdev_handle;
> +	struct file *bdev_file;
>  	int errno = 0;
>  
> -	bdev_handle = ext4_get_journal_blkdev(sb, j_dev, &j_start, &j_len);
> -	if (IS_ERR(bdev_handle))
> -		return ERR_CAST(bdev_handle);
> +	bdev_file = ext4_get_journal_blkdev(sb, j_dev, &j_start, &j_len);
> +	if (IS_ERR(bdev_file))
> +		return ERR_CAST(bdev_file);
>  
> -	journal = jbd2_journal_init_dev(bdev_handle->bdev, sb->s_bdev, j_start,
> +	journal = jbd2_journal_init_dev(file_bdev(bdev_file), sb->s_bdev, j_start,
>  					j_len, sb->s_blocksize);
>  	if (IS_ERR(journal)) {
>  		ext4_msg(sb, KERN_ERR, "failed to create device journal");
> @@ -5949,14 +5949,14 @@ static journal_t *ext4_open_dev_journal(struct super_block *sb,
>  		goto out_journal;
>  	}
>  	journal->j_private = sb;
> -	EXT4_SB(sb)->s_journal_bdev_handle = bdev_handle;
> +	EXT4_SB(sb)->s_journal_bdev_file = bdev_file;
>  	ext4_init_journal_params(sb, journal);
>  	return journal;
>  
>  out_journal:
>  	jbd2_journal_destroy(journal);
>  out_bdev:
> -	bdev_release(bdev_handle);
> +	fput(bdev_file);
>  	return ERR_PTR(errno);
>  }
>  
> @@ -7314,12 +7314,12 @@ static inline int ext3_feature_set_ok(struct super_block *sb)
>  static void ext4_kill_sb(struct super_block *sb)
>  {
>  	struct ext4_sb_info *sbi = EXT4_SB(sb);
> -	struct bdev_handle *handle = sbi ? sbi->s_journal_bdev_handle : NULL;
> +	struct file *bdev_file = sbi ? sbi->s_journal_bdev_file : NULL;
>  
>  	kill_block_super(sb);
>  
> -	if (handle)
> -		bdev_release(handle);
> +	if (bdev_file)
> +		fput(bdev_file);
>  }
>  
>  static struct file_system_type ext4_fs_type = {
> 
> -- 
> 2.43.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 22/34] f2fs: port block device access to files
  2024-01-23 13:26 ` [PATCH v2 22/34] f2fs: port block device access to files Christian Brauner
@ 2024-02-01 10:19   ` Jan Kara
  0 siblings, 0 replies; 146+ messages in thread
From: Jan Kara @ 2024-02-01 10:19 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:39, Christian Brauner wrote:
> Signed-off-by: Christian Brauner <brauner@kernel.org>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/f2fs/f2fs.h  |  2 +-
>  fs/f2fs/super.c | 12 ++++++------
>  2 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> index 65294e3b0bef..6fc172c99915 100644
> --- a/fs/f2fs/f2fs.h
> +++ b/fs/f2fs/f2fs.h
> @@ -1239,7 +1239,7 @@ struct f2fs_bio_info {
>  #define FDEV(i)				(sbi->devs[i])
>  #define RDEV(i)				(raw_super->devs[i])
>  struct f2fs_dev_info {
> -	struct bdev_handle *bdev_handle;
> +	struct file *bdev_file;
>  	struct block_device *bdev;
>  	char path[MAX_PATH_LEN];
>  	unsigned int total_segments;
> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> index ea94c148fee5..557ea5c6c926 100644
> --- a/fs/f2fs/super.c
> +++ b/fs/f2fs/super.c
> @@ -1605,7 +1605,7 @@ static void destroy_device_list(struct f2fs_sb_info *sbi)
>  
>  	for (i = 0; i < sbi->s_ndevs; i++) {
>  		if (i > 0)
> -			bdev_release(FDEV(i).bdev_handle);
> +			fput(FDEV(i).bdev_file);
>  #ifdef CONFIG_BLK_DEV_ZONED
>  		kvfree(FDEV(i).blkz_seq);
>  #endif
> @@ -4247,7 +4247,7 @@ static int f2fs_scan_devices(struct f2fs_sb_info *sbi)
>  
>  	for (i = 0; i < max_devices; i++) {
>  		if (i == 0)
> -			FDEV(0).bdev_handle = sb_bdev_handle(sbi->sb);
> +			FDEV(0).bdev_file = sbi->sb->s_bdev_file;
>  		else if (!RDEV(i).path[0])
>  			break;
>  
> @@ -4267,14 +4267,14 @@ static int f2fs_scan_devices(struct f2fs_sb_info *sbi)
>  				FDEV(i).end_blk = FDEV(i).start_blk +
>  					(FDEV(i).total_segments <<
>  					sbi->log_blocks_per_seg) - 1;
> -				FDEV(i).bdev_handle = bdev_open_by_path(
> +				FDEV(i).bdev_file = bdev_file_open_by_path(
>  					FDEV(i).path, mode, sbi->sb, NULL);
>  			}
>  		}
> -		if (IS_ERR(FDEV(i).bdev_handle))
> -			return PTR_ERR(FDEV(i).bdev_handle);
> +		if (IS_ERR(FDEV(i).bdev_file))
> +			return PTR_ERR(FDEV(i).bdev_file);
>  
> -		FDEV(i).bdev = FDEV(i).bdev_handle->bdev;
> +		FDEV(i).bdev = file_bdev(FDEV(i).bdev_file);
>  		/* to release errored devices */
>  		sbi->s_ndevs = i + 1;
>  
> 
> -- 
> 2.43.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 23/34] jfs: port block device access to file
  2024-01-23 13:26 ` [PATCH v2 23/34] jfs: port block device access to file Christian Brauner
@ 2024-02-01 10:19   ` Jan Kara
  0 siblings, 0 replies; 146+ messages in thread
From: Jan Kara @ 2024-02-01 10:19 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:40, Christian Brauner wrote:
> Signed-off-by: Christian Brauner <brauner@kernel.org>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/jfs/jfs_logmgr.c | 26 +++++++++++++-------------
>  fs/jfs/jfs_logmgr.h |  2 +-
>  fs/jfs/jfs_mount.c  |  2 +-
>  3 files changed, 15 insertions(+), 15 deletions(-)
> 
> diff --git a/fs/jfs/jfs_logmgr.c b/fs/jfs/jfs_logmgr.c
> index 8691463956d1..73389c68e251 100644
> --- a/fs/jfs/jfs_logmgr.c
> +++ b/fs/jfs/jfs_logmgr.c
> @@ -1058,7 +1058,7 @@ void jfs_syncpt(struct jfs_log *log, int hard_sync)
>  int lmLogOpen(struct super_block *sb)
>  {
>  	int rc;
> -	struct bdev_handle *bdev_handle;
> +	struct file *bdev_file;
>  	struct jfs_log *log;
>  	struct jfs_sb_info *sbi = JFS_SBI(sb);
>  
> @@ -1070,7 +1070,7 @@ int lmLogOpen(struct super_block *sb)
>  
>  	mutex_lock(&jfs_log_mutex);
>  	list_for_each_entry(log, &jfs_external_logs, journal_list) {
> -		if (log->bdev_handle->bdev->bd_dev == sbi->logdev) {
> +		if (file_bdev(log->bdev_file)->bd_dev == sbi->logdev) {
>  			if (!uuid_equal(&log->uuid, &sbi->loguuid)) {
>  				jfs_warn("wrong uuid on JFS journal");
>  				mutex_unlock(&jfs_log_mutex);
> @@ -1100,14 +1100,14 @@ int lmLogOpen(struct super_block *sb)
>  	 * file systems to log may have n-to-1 relationship;
>  	 */
>  
> -	bdev_handle = bdev_open_by_dev(sbi->logdev,
> +	bdev_file = bdev_file_open_by_dev(sbi->logdev,
>  			BLK_OPEN_READ | BLK_OPEN_WRITE, log, NULL);
> -	if (IS_ERR(bdev_handle)) {
> -		rc = PTR_ERR(bdev_handle);
> +	if (IS_ERR(bdev_file)) {
> +		rc = PTR_ERR(bdev_file);
>  		goto free;
>  	}
>  
> -	log->bdev_handle = bdev_handle;
> +	log->bdev_file = bdev_file;
>  	uuid_copy(&log->uuid, &sbi->loguuid);
>  
>  	/*
> @@ -1141,7 +1141,7 @@ int lmLogOpen(struct super_block *sb)
>  	lbmLogShutdown(log);
>  
>        close:		/* close external log device */
> -	bdev_release(bdev_handle);
> +	fput(bdev_file);
>  
>        free:		/* free log descriptor */
>  	mutex_unlock(&jfs_log_mutex);
> @@ -1162,7 +1162,7 @@ static int open_inline_log(struct super_block *sb)
>  	init_waitqueue_head(&log->syncwait);
>  
>  	set_bit(log_INLINELOG, &log->flag);
> -	log->bdev_handle = sb_bdev_handle(sb);
> +	log->bdev_file = sb->s_bdev_file;
>  	log->base = addressPXD(&JFS_SBI(sb)->logpxd);
>  	log->size = lengthPXD(&JFS_SBI(sb)->logpxd) >>
>  	    (L2LOGPSIZE - sb->s_blocksize_bits);
> @@ -1436,7 +1436,7 @@ int lmLogClose(struct super_block *sb)
>  {
>  	struct jfs_sb_info *sbi = JFS_SBI(sb);
>  	struct jfs_log *log = sbi->log;
> -	struct bdev_handle *bdev_handle;
> +	struct file *bdev_file;
>  	int rc = 0;
>  
>  	jfs_info("lmLogClose: log:0x%p", log);
> @@ -1482,10 +1482,10 @@ int lmLogClose(struct super_block *sb)
>  	 *	external log as separate logical volume
>  	 */
>  	list_del(&log->journal_list);
> -	bdev_handle = log->bdev_handle;
> +	bdev_file = log->bdev_file;
>  	rc = lmLogShutdown(log);
>  
> -	bdev_release(bdev_handle);
> +	fput(bdev_file);
>  
>  	kfree(log);
>  
> @@ -1972,7 +1972,7 @@ static int lbmRead(struct jfs_log * log, int pn, struct lbuf ** bpp)
>  
>  	bp->l_flag |= lbmREAD;
>  
> -	bio = bio_alloc(log->bdev_handle->bdev, 1, REQ_OP_READ, GFP_NOFS);
> +	bio = bio_alloc(file_bdev(log->bdev_file), 1, REQ_OP_READ, GFP_NOFS);
>  	bio->bi_iter.bi_sector = bp->l_blkno << (log->l2bsize - 9);
>  	__bio_add_page(bio, bp->l_page, LOGPSIZE, bp->l_offset);
>  	BUG_ON(bio->bi_iter.bi_size != LOGPSIZE);
> @@ -2115,7 +2115,7 @@ static void lbmStartIO(struct lbuf * bp)
>  	jfs_info("lbmStartIO");
>  
>  	if (!log->no_integrity)
> -		bdev = log->bdev_handle->bdev;
> +		bdev = file_bdev(log->bdev_file);
>  
>  	bio = bio_alloc(bdev, 1, REQ_OP_WRITE | REQ_SYNC,
>  			GFP_NOFS);
> diff --git a/fs/jfs/jfs_logmgr.h b/fs/jfs/jfs_logmgr.h
> index 84aa2d253907..8b8994e48cd0 100644
> --- a/fs/jfs/jfs_logmgr.h
> +++ b/fs/jfs/jfs_logmgr.h
> @@ -356,7 +356,7 @@ struct jfs_log {
>  				 *    before writing syncpt.
>  				 */
>  	struct list_head journal_list; /* Global list */
> -	struct bdev_handle *bdev_handle; /* 4: log lv pointer */
> +	struct file *bdev_file;	/* 4: log lv pointer */
>  	int serial;		/* 4: log mount serial number */
>  
>  	s64 base;		/* @8: log extent address (inline log ) */
> diff --git a/fs/jfs/jfs_mount.c b/fs/jfs/jfs_mount.c
> index 9b5c6a20b30c..98f9a432c336 100644
> --- a/fs/jfs/jfs_mount.c
> +++ b/fs/jfs/jfs_mount.c
> @@ -431,7 +431,7 @@ int updateSuper(struct super_block *sb, uint state)
>  	if (state == FM_MOUNT) {
>  		/* record log's dev_t and mount serial number */
>  		j_sb->s_logdev = cpu_to_le32(
> -			new_encode_dev(sbi->log->bdev_handle->bdev->bd_dev));
> +			new_encode_dev(file_bdev(sbi->log->bdev_file)->bd_dev));
>  		j_sb->s_logserial = cpu_to_le32(sbi->log->serial);
>  	} else if (state == FM_CLEAN) {
>  		/*
> 
> -- 
> 2.43.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 24/34] nfs: port block device access to files
  2024-01-23 13:26 ` [PATCH v2 24/34] nfs: port block device access to files Christian Brauner
@ 2024-02-01 10:22   ` Jan Kara
  0 siblings, 0 replies; 146+ messages in thread
From: Jan Kara @ 2024-02-01 10:22 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:41, Christian Brauner wrote:
> Signed-off-by: Christian Brauner <brauner@kernel.org>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

									Honza
> ---
>  fs/nfs/blocklayout/blocklayout.h |  2 +-
>  fs/nfs/blocklayout/dev.c         | 68 ++++++++++++++++++++--------------------
>  2 files changed, 35 insertions(+), 35 deletions(-)
> 
> diff --git a/fs/nfs/blocklayout/blocklayout.h b/fs/nfs/blocklayout/blocklayout.h
> index b4294a8aa2d4..f1eeb4914199 100644
> --- a/fs/nfs/blocklayout/blocklayout.h
> +++ b/fs/nfs/blocklayout/blocklayout.h
> @@ -108,7 +108,7 @@ struct pnfs_block_dev {
>  	struct pnfs_block_dev		*children;
>  	u64				chunk_size;
>  
> -	struct bdev_handle		*bdev_handle;
> +	struct file			*bdev_file;
>  	u64				disk_offset;
>  
>  	u64				pr_key;
> diff --git a/fs/nfs/blocklayout/dev.c b/fs/nfs/blocklayout/dev.c
> index c97ebc42ec0f..93ef7f864980 100644
> --- a/fs/nfs/blocklayout/dev.c
> +++ b/fs/nfs/blocklayout/dev.c
> @@ -25,17 +25,17 @@ bl_free_device(struct pnfs_block_dev *dev)
>  	} else {
>  		if (dev->pr_registered) {
>  			const struct pr_ops *ops =
> -				dev->bdev_handle->bdev->bd_disk->fops->pr_ops;
> +				file_bdev(dev->bdev_file)->bd_disk->fops->pr_ops;
>  			int error;
>  
> -			error = ops->pr_register(dev->bdev_handle->bdev,
> +			error = ops->pr_register(file_bdev(dev->bdev_file),
>  				dev->pr_key, 0, false);
>  			if (error)
>  				pr_err("failed to unregister PR key.\n");
>  		}
>  
> -		if (dev->bdev_handle)
> -			bdev_release(dev->bdev_handle);
> +		if (dev->bdev_file)
> +			fput(dev->bdev_file);
>  	}
>  }
>  
> @@ -169,7 +169,7 @@ static bool bl_map_simple(struct pnfs_block_dev *dev, u64 offset,
>  	map->start = dev->start;
>  	map->len = dev->len;
>  	map->disk_offset = dev->disk_offset;
> -	map->bdev = dev->bdev_handle->bdev;
> +	map->bdev = file_bdev(dev->bdev_file);
>  	return true;
>  }
>  
> @@ -236,26 +236,26 @@ bl_parse_simple(struct nfs_server *server, struct pnfs_block_dev *d,
>  		struct pnfs_block_volume *volumes, int idx, gfp_t gfp_mask)
>  {
>  	struct pnfs_block_volume *v = &volumes[idx];
> -	struct bdev_handle *bdev_handle;
> +	struct file *bdev_file;
>  	dev_t dev;
>  
>  	dev = bl_resolve_deviceid(server, v, gfp_mask);
>  	if (!dev)
>  		return -EIO;
>  
> -	bdev_handle = bdev_open_by_dev(dev, BLK_OPEN_READ | BLK_OPEN_WRITE,
> +	bdev_file = bdev_file_open_by_dev(dev, BLK_OPEN_READ | BLK_OPEN_WRITE,
>  				       NULL, NULL);
> -	if (IS_ERR(bdev_handle)) {
> +	if (IS_ERR(bdev_file)) {
>  		printk(KERN_WARNING "pNFS: failed to open device %d:%d (%ld)\n",
> -			MAJOR(dev), MINOR(dev), PTR_ERR(bdev_handle));
> -		return PTR_ERR(bdev_handle);
> +			MAJOR(dev), MINOR(dev), PTR_ERR(bdev_file));
> +		return PTR_ERR(bdev_file);
>  	}
> -	d->bdev_handle = bdev_handle;
> -	d->len = bdev_nr_bytes(bdev_handle->bdev);
> +	d->bdev_file = bdev_file;
> +	d->len = bdev_nr_bytes(file_bdev(bdev_file));
>  	d->map = bl_map_simple;
>  
>  	printk(KERN_INFO "pNFS: using block device %s\n",
> -		bdev_handle->bdev->bd_disk->disk_name);
> +		file_bdev(bdev_file)->bd_disk->disk_name);
>  	return 0;
>  }
>  
> @@ -300,10 +300,10 @@ bl_validate_designator(struct pnfs_block_volume *v)
>  	}
>  }
>  
> -static struct bdev_handle *
> +static struct file *
>  bl_open_path(struct pnfs_block_volume *v, const char *prefix)
>  {
> -	struct bdev_handle *bdev_handle;
> +	struct file *bdev_file;
>  	const char *devname;
>  
>  	devname = kasprintf(GFP_KERNEL, "/dev/disk/by-id/%s%*phN",
> @@ -311,15 +311,15 @@ bl_open_path(struct pnfs_block_volume *v, const char *prefix)
>  	if (!devname)
>  		return ERR_PTR(-ENOMEM);
>  
> -	bdev_handle = bdev_open_by_path(devname, BLK_OPEN_READ | BLK_OPEN_WRITE,
> +	bdev_file = bdev_file_open_by_path(devname, BLK_OPEN_READ | BLK_OPEN_WRITE,
>  					NULL, NULL);
> -	if (IS_ERR(bdev_handle)) {
> +	if (IS_ERR(bdev_file)) {
>  		pr_warn("pNFS: failed to open device %s (%ld)\n",
> -			devname, PTR_ERR(bdev_handle));
> +			devname, PTR_ERR(bdev_file));
>  	}
>  
>  	kfree(devname);
> -	return bdev_handle;
> +	return bdev_file;
>  }
>  
>  static int
> @@ -327,7 +327,7 @@ bl_parse_scsi(struct nfs_server *server, struct pnfs_block_dev *d,
>  		struct pnfs_block_volume *volumes, int idx, gfp_t gfp_mask)
>  {
>  	struct pnfs_block_volume *v = &volumes[idx];
> -	struct bdev_handle *bdev_handle;
> +	struct file *bdev_file;
>  	const struct pr_ops *ops;
>  	int error;
>  
> @@ -340,14 +340,14 @@ bl_parse_scsi(struct nfs_server *server, struct pnfs_block_dev *d,
>  	 * On other distributions like Debian, the default SCSI by-id path will
>  	 * point to the dm-multipath device if one exists.
>  	 */
> -	bdev_handle = bl_open_path(v, "dm-uuid-mpath-0x");
> -	if (IS_ERR(bdev_handle))
> -		bdev_handle = bl_open_path(v, "wwn-0x");
> -	if (IS_ERR(bdev_handle))
> -		return PTR_ERR(bdev_handle);
> -	d->bdev_handle = bdev_handle;
> -
> -	d->len = bdev_nr_bytes(d->bdev_handle->bdev);
> +	bdev_file = bl_open_path(v, "dm-uuid-mpath-0x");
> +	if (IS_ERR(bdev_file))
> +		bdev_file = bl_open_path(v, "wwn-0x");
> +	if (IS_ERR(bdev_file))
> +		return PTR_ERR(bdev_file);
> +	d->bdev_file = bdev_file;
> +
> +	d->len = bdev_nr_bytes(file_bdev(d->bdev_file));
>  	d->map = bl_map_simple;
>  	d->pr_key = v->scsi.pr_key;
>  
> @@ -355,20 +355,20 @@ bl_parse_scsi(struct nfs_server *server, struct pnfs_block_dev *d,
>  		return -ENODEV;
>  
>  	pr_info("pNFS: using block device %s (reservation key 0x%llx)\n",
> -		d->bdev_handle->bdev->bd_disk->disk_name, d->pr_key);
> +		file_bdev(d->bdev_file)->bd_disk->disk_name, d->pr_key);
>  
> -	ops = d->bdev_handle->bdev->bd_disk->fops->pr_ops;
> +	ops = file_bdev(d->bdev_file)->bd_disk->fops->pr_ops;
>  	if (!ops) {
>  		pr_err("pNFS: block device %s does not support reservations.",
> -				d->bdev_handle->bdev->bd_disk->disk_name);
> +				file_bdev(d->bdev_file)->bd_disk->disk_name);
>  		error = -EINVAL;
>  		goto out_blkdev_put;
>  	}
>  
> -	error = ops->pr_register(d->bdev_handle->bdev, 0, d->pr_key, true);
> +	error = ops->pr_register(file_bdev(d->bdev_file), 0, d->pr_key, true);
>  	if (error) {
>  		pr_err("pNFS: failed to register key for block device %s.",
> -				d->bdev_handle->bdev->bd_disk->disk_name);
> +				file_bdev(d->bdev_file)->bd_disk->disk_name);
>  		goto out_blkdev_put;
>  	}
>  
> @@ -376,7 +376,7 @@ bl_parse_scsi(struct nfs_server *server, struct pnfs_block_dev *d,
>  	return 0;
>  
>  out_blkdev_put:
> -	bdev_release(d->bdev_handle);
> +	fput(d->bdev_file);
>  	return error;
>  }
>  
> 
> -- 
> 2.43.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 25/34] ocfs2: port block device access to file
  2024-01-23 13:26 ` [PATCH v2 25/34] ocfs2: port block device access to file Christian Brauner
@ 2024-02-01 10:22   ` Jan Kara
  0 siblings, 0 replies; 146+ messages in thread
From: Jan Kara @ 2024-02-01 10:22 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:42, Christian Brauner wrote:
> Signed-off-by: Christian Brauner <brauner@kernel.org>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/ocfs2/cluster/heartbeat.c | 32 ++++++++++++++++----------------
>  1 file changed, 16 insertions(+), 16 deletions(-)
> 
> diff --git a/fs/ocfs2/cluster/heartbeat.c b/fs/ocfs2/cluster/heartbeat.c
> index 4d7efefa98c5..1bde1281d514 100644
> --- a/fs/ocfs2/cluster/heartbeat.c
> +++ b/fs/ocfs2/cluster/heartbeat.c
> @@ -213,7 +213,7 @@ struct o2hb_region {
>  	unsigned int		hr_num_pages;
>  
>  	struct page             **hr_slot_data;
> -	struct bdev_handle	*hr_bdev_handle;
> +	struct file		*hr_bdev_file;
>  	struct o2hb_disk_slot	*hr_slots;
>  
>  	/* live node map of this region */
> @@ -263,7 +263,7 @@ struct o2hb_region {
>  
>  static inline struct block_device *reg_bdev(struct o2hb_region *reg)
>  {
> -	return reg->hr_bdev_handle ? reg->hr_bdev_handle->bdev : NULL;
> +	return reg->hr_bdev_file ? file_bdev(reg->hr_bdev_file) : NULL;
>  }
>  
>  struct o2hb_bio_wait_ctxt {
> @@ -1509,8 +1509,8 @@ static void o2hb_region_release(struct config_item *item)
>  		kfree(reg->hr_slot_data);
>  	}
>  
> -	if (reg->hr_bdev_handle)
> -		bdev_release(reg->hr_bdev_handle);
> +	if (reg->hr_bdev_file)
> +		fput(reg->hr_bdev_file);
>  
>  	kfree(reg->hr_slots);
>  
> @@ -1569,7 +1569,7 @@ static ssize_t o2hb_region_block_bytes_store(struct config_item *item,
>  	unsigned long block_bytes;
>  	unsigned int block_bits;
>  
> -	if (reg->hr_bdev_handle)
> +	if (reg->hr_bdev_file)
>  		return -EINVAL;
>  
>  	status = o2hb_read_block_input(reg, page, &block_bytes,
> @@ -1598,7 +1598,7 @@ static ssize_t o2hb_region_start_block_store(struct config_item *item,
>  	char *p = (char *)page;
>  	ssize_t ret;
>  
> -	if (reg->hr_bdev_handle)
> +	if (reg->hr_bdev_file)
>  		return -EINVAL;
>  
>  	ret = kstrtoull(p, 0, &tmp);
> @@ -1623,7 +1623,7 @@ static ssize_t o2hb_region_blocks_store(struct config_item *item,
>  	unsigned long tmp;
>  	char *p = (char *)page;
>  
> -	if (reg->hr_bdev_handle)
> +	if (reg->hr_bdev_file)
>  		return -EINVAL;
>  
>  	tmp = simple_strtoul(p, &p, 0);
> @@ -1642,7 +1642,7 @@ static ssize_t o2hb_region_dev_show(struct config_item *item, char *page)
>  {
>  	unsigned int ret = 0;
>  
> -	if (to_o2hb_region(item)->hr_bdev_handle)
> +	if (to_o2hb_region(item)->hr_bdev_file)
>  		ret = sprintf(page, "%pg\n", reg_bdev(to_o2hb_region(item)));
>  
>  	return ret;
> @@ -1753,7 +1753,7 @@ static int o2hb_populate_slot_data(struct o2hb_region *reg)
>  }
>  
>  /*
> - * this is acting as commit; we set up all of hr_bdev_handle and hr_task or
> + * this is acting as commit; we set up all of hr_bdev_file and hr_task or
>   * nothing
>   */
>  static ssize_t o2hb_region_dev_store(struct config_item *item,
> @@ -1769,7 +1769,7 @@ static ssize_t o2hb_region_dev_store(struct config_item *item,
>  	ssize_t ret = -EINVAL;
>  	int live_threshold;
>  
> -	if (reg->hr_bdev_handle)
> +	if (reg->hr_bdev_file)
>  		goto out;
>  
>  	/* We can't heartbeat without having had our node number
> @@ -1795,11 +1795,11 @@ static ssize_t o2hb_region_dev_store(struct config_item *item,
>  	if (!S_ISBLK(f.file->f_mapping->host->i_mode))
>  		goto out2;
>  
> -	reg->hr_bdev_handle = bdev_open_by_dev(f.file->f_mapping->host->i_rdev,
> +	reg->hr_bdev_file = bdev_file_open_by_dev(f.file->f_mapping->host->i_rdev,
>  			BLK_OPEN_WRITE | BLK_OPEN_READ, NULL, NULL);
> -	if (IS_ERR(reg->hr_bdev_handle)) {
> -		ret = PTR_ERR(reg->hr_bdev_handle);
> -		reg->hr_bdev_handle = NULL;
> +	if (IS_ERR(reg->hr_bdev_file)) {
> +		ret = PTR_ERR(reg->hr_bdev_file);
> +		reg->hr_bdev_file = NULL;
>  		goto out2;
>  	}
>  
> @@ -1903,8 +1903,8 @@ static ssize_t o2hb_region_dev_store(struct config_item *item,
>  
>  out3:
>  	if (ret < 0) {
> -		bdev_release(reg->hr_bdev_handle);
> -		reg->hr_bdev_handle = NULL;
> +		fput(reg->hr_bdev_file);
> +		reg->hr_bdev_file = NULL;
>  	}
>  out2:
>  	fdput(f);
> 
> -- 
> 2.43.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 26/34] reiserfs: port block device access to file
  2024-01-23 13:26 ` [PATCH v2 26/34] reiserfs: " Christian Brauner
@ 2024-02-01 10:24   ` Jan Kara
  0 siblings, 0 replies; 146+ messages in thread
From: Jan Kara @ 2024-02-01 10:24 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:43, Christian Brauner wrote:
> Signed-off-by: Christian Brauner <brauner@kernel.org>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/reiserfs/journal.c  | 38 +++++++++++++++++++-------------------
>  fs/reiserfs/procfs.c   |  2 +-
>  fs/reiserfs/reiserfs.h |  8 ++++----
>  3 files changed, 24 insertions(+), 24 deletions(-)
> 
> diff --git a/fs/reiserfs/journal.c b/fs/reiserfs/journal.c
> index 171c912af50f..6474529c4253 100644
> --- a/fs/reiserfs/journal.c
> +++ b/fs/reiserfs/journal.c
> @@ -2386,7 +2386,7 @@ static int journal_read(struct super_block *sb)
>  
>  	cur_dblock = SB_ONDISK_JOURNAL_1st_BLOCK(sb);
>  	reiserfs_info(sb, "checking transaction log (%pg)\n",
> -		      journal->j_bdev_handle->bdev);
> +		      file_bdev(journal->j_bdev_file));
>  	start = ktime_get_seconds();
>  
>  	/*
> @@ -2447,7 +2447,7 @@ static int journal_read(struct super_block *sb)
>  		 * device and journal device to be the same
>  		 */
>  		d_bh =
> -		    reiserfs_breada(journal->j_bdev_handle->bdev, cur_dblock,
> +		    reiserfs_breada(file_bdev(journal->j_bdev_file), cur_dblock,
>  				    sb->s_blocksize,
>  				    SB_ONDISK_JOURNAL_1st_BLOCK(sb) +
>  				    SB_ONDISK_JOURNAL_SIZE(sb));
> @@ -2588,9 +2588,9 @@ static void journal_list_init(struct super_block *sb)
>  
>  static void release_journal_dev(struct reiserfs_journal *journal)
>  {
> -	if (journal->j_bdev_handle) {
> -		bdev_release(journal->j_bdev_handle);
> -		journal->j_bdev_handle = NULL;
> +	if (journal->j_bdev_file) {
> +		fput(journal->j_bdev_file);
> +		journal->j_bdev_file = NULL;
>  	}
>  }
>  
> @@ -2605,7 +2605,7 @@ static int journal_init_dev(struct super_block *super,
>  
>  	result = 0;
>  
> -	journal->j_bdev_handle = NULL;
> +	journal->j_bdev_file = NULL;
>  	jdev = SB_ONDISK_JOURNAL_DEVICE(super) ?
>  	    new_decode_dev(SB_ONDISK_JOURNAL_DEVICE(super)) : super->s_dev;
>  
> @@ -2616,37 +2616,37 @@ static int journal_init_dev(struct super_block *super,
>  	if ((!jdev_name || !jdev_name[0])) {
>  		if (jdev == super->s_dev)
>  			holder = NULL;
> -		journal->j_bdev_handle = bdev_open_by_dev(jdev, blkdev_mode,
> +		journal->j_bdev_file = bdev_file_open_by_dev(jdev, blkdev_mode,
>  							  holder, NULL);
> -		if (IS_ERR(journal->j_bdev_handle)) {
> -			result = PTR_ERR(journal->j_bdev_handle);
> -			journal->j_bdev_handle = NULL;
> +		if (IS_ERR(journal->j_bdev_file)) {
> +			result = PTR_ERR(journal->j_bdev_file);
> +			journal->j_bdev_file = NULL;
>  			reiserfs_warning(super, "sh-458",
>  					 "cannot init journal device unknown-block(%u,%u): %i",
>  					 MAJOR(jdev), MINOR(jdev), result);
>  			return result;
>  		} else if (jdev != super->s_dev)
> -			set_blocksize(journal->j_bdev_handle->bdev,
> +			set_blocksize(file_bdev(journal->j_bdev_file),
>  				      super->s_blocksize);
>  
>  		return 0;
>  	}
>  
> -	journal->j_bdev_handle = bdev_open_by_path(jdev_name, blkdev_mode,
> +	journal->j_bdev_file = bdev_file_open_by_path(jdev_name, blkdev_mode,
>  						   holder, NULL);
> -	if (IS_ERR(journal->j_bdev_handle)) {
> -		result = PTR_ERR(journal->j_bdev_handle);
> -		journal->j_bdev_handle = NULL;
> +	if (IS_ERR(journal->j_bdev_file)) {
> +		result = PTR_ERR(journal->j_bdev_file);
> +		journal->j_bdev_file = NULL;
>  		reiserfs_warning(super, "sh-457",
>  				 "journal_init_dev: Cannot open '%s': %i",
>  				 jdev_name, result);
>  		return result;
>  	}
>  
> -	set_blocksize(journal->j_bdev_handle->bdev, super->s_blocksize);
> +	set_blocksize(file_bdev(journal->j_bdev_file), super->s_blocksize);
>  	reiserfs_info(super,
>  		      "journal_init_dev: journal device: %pg\n",
> -		      journal->j_bdev_handle->bdev);
> +		      file_bdev(journal->j_bdev_file));
>  	return 0;
>  }
>  
> @@ -2804,7 +2804,7 @@ int journal_init(struct super_block *sb, const char *j_dev_name,
>  				 "journal header magic %x (device %pg) does "
>  				 "not match to magic found in super block %x",
>  				 jh->jh_journal.jp_journal_magic,
> -				 journal->j_bdev_handle->bdev,
> +				 file_bdev(journal->j_bdev_file),
>  				 sb_jp_journal_magic(rs));
>  		brelse(bhjh);
>  		goto free_and_return;
> @@ -2828,7 +2828,7 @@ int journal_init(struct super_block *sb, const char *j_dev_name,
>  	reiserfs_info(sb, "journal params: device %pg, size %u, "
>  		      "journal first block %u, max trans len %u, max batch %u, "
>  		      "max commit age %u, max trans age %u\n",
> -		      journal->j_bdev_handle->bdev,
> +		      file_bdev(journal->j_bdev_file),
>  		      SB_ONDISK_JOURNAL_SIZE(sb),
>  		      SB_ONDISK_JOURNAL_1st_BLOCK(sb),
>  		      journal->j_trans_max,
> diff --git a/fs/reiserfs/procfs.c b/fs/reiserfs/procfs.c
> index 83cb9402e0f9..5c68a4a52d78 100644
> --- a/fs/reiserfs/procfs.c
> +++ b/fs/reiserfs/procfs.c
> @@ -354,7 +354,7 @@ static int show_journal(struct seq_file *m, void *unused)
>  		   "prepare: \t%12lu\n"
>  		   "prepare_retry: \t%12lu\n",
>  		   DJP(jp_journal_1st_block),
> -		   SB_JOURNAL(sb)->j_bdev_handle->bdev,
> +		   file_bdev(SB_JOURNAL(sb)->j_bdev_file),
>  		   DJP(jp_journal_dev),
>  		   DJP(jp_journal_size),
>  		   DJP(jp_journal_trans_max),
> diff --git a/fs/reiserfs/reiserfs.h b/fs/reiserfs/reiserfs.h
> index 725667880e62..0554903f42a9 100644
> --- a/fs/reiserfs/reiserfs.h
> +++ b/fs/reiserfs/reiserfs.h
> @@ -299,7 +299,7 @@ struct reiserfs_journal {
>  	/* oldest journal block.  start here for traverse */
>  	struct reiserfs_journal_cnode *j_first;
>  
> -	struct bdev_handle *j_bdev_handle;
> +	struct file *j_bdev_file;
>  
>  	/* first block on s_dev of reserved area journal */
>  	int j_1st_reserved_block;
> @@ -2810,10 +2810,10 @@ struct reiserfs_journal_header {
>  
>  /* We need these to make journal.c code more readable */
>  #define journal_find_get_block(s, block) __find_get_block(\
> -		SB_JOURNAL(s)->j_bdev_handle->bdev, block, s->s_blocksize)
> -#define journal_getblk(s, block) __getblk(SB_JOURNAL(s)->j_bdev_handle->bdev,\
> +		file_bdev(SB_JOURNAL(s)->j_bdev_file), block, s->s_blocksize)
> +#define journal_getblk(s, block) __getblk(file_bdev(SB_JOURNAL(s)->j_bdev_file),\
>  		block, s->s_blocksize)
> -#define journal_bread(s, block) __bread(SB_JOURNAL(s)->j_bdev_handle->bdev,\
> +#define journal_bread(s, block) __bread(file_bdev(SB_JOURNAL(s)->j_bdev_file),\
>  		block, s->s_blocksize)
>  
>  enum reiserfs_bh_state_bits {
> 
> -- 
> 2.43.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 27/34] bdev: remove bdev_open_by_path()
  2024-01-23 13:26 ` [PATCH v2 27/34] bdev: remove bdev_open_by_path() Christian Brauner
  2024-01-29 16:17   ` Christoph Hellwig
@ 2024-02-01 10:24   ` Jan Kara
  1 sibling, 0 replies; 146+ messages in thread
From: Jan Kara @ 2024-02-01 10:24 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:44, Christian Brauner wrote:
> Signed-off-by: Christian Brauner <brauner@kernel.org>

Sure. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  block/bdev.c           | 40 ----------------------------------------
>  include/linux/blkdev.h |  2 --
>  2 files changed, 42 deletions(-)
> 
> diff --git a/block/bdev.c b/block/bdev.c
> index 4246a57a7c5a..eb5607af6ec5 100644
> --- a/block/bdev.c
> +++ b/block/bdev.c
> @@ -1007,46 +1007,6 @@ struct file *bdev_file_open_by_path(const char *path, blk_mode_t mode,
>  }
>  EXPORT_SYMBOL(bdev_file_open_by_path);
>  
> -/**
> - * bdev_open_by_path - open a block device by name
> - * @path: path to the block device to open
> - * @mode: open mode (BLK_OPEN_*)
> - * @holder: exclusive holder identifier
> - * @hops: holder operations
> - *
> - * Open the block device described by the device file at @path.  If @holder is
> - * not %NULL, the block device is opened with exclusive access.  Exclusive opens
> - * may nest for the same @holder.
> - *
> - * CONTEXT:
> - * Might sleep.
> - *
> - * RETURNS:
> - * Handle with a reference to the block_device on success, ERR_PTR(-errno) on
> - * failure.
> - */
> -struct bdev_handle *bdev_open_by_path(const char *path, blk_mode_t mode,
> -		void *holder, const struct blk_holder_ops *hops)
> -{
> -	struct bdev_handle *handle;
> -	dev_t dev;
> -	int error;
> -
> -	error = lookup_bdev(path, &dev);
> -	if (error)
> -		return ERR_PTR(error);
> -
> -	handle = bdev_open_by_dev(dev, mode, holder, hops);
> -	if (!IS_ERR(handle) && (mode & BLK_OPEN_WRITE) &&
> -	    bdev_read_only(handle->bdev)) {
> -		bdev_release(handle);
> -		return ERR_PTR(-EACCES);
> -	}
> -
> -	return handle;
> -}
> -EXPORT_SYMBOL(bdev_open_by_path);
> -
>  void bdev_release(struct bdev_handle *handle)
>  {
>  	struct block_device *bdev = handle->bdev;
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 76706aa47316..5880d5abfebe 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -1484,8 +1484,6 @@ struct bdev_handle {
>  
>  struct bdev_handle *bdev_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
>  		const struct blk_holder_ops *hops);
> -struct bdev_handle *bdev_open_by_path(const char *path, blk_mode_t mode,
> -		void *holder, const struct blk_holder_ops *hops);
>  struct file *bdev_file_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
>  		const struct blk_holder_ops *hops);
>  struct file *bdev_file_open_by_path(const char *path, blk_mode_t mode,
> 
> -- 
> 2.43.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 28/34] bdev: make bdev_release() private to block layer
  2024-01-23 13:26 ` [PATCH v2 28/34] bdev: make bdev_release() private to block layer Christian Brauner
  2024-01-29 16:19   ` Christoph Hellwig
@ 2024-02-01 10:26   ` Jan Kara
  2024-02-01 14:48     ` Christian Brauner
  1 sibling, 1 reply; 146+ messages in thread
From: Jan Kara @ 2024-02-01 10:26 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:45, Christian Brauner wrote:
> and move both of them to the private block header. There's no caller in
> the tree anymore that uses them directly.
> 
> Signed-off-by: Christian Brauner <brauner@kernel.org>

As Christoph noticed, the changelog needs a bit of work but otherwise feel
free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza
> ---
>  block/bdev.c           | 2 --
>  block/blk.h            | 4 ++++
>  include/linux/blkdev.h | 3 ---
>  3 files changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/block/bdev.c b/block/bdev.c
> index eb5607af6ec5..1f64f213c5fa 100644
> --- a/block/bdev.c
> +++ b/block/bdev.c
> @@ -916,7 +916,6 @@ struct bdev_handle *bdev_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
>  	kfree(handle);
>  	return ERR_PTR(ret);
>  }
> -EXPORT_SYMBOL(bdev_open_by_dev);
>  
>  static unsigned blk_to_file_flags(blk_mode_t mode)
>  {
> @@ -1045,7 +1044,6 @@ void bdev_release(struct bdev_handle *handle)
>  	blkdev_put_no_open(bdev);
>  	kfree(handle);
>  }
> -EXPORT_SYMBOL(bdev_release);
>  
>  /**
>   * lookup_bdev() - Look up a struct block_device by name.
> diff --git a/block/blk.h b/block/blk.h
> index 1ef920f72e0f..c9630774767d 100644
> --- a/block/blk.h
> +++ b/block/blk.h
> @@ -516,4 +516,8 @@ static inline int req_ref_read(struct request *req)
>  	return atomic_read(&req->ref);
>  }
>  
> +void bdev_release(struct bdev_handle *handle);
> +struct bdev_handle *bdev_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
> +		const struct blk_holder_ops *hops);
> +
>  #endif /* BLK_INTERNAL_H */
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 5880d5abfebe..495f55587207 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -1482,8 +1482,6 @@ struct bdev_handle {
>  	blk_mode_t mode;
>  };
>  
> -struct bdev_handle *bdev_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
> -		const struct blk_holder_ops *hops);
>  struct file *bdev_file_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
>  		const struct blk_holder_ops *hops);
>  struct file *bdev_file_open_by_path(const char *path, blk_mode_t mode,
> @@ -1491,7 +1489,6 @@ struct file *bdev_file_open_by_path(const char *path, blk_mode_t mode,
>  int bd_prepare_to_claim(struct block_device *bdev, void *holder,
>  		const struct blk_holder_ops *hops);
>  void bd_abort_claiming(struct block_device *bdev, void *holder);
> -void bdev_release(struct bdev_handle *handle);
>  
>  /* just for blk-cgroup, don't use elsewhere */
>  struct block_device *blkdev_get_no_open(dev_t dev);
> 
> -- 
> 2.43.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 29/34] bdev: make struct bdev_handle private to the block layer
  2024-01-23 13:26 ` [PATCH v2 29/34] bdev: make struct bdev_handle private to the " Christian Brauner
  2024-01-29 16:22   ` Christoph Hellwig
@ 2024-02-01 10:54   ` Jan Kara
  2024-02-01 15:07     ` Christian Brauner
  2024-02-01 11:23   ` Jan Kara
  2 siblings, 1 reply; 146+ messages in thread
From: Jan Kara @ 2024-02-01 10:54 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:46, Christian Brauner wrote:
> Signed-off-by: Christian Brauner <brauner@kernel.org>

Some comments below...

> @@ -795,15 +813,15 @@ static void bdev_yield_write_access(struct block_device *bdev, blk_mode_t mode)
>  }
>  
>  /**
> - * bdev_open_by_dev - open a block device by device number
> - * @dev: device number of block device to open
> + * bdev_open - open a block device
> + * @bdev: block device to open
>   * @mode: open mode (BLK_OPEN_*)
>   * @holder: exclusive holder identifier
>   * @hops: holder operations
> + * @bdev_file: file for the block device
>   *
> - * Open the block device described by device number @dev. If @holder is not
> - * %NULL, the block device is opened with exclusive access.  Exclusive opens may
> - * nest for the same @holder.
> + * Open the block device. If @holder is not %NULL, the block device is opened
> + * with exclusive access.  Exclusive opens may nest for the same @holder.
>   *
>   * Use this interface ONLY if you really do not have anything better - i.e. when
>   * you are behind a truly sucky interface and all you are given is a device
      ^^^
I guess this part of comment is stale now?

> @@ -902,7 +897,22 @@ struct bdev_handle *bdev_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
>  	handle->bdev = bdev;
>  	handle->holder = holder;
>  	handle->mode = mode;
> -	return handle;
> +
> +	/*
> +	 * Preserve backwards compatibility and allow large file access
> +	 * even if userspace doesn't ask for it explicitly. Some mkfs
> +	 * binary needs it. We might want to drop this workaround
> +	 * during an unstable branch.
> +	 */

Heh, I think the sentense "We might want to drop this workaround during an
unstable branch." does not need to be moved as well :)

> +	bdev_file->f_flags |= O_LARGEFILE;
> +	bdev_file->f_mode |= FMODE_BUF_RASYNC | FMODE_CAN_ODIRECT;
> +	if (bdev_nowait(bdev))
> +		bdev_file->f_mode |= FMODE_NOWAIT;
> +	bdev_file->f_mapping = handle->bdev->bd_inode->i_mapping;
> +	bdev_file->f_wb_err = filemap_sample_wb_err(bdev_file->f_mapping);
> +	bdev_file->private_data = handle;
> +
> +	return 0;
>  put_module:
>  	module_put(disk->fops->owner);
>  abort_claiming:
> @@ -910,11 +920,8 @@ struct bdev_handle *bdev_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
>  		bd_abort_claiming(bdev, holder);
>  	mutex_unlock(&disk->open_mutex);
>  	disk_unblock_events(disk);
> -put_blkdev:
> -	blkdev_put_no_open(bdev);
> -free_handle:
>  	kfree(handle);
> -	return ERR_PTR(ret);
> +	return ret;
>  }
>  
>  static unsigned blk_to_file_flags(blk_mode_t mode)
> @@ -954,29 +961,35 @@ struct file *bdev_file_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
>  				   const struct blk_holder_ops *hops)
>  {
>  	struct file *bdev_file;
> -	struct bdev_handle *handle;
> +	struct block_device *bdev;
>  	unsigned int flags;
> +	int ret;
>  
> -	handle = bdev_open_by_dev(dev, mode, holder, hops);
> -	if (IS_ERR(handle))
> -		return ERR_CAST(handle);
> +	ret = bdev_permission(dev, 0, holder);
				   ^^ Maybe I'm missing something but why
do you pass 0 mode here?


> +	if (ret)
> +		return ERR_PTR(ret);
> +
> +	bdev = blkdev_get_no_open(dev);
> +	if (!bdev)
> +		return ERR_PTR(-ENXIO);
>  
>  	flags = blk_to_file_flags(mode);
> -	bdev_file = alloc_file_pseudo_noaccount(handle->bdev->bd_inode,
> +	bdev_file = alloc_file_pseudo_noaccount(bdev->bd_inode,
>  			blockdev_mnt, "", flags | O_LARGEFILE, &def_blk_fops);
>  	if (IS_ERR(bdev_file)) {
> -		bdev_release(handle);
> +		blkdev_put_no_open(bdev);
>  		return bdev_file;
>  	}
> -	ihold(handle->bdev->bd_inode);
> +	bdev_file->f_mode &= ~FMODE_OPENED;

Hum, why do you need these games with FMODE_OPENED? I suspect you want to
influence fput() behavior but then AFAICT we will leak dentry, mnt, etc. on
error? If this is indeed needed, it deserves a comment...

> -	bdev_file->f_mode |= FMODE_BUF_RASYNC | FMODE_CAN_ODIRECT;
> -	if (bdev_nowait(handle->bdev))
> -		bdev_file->f_mode |= FMODE_NOWAIT;
> -
> -	bdev_file->f_mapping = handle->bdev->bd_inode->i_mapping;
> -	bdev_file->f_wb_err = filemap_sample_wb_err(bdev_file->f_mapping);
> -	bdev_file->private_data = handle;
> +	ihold(bdev->bd_inode);
> +	ret = bdev_open(bdev, mode, holder, hops, bdev_file);
> +	if (ret) {
> +		fput(bdev_file);
> +		return ERR_PTR(ret);
> +	}
> +	/* Now that thing is opened. */
> +	bdev_file->f_mode |= FMODE_OPENED;
>  	return bdev_file;
>  }
>  EXPORT_SYMBOL(bdev_file_open_by_dev);

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 30/34] bdev: remove bdev pointer from struct bdev_handle
  2024-01-23 13:26 ` [PATCH v2 30/34] bdev: remove bdev pointer from struct bdev_handle Christian Brauner
  2024-01-29 16:22   ` Christoph Hellwig
@ 2024-02-01 10:57   ` Jan Kara
  1 sibling, 0 replies; 146+ messages in thread
From: Jan Kara @ 2024-02-01 10:57 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:47, Christian Brauner wrote:
> We can always go directly via:
> 
> * I_BDEV(bdev_file->f_inode)
> * I_BDEV(bdev_file->f_mapping->host)
> 
> So keeping struct bdev in struct bdev_handle is redundant.
> 
> Signed-off-by: Christian Brauner <brauner@kernel.org>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  block/bdev.c | 26 ++++++++++++--------------
>  block/blk.h  |  3 +--
>  block/fops.c |  2 +-
>  3 files changed, 14 insertions(+), 17 deletions(-)
> 
> diff --git a/block/bdev.c b/block/bdev.c
> index 34b9a16edb6e..71eaa1b5b7eb 100644
> --- a/block/bdev.c
> +++ b/block/bdev.c
> @@ -51,8 +51,7 @@ EXPORT_SYMBOL(I_BDEV);
>  
>  struct block_device *file_bdev(struct file *bdev_file)
>  {
> -	struct bdev_handle *handle = bdev_file->private_data;
> -	return handle->bdev;
> +	return I_BDEV(bdev_file->f_mapping->host);
>  }
>  EXPORT_SYMBOL(file_bdev);
>  
> @@ -894,7 +893,6 @@ int bdev_open(struct block_device *bdev, blk_mode_t mode, void *holder,
>  
>  	if (unblock_events)
>  		disk_unblock_events(disk);
> -	handle->bdev = bdev;
>  	handle->holder = holder;
>  	handle->mode = mode;
>  
> @@ -908,7 +906,7 @@ int bdev_open(struct block_device *bdev, blk_mode_t mode, void *holder,
>  	bdev_file->f_mode |= FMODE_BUF_RASYNC | FMODE_CAN_ODIRECT;
>  	if (bdev_nowait(bdev))
>  		bdev_file->f_mode |= FMODE_NOWAIT;
> -	bdev_file->f_mapping = handle->bdev->bd_inode->i_mapping;
> +	bdev_file->f_mapping = bdev->bd_inode->i_mapping;
>  	bdev_file->f_wb_err = filemap_sample_wb_err(bdev_file->f_mapping);
>  	bdev_file->private_data = handle;
>  
> @@ -998,7 +996,7 @@ struct file *bdev_file_open_by_path(const char *path, blk_mode_t mode,
>  				    void *holder,
>  				    const struct blk_holder_ops *hops)
>  {
> -	struct file *bdev_file;
> +	struct file *file;
>  	dev_t dev;
>  	int error;
>  
> @@ -1006,22 +1004,22 @@ struct file *bdev_file_open_by_path(const char *path, blk_mode_t mode,
>  	if (error)
>  		return ERR_PTR(error);
>  
> -	bdev_file = bdev_file_open_by_dev(dev, mode, holder, hops);
> -	if (!IS_ERR(bdev_file) && (mode & BLK_OPEN_WRITE)) {
> -		struct bdev_handle *handle = bdev_file->private_data;
> -		if (bdev_read_only(handle->bdev)) {
> -			fput(bdev_file);
> -			bdev_file = ERR_PTR(-EACCES);
> +	file = bdev_file_open_by_dev(dev, mode, holder, hops);
> +	if (!IS_ERR(file) && (mode & BLK_OPEN_WRITE)) {
> +		if (bdev_read_only(file_bdev(file))) {
> +			fput(file);
> +			file = ERR_PTR(-EACCES);
>  		}
>  	}
>  
> -	return bdev_file;
> +	return file;
>  }
>  EXPORT_SYMBOL(bdev_file_open_by_path);
>  
> -void bdev_release(struct bdev_handle *handle)
> +void bdev_release(struct file *bdev_file)
>  {
> -	struct block_device *bdev = handle->bdev;
> +	struct block_device *bdev = file_bdev(bdev_file);
> +	struct bdev_handle *handle = bdev_file->private_data;
>  	struct gendisk *disk = bdev->bd_disk;
>  
>  	/*
> diff --git a/block/blk.h b/block/blk.h
> index 19b15870284f..7ca24814f3a0 100644
> --- a/block/blk.h
> +++ b/block/blk.h
> @@ -26,7 +26,6 @@ struct blk_flush_queue {
>  };
>  
>  struct bdev_handle {
> -	struct block_device *bdev;
>  	void *holder;
>  	blk_mode_t mode;
>  };
> @@ -522,7 +521,7 @@ static inline int req_ref_read(struct request *req)
>  	return atomic_read(&req->ref);
>  }
>  
> -void bdev_release(struct bdev_handle *handle);
> +void bdev_release(struct file *bdev_file);
>  int bdev_open(struct block_device *bdev, blk_mode_t mode, void *holder,
>  	      const struct blk_holder_ops *hops, struct file *bdev_file);
>  int bdev_permission(dev_t dev, blk_mode_t mode, void *holder);
> diff --git a/block/fops.c b/block/fops.c
> index 81ff8c0ce32f..5589bf9c3822 100644
> --- a/block/fops.c
> +++ b/block/fops.c
> @@ -622,7 +622,7 @@ static int blkdev_open(struct inode *inode, struct file *filp)
>  
>  static int blkdev_release(struct inode *inode, struct file *filp)
>  {
> -	bdev_release(filp->private_data);
> +	bdev_release(filp);
>  	return 0;
>  }
>  
> 
> -- 
> 2.43.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 31/34] block: use file->f_op to indicate restricted writes
  2024-01-23 13:26 ` [PATCH v2 31/34] block: use file->f_op to indicate restricted writes Christian Brauner
  2024-01-29 16:49   ` Christoph Hellwig
@ 2024-02-01 11:08   ` Jan Kara
  2024-02-01 16:16     ` Christian Brauner
  1 sibling, 1 reply; 146+ messages in thread
From: Jan Kara @ 2024-02-01 11:08 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:48, Christian Brauner wrote:
> Make it possible to detected a block device that was opened with
> restricted write access solely based on its file operations that it was
> opened with. This avoids wasting an FMODE_* flag.
> 
> def_blk_fops isn't needed to check whether something is a block device
> checking the inode type is enough for that. And def_blk_fops_restricted
> can be kept private to the block layer.
> 
> Signed-off-by: Christian Brauner <brauner@kernel.org>

I don't think we need def_blk_fops_restricted. If we have BLK_OPEN_WRITE
file against a bdev with bdev_writes_blocked() == true, we are sure this is
the handle blocking other writes so we can unblock them in
bdev_yield_write_access()...

								Honza

> ---
>  block/bdev.c | 16 ++++++++++++----
>  block/blk.h  |  2 ++
>  block/fops.c |  3 +++
>  3 files changed, 17 insertions(+), 4 deletions(-)
> 
> diff --git a/block/bdev.c b/block/bdev.c
> index 71eaa1b5b7eb..9d96a43f198d 100644
> --- a/block/bdev.c
> +++ b/block/bdev.c
> @@ -799,13 +799,16 @@ static void bdev_claim_write_access(struct block_device *bdev, blk_mode_t mode)
>  		bdev->bd_writers++;
>  }
>  
> -static void bdev_yield_write_access(struct block_device *bdev, blk_mode_t mode)
> +static void bdev_yield_write_access(struct file *bdev_file, blk_mode_t mode)
>  {
> +	struct block_device *bdev;
> +
>  	if (bdev_allow_write_mounted)
>  		return;
>  
> +	bdev = file_bdev(bdev_file);
>  	/* Yield exclusive or shared write access. */
> -	if (mode & BLK_OPEN_RESTRICT_WRITES)
> +	if (bdev_file->f_op == &def_blk_fops_restricted)
>  		bdev_unblock_writes(bdev);
>  	else if (mode & BLK_OPEN_WRITE)
>  		bdev->bd_writers--;
> @@ -959,6 +962,7 @@ struct file *bdev_file_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
>  				   const struct blk_holder_ops *hops)
>  {
>  	struct file *bdev_file;
> +	const struct file_operations *blk_fops;
>  	struct block_device *bdev;
>  	unsigned int flags;
>  	int ret;
> @@ -972,8 +976,12 @@ struct file *bdev_file_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
>  		return ERR_PTR(-ENXIO);
>  
>  	flags = blk_to_file_flags(mode);
> +	if (mode & BLK_OPEN_RESTRICT_WRITES)
> +		blk_fops = &def_blk_fops_restricted;
> +	else
> +		blk_fops = &def_blk_fops;
>  	bdev_file = alloc_file_pseudo_noaccount(bdev->bd_inode,
> -			blockdev_mnt, "", flags | O_LARGEFILE, &def_blk_fops);
> +			blockdev_mnt, "", flags | O_LARGEFILE, blk_fops);
>  	if (IS_ERR(bdev_file)) {
>  		blkdev_put_no_open(bdev);
>  		return bdev_file;
> @@ -1033,7 +1041,7 @@ void bdev_release(struct file *bdev_file)
>  		sync_blockdev(bdev);
>  
>  	mutex_lock(&disk->open_mutex);
> -	bdev_yield_write_access(bdev, handle->mode);
> +	bdev_yield_write_access(bdev_file, handle->mode);
>  
>  	if (handle->holder)
>  		bd_end_claim(bdev, handle->holder);
> diff --git a/block/blk.h b/block/blk.h
> index 7ca24814f3a0..dfa958909c54 100644
> --- a/block/blk.h
> +++ b/block/blk.h
> @@ -9,6 +9,8 @@
>  
>  struct elevator_type;
>  
> +extern const struct file_operations def_blk_fops_restricted;
> +
>  /* Max future timer expiry for timeouts */
>  #define BLK_MAX_TIMEOUT		(5 * HZ)
>  
> diff --git a/block/fops.c b/block/fops.c
> index 5589bf9c3822..f56bdfe459de 100644
> --- a/block/fops.c
> +++ b/block/fops.c
> @@ -862,6 +862,9 @@ const struct file_operations def_blk_fops = {
>  	.fallocate	= blkdev_fallocate,
>  };
>  
> +/* Indicator that this block device is opened with restricted write access. */
> +const struct file_operations def_blk_fops_restricted = def_blk_fops;
> +
>  static __init int blkdev_init(void)
>  {
>  	return bioset_init(&blkdev_dio_pool, 4,
> 
> -- 
> 2.43.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 32/34] block: remove bdev_handle completely
  2024-01-23 13:26 ` [PATCH v2 32/34] block: remove bdev_handle completely Christian Brauner
  2024-01-29 16:50   ` Christoph Hellwig
@ 2024-02-01 11:20   ` Jan Kara
  2024-02-01 16:18     ` Christian Brauner
  1 sibling, 1 reply; 146+ messages in thread
From: Jan Kara @ 2024-02-01 11:20 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:49, Christian Brauner wrote:
> We just need to use the holder to indicate whether a block device open
> was exclusive or not. We did use to do that before but had to give that
> up once we switched to struct bdev_handle. Before struct bdev_handle we
> only stashed stuff in file->private_data if this was an exclusive open
> but after struct bdev_handle we always set file->private_data to a
> struct bdev_handle and so we had to use bdev_handle->mode or
> bdev_handle->holder. Now that we don't use struct bdev_handle anymore we
> can revert back to the old behavior.
> 
> Signed-off-by: Christian Brauner <brauner@kernel.org>

Two small comments below.

> diff --git a/block/fops.c b/block/fops.c
> index f56bdfe459de..a0bff2c0d88d 100644
> --- a/block/fops.c
> +++ b/block/fops.c
> @@ -569,7 +569,6 @@ static int blkdev_fsync(struct file *filp, loff_t start, loff_t end,
>  blk_mode_t file_to_blk_mode(struct file *file)
>  {
>  	blk_mode_t mode = 0;
> -	struct bdev_handle *handle = file->private_data;
>  
>  	if (file->f_mode & FMODE_READ)
>  		mode |= BLK_OPEN_READ;
> @@ -579,8 +578,8 @@ blk_mode_t file_to_blk_mode(struct file *file)
>  	 * do_dentry_open() clears O_EXCL from f_flags, use handle->mode to
>  	 * determine whether the open was exclusive for already open files.
>  	 */
^^^ This comment needs update now...

> -	if (handle)
> -		mode |= handle->mode & BLK_OPEN_EXCL;
> +	if (file->private_data)
> +		mode |= BLK_OPEN_EXCL;
>  	else if (file->f_flags & O_EXCL)
>  		mode |= BLK_OPEN_EXCL;
>  	if (file->f_flags & O_NDELAY)
> @@ -601,12 +600,17 @@ static int blkdev_open(struct inode *inode, struct file *filp)
>  {
>  	struct block_device *bdev;
>  	blk_mode_t mode;
> -	void *holder;
>  	int ret;
>  
> +	/*
> +	 * Use the file private data to store the holder for exclusive opens.
> +	 * file_to_blk_mode relies on it being present to set BLK_OPEN_EXCL.
> +	 */
> +	if (filp->f_flags & O_EXCL)
> +		filp->private_data = filp;

Well, if we have O_EXCL in f_flags here, then file_to_blk_mode() on the
next line is going to do the right thing and set BLK_OPEN_EXCL even
without filp->private_data. So this shound't be needed AFAICT.

>  	mode = file_to_blk_mode(filp);
> -	holder = mode & BLK_OPEN_EXCL ? filp : NULL;
> -	ret = bdev_permission(inode->i_rdev, mode, holder);
> +	ret = bdev_permission(inode->i_rdev, mode, filp->private_data);
>  	if (ret)
>  		return ret;

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 29/34] bdev: make struct bdev_handle private to the block layer
  2024-01-23 13:26 ` [PATCH v2 29/34] bdev: make struct bdev_handle private to the " Christian Brauner
  2024-01-29 16:22   ` Christoph Hellwig
  2024-02-01 10:54   ` Jan Kara
@ 2024-02-01 11:23   ` Jan Kara
  2024-02-01 14:52     ` Christian Brauner
  2 siblings, 1 reply; 146+ messages in thread
From: Jan Kara @ 2024-02-01 11:23 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:46, Christian Brauner wrote:
> Signed-off-by: Christian Brauner <brauner@kernel.org>

One more thing I've noticed:

> -struct bdev_handle *bdev_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
> -				     const struct blk_holder_ops *hops)
> +int bdev_open(struct block_device *bdev, blk_mode_t mode, void *holder,
> +	      const struct blk_holder_ops *hops, struct file *bdev_file)
>  {
>  	struct bdev_handle *handle = kmalloc(sizeof(struct bdev_handle),
>  					     GFP_KERNEL);
> -	struct block_device *bdev;
>  	bool unblock_events = true;
> -	struct gendisk *disk;
> +	struct gendisk *disk = bdev->bd_disk;
>  	int ret;
>  
> +	handle = kmalloc(sizeof(struct bdev_handle), GFP_KERNEL);

You are leaking handle here. It gets fixed up later in the series but
still...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 34/34] ext4: rely on sb->f_bdev only
  2024-01-23 13:26 ` [PATCH v2 34/34] ext4: rely on sb->f_bdev only Christian Brauner
@ 2024-02-01 11:34   ` Jan Kara
  2024-02-01 13:40     ` Christian Brauner
  0 siblings, 1 reply; 146+ messages in thread
From: Jan Kara @ 2024-02-01 11:34 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 23-01-24 14:26:51, Christian Brauner wrote:
> (1) Instead of bdev->bd_inode->i_mapping we do f_bdev->f_mapping
> (2) Instead of bdev->bd_inode we could do f_bdev->f_inode
> 
> I mention this explicitly because (1) is dependent on how the block
> device is opened while (2) isn't. Consider:
> 
> mount -t tmpfs tmpfs /mnt
> mknod /mnt/foo b <minor> <major>
> open("/mnt/foo", O_RDWR);
> 
> then (1) doesn't work because f_bdev->f_inode is a tmpfs inode _not_ the
> actual bdev filesystem inode. But (2) is still the bd_inode->i_mapping
> as that's set up during bdev_open().
> 
> IOW, I'm explicitly _not_ going via f_bdev->f_inode but via
> f_bdev->f_mapping->host aka bdev_file_inode(f_bdev). Currently this
> isn't a problem because sb->s_bdev_file stashes the a block device file
> opened via bdev_open_by_*() which is always a file on the bdev
> filesystem.
> 
> _If_ we ever wanted to allow userspace to pass a block device file
> descriptor during superblock creation. Say:
> 
> fsconfig(fs_fd, FSCONFIG_CMD_CREATE_EXCL, "source", bdev_fd);
> 
> then using f_bdev->f_inode would be very wrong. Another thing to keep in
> mind there would be that this would implicitly pin another filesystem.
> Say:
> 
> mount -t ext4 /dev/sda /mnt
> mknod /mnt/foo b <minor> <major>
> bdev_fd = open("/mnt/foo", O_RDWR);
> 
> fd_fs = fsopen("xfs")
> fsconfig(fd_fs, FSCONFIG_CMD_CREATE, "source", bdev_fd);
> fd_mnt = fsmount(fd_fs);
> move_mount(fd_mnt, "/mnt2");
> 
> umount /mnt # EBUSY
> 
> Because the xfs filesystem now pins the ext4 filesystem via the
> bdev_file we're keeping. In other words, this is probably a bad idea and
> if we allow userspace to do this then we should only use the provided fd
> to lookup the block device and open our own handle to it.
> 
> Signed-off-by: Christian Brauner <brauner@kernel.org>

I suppose this is more or less a sample how to get rid of sb->s_bdev /
bd_inode dereferences AFAICT? Because otherwise I'm not sure why it was
included in this series...

> @@ -5576,7 +5576,7 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
>  	 * used to detect the metadata async write error.
>  	 */
>  	spin_lock_init(&sbi->s_bdev_wb_lock);
> -	errseq_check_and_advance(&sb->s_bdev->bd_inode->i_mapping->wb_err,
> +	errseq_check_and_advance(&sb->s_bdev_file->f_mapping->wb_err,
>  				 &sbi->s_bdev_wb_err);

So when we have struct file, it would be actually nicer to drop
EXT4_SB(sb)->s_bdev_wb_err completely and instead use
file_check_and_advance_wb_err(sb->s_bdev_file). But that's a separate
cleanup I suppose.

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 34/34] ext4: rely on sb->f_bdev only
  2024-02-01 11:34   ` Jan Kara
@ 2024-02-01 13:40     ` Christian Brauner
  0 siblings, 0 replies; 146+ messages in thread
From: Christian Brauner @ 2024-02-01 13:40 UTC (permalink / raw)
  To: Jan Kara
  Cc: Christoph Hellwig, Jens Axboe, Darrick J. Wong, linux-fsdevel,
	linux-block

On Thu, Feb 01, 2024 at 12:34:24PM +0100, Jan Kara wrote:
> On Tue 23-01-24 14:26:51, Christian Brauner wrote:
> > (1) Instead of bdev->bd_inode->i_mapping we do f_bdev->f_mapping
> > (2) Instead of bdev->bd_inode we could do f_bdev->f_inode
> > 
> > I mention this explicitly because (1) is dependent on how the block
> > device is opened while (2) isn't. Consider:
> > 
> > mount -t tmpfs tmpfs /mnt
> > mknod /mnt/foo b <minor> <major>
> > open("/mnt/foo", O_RDWR);
> > 
> > then (1) doesn't work because f_bdev->f_inode is a tmpfs inode _not_ the
> > actual bdev filesystem inode. But (2) is still the bd_inode->i_mapping
> > as that's set up during bdev_open().
> > 
> > IOW, I'm explicitly _not_ going via f_bdev->f_inode but via
> > f_bdev->f_mapping->host aka bdev_file_inode(f_bdev). Currently this
> > isn't a problem because sb->s_bdev_file stashes the a block device file
> > opened via bdev_open_by_*() which is always a file on the bdev
> > filesystem.
> > 
> > _If_ we ever wanted to allow userspace to pass a block device file
> > descriptor during superblock creation. Say:
> > 
> > fsconfig(fs_fd, FSCONFIG_CMD_CREATE_EXCL, "source", bdev_fd);
> > 
> > then using f_bdev->f_inode would be very wrong. Another thing to keep in
> > mind there would be that this would implicitly pin another filesystem.
> > Say:
> > 
> > mount -t ext4 /dev/sda /mnt
> > mknod /mnt/foo b <minor> <major>
> > bdev_fd = open("/mnt/foo", O_RDWR);
> > 
> > fd_fs = fsopen("xfs")
> > fsconfig(fd_fs, FSCONFIG_CMD_CREATE, "source", bdev_fd);
> > fd_mnt = fsmount(fd_fs);
> > move_mount(fd_mnt, "/mnt2");
> > 
> > umount /mnt # EBUSY
> > 
> > Because the xfs filesystem now pins the ext4 filesystem via the
> > bdev_file we're keeping. In other words, this is probably a bad idea and
> > if we allow userspace to do this then we should only use the provided fd
> > to lookup the block device and open our own handle to it.
> > 
> > Signed-off-by: Christian Brauner <brauner@kernel.org>
> 
> I suppose this is more or less a sample how to get rid of sb->s_bdev /
> bd_inode dereferences AFAICT? Because otherwise I'm not sure why it was

Yes, correct. That was just a way to show that you don't need that
anymore.

> included in this series...

Yes, that would likely go separately.

> 
> > @@ -5576,7 +5576,7 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
> >  	 * used to detect the metadata async write error.
> >  	 */
> >  	spin_lock_init(&sbi->s_bdev_wb_lock);
> > -	errseq_check_and_advance(&sb->s_bdev->bd_inode->i_mapping->wb_err,
> > +	errseq_check_and_advance(&sb->s_bdev_file->f_mapping->wb_err,
> >  				 &sbi->s_bdev_wb_err);
> 
> So when we have struct file, it would be actually nicer to drop
> EXT4_SB(sb)->s_bdev_wb_err completely and instead use
> file_check_and_advance_wb_err(sb->s_bdev_file). But that's a separate
> cleanup I suppose.

Yep. I forgot about that helper.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 07/34] xfs: port block device access to files
  2024-01-29 16:17   ` Christoph Hellwig
@ 2024-02-01 14:33     ` Christian Brauner
  0 siblings, 0 replies; 146+ messages in thread
From: Christian Brauner @ 2024-02-01 14:33 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jan Kara, Jens Axboe, Darrick J. Wong, linux-fsdevel, linux-block

> Please avoid the overly long line here.

Done.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 28/34] bdev: make bdev_release() private to block layer
  2024-02-01 10:26   ` Jan Kara
@ 2024-02-01 14:48     ` Christian Brauner
  0 siblings, 0 replies; 146+ messages in thread
From: Christian Brauner @ 2024-02-01 14:48 UTC (permalink / raw)
  To: Christoph Hellwig, Jan Kara
  Cc: Jens Axboe, Darrick J. Wong, linux-fsdevel, linux-block

On Mon, Jan 29, 2024 at 05:19:33PM +0100, Christoph Hellwig wrote:
> On Tue, Jan 23, 2024 at 02:26:45PM +0100, Christian Brauner wrote:
> > and move both of them to the private block header. There's no caller in
> > the tree anymore that uses them directly.
> 
> the subject only takes about a single helper, but then the commit
> message mentions "both".  Seems like the subject is missing a
> "bdev_open_by_dev and".

On Thu, Feb 01, 2024 at 11:26:16AM +0100, Jan Kara wrote:
> On Tue 23-01-24 14:26:45, Christian Brauner wrote:
> > and move both of them to the private block header. There's no caller in
> > the tree anymore that uses them directly.
> > 
> > Signed-off-by: Christian Brauner <brauner@kernel.org>
> 
> As Christoph noticed, the changelog needs a bit of work but otherwise feel

Fixed, thanks!

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 29/34] bdev: make struct bdev_handle private to the block layer
  2024-01-29 16:22   ` Christoph Hellwig
@ 2024-02-01 14:50     ` Christian Brauner
  0 siblings, 0 replies; 146+ messages in thread
From: Christian Brauner @ 2024-02-01 14:50 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jan Kara, Jens Axboe, Darrick J. Wong, linux-fsdevel, linux-block

On Mon, Jan 29, 2024 at 05:22:03PM +0100, Christoph Hellwig wrote:
> > +	ret = devcgroup_check_permission(
> > +		DEVCG_DEV_BLOCK, MAJOR(dev), MINOR(dev),
> > +		((mode & BLK_OPEN_READ) ? DEVCG_ACC_READ : 0) |
> > +			((mode & BLK_OPEN_WRITE) ? DEVCG_ACC_WRITE : 0));
> 
> Somewhat weird formatting here with DEVCG_DEV_BLOCK not on the
> same line as the opening brace and the extra indentation after
> the |.  I would have expected something like:
> 
> 	ret = devcgroup_check_permission(DEVCG_DEV_BLOCK,
> 		MAJOR(dev), MINOR(dev),
> 		((mode & BLK_OPEN_READ) ? DEVCG_ACC_READ : 0) |
> 		((mode & BLK_OPEN_WRITE) ? DEVCG_ACC_WRITE : 0));

Fixed. (Fwiw, this is due to clang-format.)

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 29/34] bdev: make struct bdev_handle private to the block layer
  2024-02-01 11:23   ` Jan Kara
@ 2024-02-01 14:52     ` Christian Brauner
  0 siblings, 0 replies; 146+ messages in thread
From: Christian Brauner @ 2024-02-01 14:52 UTC (permalink / raw)
  To: Jan Kara
  Cc: Christoph Hellwig, Jens Axboe, Darrick J. Wong, linux-fsdevel,
	linux-block

On Thu, Feb 01, 2024 at 12:23:47PM +0100, Jan Kara wrote:
> On Tue 23-01-24 14:26:46, Christian Brauner wrote:
> > Signed-off-by: Christian Brauner <brauner@kernel.org>
> 
> One more thing I've noticed:
> 
> > -struct bdev_handle *bdev_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
> > -				     const struct blk_holder_ops *hops)
> > +int bdev_open(struct block_device *bdev, blk_mode_t mode, void *holder,
> > +	      const struct blk_holder_ops *hops, struct file *bdev_file)
> >  {
> >  	struct bdev_handle *handle = kmalloc(sizeof(struct bdev_handle),
> >  					     GFP_KERNEL);
> > -	struct block_device *bdev;
> >  	bool unblock_events = true;
> > -	struct gendisk *disk;
> > +	struct gendisk *disk = bdev->bd_disk;
> >  	int ret;
> >  
> > +	handle = kmalloc(sizeof(struct bdev_handle), GFP_KERNEL);
> 
> You are leaking handle here. It gets fixed up later in the series but
> still...

Bah, called twice instead of removed it. Fixed.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 29/34] bdev: make struct bdev_handle private to the block layer
  2024-02-01 10:54   ` Jan Kara
@ 2024-02-01 15:07     ` Christian Brauner
  2024-02-01 17:42       ` Jan Kara
  0 siblings, 1 reply; 146+ messages in thread
From: Christian Brauner @ 2024-02-01 15:07 UTC (permalink / raw)
  To: Jan Kara
  Cc: Christoph Hellwig, Jens Axboe, Darrick J. Wong, linux-fsdevel,
	linux-block

> >   * Use this interface ONLY if you really do not have anything better - i.e. when
> >   * you are behind a truly sucky interface and all you are given is a device
>       ^^^
> I guess this part of comment is stale now?

Indeed, I removed that.

> 
> > @@ -902,7 +897,22 @@ struct bdev_handle *bdev_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
> >  	handle->bdev = bdev;
> >  	handle->holder = holder;
> >  	handle->mode = mode;
> > -	return handle;
> > +
> > +	/*
> > +	 * Preserve backwards compatibility and allow large file access
> > +	 * even if userspace doesn't ask for it explicitly. Some mkfs
> > +	 * binary needs it. We might want to drop this workaround
> > +	 * during an unstable branch.
> > +	 */
> 
> Heh, I think the sentense "We might want to drop this workaround during an
> unstable branch." does not need to be moved as well :)

Dropped.

> > -	handle = bdev_open_by_dev(dev, mode, holder, hops);
> > -	if (IS_ERR(handle))
> > -		return ERR_CAST(handle);
> > +	ret = bdev_permission(dev, 0, holder);
> 				   ^^ Maybe I'm missing something but why
> do you pass 0 mode here?

Lack of caffeine? Fixed. Thanks for catching that.

> 
> 
> > +	if (ret)
> > +		return ERR_PTR(ret);
> > +
> > +	bdev = blkdev_get_no_open(dev);
> > +	if (!bdev)
> > +		return ERR_PTR(-ENXIO);
> >  
> >  	flags = blk_to_file_flags(mode);
> > -	bdev_file = alloc_file_pseudo_noaccount(handle->bdev->bd_inode,
> > +	bdev_file = alloc_file_pseudo_noaccount(bdev->bd_inode,
> >  			blockdev_mnt, "", flags | O_LARGEFILE, &def_blk_fops);
> >  	if (IS_ERR(bdev_file)) {
> > -		bdev_release(handle);
> > +		blkdev_put_no_open(bdev);
> >  		return bdev_file;
> >  	}
> > -	ihold(handle->bdev->bd_inode);
> > +	bdev_file->f_mode &= ~FMODE_OPENED;
> 
> Hum, why do you need these games with FMODE_OPENED? I suspect you want to
> influence fput() behavior but then AFAICT we will leak dentry, mnt, etc. on
> error? If this is indeed needed, it deserves a comment...

I rewrote this.

Total diff I applied is:

diff --git a/block/bdev.c b/block/bdev.c
index 0e8984884236..ba9dfa4648ca 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -893,12 +893,6 @@ int bdev_open(struct block_device *bdev, blk_mode_t mode, void *holder,
        handle->holder = holder;
        handle->mode = mode;

-       /*
-        * Preserve backwards compatibility and allow large file access
-        * even if userspace doesn't ask for it explicitly. Some mkfs
-        * binary needs it. We might want to drop this workaround
-        * during an unstable branch.
-        */
        bdev_file->f_flags |= O_LARGEFILE;
        bdev_file->f_mode |= FMODE_BUF_RASYNC | FMODE_CAN_ODIRECT;
        if (bdev_nowait(bdev))
@@ -960,7 +954,7 @@ struct file *bdev_file_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
        unsigned int flags;
        int ret;

-       ret = bdev_permission(dev, 0, holder);
+       ret = bdev_permission(dev, mode, holder);
        if (ret)
                return ERR_PTR(ret);

@@ -975,16 +969,14 @@ struct file *bdev_file_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
                blkdev_put_no_open(bdev);
                return bdev_file;
        }
-       bdev_file->f_mode &= ~FMODE_OPENED;
-
        ihold(bdev->bd_inode);
+
        ret = bdev_open(bdev, mode, holder, hops, bdev_file);
        if (ret) {
+               blkdev_put_no_open(bdev);
                fput(bdev_file);
                return ERR_PTR(ret);
        }
-       /* Now that thing is opened. */
-       bdev_file->f_mode |= FMODE_OPENED;
        return bdev_file;
 }
 EXPORT_SYMBOL(bdev_file_open_by_dev);
diff --git a/block/fops.c b/block/fops.c
index 81ff8c0ce32f..a1ba1a50ae77 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -622,7 +622,8 @@ static int blkdev_open(struct inode *inode, struct file *filp)

 static int blkdev_release(struct inode *inode, struct file *filp)
 {
-       bdev_release(filp->private_data);
+       if (filp->private_data)
+               bdev_release(filp->private_data);
        return 0;
 }

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 31/34] block: use file->f_op to indicate restricted writes
  2024-02-01 11:08   ` [PATCH v2 31/34] block: use file->f_op to indicate restricted writes Jan Kara
@ 2024-02-01 16:16     ` Christian Brauner
  2024-02-01 17:36       ` Jan Kara
  0 siblings, 1 reply; 146+ messages in thread
From: Christian Brauner @ 2024-02-01 16:16 UTC (permalink / raw)
  To: Jan Kara
  Cc: Christoph Hellwig, Jens Axboe, Darrick J. Wong, linux-fsdevel,
	linux-block

On Thu, Feb 01, 2024 at 12:08:58PM +0100, Jan Kara wrote:
> On Tue 23-01-24 14:26:48, Christian Brauner wrote:
> > Make it possible to detected a block device that was opened with
> > restricted write access solely based on its file operations that it was
> > opened with. This avoids wasting an FMODE_* flag.
> > 
> > def_blk_fops isn't needed to check whether something is a block device
> > checking the inode type is enough for that. And def_blk_fops_restricted
> > can be kept private to the block layer.
> > 
> > Signed-off-by: Christian Brauner <brauner@kernel.org>
> 
> I don't think we need def_blk_fops_restricted. If we have BLK_OPEN_WRITE
> file against a bdev with bdev_writes_blocked() == true, we are sure this is
> the handle blocking other writes so we can unblock them in
> bdev_yield_write_access()...

Excellent:

commit e2dd15e4c32ad66d938d35e1acd26375a7f355fb
Author:     Christian Brauner <brauner@kernel.org>
AuthorDate: Tue Jan 23 14:26:48 2024 +0100
Commit:     Christian Brauner <brauner@kernel.org>
CommitDate: Thu Feb 1 17:13:16 2024 +0100

    block: don't rely on BLK_OPEN_RESTRICT_WRITES when yielding write access

    Make it possible to detected a block device that was opened with
    restricted write access based only on BLK_OPEN_WRITE and
    bdev->bd_writers < 0 so we won't have to claim another FMODE_* flag.

    Link: https://lore.kernel.org/r/20240123-vfs-bdev-file-v2-31-adbd023e19cc@kernel.org
    base-commit: 0bd1bf95a554f5f877724c27dbe33d4db0af4d0c
    change-id: 20240129-vfs-bdev-file-bd_inode-385a56c57a51
    Signed-off-by: Christian Brauner <brauner@kernel.org>

diff --git a/block/bdev.c b/block/bdev.c
index 9be8c3c683ae..0edeb073e4d8 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -799,13 +799,22 @@ static void bdev_claim_write_access(struct block_device *bdev, blk_mode_t mode)
                bdev->bd_writers++;
 }

-static void bdev_yield_write_access(struct block_device *bdev, blk_mode_t mode)
+static void bdev_yield_write_access(struct file *bdev_file, blk_mode_t mode)
 {
+       struct block_device *bdev;
+
        if (bdev_allow_write_mounted)
                return;

+       bdev = file_bdev(bdev_file);
        /* Yield exclusive or shared write access. */
-       if (mode & BLK_OPEN_RESTRICT_WRITES)
+       if (mode & BLK_OPEN_WRITE) {
+               if (bdev_writes_blocked(bdev))
+                       bdev_unblock_writes(bdev);
+               else
+                       bdev->bd_writers--;
+       }
+       if (bdev_file->f_op == &def_blk_fops_restricted)
                bdev_unblock_writes(bdev);
        else if (mode & BLK_OPEN_WRITE)
                bdev->bd_writers--;
@@ -1020,7 +1029,7 @@ void bdev_release(struct file *bdev_file)
                sync_blockdev(bdev);

        mutex_lock(&disk->open_mutex);
-       bdev_yield_write_access(bdev, handle->mode);
+       bdev_yield_write_access(bdev_file, handle->mode);

        if (handle->holder)
                bd_end_claim(bdev, handle->holder);

^ permalink raw reply related	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 32/34] block: remove bdev_handle completely
  2024-02-01 11:20   ` Jan Kara
@ 2024-02-01 16:18     ` Christian Brauner
  0 siblings, 0 replies; 146+ messages in thread
From: Christian Brauner @ 2024-02-01 16:18 UTC (permalink / raw)
  To: Jan Kara
  Cc: Christoph Hellwig, Jens Axboe, Darrick J. Wong, linux-fsdevel,
	linux-block

On Thu, Feb 01, 2024 at 12:20:08PM +0100, Jan Kara wrote:
> On Tue 23-01-24 14:26:49, Christian Brauner wrote:
> > We just need to use the holder to indicate whether a block device open
> > was exclusive or not. We did use to do that before but had to give that
> > up once we switched to struct bdev_handle. Before struct bdev_handle we
> > only stashed stuff in file->private_data if this was an exclusive open
> > but after struct bdev_handle we always set file->private_data to a
> > struct bdev_handle and so we had to use bdev_handle->mode or
> > bdev_handle->holder. Now that we don't use struct bdev_handle anymore we
> > can revert back to the old behavior.
> > 
> > Signed-off-by: Christian Brauner <brauner@kernel.org>
> 
> Two small comments below.
> 
> > diff --git a/block/fops.c b/block/fops.c
> > index f56bdfe459de..a0bff2c0d88d 100644
> > --- a/block/fops.c
> > +++ b/block/fops.c
> > @@ -569,7 +569,6 @@ static int blkdev_fsync(struct file *filp, loff_t start, loff_t end,
> >  blk_mode_t file_to_blk_mode(struct file *file)
> >  {
> >  	blk_mode_t mode = 0;
> > -	struct bdev_handle *handle = file->private_data;
> >  
> >  	if (file->f_mode & FMODE_READ)
> >  		mode |= BLK_OPEN_READ;
> > @@ -579,8 +578,8 @@ blk_mode_t file_to_blk_mode(struct file *file)
> >  	 * do_dentry_open() clears O_EXCL from f_flags, use handle->mode to
> >  	 * determine whether the open was exclusive for already open files.
> >  	 */
> ^^^ This comment needs update now...
> 
> > -	if (handle)
> > -		mode |= handle->mode & BLK_OPEN_EXCL;
> > +	if (file->private_data)
> > +		mode |= BLK_OPEN_EXCL;
> >  	else if (file->f_flags & O_EXCL)
> >  		mode |= BLK_OPEN_EXCL;
> >  	if (file->f_flags & O_NDELAY)
> > @@ -601,12 +600,17 @@ static int blkdev_open(struct inode *inode, struct file *filp)
> >  {
> >  	struct block_device *bdev;
> >  	blk_mode_t mode;
> > -	void *holder;
> >  	int ret;
> >  
> > +	/*
> > +	 * Use the file private data to store the holder for exclusive opens.
> > +	 * file_to_blk_mode relies on it being present to set BLK_OPEN_EXCL.
> > +	 */
> > +	if (filp->f_flags & O_EXCL)
> > +		filp->private_data = filp;
> 
> Well, if we have O_EXCL in f_flags here, then file_to_blk_mode() on the
> next line is going to do the right thing and set BLK_OPEN_EXCL even
> without filp->private_data. So this shound't be needed AFAICT.

Fixed.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 01/34] bdev: open block device as files
  2024-01-29 16:02   ` Christoph Hellwig
@ 2024-02-01 17:08     ` Christian Brauner
  2024-02-02  6:43       ` Christoph Hellwig
  2024-02-09 11:39       ` Christian Brauner
  0 siblings, 2 replies; 146+ messages in thread
From: Christian Brauner @ 2024-02-01 17:08 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jan Kara, Jens Axboe, Darrick J. Wong, linux-fsdevel, linux-block

On Mon, Jan 29, 2024 at 05:02:41PM +0100, Christoph Hellwig wrote:
> > +static unsigned blk_to_file_flags(blk_mode_t mode)
> > +{
> > +	unsigned int flags = 0;
> > +
> 
> ...
> 
> > +	/*
> > +	 * O_EXCL is one of those flags that the VFS clears once it's done with
> > +	 * the operation. So don't raise it here either.
> > +	 */
> > +	if (mode & BLK_OPEN_NDELAY)
> > +		flags |= O_NDELAY;
> 
> O_EXCL isn't dealt with in this helper at all.

Yeah, on purpose was my point bc we can just rely on @holder and passing
_EXCL without holder is invalid. But I could add it.

> 
> > +	/*
> > +	 * If BLK_OPEN_WRITE_IOCTL is set then this is a historical quirk
> > +	 * associated with the floppy driver where it has allowed ioctls if the
> > +	 * file was opened for writing, but does not allow reads or writes.
> > +	 * Make sure that this quirk is reflected in @f_flags.
> > +	 */
> > +	if (mode & BLK_OPEN_WRITE_IOCTL)
> > +		flags |= O_RDWR | O_WRONLY;
> 
> .. and BLK_OPEN_WRITE_IOCTL will never be passed to it.  It only comes
> from open block devices nodes.
> 
> That being said, passing BLK_OPEN_* to bdev_file_open_by_* actually
> feels wrong.  They deal with files and should just take normal
> O_* flags instead of translating from BLK_OPEN_* to O_* back to
> BLK_OPEN_* for the driver (where they make sense as the driver
> flags are pretty different from what is passed to open).
> 
> Now of course changing that would make a mess of the whole series,
> so maybe that can go into a new patch at the end?

Yes, I had considered that and it would work but there's the issue that
we need to figure out how to handle BLK_OPEN_RESTRICT_WRITES. It has no
corresponding O_* flag that would let us indicate this. So I had
considered:

1/ Expose bdev_file_open_excl() so callers don't need to pass any
   specific flags. Nearly all filesystems would effectively use this
   helper as sb_open_mode() adds it implicitly. That would have the
   side-effect of introducing another open helper ofc; possibly two if
   we take _by_dev() and _by_path() into account.

2/ Abuse an O_* flag to mean BLK_OPEN_RESTRICT_WRITES. For example,
   O_TRUNC or O_NOCTTY which is pretty yucky.

3/ Introduce an internal O_* flag which is also ugly. Vomitorious and my
   co-maintainers would likely chop off my hands so I can't go near a
   computer again.

3/ Make O_EXCL when passed together with bdev_file_open_by_*() always
   imply BLK_OPEN_RESTRICT_WRITES.

The 3/ option would probably be the cleanest one and I think that all
filesystems now pass at least a holder and holder ops so this _should_
work.

Thoughts?

> 
> > + * @noaccount: whether this is an internal open that shouldn't be counted
> >   */
> >  static struct file *alloc_file(const struct path *path, int flags,
> > -		const struct file_operations *fop)
> > +		const struct file_operations *fop, bool noaccount)
> 
> Just a suggestion as you are the maintainer here, but I always find
> it hard to follow when infrastructure in subsystem A is changed in
> a patch primarily changing subsystem B.  Can the file_table.c
> changes go into a separate patch or patches with commit logs
> documenting their semantics?
> 
> And while we're at the semantics I find this area already a bit of a
> a mess and this doesn't make it any better..
> 
> How about the following:
> 
>  - alloc_file loses the actual file allocation and gets a new name
>    (unfortunatel init_file is already taken), callers call
>    alloc_empty_file_noaccount or alloc_empty_file plus the
>    new helper.
>  - similarly __alloc_file_pseudo is split into a helper creating
>    a path for mnt and inode, and callers call that plus the
>    file allocation
> 
> ?

Ok, let me see how far I get.

> 
> > +extern struct file *alloc_file_pseudo_noaccount(struct inode *, struct vfsmount *,
> 
> no need for the extern here.
> 
> > +	struct block_device	*s_bdev;	/* can go away once we use an accessor for @s_bdev_file */
> 
> can you put the comment into a separate line to make it readable.
> 
> But I'm not even sure it should go away.  s_bdev is used all over the
> data and metadata I/O path, so caching it and avoiding multiple levels
> of pointer chasing would seem useful.

Fair.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 31/34] block: use file->f_op to indicate restricted writes
  2024-02-01 16:16     ` Christian Brauner
@ 2024-02-01 17:36       ` Jan Kara
  2024-02-02 11:45         ` Christian Brauner
  0 siblings, 1 reply; 146+ messages in thread
From: Jan Kara @ 2024-02-01 17:36 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Thu 01-02-24 17:16:02, Christian Brauner wrote:
> On Thu, Feb 01, 2024 at 12:08:58PM +0100, Jan Kara wrote:
> > On Tue 23-01-24 14:26:48, Christian Brauner wrote:
> > > Make it possible to detected a block device that was opened with
> > > restricted write access solely based on its file operations that it was
> > > opened with. This avoids wasting an FMODE_* flag.
> > > 
> > > def_blk_fops isn't needed to check whether something is a block device
> > > checking the inode type is enough for that. And def_blk_fops_restricted
> > > can be kept private to the block layer.
> > > 
> > > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > 
> > I don't think we need def_blk_fops_restricted. If we have BLK_OPEN_WRITE
> > file against a bdev with bdev_writes_blocked() == true, we are sure this is
> > the handle blocking other writes so we can unblock them in
> > bdev_yield_write_access()...

...

> -       if (mode & BLK_OPEN_RESTRICT_WRITES)
> +       if (mode & BLK_OPEN_WRITE) {
> +               if (bdev_writes_blocked(bdev))
> +                       bdev_unblock_writes(bdev);
> +               else
> +                       bdev->bd_writers--;
> +       }
> +       if (bdev_file->f_op == &def_blk_fops_restricted)

Uh, why are you leaving def_blk_fops_restricted check here? I'd expect you
can delete def_blk_fops_restricted completely...

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 29/34] bdev: make struct bdev_handle private to the block layer
  2024-02-01 15:07     ` Christian Brauner
@ 2024-02-01 17:42       ` Jan Kara
  0 siblings, 0 replies; 146+ messages in thread
From: Jan Kara @ 2024-02-01 17:42 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Thu 01-02-24 16:07:59, Christian Brauner wrote:
> > > +	if (ret)
> > > +		return ERR_PTR(ret);
> > > +
> > > +	bdev = blkdev_get_no_open(dev);
> > > +	if (!bdev)
> > > +		return ERR_PTR(-ENXIO);
> > >  
> > >  	flags = blk_to_file_flags(mode);
> > > -	bdev_file = alloc_file_pseudo_noaccount(handle->bdev->bd_inode,
> > > +	bdev_file = alloc_file_pseudo_noaccount(bdev->bd_inode,
> > >  			blockdev_mnt, "", flags | O_LARGEFILE, &def_blk_fops);
> > >  	if (IS_ERR(bdev_file)) {
> > > -		bdev_release(handle);
> > > +		blkdev_put_no_open(bdev);
> > >  		return bdev_file;
> > >  	}
> > > -	ihold(handle->bdev->bd_inode);
> > > +	bdev_file->f_mode &= ~FMODE_OPENED;
> > 
> > Hum, why do you need these games with FMODE_OPENED? I suspect you want to
> > influence fput() behavior but then AFAICT we will leak dentry, mnt, etc. on
> > error? If this is indeed needed, it deserves a comment...
> 
> I rewrote this.
> 
> Total diff I applied is:

Looks good now.

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 01/34] bdev: open block device as files
  2024-02-01 17:08     ` Christian Brauner
@ 2024-02-02  6:43       ` Christoph Hellwig
  2024-02-02 11:46         ` Christian Brauner
  2024-02-09 11:39       ` Christian Brauner
  1 sibling, 1 reply; 146+ messages in thread
From: Christoph Hellwig @ 2024-02-02  6:43 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Christoph Hellwig, Jan Kara, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Thu, Feb 01, 2024 at 06:08:29PM +0100, Christian Brauner wrote:
> > > +	/*
> > > +	 * O_EXCL is one of those flags that the VFS clears once it's done with
> > > +	 * the operation. So don't raise it here either.
> > > +	 */
> > > +	if (mode & BLK_OPEN_NDELAY)
> > > +		flags |= O_NDELAY;
> > 
> > O_EXCL isn't dealt with in this helper at all.
> 
> Yeah, on purpose was my point bc we can just rely on @holder and passing
> _EXCL without holder is invalid. But I could add it.

Ok.  I found it weird to have the comment next to BLK_OPEN_NDELAY
as it looked like it sneaked through.  Especially as BLK_OPEN_EXCL
has literally nothing to do with O_EXCL at all as the latter is a
namespace operation flag.  So even if the comment was intentional
I think we're probably better off without it.

> Yes, I had considered that and it would work but there's the issue that
> we need to figure out how to handle BLK_OPEN_RESTRICT_WRITES. It has no
> corresponding O_* flag that would let us indicate this.

Oh, indeed.

>
> So I had
> considered:
> 
> 1/ Expose bdev_file_open_excl() so callers don't need to pass any
>    specific flags. Nearly all filesystems would effectively use this
>    helper as sb_open_mode() adds it implicitly. That would have the
>    side-effect of introducing another open helper ofc; possibly two if
>    we take _by_dev() and _by_path() into account.
> 
> 2/ Abuse an O_* flag to mean BLK_OPEN_RESTRICT_WRITES. For example,
>    O_TRUNC or O_NOCTTY which is pretty yucky.
> 
> 3/ Introduce an internal O_* flag which is also ugly. Vomitorious and my
>    co-maintainers would likely chop off my hands so I can't go near a
>    computer again.
> 
> 3/ Make O_EXCL when passed together with bdev_file_open_by_*() always
>    imply BLK_OPEN_RESTRICT_WRITES.
> 
> The 3/ option would probably be the cleanest one and I think that all
> filesystems now pass at least a holder and holder ops so this _should_
> work.

2 and 3 sound pretty horrible.  3 would work and look clean for the
block side, but O_ flags are mess so I wouldn't go there.

Maybe  variant of 1 that allows for a non-exclusive open and clearly
marks that?

Or just leave it as-is for now and look into that later.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 31/34] block: use file->f_op to indicate restricted writes
  2024-02-01 17:36       ` Jan Kara
@ 2024-02-02 11:45         ` Christian Brauner
  2024-02-02 11:51           ` Jan Kara
  0 siblings, 1 reply; 146+ messages in thread
From: Christian Brauner @ 2024-02-02 11:45 UTC (permalink / raw)
  To: Jan Kara
  Cc: Christoph Hellwig, Jens Axboe, Darrick J. Wong, linux-fsdevel,
	linux-block

[-- Attachment #1: Type: text/plain, Size: 1553 bytes --]

On Thu, Feb 01, 2024 at 06:36:31PM +0100, Jan Kara wrote:
> On Thu 01-02-24 17:16:02, Christian Brauner wrote:
> > On Thu, Feb 01, 2024 at 12:08:58PM +0100, Jan Kara wrote:
> > > On Tue 23-01-24 14:26:48, Christian Brauner wrote:
> > > > Make it possible to detected a block device that was opened with
> > > > restricted write access solely based on its file operations that it was
> > > > opened with. This avoids wasting an FMODE_* flag.
> > > > 
> > > > def_blk_fops isn't needed to check whether something is a block device
> > > > checking the inode type is enough for that. And def_blk_fops_restricted
> > > > can be kept private to the block layer.
> > > > 
> > > > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > > 
> > > I don't think we need def_blk_fops_restricted. If we have BLK_OPEN_WRITE
> > > file against a bdev with bdev_writes_blocked() == true, we are sure this is
> > > the handle blocking other writes so we can unblock them in
> > > bdev_yield_write_access()...
> 
> ...
> 
> > -       if (mode & BLK_OPEN_RESTRICT_WRITES)
> > +       if (mode & BLK_OPEN_WRITE) {
> > +               if (bdev_writes_blocked(bdev))
> > +                       bdev_unblock_writes(bdev);
> > +               else
> > +                       bdev->bd_writers--;
> > +       }
> > +       if (bdev_file->f_op == &def_blk_fops_restricted)
> 
> Uh, why are you leaving def_blk_fops_restricted check here? I'd expect you
> can delete def_blk_fops_restricted completely...

Copy-paste error when dumping this into here. Here's the full patch.

[-- Attachment #2: 0001-block-don-t-rely-on-BLK_OPEN_RESTRICT_WRITES-when-yi.patch --]
[-- Type: text/x-diff, Size: 1910 bytes --]

From 25609e947674d2c24fb18edd0dabb5a49f27f23d Mon Sep 17 00:00:00 2001
From: Christian Brauner <brauner@kernel.org>
Date: Tue, 23 Jan 2024 14:26:48 +0100
Subject: [PATCH] block: don't rely on BLK_OPEN_RESTRICT_WRITES when yielding
 write access

Make it possible to detected a block device that was opened with
restricted write access based only on BLK_OPEN_WRITE and
bdev->bd_writers < 0 so we won't have to claim another FMODE_* flag.

Link: https://lore.kernel.org/r/20240123-vfs-bdev-file-v2-31-adbd023e19cc@kernel.org
base-commit: 0bd1bf95a554f5f877724c27dbe33d4db0af4d0c
change-id: 20240129-vfs-bdev-file-bd_inode-385a56c57a51
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 block/bdev.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/block/bdev.c b/block/bdev.c
index 9be8c3c683ae..b19cbcd6a4bf 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -799,16 +799,21 @@ static void bdev_claim_write_access(struct block_device *bdev, blk_mode_t mode)
 		bdev->bd_writers++;
 }
 
-static void bdev_yield_write_access(struct block_device *bdev, blk_mode_t mode)
+static void bdev_yield_write_access(struct file *bdev_file, blk_mode_t mode)
 {
+	struct block_device *bdev;
+
 	if (bdev_allow_write_mounted)
 		return;
 
+	bdev = file_bdev(bdev_file);
 	/* Yield exclusive or shared write access. */
-	if (mode & BLK_OPEN_RESTRICT_WRITES)
-		bdev_unblock_writes(bdev);
-	else if (mode & BLK_OPEN_WRITE)
-		bdev->bd_writers--;
+	if (mode & BLK_OPEN_WRITE) {
+		if (bdev_writes_blocked(bdev))
+			bdev_unblock_writes(bdev);
+		else
+			bdev->bd_writers--;
+	}
 }
 
 /**
@@ -1020,7 +1025,7 @@ void bdev_release(struct file *bdev_file)
 		sync_blockdev(bdev);
 
 	mutex_lock(&disk->open_mutex);
-	bdev_yield_write_access(bdev, handle->mode);
+	bdev_yield_write_access(bdev_file, handle->mode);
 
 	if (handle->holder)
 		bd_end_claim(bdev, handle->holder);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 01/34] bdev: open block device as files
  2024-02-02  6:43       ` Christoph Hellwig
@ 2024-02-02 11:46         ` Christian Brauner
  0 siblings, 0 replies; 146+ messages in thread
From: Christian Brauner @ 2024-02-02 11:46 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jan Kara, Jens Axboe, Darrick J. Wong, linux-fsdevel, linux-block

On Fri, Feb 02, 2024 at 07:43:24AM +0100, Christoph Hellwig wrote:
> On Thu, Feb 01, 2024 at 06:08:29PM +0100, Christian Brauner wrote:
> > > > +	/*
> > > > +	 * O_EXCL is one of those flags that the VFS clears once it's done with
> > > > +	 * the operation. So don't raise it here either.
> > > > +	 */
> > > > +	if (mode & BLK_OPEN_NDELAY)
> > > > +		flags |= O_NDELAY;
> > > 
> > > O_EXCL isn't dealt with in this helper at all.
> > 
> > Yeah, on purpose was my point bc we can just rely on @holder and passing
> > _EXCL without holder is invalid. But I could add it.
> 
> Ok.  I found it weird to have the comment next to BLK_OPEN_NDELAY
> as it looked like it sneaked through.  Especially as BLK_OPEN_EXCL
> has literally nothing to do with O_EXCL at all as the latter is a
> namespace operation flag.  So even if the comment was intentional
> I think we're probably better off without it.
> 
> > Yes, I had considered that and it would work but there's the issue that
> > we need to figure out how to handle BLK_OPEN_RESTRICT_WRITES. It has no
> > corresponding O_* flag that would let us indicate this.
> 
> Oh, indeed.
> 
> >
> > So I had
> > considered:
> > 
> > 1/ Expose bdev_file_open_excl() so callers don't need to pass any
> >    specific flags. Nearly all filesystems would effectively use this
> >    helper as sb_open_mode() adds it implicitly. That would have the
> >    side-effect of introducing another open helper ofc; possibly two if
> >    we take _by_dev() and _by_path() into account.
> > 
> > 2/ Abuse an O_* flag to mean BLK_OPEN_RESTRICT_WRITES. For example,
> >    O_TRUNC or O_NOCTTY which is pretty yucky.
> > 
> > 3/ Introduce an internal O_* flag which is also ugly. Vomitorious and my
> >    co-maintainers would likely chop off my hands so I can't go near a
> >    computer again.
> > 
> > 3/ Make O_EXCL when passed together with bdev_file_open_by_*() always
> >    imply BLK_OPEN_RESTRICT_WRITES.
> > 
> > The 3/ option would probably be the cleanest one and I think that all
> > filesystems now pass at least a holder and holder ops so this _should_
> > work.
> 
> 2 and 3 sound pretty horrible.  3 would work and look clean for the
> block side, but O_ flags are mess so I wouldn't go there.

My numbering is obviously wrong btw. That last point should obviously be 4/

> 
> Maybe  variant of 1 that allows for a non-exclusive open and clearly
> marks that?
> 
> Or just leave it as-is for now and look into that later.

Ok.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 31/34] block: use file->f_op to indicate restricted writes
  2024-02-02 11:45         ` Christian Brauner
@ 2024-02-02 11:51           ` Jan Kara
  0 siblings, 0 replies; 146+ messages in thread
From: Jan Kara @ 2024-02-02 11:51 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Fri 02-02-24 12:45:49, Christian Brauner wrote:
> On Thu, Feb 01, 2024 at 06:36:31PM +0100, Jan Kara wrote:
> > On Thu 01-02-24 17:16:02, Christian Brauner wrote:
> > > On Thu, Feb 01, 2024 at 12:08:58PM +0100, Jan Kara wrote:
> > > > On Tue 23-01-24 14:26:48, Christian Brauner wrote:
> > > > > Make it possible to detected a block device that was opened with
> > > > > restricted write access solely based on its file operations that it was
> > > > > opened with. This avoids wasting an FMODE_* flag.
> > > > > 
> > > > > def_blk_fops isn't needed to check whether something is a block device
> > > > > checking the inode type is enough for that. And def_blk_fops_restricted
> > > > > can be kept private to the block layer.
> > > > > 
> > > > > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > > > 
> > > > I don't think we need def_blk_fops_restricted. If we have BLK_OPEN_WRITE
> > > > file against a bdev with bdev_writes_blocked() == true, we are sure this is
> > > > the handle blocking other writes so we can unblock them in
> > > > bdev_yield_write_access()...
> > 
> > ...
> > 
> > > -       if (mode & BLK_OPEN_RESTRICT_WRITES)
> > > +       if (mode & BLK_OPEN_WRITE) {
> > > +               if (bdev_writes_blocked(bdev))
> > > +                       bdev_unblock_writes(bdev);
> > > +               else
> > > +                       bdev->bd_writers--;
> > > +       }
> > > +       if (bdev_file->f_op == &def_blk_fops_restricted)
> > 
> > Uh, why are you leaving def_blk_fops_restricted check here? I'd expect you
> > can delete def_blk_fops_restricted completely...
> 
> Copy-paste error when dumping this into here. Here's the full patch.

Yes, the full patch looks good to me! Thanks!

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 00/34] Open block devices as files
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (35 preceding siblings ...)
  2024-01-29 10:56 ` [PATCH RFC 0/2] fs & block: remove bd_inode Christian Brauner
@ 2024-02-05 11:55 ` Christian Brauner
  2024-02-05 14:19   ` Jan Kara
  2024-03-21 22:17 ` Matthew Wilcox
  37 siblings, 1 reply; 146+ messages in thread
From: Christian Brauner @ 2024-02-05 11:55 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block

On Tue, Jan 23, 2024 at 02:26:17PM +0100, Christian Brauner wrote:
> Hey Christoph,
> Hey Jan,
> Hey Jens,
> 
> This opens block devices as files. Instead of introducing a separate
> indirection into bdev_open_by_*() vis struct bdev_handle we can just
> make bdev_file_open_by_*() return a struct file. Opening and closing a
> block device from setup_bdev_super() and in all other places just
> becomes equivalent to opening and closing a file.
> 
> This has held up in xfstests and in blktests so far and it seems stable
> and clean. The equivalence of opening and closing block devices to
> regular files is a win in and of itself imho. Added to that is the
> ability to do away with struct bdev_handle completely and make various
> low-level helpers private to the block layer.
> 
> All places were we currently stash a struct bdev_handle we just stash a
> file and use an accessor such as file_bdev() akin to I_BDEV() to get to
> the block device.
> 
> It's now also possible to use file->f_mapping as a replacement for
> bdev->bd_inode->i_mapping and file->f_inode or file->f_mapping->host as
> an alternative to bdev->bd_inode allowing us to significantly reduce or
> even fully remove bdev->bd_inode in follow-up patches.
> 
> In addition, we could get rid of sb->s_bdev and various other places
> that stash the block device directly and instead stash the block device
> file. Again, this is follow-up work.
> 
> Thanks!
> Christian
> 
> Signed-off-by: Christian Brauner <brauner@kernel.org>
> ---

With all fixes applied I've moved this into vfs.super on vfs/vfs.git so
this gets some exposure in -next asap. Please let me know if you have
quarrels with that.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 00/34] Open block devices as files
  2024-02-05 11:55 ` [PATCH v2 00/34] Open block devices as files Christian Brauner
@ 2024-02-05 14:19   ` Jan Kara
  2024-02-06 13:39     ` Christian Brauner
  0 siblings, 1 reply; 146+ messages in thread
From: Jan Kara @ 2024-02-05 14:19 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

Hi!

On Mon 05-02-24 12:55:18, Christian Brauner wrote:
> On Tue, Jan 23, 2024 at 02:26:17PM +0100, Christian Brauner wrote:
> > Hey Christoph,
> > Hey Jan,
> > Hey Jens,
> > 
> > This opens block devices as files. Instead of introducing a separate
> > indirection into bdev_open_by_*() vis struct bdev_handle we can just
> > make bdev_file_open_by_*() return a struct file. Opening and closing a
> > block device from setup_bdev_super() and in all other places just
> > becomes equivalent to opening and closing a file.
> > 
> > This has held up in xfstests and in blktests so far and it seems stable
> > and clean. The equivalence of opening and closing block devices to
> > regular files is a win in and of itself imho. Added to that is the
> > ability to do away with struct bdev_handle completely and make various
> > low-level helpers private to the block layer.
> > 
> > All places were we currently stash a struct bdev_handle we just stash a
> > file and use an accessor such as file_bdev() akin to I_BDEV() to get to
> > the block device.
> > 
> > It's now also possible to use file->f_mapping as a replacement for
> > bdev->bd_inode->i_mapping and file->f_inode or file->f_mapping->host as
> > an alternative to bdev->bd_inode allowing us to significantly reduce or
> > even fully remove bdev->bd_inode in follow-up patches.
> > 
> > In addition, we could get rid of sb->s_bdev and various other places
> > that stash the block device directly and instead stash the block device
> > file. Again, this is follow-up work.
> > 
> > Thanks!
> > Christian
> > 
> > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > ---
> 
> With all fixes applied I've moved this into vfs.super on vfs/vfs.git so
> this gets some exposure in -next asap. Please let me know if you have
> quarrels with that.

No quarrels really. I went through the patches and all of them look fine to
me to feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

I have just noticed that in "bdev: make struct bdev_handle private to the
block layer" in bdev_open() we are still leaking the handle in case of
error. This is however temporary (until the end of the series when we get
rid of handles altogether) so whatever.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 00/34] Open block devices as files
  2024-02-05 14:19   ` Jan Kara
@ 2024-02-06 13:39     ` Christian Brauner
  2024-02-06 13:58       ` Jan Kara
  0 siblings, 1 reply; 146+ messages in thread
From: Christian Brauner @ 2024-02-06 13:39 UTC (permalink / raw)
  To: Jan Kara
  Cc: Christoph Hellwig, Jens Axboe, Darrick J. Wong, linux-fsdevel,
	linux-block

On Mon, Feb 05, 2024 at 03:19:11PM +0100, Jan Kara wrote:
> Hi!
> 
> On Mon 05-02-24 12:55:18, Christian Brauner wrote:
> > On Tue, Jan 23, 2024 at 02:26:17PM +0100, Christian Brauner wrote:
> > > Hey Christoph,
> > > Hey Jan,
> > > Hey Jens,
> > > 
> > > This opens block devices as files. Instead of introducing a separate
> > > indirection into bdev_open_by_*() vis struct bdev_handle we can just
> > > make bdev_file_open_by_*() return a struct file. Opening and closing a
> > > block device from setup_bdev_super() and in all other places just
> > > becomes equivalent to opening and closing a file.
> > > 
> > > This has held up in xfstests and in blktests so far and it seems stable
> > > and clean. The equivalence of opening and closing block devices to
> > > regular files is a win in and of itself imho. Added to that is the
> > > ability to do away with struct bdev_handle completely and make various
> > > low-level helpers private to the block layer.
> > > 
> > > All places were we currently stash a struct bdev_handle we just stash a
> > > file and use an accessor such as file_bdev() akin to I_BDEV() to get to
> > > the block device.
> > > 
> > > It's now also possible to use file->f_mapping as a replacement for
> > > bdev->bd_inode->i_mapping and file->f_inode or file->f_mapping->host as
> > > an alternative to bdev->bd_inode allowing us to significantly reduce or
> > > even fully remove bdev->bd_inode in follow-up patches.
> > > 
> > > In addition, we could get rid of sb->s_bdev and various other places
> > > that stash the block device directly and instead stash the block device
> > > file. Again, this is follow-up work.
> > > 
> > > Thanks!
> > > Christian
> > > 
> > > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > > ---
> > 
> > With all fixes applied I've moved this into vfs.super on vfs/vfs.git so
> > this gets some exposure in -next asap. Please let me know if you have
> > quarrels with that.
> 
> No quarrels really. I went through the patches and all of them look fine to
> me to feel free to add:
> 
> Reviewed-by: Jan Kara <jack@suse.cz>
> 
> I have just noticed that in "bdev: make struct bdev_handle private to the
> block layer" in bdev_open() we are still leaking the handle in case of
> error. This is however temporary (until the end of the series when we get
> rid of handles altogether) so whatever.

Can you double-check what's in vfs.super right now? I thought I fixed
this up. I'll check too!

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 00/34] Open block devices as files
  2024-02-06 13:39     ` Christian Brauner
@ 2024-02-06 13:58       ` Jan Kara
  2024-02-06 16:10         ` Christian Brauner
  0 siblings, 1 reply; 146+ messages in thread
From: Jan Kara @ 2024-02-06 13:58 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue 06-02-24 14:39:13, Christian Brauner wrote:
> On Mon, Feb 05, 2024 at 03:19:11PM +0100, Jan Kara wrote:
> > Hi!
> > 
> > On Mon 05-02-24 12:55:18, Christian Brauner wrote:
> > > On Tue, Jan 23, 2024 at 02:26:17PM +0100, Christian Brauner wrote:
> > > > Hey Christoph,
> > > > Hey Jan,
> > > > Hey Jens,
> > > > 
> > > > This opens block devices as files. Instead of introducing a separate
> > > > indirection into bdev_open_by_*() vis struct bdev_handle we can just
> > > > make bdev_file_open_by_*() return a struct file. Opening and closing a
> > > > block device from setup_bdev_super() and in all other places just
> > > > becomes equivalent to opening and closing a file.
> > > > 
> > > > This has held up in xfstests and in blktests so far and it seems stable
> > > > and clean. The equivalence of opening and closing block devices to
> > > > regular files is a win in and of itself imho. Added to that is the
> > > > ability to do away with struct bdev_handle completely and make various
> > > > low-level helpers private to the block layer.
> > > > 
> > > > All places were we currently stash a struct bdev_handle we just stash a
> > > > file and use an accessor such as file_bdev() akin to I_BDEV() to get to
> > > > the block device.
> > > > 
> > > > It's now also possible to use file->f_mapping as a replacement for
> > > > bdev->bd_inode->i_mapping and file->f_inode or file->f_mapping->host as
> > > > an alternative to bdev->bd_inode allowing us to significantly reduce or
> > > > even fully remove bdev->bd_inode in follow-up patches.
> > > > 
> > > > In addition, we could get rid of sb->s_bdev and various other places
> > > > that stash the block device directly and instead stash the block device
> > > > file. Again, this is follow-up work.
> > > > 
> > > > Thanks!
> > > > Christian
> > > > 
> > > > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > > > ---
> > > 
> > > With all fixes applied I've moved this into vfs.super on vfs/vfs.git so
> > > this gets some exposure in -next asap. Please let me know if you have
> > > quarrels with that.
> > 
> > No quarrels really. I went through the patches and all of them look fine to
> > me to feel free to add:
> > 
> > Reviewed-by: Jan Kara <jack@suse.cz>
> > 
> > I have just noticed that in "bdev: make struct bdev_handle private to the
> > block layer" in bdev_open() we are still leaking the handle in case of
> > error. This is however temporary (until the end of the series when we get
> > rid of handles altogether) so whatever.
> 
> Can you double-check what's in vfs.super right now? I thought I fixed
> this up. I'll check too!

Well, you've fixed the "double allocation" issue but there's still a
problem that you do:

int bdev_open(struct block_device *bdev, blk_mode_t mode, void *holder,
	      const struct blk_holder_ops *hops, struct file *bdev_file)
{
...
	handle = kmalloc(sizeof(struct bdev_handle), GFP_KERNEL);
	if (!handle)
		return -ENOMEM;
 	if (holder) {
 		mode |= BLK_OPEN_EXCL;
 		ret = bd_prepare_to_claim(bdev, holder, hops);
 		if (ret)
			return ret;
 	} else {
...


So in case bd_prepare_to_claim() fails we forget to free the allocated
handle.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 00/34] Open block devices as files
  2024-02-06 13:58       ` Jan Kara
@ 2024-02-06 16:10         ` Christian Brauner
  0 siblings, 0 replies; 146+ messages in thread
From: Christian Brauner @ 2024-02-06 16:10 UTC (permalink / raw)
  To: Jan Kara
  Cc: Christoph Hellwig, Jens Axboe, Darrick J. Wong, linux-fsdevel,
	linux-block

> > Can you double-check what's in vfs.super right now? I thought I fixed
> > this up. I'll check too!
> 
> Well, you've fixed the "double allocation" issue but there's still a
> problem that you do:
> 
> int bdev_open(struct block_device *bdev, blk_mode_t mode, void *holder,
> 	      const struct blk_holder_ops *hops, struct file *bdev_file)
> {
> ...
> 	handle = kmalloc(sizeof(struct bdev_handle), GFP_KERNEL);
> 	if (!handle)
> 		return -ENOMEM;
>  	if (holder) {
>  		mode |= BLK_OPEN_EXCL;
>  		ret = bd_prepare_to_claim(bdev, holder, hops);
>  		if (ret)
> 			return ret;
>  	} else {
> ...
> 
> 
> So in case bd_prepare_to_claim() fails we forget to free the allocated
> handle.

Grumble grumble grumble, thank you! Fixing.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 01/34] bdev: open block device as files
  2024-02-01 17:08     ` Christian Brauner
  2024-02-02  6:43       ` Christoph Hellwig
@ 2024-02-09 11:39       ` Christian Brauner
  1 sibling, 0 replies; 146+ messages in thread
From: Christian Brauner @ 2024-02-09 11:39 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jan Kara, Jens Axboe, Darrick J. Wong, linux-fsdevel, linux-block

> > How about the following:
> > 
> >  - alloc_file loses the actual file allocation and gets a new name
> >    (unfortunatel init_file is already taken), callers call
> >    alloc_empty_file_noaccount or alloc_empty_file plus the
> >    new helper.
> >  - similarly __alloc_file_pseudo is split into a helper creating
> >    a path for mnt and inode, and callers call that plus the
> >    file allocation
> > 
> > ?
> 
> Ok, let me see how far I get.

Ok, it's in vfs.super.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH RFC 2/2] fs,drivers: remove bdev_inode() usage outside of block layer and drivers
  2024-01-29 15:36         ` Christoph Hellwig
@ 2024-02-19 13:34           ` Yu Kuai
  2024-02-19 13:42           ` Yu Kuai
  1 sibling, 0 replies; 146+ messages in thread
From: Yu Kuai @ 2024-02-19 13:34 UTC (permalink / raw)
  To: Christoph Hellwig, Christian Brauner
  Cc: Jan Kara, Jens Axboe, Darrick J. Wong, linux-fsdevel, linux-block

Hi,

在 2024/01/29 23:36, Christoph Hellwig 写道:
> On Mon, Jan 29, 2024 at 04:29:32PM +0100, Christian Brauner wrote:
>> On Mon, Jan 29, 2024 at 03:37:09PM +0100, Christoph Hellwig wrote:
>>> Most of these really should be using proper high level APIs.  The
>>> last round of work on this is here:
>>>
>>> https://lore.kernel.org/linux-nilfs/4b11a311-c121-1f44-0ccf-a3966a396994@huaweicloud.com/
>>
>> Are you saying that I should just drop this patch here?
> 
> I think we need to order the work:
> 
>   - get your use struct file as bdev handle series in
>   - rebase the above series on top of that, including some bigger changes
>     like block2mtd which can then use normal file read/write APIs

I'm working on that now, mostly convert to use bdev_file, and file_inode
or f_mapingo
>   - rebase what is left of this series on top of that, and hopefully not
>     much of this patch and a lot less of patch 1 will be left at that
>     point.
> 
> .
> 

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH RFC 2/2] fs,drivers: remove bdev_inode() usage outside of block layer and drivers
  2024-01-29 15:36         ` Christoph Hellwig
  2024-02-19 13:34           ` Yu Kuai
@ 2024-02-19 13:42           ` Yu Kuai
  1 sibling, 0 replies; 146+ messages in thread
From: Yu Kuai @ 2024-02-19 13:42 UTC (permalink / raw)
  To: Christoph Hellwig, Christian Brauner
  Cc: Jan Kara, Jens Axboe, Darrick J. Wong, linux-fsdevel, linux-block,
	yukuai (C)

Hi,

在 2024/01/29 23:36, Christoph Hellwig 写道:
> On Mon, Jan 29, 2024 at 04:29:32PM +0100, Christian Brauner wrote:
>> On Mon, Jan 29, 2024 at 03:37:09PM +0100, Christoph Hellwig wrote:
>>> Most of these really should be using proper high level APIs.  The
>>> last round of work on this is here:
>>>
>>> https://lore.kernel.org/linux-nilfs/4b11a311-c121-1f44-0ccf-a3966a396994@huaweicloud.com/
>>
>> Are you saying that I should just drop this patch here?
> 
> I think we need to order the work:
> 
>   - get your use struct file as bdev handle series in
>   - rebase the above series on top of that, including some bigger changes
>     like block2mtd which can then use normal file read/write APIs

I'm working on that now, mostly convert to use bdev_file, and file_inode
to get 'bd_inode' or f_mapping to get 'bd_inode->f_mapping', and now
that all fs and all drivers can avoid to access 'bd_inode' now.

>   - rebase what is left of this series on top of that, and hopefully not
>     much of this patch and a lot less of patch 1 will be left at that
>     point.

This is done by a huge patch for now, and there really is nothing left
of this set. I'm still testing and spliting into a patchset. I'll post a
RFC version soon.

Thanks,
Kuai

> 
> .
> 


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH RFC 1/2] fs & block: remove bdev->bd_inode
  2024-01-29 10:56   ` [PATCH RFC 1/2] fs & block: remove bdev->bd_inode Christian Brauner
@ 2024-02-20 11:57     ` Yu Kuai
  2024-02-21  7:36       ` Christian Brauner
  0 siblings, 1 reply; 146+ messages in thread
From: Yu Kuai @ 2024-02-20 11:57 UTC (permalink / raw)
  To: Christian Brauner, Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Darrick J. Wong, linux-fsdevel, linux-block, yukuai (C)

Hi, Christian

在 2024/01/29 18:56, Christian Brauner 写道:
> The only user that doesn't rely on files is the block layer itself in
> block/fops.c where we only have access to the block device. As the bdev
> filesystem doesn't open block devices as files obviously.
> 
> This introduces a union into struct buffer_head and struct iomap. The
> union encompasses both struct block_device and struct file. In both
> cases a flag is used to differentiate whether a block device or a proper
> file was stashed. Simple accessors bh_bdev() and iomap_bdev() are used
> to return the block device in the really low-level functions where it's
> needed. These are overall just a few callsites.

I just realize that your implementation for iomap and buffer_head is
better, if you don't mind. I'll split related changes into a seperate
patch, and send out together.

Thanks,
Kuai


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH RFC 1/2] fs & block: remove bdev->bd_inode
  2024-02-20 11:57     ` Yu Kuai
@ 2024-02-21  7:36       ` Christian Brauner
  0 siblings, 0 replies; 146+ messages in thread
From: Christian Brauner @ 2024-02-21  7:36 UTC (permalink / raw)
  To: Yu Kuai
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block, yukuai (C)

On Tue, Feb 20, 2024 at 07:57:12PM +0800, Yu Kuai wrote:
> Hi, Christian
> 
> 在 2024/01/29 18:56, Christian Brauner 写道:
> > The only user that doesn't rely on files is the block layer itself in
> > block/fops.c where we only have access to the block device. As the bdev
> > filesystem doesn't open block devices as files obviously.
> > 
> > This introduces a union into struct buffer_head and struct iomap. The
> > union encompasses both struct block_device and struct file. In both
> > cases a flag is used to differentiate whether a block device or a proper
> > file was stashed. Simple accessors bh_bdev() and iomap_bdev() are used
> > to return the block device in the really low-level functions where it's
> > needed. These are overall just a few callsites.
> 
> I just realize that your implementation for iomap and buffer_head is
> better, if you don't mind. I'll split related changes into a seperate
> patch, and send out together.

Sure!

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 01/34] bdev: open block device as files
  2024-01-23 13:26 ` [PATCH v2 01/34] bdev: open block device " Christian Brauner
  2024-01-29 16:02   ` Christoph Hellwig
@ 2024-03-13  2:32   ` Christoph Hellwig
  2024-03-14 11:10     ` Christian Brauner
  1 sibling, 1 reply; 146+ messages in thread
From: Christoph Hellwig @ 2024-03-13  2:32 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

Now that this is in mainline it seems to cause blktests to crash
nbd/003 with a rather non-obvious oops for me:

nbd/003 (mount/unmount concurrently with NBD_CLEAR_SOCK)    
runtime  8.139s  ...
[  802.941685] run blktests nbd/003 at 2024-03-12 14:51:20
[  803.171958] nbd0: detected capacity change from 0 to 20971520
[  803.183725] block nbd0: shutting down sockets
[  803.184645] I/O error, dev nbd0, sector 2 op 0x0:(READ) flags 0x1000 phys_seg 1 prio class 0
[  803.185156] EXT4-fs (nbd0): unable to read superblock
[  803.186214] I/O error, dev nbd0, sector 20968432 op 0x0:(READ) flags
 0x80700 phys_seg 1 prio clas0
[  803.186770] I/O error, dev nbd0, sector 20968432 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[  803.187026] Buffer I/O error on dev nbd0, logical block 10484216, async page read
[  803.187335] I/O error, dev nbd0, sector 20968434 op 0x0:(READ) flags 0x0 phys_seg 3 prio class 0
[  803.187593] Buffer I/O error on dev nbd0, logical block 10484217, async page read
[  803.187809] Buffer I/O error on dev nbd0, logical block 10484218, async page read
[  803.188027] Buffer I/O error on dev nbd0, logical block 10484219, async page read
[  803.194147] I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[  803.194400] Buffer I/O error on dev nbd0, logical block 0, async page read
[  803.194634] I/O error, dev nbd0, sector 2 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[  803.194880] Buffer I/O error on dev nbd0, logical block 1, async page read
[  803.195296] I/O error, dev nbd0, sector 4 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[  803.195548] Buffer I/O error on dev nbd0, logical block 2, async page read
[  803.195781] I/O error, dev nbd0, sector 6 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[  803.196015] Buffer I/O error on dev nbd0, logical block 3, async page read
[  803.196246] Buffer I/O error on dev nbd0, logical block 0, async page read
[  803.196743] ldm_validate_partition_table(): Disk read failed.
[  803.197375] Dev nbd0: unable to read RDB block 0
[  803.198007]  nbd0: unable to read partition table
[  803.198467] EXT4-fs (nbd0): unable to read superblock
[  803.199444] ldm_validate_partition_table(): Disk read failed.
[  803.200487] Dev nbd0: unable to read RDB block 0
[  803.201046]  nbd0: unable to read partition table
[  803.207369] (udev-worker): attempt to access beyond end of device
[  803.207369] nbd0: rw=0, sector=2, nr_sectors = 2 limit=0
[  803.208100] (udev-worker): attempt to access beyond end of device
[  803.208100] nbd0: rw=0, sector=4, nr_sectors = 2 limit=0
[  803.208807] (udev-worker): attempt to access beyond end of device
[  803.208807] nbd0: rw=0, sector=6, nr_sectors = 2 limit=0
[  803.209197] ldm_validate_partition_table(): Disk read failed.
[  803.209365] Dev nbd0: unable to read RDB block 0
[  803.209506]  nbd0: unable to read partition table
[  803.209679] nbd0: partition table beyond EOD, truncated
[  803.210132] mount_clear_soc: attempt to access beyond end of device
[  803.210132] nbd0: rw=4096, sector=2, nr_sectors = 2 limit=0
[  803.210896] EXT4-fs (nbd0): unable to read superblock
[  803.212990] mount_clear_soc: attempt to access beyond end of device
[  803.212990] nbd0: rw=4096, sector=2, nr_sectors = 2 limit=0
[  803.213374] EXT4-fs (nbd0): unable to read superblock
[  803.221502] mount_clear_soc: attempt to access beyond end of device
[  803.221502] nbd0: rw=4096, sector=2, nr_sectors = 2 limit=0
[  803.221868] EXT4-fs (nbd0): unable to read superblock
[  803.223990] mount_clear_soc: attempt to access beyond end of device
[  803.223990] nbd0: rw=4096, sector=2, nr_sectors = 2 limit=0
[  803.224334] EXT4-fs (nbd0): unable to read superblock
[  803.229317] mount_clear_soc: attempt to access beyond end of device
[  803.229317] nbd0: rw=4096, sector=2, nr_sectors = 2 limit=0
[  803.229665] EXT4-fs (nbd0): unable to read superblock
[  803.231698] mount_clear_soc: attempt to access beyond end of device
[  803.231698] nbd0: rw=4096, sector=2, nr_sectors = 2 limit=0
[  803.232068] EXT4-fs (nbd0): unable to read superblock
[  803.233702] mount_clear_soc: attempt to access beyond end of device
[  803.233702] nbd0: rw=4096, sector=2, nr_sectors = 2 limit=0
[  803.234049] EXT4-fs (nbd0): unable to read superblock
[  803.235263] general protection fault, maybe for address 0x0: 0000 [#1] PREEMPT SMP NOPTI
[  803.235505] CPU: 1 PID: 53579 Comm: mount_clear_soc Not tainted 6.8.0+ #2288
[  803.235712] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/014
[  803.235973] RIP: 0010:srso_alias_safe_ret+0x5/0x7
[  803.236118] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
[  803.236637] RSP: 0018:ffffc900000d4ef8 EFLAGS: 00010202
[  803.236789] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000001
[  803.236991] RDX: 6b6b6b6b6b6b6b6b RSI: ffffffff833adc6a RDI: ffff888112581870
[  803.237200] RBP: ffffffff8124ae64 R08: 00000000ffffffff R09: 00000000ffffffff
[  803.237402] R10: 0000000000000002 R11: ffff8881130cb458 R12: ffff8881130caa40
[  803.237605] R13: 0000000000000000 R14: 0000000000031688 R15: ffffffff8124adde
[  803.237811] FS:  0000000000000000(0000) GS:ffff8881f9d00000(0000) knlGS:0000000000000000
[  803.238038] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  803.238206] CR2: 00007f2bb5507740 CR3: 0000000110dd2000 CR4: 0000000000750ef0
[  803.238412] PKRU: 55555554
[  803.238496] Call Trace:
[  803.238571]  <IRQ>
[  803.238634]  ? die_addr+0x31/0x80
[  803.238740]  ? exc_general_protection+0x24a/0x480
[  803.238886]  ? asm_exc_general_protection+0x26/0x30
[  803.239028]  ? rcu_core+0x34e/0xae0
[  803.239141]  ? rcu_core+0x3d4/0xae0
[  803.239255]  ? srso_alias_safe_ret+0x5/0x7
[  803.239379]  ? srso_alias_return_thunk+0x5/0xfbef5
[  803.239523]  ? rcu_core+0x3d9/0xae0
[  803.239636]  ? __do_softirq+0xec/0x481
[  803.239757]  ? __irq_exit_rcu+0x89/0xe0
[  803.239874]  ? irq_exit_rcu+0x9/0x30
[  803.239982]  ? sysvec_apic_timer_interrupt+0xa1/0xd0
[  803.240130]  </IRQ>
[  803.240197]  <TASK>
[  803.240265]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[  803.240423]  ? __memcg_slab_free_hook+0x11e/0x230
[  803.240566]  ? __memcg_slab_free_hook+0x68/0x230
[  803.240705]  ? unlink_anon_vmas+0x78/0x200
[  803.240833]  ? kmem_cache_free+0x2ca/0x310
[  803.240960]  ? unlink_anon_vmas+0x78/0x200
[  803.241086]  ? free_pgtables+0x141/0x260
[  803.241218]  ? exit_mmap+0x194/0x440
[  803.241337]  ? __mmput+0x3a/0x110
[  803.241445]  ? do_exit+0x2bf/0xb10
[  803.241553]  ? do_group_exit+0x31/0x90
[  803.241669]  ? __x64_sys_exit_group+0x13/0x20
[  803.241801]  ? do_syscall_64+0x75/0x150
[  803.241921]  ? entry_SYSCALL_64_after_hwframe+0x6c/0x74

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 01/34] bdev: open block device as files
  2024-03-13  2:32   ` Christoph Hellwig
@ 2024-03-14 11:10     ` Christian Brauner
  2024-03-14 14:47       ` Christian Brauner
  0 siblings, 1 reply; 146+ messages in thread
From: Christian Brauner @ 2024-03-14 11:10 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue, Mar 12, 2024 at 07:32:35PM -0700, Christoph Hellwig wrote:
> Now that this is in mainline it seems to cause blktests to crash
> nbd/003 with a rather non-obvious oops for me:

Ok, will be looking into that next.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 01/34] bdev: open block device as files
  2024-03-14 11:10     ` Christian Brauner
@ 2024-03-14 14:47       ` Christian Brauner
  2024-03-14 16:45         ` Christian Brauner
  2024-03-14 16:58         ` Jan Kara
  0 siblings, 2 replies; 146+ messages in thread
From: Christian Brauner @ 2024-03-14 14:47 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Thu, Mar 14, 2024 at 12:10:59PM +0100, Christian Brauner wrote:
> On Tue, Mar 12, 2024 at 07:32:35PM -0700, Christoph Hellwig wrote:
> > Now that this is in mainline it seems to cause blktests to crash
> > nbd/003 with a rather non-obvious oops for me:
> 
> Ok, will be looking into that next.

Ok, I know what's going on. Basically, fput() on the block device runs
asynchronously which means that bdev->bd_holder can still be set to @sb
after it has already been freed. Let me illustrate what I mean:

P1                                                 P2
mount(sb)                                          fd = open("/dev/nbd", ...)
-> file = bdev_file_open_by_dev(..., sb, ...)
   bdev->bd_holder = sb;

// Later on:

umount(sb)
->kill_block_super(sb)
|----> [fput() -> deferred via task work]
-> put_super(sb) -> frees the sb via rcu
|
|                                                 nbd_ioctl(NBD_CLEAR_SOCK)
|                                                 -> disk_force_media_change()
|                                                    -> bdev_mark_dead()
|                                                       -> fs_mark_dead()
|                                                          // Finds bdev->bd_holder == sb
|-> file->release::bdev_release()                          // Tries to get reference to it but it's freed; frees it again
   bdev->bd_holder = NULL;

Two solutions that come to my mind:

[1] Indicate to fput() that this is an internal block devices open and
    thus just close it synchronously. This is fine. First, because the block
    device superblock is never unmounted or anything so there's no risk
    from hanging there for any reason. Second, bdev_release() also ran
    synchronously so any issue that we'd see from calling
    file->f_op->release::bdev_release() we would have seen from
    bdev_release() itself. See [1.1] for a patch.

(2) Take a temporary reference to the holder when opening the block
    device. This is also fine afaict because we differentiate between
    passive and active references on superblocks and what we already do
    in fs_bdev_mark_dead() and friends. This mean we don't have to mess
    around with fput(). See [1.2] for a patch.

(3) Revert and cry. No patch.

Personally, I think (2) is more desirable because we don't lose the
async property of fput() and we don't need to have a new FMODE_* flag.
I'd like to do some more testing with this. Thoughts?

[1.1]:
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 block/bdev.c       | 1 +
 fs/file_table.c    | 5 +++++
 include/linux/fs.h | 3 +++
 3 files changed, 9 insertions(+)

diff --git a/block/bdev.c b/block/bdev.c
index e7adaaf1c219..d0c208a04b04 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -969,6 +969,7 @@ struct file *bdev_file_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
 		return bdev_file;
 	}
 	ihold(bdev->bd_inode);
+	bdev_file->f_mode |= FMODE_BLOCKDEV;
 
 	ret = bdev_open(bdev, mode, holder, hops, bdev_file);
 	if (ret) {
diff --git a/fs/file_table.c b/fs/file_table.c
index 4f03beed4737..48d35dd67020 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -473,6 +473,11 @@ void fput(struct file *file)
 	if (atomic_long_dec_and_test(&file->f_count)) {
 		struct task_struct *task = current;
 
+		if (unlikely((file->f_mode & FMODE_BLOCKDEV))) {
+			__fput(file);
+			return;
+		}
+
 		if (unlikely(!(file->f_mode & (FMODE_BACKING | FMODE_OPENED)))) {
 			file_free(file);
 			return;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index d5d5a4ee24f0..ceac9c0316a6 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -121,6 +121,9 @@ typedef int (dio_iodone_t)(struct kiocb *iocb, loff_t offset,
 #define FMODE_PWRITE		((__force fmode_t)0x10)
 /* File is opened for execution with sys_execve / sys_uselib */
 #define FMODE_EXEC		((__force fmode_t)0x20)
+
+/* File is opened as block device. */
+#define FMODE_BLOCKDEV		((__force fmode_t)0x100)
 /* 32bit hashes as llseek() offset (for directories) */
 #define FMODE_32BITHASH         ((__force fmode_t)0x200)
 /* 64bit hashes as llseek() offset (for directories) */
-- 
2.43.0

[1.2]:
Sketched-by: Christian Brauner <brauner@kernel.org>
---
 block/bdev.c           |  4 ++++
 fs/super.c             | 17 +++++++++++++++++
 include/linux/blkdev.h |  3 +++
 3 files changed, 24 insertions(+)

diff --git a/block/bdev.c b/block/bdev.c
index e7adaaf1c219..a0d5960dc2b9 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -627,6 +627,8 @@ static void bd_end_claim(struct block_device *bdev, void *holder)
 		whole->bd_holder = NULL;
 	mutex_unlock(&bdev_lock);
 
+	if (bdev->bd_holder_ops && bdev->bd_holder_ops->put_holder)
+		bdev->bd_holder_ops->put_holder(holder);
 	/*
 	 * If this was the last claim, remove holder link and unblock evpoll if
 	 * it was a write holder.
@@ -902,6 +904,8 @@ int bdev_open(struct block_device *bdev, blk_mode_t mode, void *holder,
 		bdev_file->f_mode |= FMODE_NOWAIT;
 	bdev_file->f_mapping = bdev->bd_inode->i_mapping;
 	bdev_file->f_wb_err = filemap_sample_wb_err(bdev_file->f_mapping);
+	if (hops && hops->get_holder)
+		hops->get_holder(holder);
 	bdev_file->private_data = holder;
 
 	return 0;
diff --git a/fs/super.c b/fs/super.c
index ee05ab6b37e7..64dbbdbed93a 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1515,11 +1515,28 @@ static int fs_bdev_thaw(struct block_device *bdev)
 	return error;
 }
 
+static void fs_bdev_super_get(void *data)
+{
+	struct super_block *sb = data;
+
+	spin_lock(&sb_lock);
+	sb->s_count++;
+	spin_unlock(&sb_lock);
+}
+
+static void fs_bdev_super_put(void *data)
+{
+	struct super_block *sb = data;
+	put_super(sb);
+}
+
 const struct blk_holder_ops fs_holder_ops = {
 	.mark_dead		= fs_bdev_mark_dead,
 	.sync			= fs_bdev_sync,
 	.freeze			= fs_bdev_freeze,
 	.thaw			= fs_bdev_thaw,
+	.get_holder		= fs_bdev_super_get,
+	.put_holder		= fs_bdev_super_put,
 };
 EXPORT_SYMBOL_GPL(fs_holder_ops);
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index f9b87c39cab0..d919e8bcb2c1 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1505,6 +1505,9 @@ struct blk_holder_ops {
 	 * Thaw the file system mounted on the block device.
 	 */
 	int (*thaw)(struct block_device *bdev);
+
+	void (*get_holder)(void *holder);
+	void (*put_holder)(void *holder);
 };
 
 /*
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 01/34] bdev: open block device as files
  2024-03-14 14:47       ` Christian Brauner
@ 2024-03-14 16:45         ` Christian Brauner
  2024-03-14 16:58         ` Jan Kara
  1 sibling, 0 replies; 146+ messages in thread
From: Christian Brauner @ 2024-03-14 16:45 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Thu, Mar 14, 2024 at 03:47:52PM +0100, Christian Brauner wrote:
> On Thu, Mar 14, 2024 at 12:10:59PM +0100, Christian Brauner wrote:
> > On Tue, Mar 12, 2024 at 07:32:35PM -0700, Christoph Hellwig wrote:
> > > Now that this is in mainline it seems to cause blktests to crash
> > > nbd/003 with a rather non-obvious oops for me:
> > 
> > Ok, will be looking into that next.
> 
> Ok, I know what's going on. Basically, fput() on the block device runs
> asynchronously which means that bdev->bd_holder can still be set to @sb
> after it has already been freed. Let me illustrate what I mean:
> 
> P1                                                 P2
> mount(sb)                                          fd = open("/dev/nbd", ...)
> -> file = bdev_file_open_by_dev(..., sb, ...)
>    bdev->bd_holder = sb;
> 
> // Later on:
> 
> umount(sb)
> ->kill_block_super(sb)
> |----> [fput() -> deferred via task work]
> -> put_super(sb) -> frees the sb via rcu
> |
> |                                                 nbd_ioctl(NBD_CLEAR_SOCK)
> |                                                 -> disk_force_media_change()
> |                                                    -> bdev_mark_dead()
> |                                                       -> fs_mark_dead()
> |                                                          // Finds bdev->bd_holder == sb
> |-> file->release::bdev_release()                          // Tries to get reference to it but it's freed; frees it again
>    bdev->bd_holder = NULL;
> 
> Two solutions that come to my mind:
> 
> [1] Indicate to fput() that this is an internal block devices open and
>     thus just close it synchronously. This is fine. First, because the block
>     device superblock is never unmounted or anything so there's no risk
>     from hanging there for any reason. Second, bdev_release() also ran
>     synchronously so any issue that we'd see from calling
>     file->f_op->release::bdev_release() we would have seen from
>     bdev_release() itself. See [1.1] for a patch.
> 
> (2) Take a temporary reference to the holder when opening the block
>     device. This is also fine afaict because we differentiate between
>     passive and active references on superblocks and what we already do
>     in fs_bdev_mark_dead() and friends. This mean we don't have to mess
>     around with fput(). See [1.2] for a patch.
> 
> (3) Revert and cry. No patch.
> 
> Personally, I think (2) is more desirable because we don't lose the
> async property of fput() and we don't need to have a new FMODE_* flag.
> I'd like to do some more testing with this. Thoughts?
> 
> [1.1]:
> Signed-off-by: Christian Brauner <brauner@kernel.org>
> ---
>  block/bdev.c       | 1 +
>  fs/file_table.c    | 5 +++++
>  include/linux/fs.h | 3 +++
>  3 files changed, 9 insertions(+)
> 
> diff --git a/block/bdev.c b/block/bdev.c
> index e7adaaf1c219..d0c208a04b04 100644
> --- a/block/bdev.c
> +++ b/block/bdev.c
> @@ -969,6 +969,7 @@ struct file *bdev_file_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
>  		return bdev_file;
>  	}
>  	ihold(bdev->bd_inode);
> +	bdev_file->f_mode |= FMODE_BLOCKDEV;
>  
>  	ret = bdev_open(bdev, mode, holder, hops, bdev_file);
>  	if (ret) {
> diff --git a/fs/file_table.c b/fs/file_table.c
> index 4f03beed4737..48d35dd67020 100644
> --- a/fs/file_table.c
> +++ b/fs/file_table.c
> @@ -473,6 +473,11 @@ void fput(struct file *file)
>  	if (atomic_long_dec_and_test(&file->f_count)) {
>  		struct task_struct *task = current;
>  
> +		if (unlikely((file->f_mode & FMODE_BLOCKDEV))) {
> +			__fput(file);
> +			return;
> +		}
> +
>  		if (unlikely(!(file->f_mode & (FMODE_BACKING | FMODE_OPENED)))) {
>  			file_free(file);
>  			return;
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index d5d5a4ee24f0..ceac9c0316a6 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -121,6 +121,9 @@ typedef int (dio_iodone_t)(struct kiocb *iocb, loff_t offset,
>  #define FMODE_PWRITE		((__force fmode_t)0x10)
>  /* File is opened for execution with sys_execve / sys_uselib */
>  #define FMODE_EXEC		((__force fmode_t)0x20)
> +
> +/* File is opened as block device. */
> +#define FMODE_BLOCKDEV		((__force fmode_t)0x100)
>  /* 32bit hashes as llseek() offset (for directories) */
>  #define FMODE_32BITHASH         ((__force fmode_t)0x200)
>  /* 64bit hashes as llseek() offset (for directories) */
> -- 
> 2.43.0
> 
> [1.2]:
> Sketched-by: Christian Brauner <brauner@kernel.org>
> ---
>  block/bdev.c           |  4 ++++
>  fs/super.c             | 17 +++++++++++++++++
>  include/linux/blkdev.h |  3 +++
>  3 files changed, 24 insertions(+)
> 
> diff --git a/block/bdev.c b/block/bdev.c
> index e7adaaf1c219..a0d5960dc2b9 100644
> --- a/block/bdev.c
> +++ b/block/bdev.c
> @@ -627,6 +627,8 @@ static void bd_end_claim(struct block_device *bdev, void *holder)
>  		whole->bd_holder = NULL;
>  	mutex_unlock(&bdev_lock);
>  
> +	if (bdev->bd_holder_ops && bdev->bd_holder_ops->put_holder)
> +		bdev->bd_holder_ops->put_holder(holder);

That should move into bdev_release() obviously so it mirrors
bdev_open(). Plus, currently it's a nop because we just NULLed
bdev->bd_holder_ops above. But you get the idea, I hope.

>  	/*
>  	 * If this was the last claim, remove holder link and unblock evpoll if
>  	 * it was a write holder.
> @@ -902,6 +904,8 @@ int bdev_open(struct block_device *bdev, blk_mode_t mode, void *holder,
>  		bdev_file->f_mode |= FMODE_NOWAIT;
>  	bdev_file->f_mapping = bdev->bd_inode->i_mapping;
>  	bdev_file->f_wb_err = filemap_sample_wb_err(bdev_file->f_mapping);
> +	if (hops && hops->get_holder)
> +		hops->get_holder(holder);
>  	bdev_file->private_data = holder;
>  
>  	return 0;
> diff --git a/fs/super.c b/fs/super.c
> index ee05ab6b37e7..64dbbdbed93a 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -1515,11 +1515,28 @@ static int fs_bdev_thaw(struct block_device *bdev)
>  	return error;
>  }
>  
> +static void fs_bdev_super_get(void *data)
> +{
> +	struct super_block *sb = data;
> +
> +	spin_lock(&sb_lock);
> +	sb->s_count++;
> +	spin_unlock(&sb_lock);
> +}
> +
> +static void fs_bdev_super_put(void *data)
> +{
> +	struct super_block *sb = data;
> +	put_super(sb);
> +}
> +
>  const struct blk_holder_ops fs_holder_ops = {
>  	.mark_dead		= fs_bdev_mark_dead,
>  	.sync			= fs_bdev_sync,
>  	.freeze			= fs_bdev_freeze,
>  	.thaw			= fs_bdev_thaw,
> +	.get_holder		= fs_bdev_super_get,
> +	.put_holder		= fs_bdev_super_put,
>  };
>  EXPORT_SYMBOL_GPL(fs_holder_ops);
>  
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index f9b87c39cab0..d919e8bcb2c1 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -1505,6 +1505,9 @@ struct blk_holder_ops {
>  	 * Thaw the file system mounted on the block device.
>  	 */
>  	int (*thaw)(struct block_device *bdev);
> +
> +	void (*get_holder)(void *holder);
> +	void (*put_holder)(void *holder);
>  };
>  
>  /*
> -- 
> 2.43.0
> 

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 01/34] bdev: open block device as files
  2024-03-14 14:47       ` Christian Brauner
  2024-03-14 16:45         ` Christian Brauner
@ 2024-03-14 16:58         ` Jan Kara
  2024-03-15 13:23           ` [PATCH] fs,block: get holder during claim Christian Brauner
  1 sibling, 1 reply; 146+ messages in thread
From: Jan Kara @ 2024-03-14 16:58 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Christoph Hellwig, Jan Kara, Christoph Hellwig, Jens Axboe,
	Darrick J. Wong, linux-fsdevel, linux-block

On Thu 14-03-24 15:47:52, Christian Brauner wrote:
> On Thu, Mar 14, 2024 at 12:10:59PM +0100, Christian Brauner wrote:
> > On Tue, Mar 12, 2024 at 07:32:35PM -0700, Christoph Hellwig wrote:
> > > Now that this is in mainline it seems to cause blktests to crash
> > > nbd/003 with a rather non-obvious oops for me:
> > 
> > Ok, will be looking into that next.
> 
> Ok, I know what's going on. Basically, fput() on the block device runs
> asynchronously which means that bdev->bd_holder can still be set to @sb
> after it has already been freed. Let me illustrate what I mean:
> 
> P1                                                 P2
> mount(sb)                                          fd = open("/dev/nbd", ...)
> -> file = bdev_file_open_by_dev(..., sb, ...)
>    bdev->bd_holder = sb;
> 
> // Later on:
> 
> umount(sb)
> ->kill_block_super(sb)
> |----> [fput() -> deferred via task work]
> -> put_super(sb) -> frees the sb via rcu
> |
> |                                                 nbd_ioctl(NBD_CLEAR_SOCK)
> |                                                 -> disk_force_media_change()
> |                                                    -> bdev_mark_dead()
> |                                                       -> fs_mark_dead()
> |                                                          // Finds bdev->bd_holder == sb
> |-> file->release::bdev_release()                          // Tries to get reference to it but it's freed; frees it again
>    bdev->bd_holder = NULL;
> 
> Two solutions that come to my mind:
> 
> [1] Indicate to fput() that this is an internal block devices open and
>     thus just close it synchronously. This is fine. First, because the block
>     device superblock is never unmounted or anything so there's no risk
>     from hanging there for any reason. Second, bdev_release() also ran
>     synchronously so any issue that we'd see from calling
>     file->f_op->release::bdev_release() we would have seen from
>     bdev_release() itself. See [1.1] for a patch.
> 
> (2) Take a temporary reference to the holder when opening the block
>     device. This is also fine afaict because we differentiate between
>     passive and active references on superblocks and what we already do
>     in fs_bdev_mark_dead() and friends. This mean we don't have to mess
>     around with fput(). See [1.2] for a patch.
> 
> (3) Revert and cry. No patch.
> 
> Personally, I think (2) is more desirable because we don't lose the
> async property of fput() and we don't need to have a new FMODE_* flag.
> I'd like to do some more testing with this. Thoughts?

Yeah, 2) looks like the least painful solution to me as well.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* [PATCH] fs,block: get holder during claim
  2024-03-14 16:58         ` Jan Kara
@ 2024-03-15 13:23           ` Christian Brauner
  2024-03-15 14:28             ` Jan Kara
                               ` (2 more replies)
  0 siblings, 3 replies; 146+ messages in thread
From: Christian Brauner @ 2024-03-15 13:23 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig, Jens Axboe
  Cc: Christian Brauner, Darrick J. Wong, linux-fsdevel, linux-block

Now that we open block devices as files we need to deal with the
realities that closing is a deferred operation. An operation on the
block device such as e.g., freeze, thaw, or removal that runs
concurrently with umount, tries to acquire a stable reference on the
holder. The holder might already be gone though. Make that reliable by
grabbing a passive reference to the holder during bdev_open() and
releasing it during bdev_release().

Fixes: f3a608827d1f ("bdev: open block device as files") # mainline only
Reported-by: Christoph Hellwig <hch@infradead.org>
Link: https://lore.kernel.org/r/ZfEQQ9jZZVes0WCZ@infradead.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
Hey all,

I ran blktests with nbd enabled which contains a reliable repro for the
issue. Thanks to Christoph for pointing in that direction. The
underlying issue is not reproducible anymore with this patch applied.
xfstests and blktests pass.

Thanks!
Christian
---
 block/bdev.c           |  7 +++++++
 fs/super.c             | 18 ++++++++++++++++++
 include/linux/blkdev.h | 10 ++++++++++
 3 files changed, 35 insertions(+)

diff --git a/block/bdev.c b/block/bdev.c
index e7adaaf1c219..7a5f611c3d2e 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -583,6 +583,9 @@ static void bd_finish_claiming(struct block_device *bdev, void *holder,
 	mutex_unlock(&bdev->bd_holder_lock);
 	bd_clear_claiming(whole, holder);
 	mutex_unlock(&bdev_lock);
+
+	if (hops && hops->get_holder)
+		hops->get_holder(holder);
 }
 
 /**
@@ -605,6 +608,7 @@ EXPORT_SYMBOL(bd_abort_claiming);
 static void bd_end_claim(struct block_device *bdev, void *holder)
 {
 	struct block_device *whole = bdev_whole(bdev);
+	const struct blk_holder_ops *hops = bdev->bd_holder_ops;
 	bool unblock = false;
 
 	/*
@@ -627,6 +631,9 @@ static void bd_end_claim(struct block_device *bdev, void *holder)
 		whole->bd_holder = NULL;
 	mutex_unlock(&bdev_lock);
 
+	if (hops && hops->put_holder)
+		hops->put_holder(holder);
+
 	/*
 	 * If this was the last claim, remove holder link and unblock evpoll if
 	 * it was a write holder.
diff --git a/fs/super.c b/fs/super.c
index ee05ab6b37e7..71d9779c42b1 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1515,11 +1515,29 @@ static int fs_bdev_thaw(struct block_device *bdev)
 	return error;
 }
 
+static void fs_bdev_super_get(void *data)
+{
+	struct super_block *sb = data;
+
+	spin_lock(&sb_lock);
+	sb->s_count++;
+	spin_unlock(&sb_lock);
+}
+
+static void fs_bdev_super_put(void *data)
+{
+	struct super_block *sb = data;
+
+	put_super(sb);
+}
+
 const struct blk_holder_ops fs_holder_ops = {
 	.mark_dead		= fs_bdev_mark_dead,
 	.sync			= fs_bdev_sync,
 	.freeze			= fs_bdev_freeze,
 	.thaw			= fs_bdev_thaw,
+	.get_holder		= fs_bdev_super_get,
+	.put_holder		= fs_bdev_super_put,
 };
 EXPORT_SYMBOL_GPL(fs_holder_ops);
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index f9b87c39cab0..c3e8f7cf96be 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1505,6 +1505,16 @@ struct blk_holder_ops {
 	 * Thaw the file system mounted on the block device.
 	 */
 	int (*thaw)(struct block_device *bdev);
+
+	/*
+	 * If needed, get a reference to the holder.
+	 */
+	void (*get_holder)(void *holder);
+
+	/*
+	 * Release the holder.
+	 */
+	void (*put_holder)(void *holder);
 };
 
 /*
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* Re: [PATCH] fs,block: get holder during claim
  2024-03-15 13:23           ` [PATCH] fs,block: get holder during claim Christian Brauner
@ 2024-03-15 14:28             ` Jan Kara
  2024-03-19 16:24               ` remove holder ops Christian Brauner
  2024-03-17 20:53             ` [PATCH] fs,block: get holder during claim Christoph Hellwig
  2024-03-18  9:10             ` Yi Zhang
  2 siblings, 1 reply; 146+ messages in thread
From: Jan Kara @ 2024-03-15 14:28 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Fri 15-03-24 14:23:07, Christian Brauner wrote:
> Now that we open block devices as files we need to deal with the
> realities that closing is a deferred operation. An operation on the
> block device such as e.g., freeze, thaw, or removal that runs
> concurrently with umount, tries to acquire a stable reference on the
> holder. The holder might already be gone though. Make that reliable by
> grabbing a passive reference to the holder during bdev_open() and
> releasing it during bdev_release().
> 
> Fixes: f3a608827d1f ("bdev: open block device as files") # mainline only
> Reported-by: Christoph Hellwig <hch@infradead.org>
> Link: https://lore.kernel.org/r/ZfEQQ9jZZVes0WCZ@infradead.org
> Signed-off-by: Christian Brauner <brauner@kernel.org>
> ---
> Hey all,
> 
> I ran blktests with nbd enabled which contains a reliable repro for the
> issue. Thanks to Christoph for pointing in that direction. The
> underlying issue is not reproducible anymore with this patch applied.
> xfstests and blktests pass.

Thanks for the fix! It looks good to me. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza


> diff --git a/block/bdev.c b/block/bdev.c
> index e7adaaf1c219..7a5f611c3d2e 100644
> --- a/block/bdev.c
> +++ b/block/bdev.c
> @@ -583,6 +583,9 @@ static void bd_finish_claiming(struct block_device *bdev, void *holder,
>  	mutex_unlock(&bdev->bd_holder_lock);
>  	bd_clear_claiming(whole, holder);
>  	mutex_unlock(&bdev_lock);
> +
> +	if (hops && hops->get_holder)
> +		hops->get_holder(holder);
>  }
>  
>  /**
> @@ -605,6 +608,7 @@ EXPORT_SYMBOL(bd_abort_claiming);
>  static void bd_end_claim(struct block_device *bdev, void *holder)
>  {
>  	struct block_device *whole = bdev_whole(bdev);
> +	const struct blk_holder_ops *hops = bdev->bd_holder_ops;
>  	bool unblock = false;
>  
>  	/*
> @@ -627,6 +631,9 @@ static void bd_end_claim(struct block_device *bdev, void *holder)
>  		whole->bd_holder = NULL;
>  	mutex_unlock(&bdev_lock);
>  
> +	if (hops && hops->put_holder)
> +		hops->put_holder(holder);
> +
>  	/*
>  	 * If this was the last claim, remove holder link and unblock evpoll if
>  	 * it was a write holder.
> diff --git a/fs/super.c b/fs/super.c
> index ee05ab6b37e7..71d9779c42b1 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -1515,11 +1515,29 @@ static int fs_bdev_thaw(struct block_device *bdev)
>  	return error;
>  }
>  
> +static void fs_bdev_super_get(void *data)
> +{
> +	struct super_block *sb = data;
> +
> +	spin_lock(&sb_lock);
> +	sb->s_count++;
> +	spin_unlock(&sb_lock);
> +}
> +
> +static void fs_bdev_super_put(void *data)
> +{
> +	struct super_block *sb = data;
> +
> +	put_super(sb);
> +}
> +
>  const struct blk_holder_ops fs_holder_ops = {
>  	.mark_dead		= fs_bdev_mark_dead,
>  	.sync			= fs_bdev_sync,
>  	.freeze			= fs_bdev_freeze,
>  	.thaw			= fs_bdev_thaw,
> +	.get_holder		= fs_bdev_super_get,
> +	.put_holder		= fs_bdev_super_put,
>  };
>  EXPORT_SYMBOL_GPL(fs_holder_ops);
>  
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index f9b87c39cab0..c3e8f7cf96be 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -1505,6 +1505,16 @@ struct blk_holder_ops {
>  	 * Thaw the file system mounted on the block device.
>  	 */
>  	int (*thaw)(struct block_device *bdev);
> +
> +	/*
> +	 * If needed, get a reference to the holder.
> +	 */
> +	void (*get_holder)(void *holder);
> +
> +	/*
> +	 * Release the holder.
> +	 */
> +	void (*put_holder)(void *holder);
>  };
>  
>  /*
> -- 
> 2.43.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH] fs,block: get holder during claim
  2024-03-15 13:23           ` [PATCH] fs,block: get holder during claim Christian Brauner
  2024-03-15 14:28             ` Jan Kara
@ 2024-03-17 20:53             ` Christoph Hellwig
  2024-03-18  8:33               ` Christian Brauner
  2024-03-18  9:10             ` Yi Zhang
  2 siblings, 1 reply; 146+ messages in thread
From: Christoph Hellwig @ 2024-03-17 20:53 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Fri, Mar 15, 2024 at 02:23:07PM +0100, Christian Brauner wrote:
> Now that we open block devices as files we need to deal with the
> realities that closing is a deferred operation. An operation on the
> block device such as e.g., freeze, thaw, or removal that runs
> concurrently with umount, tries to acquire a stable reference on the
> holder. The holder might already be gone though. Make that reliable by
> grabbing a passive reference to the holder during bdev_open() and
> releasing it during bdev_release().

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

does bcachefs also need a fix for it's holder ops?  Or does it get to
keep the pieces as it has it's own NULL holder_ops and obviously doens't
care about getting any of this right?

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH] fs,block: get holder during claim
  2024-03-17 20:53             ` [PATCH] fs,block: get holder during claim Christoph Hellwig
@ 2024-03-18  8:33               ` Christian Brauner
  0 siblings, 0 replies; 146+ messages in thread
From: Christian Brauner @ 2024-03-18  8:33 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jan Kara, Jens Axboe, Darrick J. Wong, linux-fsdevel, linux-block

On Sun, Mar 17, 2024 at 01:53:51PM -0700, Christoph Hellwig wrote:
> On Fri, Mar 15, 2024 at 02:23:07PM +0100, Christian Brauner wrote:
> > Now that we open block devices as files we need to deal with the
> > realities that closing is a deferred operation. An operation on the
> > block device such as e.g., freeze, thaw, or removal that runs
> > concurrently with umount, tries to acquire a stable reference on the
> > holder. The holder might already be gone though. Make that reliable by
> > grabbing a passive reference to the holder during bdev_open() and
> > releasing it during bdev_release().
> 
> Looks good:
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> 
> does bcachefs also need a fix for it's holder ops?  Or does it get to
> keep the pieces as it has it's own NULL holder_ops and obviously doens't
> care about getting any of this right?

It has empty holder ops and so is behaving equivalent too having NULL
holder ops. IOW, the block layer cannot access the holder.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH] fs,block: get holder during claim
  2024-03-15 13:23           ` [PATCH] fs,block: get holder during claim Christian Brauner
  2024-03-15 14:28             ` Jan Kara
  2024-03-17 20:53             ` [PATCH] fs,block: get holder during claim Christoph Hellwig
@ 2024-03-18  9:10             ` Yi Zhang
  2 siblings, 0 replies; 146+ messages in thread
From: Yi Zhang @ 2024-03-18  9:10 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Fri, Mar 15, 2024 at 9:23 PM Christian Brauner <brauner@kernel.org> wrote:
>
> Now that we open block devices as files we need to deal with the
> realities that closing is a deferred operation. An operation on the
> block device such as e.g., freeze, thaw, or removal that runs
> concurrently with umount, tries to acquire a stable reference on the
> holder. The holder might already be gone though. Make that reliable by
> grabbing a passive reference to the holder during bdev_open() and
> releasing it during bdev_release().
>
> Fixes: f3a608827d1f ("bdev: open block device as files") # mainline only
> Reported-by: Christoph Hellwig <hch@infradead.org>
> Link: https://lore.kernel.org/r/ZfEQQ9jZZVes0WCZ@infradead.org
> Signed-off-by: Christian Brauner <brauner@kernel.org>

Verified the blktests ndb/003 panic issue was fixed by this patch,
feel free to add:

Tested-by: Yi Zhang <yi.zhang@redhat.com>




> ---
> Hey all,
>
> I ran blktests with nbd enabled which contains a reliable repro for the
> issue. Thanks to Christoph for pointing in that direction. The
> underlying issue is not reproducible anymore with this patch applied.
> xfstests and blktests pass.
>
> Thanks!
> Christian
> ---
>  block/bdev.c           |  7 +++++++
>  fs/super.c             | 18 ++++++++++++++++++
>  include/linux/blkdev.h | 10 ++++++++++
>  3 files changed, 35 insertions(+)
>
> diff --git a/block/bdev.c b/block/bdev.c
> index e7adaaf1c219..7a5f611c3d2e 100644
> --- a/block/bdev.c
> +++ b/block/bdev.c
> @@ -583,6 +583,9 @@ static void bd_finish_claiming(struct block_device *bdev, void *holder,
>         mutex_unlock(&bdev->bd_holder_lock);
>         bd_clear_claiming(whole, holder);
>         mutex_unlock(&bdev_lock);
> +
> +       if (hops && hops->get_holder)
> +               hops->get_holder(holder);
>  }
>
>  /**
> @@ -605,6 +608,7 @@ EXPORT_SYMBOL(bd_abort_claiming);
>  static void bd_end_claim(struct block_device *bdev, void *holder)
>  {
>         struct block_device *whole = bdev_whole(bdev);
> +       const struct blk_holder_ops *hops = bdev->bd_holder_ops;
>         bool unblock = false;
>
>         /*
> @@ -627,6 +631,9 @@ static void bd_end_claim(struct block_device *bdev, void *holder)
>                 whole->bd_holder = NULL;
>         mutex_unlock(&bdev_lock);
>
> +       if (hops && hops->put_holder)
> +               hops->put_holder(holder);
> +
>         /*
>          * If this was the last claim, remove holder link and unblock evpoll if
>          * it was a write holder.
> diff --git a/fs/super.c b/fs/super.c
> index ee05ab6b37e7..71d9779c42b1 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -1515,11 +1515,29 @@ static int fs_bdev_thaw(struct block_device *bdev)
>         return error;
>  }
>
> +static void fs_bdev_super_get(void *data)
> +{
> +       struct super_block *sb = data;
> +
> +       spin_lock(&sb_lock);
> +       sb->s_count++;
> +       spin_unlock(&sb_lock);
> +}
> +
> +static void fs_bdev_super_put(void *data)
> +{
> +       struct super_block *sb = data;
> +
> +       put_super(sb);
> +}
> +
>  const struct blk_holder_ops fs_holder_ops = {
>         .mark_dead              = fs_bdev_mark_dead,
>         .sync                   = fs_bdev_sync,
>         .freeze                 = fs_bdev_freeze,
>         .thaw                   = fs_bdev_thaw,
> +       .get_holder             = fs_bdev_super_get,
> +       .put_holder             = fs_bdev_super_put,
>  };
>  EXPORT_SYMBOL_GPL(fs_holder_ops);
>
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index f9b87c39cab0..c3e8f7cf96be 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -1505,6 +1505,16 @@ struct blk_holder_ops {
>          * Thaw the file system mounted on the block device.
>          */
>         int (*thaw)(struct block_device *bdev);
> +
> +       /*
> +        * If needed, get a reference to the holder.
> +        */
> +       void (*get_holder)(void *holder);
> +
> +       /*
> +        * Release the holder.
> +        */
> +       void (*put_holder)(void *holder);
>  };
>
>  /*
> --
> 2.43.0
>
>


--
Best Regards,
  Yi Zhang


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: remove holder ops
  2024-03-15 14:28             ` Jan Kara
@ 2024-03-19 16:24               ` Christian Brauner
  2024-03-19 17:03                 ` Matthew Wilcox
  2024-03-19 23:13                 ` Christoph Hellwig
  0 siblings, 2 replies; 146+ messages in thread
From: Christian Brauner @ 2024-03-19 16:24 UTC (permalink / raw)
  To: Jan Kara, Christoph Hellwig
  Cc: Jens Axboe, Darrick J. Wong, linux-fsdevel, linux-block

[-- Attachment #1: Type: text/plain, Size: 996 bytes --]

> > I ran blktests with nbd enabled which contains a reliable repro for the
> > issue. Thanks to Christoph for pointing in that direction. The
> > underlying issue is not reproducible anymore with this patch applied.
> > xfstests and blktests pass.
> 
> Thanks for the fix! It looks good to me. Feel free to add:

So Linus complained about the fact that we have holder ops when really
it currently isn't needed by anything apart from filesytems. And I think
that he's right and we should consider removing the holder ops and just
calling helpers directly. If there's users outside of filesystems we can
always reintroduce that. So I went ahead and drafted a patch series. I
think it ends up simplifying things and it ends up easier to follow and
we can handle lifetime of the superblock cleaner with relying on
callbacks. Appending the patch series here and pushed to
vfs.bdev.holder. Want to gauge your thoughts before sending it out.

https://gitlab.com/brauner/linux/-/commits/vfs.bdev.holder

[-- Attachment #2: 0001-RFC-block-add-bdev_fsopen.patch --]
[-- Type: text/x-diff, Size: 4006 bytes --]

From 5a2a2d554a1ba069469ea3e58473a734a60b1f4d Mon Sep 17 00:00:00 2001
From: Christian Brauner <brauner@kernel.org>
Date: Tue, 19 Mar 2024 10:40:56 +0100
Subject: [PATCH 1/5] [RFC] block: add bdev_fsopen*()

Use a dedicated helper instead of open-coding bdev_file_open_by_*()
everywhere with NULL arguments apart from a few places. This is in
preparation of removing holder ops.

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 block/bdev.c           | 12 ++++++++++++
 fs/ext4/super.c        |  4 +---
 fs/super.c             |  3 +--
 fs/xfs/xfs_super.c     |  4 ++--
 include/linux/blkdev.h |  3 +++
 5 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/block/bdev.c b/block/bdev.c
index 7a5f611c3d2e..48bf8ca8b161 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -1012,6 +1012,18 @@ struct file *bdev_file_open_by_path(const char *path, blk_mode_t mode,
 }
 EXPORT_SYMBOL(bdev_file_open_by_path);
 
+struct file *bdev_fsopen(dev_t dev, blk_mode_t mode, struct super_block *sb)
+{
+	return bdev_file_open_by_dev(dev, mode, sb, &fs_holder_ops);
+}
+EXPORT_SYMBOL(bdev_fsopen);
+
+struct file *bdev_fsopen_path(const char *path, blk_mode_t mode, struct super_block *sb)
+{
+	return bdev_file_open_by_path(path, mode, sb, &fs_holder_ops);
+}
+EXPORT_SYMBOL(bdev_fsopen_path);
+
 void bdev_release(struct file *bdev_file)
 {
 	struct block_device *bdev = file_bdev(bdev_file);
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index cfb8449c731f..45bf4278a20a 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -5851,9 +5851,7 @@ static struct file *ext4_get_journal_blkdev(struct super_block *sb,
 	struct ext4_super_block *es;
 	int errno;
 
-	bdev_file = bdev_file_open_by_dev(j_dev,
-		BLK_OPEN_READ | BLK_OPEN_WRITE | BLK_OPEN_RESTRICT_WRITES,
-		sb, &fs_holder_ops);
+	bdev_file = bdev_fsopen(j_dev, BLK_OPEN_READ | BLK_OPEN_WRITE | BLK_OPEN_RESTRICT_WRITES, sb);
 	if (IS_ERR(bdev_file)) {
 		ext4_msg(sb, KERN_ERR,
 			 "failed to open journal device unknown-block(%u,%u) %ld",
diff --git a/fs/super.c b/fs/super.c
index 71d9779c42b1..71bfa4ffaa7d 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1539,7 +1539,6 @@ const struct blk_holder_ops fs_holder_ops = {
 	.get_holder		= fs_bdev_super_get,
 	.put_holder		= fs_bdev_super_put,
 };
-EXPORT_SYMBOL_GPL(fs_holder_ops);
 
 int setup_bdev_super(struct super_block *sb, int sb_flags,
 		struct fs_context *fc)
@@ -1548,7 +1547,7 @@ int setup_bdev_super(struct super_block *sb, int sb_flags,
 	struct file *bdev_file;
 	struct block_device *bdev;
 
-	bdev_file = bdev_file_open_by_dev(sb->s_dev, mode, sb, &fs_holder_ops);
+	bdev_file = bdev_fsopen(sb->s_dev, mode, sb);
 	if (IS_ERR(bdev_file)) {
 		if (fc)
 			errorf(fc, "%s: Can't open blockdev", fc->source);
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index c21f10ab0f5d..4c0d208d1d06 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -366,9 +366,9 @@ xfs_blkdev_get(
 {
 	int			error = 0;
 
-	*bdev_filep = bdev_file_open_by_path(name,
+	*bdev_filep = bdev_fsopen_path(name,
 		BLK_OPEN_READ | BLK_OPEN_WRITE | BLK_OPEN_RESTRICT_WRITES,
-		mp->m_super, &fs_holder_ops);
+		mp->m_super);
 	if (IS_ERR(*bdev_filep)) {
 		error = PTR_ERR(*bdev_filep);
 		*bdev_filep = NULL;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index c3e8f7cf96be..5aa2117812d1 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1536,6 +1536,9 @@ struct file *bdev_file_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
 		const struct blk_holder_ops *hops);
 struct file *bdev_file_open_by_path(const char *path, blk_mode_t mode,
 		void *holder, const struct blk_holder_ops *hops);
+struct file *bdev_fsopen(dev_t dev, blk_mode_t mode, struct super_block *sb);
+struct file *bdev_fsopen_path(const char *path, blk_mode_t mode,
+			      struct super_block *sb);
 int bd_prepare_to_claim(struct block_device *bdev, void *holder,
 		const struct blk_holder_ops *hops);
 void bd_abort_claiming(struct block_device *bdev, void *holder);
-- 
2.43.0


[-- Attachment #3: 0002-RFC-fs-block-remove-holder_ops-argument-_by_path.patch --]
[-- Type: text/x-diff, Size: 13970 bytes --]

From e21591ed892657ab2b601f398b3273f93be035ce Mon Sep 17 00:00:00 2001
From: Christian Brauner <brauner@kernel.org>
Date: Tue, 19 Mar 2024 10:59:45 +0100
Subject: [PATCH 2/5] [RFC] fs,block: remove holder_ops argument *_by_path()

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 block/bdev.c                        | 53 ++++++++++++++++++++++++-----
 drivers/block/drbd/drbd_nl.c        |  2 +-
 drivers/block/rnbd/rnbd-srv.c       |  2 +-
 drivers/md/bcache/super.c           |  2 +-
 drivers/mtd/devices/block2mtd.c     |  2 +-
 drivers/nvme/target/io-cmd-bdev.c   |  2 +-
 drivers/target/target_core_iblock.c |  3 +-
 drivers/target/target_core_pscsi.c  |  2 +-
 fs/bcachefs/super-io.c              |  7 ++--
 fs/btrfs/dev-replace.c              |  2 +-
 fs/btrfs/volumes.c                  |  6 ++--
 fs/erofs/super.c                    |  2 +-
 fs/f2fs/super.c                     |  2 +-
 fs/nfs/blocklayout/dev.c            |  2 +-
 fs/reiserfs/journal.c               |  2 +-
 include/linux/blkdev.h              |  8 ++++-
 16 files changed, 68 insertions(+), 31 deletions(-)

diff --git a/block/bdev.c b/block/bdev.c
index 48bf8ca8b161..230559fe098c 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -952,8 +952,7 @@ static unsigned blk_to_file_flags(blk_mode_t mode)
 	return flags;
 }
 
-struct file *bdev_file_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
-				   const struct blk_holder_ops *hops)
+static struct file *bdev_file_open(dev_t dev, blk_mode_t mode, void *holder)
 {
 	struct file *bdev_file;
 	struct block_device *bdev;
@@ -977,7 +976,10 @@ struct file *bdev_file_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
 	}
 	ihold(bdev->bd_inode);
 
-	ret = bdev_open(bdev, mode, holder, hops, bdev_file);
+	if (mode & BLK_OPEN_MOUNTED)
+		ret = bdev_open(bdev, mode, holder, &fs_holder_ops, bdev_file);
+	else
+		ret = bdev_open(bdev, mode, holder, NULL, bdev_file);
 	if (ret) {
 		/* We failed to open the block device. Let ->release() know. */
 		bdev_file->private_data = ERR_PTR(ret);
@@ -986,11 +988,19 @@ struct file *bdev_file_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
 	}
 	return bdev_file;
 }
+
+struct file *bdev_file_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
+		const struct blk_holder_ops *hops)
+{
+	if (mode & ~BLK_OPEN_VALID_FLAGS)
+		return ERR_PTR(-EINVAL);
+
+	return bdev_file_open(dev, mode, holder);
+}
 EXPORT_SYMBOL(bdev_file_open_by_dev);
 
 struct file *bdev_file_open_by_path(const char *path, blk_mode_t mode,
-				    void *holder,
-				    const struct blk_holder_ops *hops)
+				    void *holder)
 {
 	struct file *file;
 	dev_t dev;
@@ -1000,7 +1010,7 @@ struct file *bdev_file_open_by_path(const char *path, blk_mode_t mode,
 	if (error)
 		return ERR_PTR(error);
 
-	file = bdev_file_open_by_dev(dev, mode, holder, hops);
+	file = bdev_file_open_by_dev(dev, mode, holder, NULL);
 	if (!IS_ERR(file) && (mode & BLK_OPEN_WRITE)) {
 		if (bdev_read_only(file_bdev(file))) {
 			fput(file);
@@ -1014,13 +1024,38 @@ EXPORT_SYMBOL(bdev_file_open_by_path);
 
 struct file *bdev_fsopen(dev_t dev, blk_mode_t mode, struct super_block *sb)
 {
-	return bdev_file_open_by_dev(dev, mode, sb, &fs_holder_ops);
+	struct file *bdev_file;
+
+	if (WARN_ON_ONCE(!sb))
+		return ERR_PTR(-EINVAL);
+
+	bdev_file = bdev_file_open(dev, mode | BLK_OPEN_MOUNTED, sb);
+	if (!IS_ERR(bdev_file))
+		return bdev_file;
+	return bdev_file;
 }
 EXPORT_SYMBOL(bdev_fsopen);
 
-struct file *bdev_fsopen_path(const char *path, blk_mode_t mode, struct super_block *sb)
+struct file *bdev_fsopen_path(const char *path, blk_mode_t mode,
+			      struct super_block *sb)
 {
-	return bdev_file_open_by_path(path, mode, sb, &fs_holder_ops);
+	struct file *file;
+	dev_t dev;
+	int error;
+
+	error = lookup_bdev(path, &dev);
+	if (error)
+		return ERR_PTR(error);
+
+	file = bdev_fsopen(dev, mode, sb);
+	if (!IS_ERR(file) && (mode & BLK_OPEN_WRITE)) {
+		if (bdev_read_only(file_bdev(file))) {
+			fput(file);
+			file = ERR_PTR(-EACCES);
+		}
+	}
+
+	return file;
 }
 EXPORT_SYMBOL(bdev_fsopen_path);
 
diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index 5d65c9754d83..ffd4721190fd 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -1634,7 +1634,7 @@ static struct file *open_backing_dev(struct drbd_device *device,
 	int err = 0;
 
 	file = bdev_file_open_by_path(bdev_path, BLK_OPEN_READ | BLK_OPEN_WRITE,
-				      claim_ptr, NULL);
+				      claim_ptr);
 	if (IS_ERR(file)) {
 		drbd_err(device, "open(\"%s\") failed with %ld\n",
 				bdev_path, PTR_ERR(file));
diff --git a/drivers/block/rnbd/rnbd-srv.c b/drivers/block/rnbd/rnbd-srv.c
index f6e3a3c4b76c..cecd44e02945 100644
--- a/drivers/block/rnbd/rnbd-srv.c
+++ b/drivers/block/rnbd/rnbd-srv.c
@@ -716,7 +716,7 @@ static int process_msg_open(struct rnbd_srv_session *srv_sess,
 		goto reject;
 	}
 
-	bdev_file = bdev_file_open_by_path(full_path, open_flags, NULL, NULL);
+	bdev_file = bdev_file_open_by_path(full_path, open_flags, NULL);
 	if (IS_ERR(bdev_file)) {
 		ret = PTR_ERR(bdev_file);
 		pr_err("Opening device '%s' on session %s failed, failed to open the block device, err: %pe\n",
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 330bcd9ea4a9..f8e94ee2c012 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -2550,7 +2550,7 @@ static ssize_t register_bcache(struct kobject *k, struct kobj_attribute *attr,
 
 	ret = -EINVAL;
 	err = "failed to open device";
-	bdev_file = bdev_file_open_by_path(strim(path), BLK_OPEN_READ, NULL, NULL);
+	bdev_file = bdev_file_open_by_path(strim(path), BLK_OPEN_READ, NULL);
 	if (IS_ERR(bdev_file))
 		goto out_free_sb;
 
diff --git a/drivers/mtd/devices/block2mtd.c b/drivers/mtd/devices/block2mtd.c
index 97a00ec9a4d4..2390b4010207 100644
--- a/drivers/mtd/devices/block2mtd.c
+++ b/drivers/mtd/devices/block2mtd.c
@@ -275,7 +275,7 @@ static struct block2mtd_dev *add_device(char *devname, int erase_size,
 		return NULL;
 
 	/* Get a handle on the device */
-	bdev_file = bdev_file_open_by_path(devname, mode, dev, NULL);
+	bdev_file = bdev_file_open_by_path(devname, mode, dev);
 	if (IS_ERR(bdev_file))
 		bdev_file = mdtblock_early_get_bdev(devname, mode, timeout,
 						      dev);
diff --git a/drivers/nvme/target/io-cmd-bdev.c b/drivers/nvme/target/io-cmd-bdev.c
index 6426aac2634a..3fbc6f510952 100644
--- a/drivers/nvme/target/io-cmd-bdev.c
+++ b/drivers/nvme/target/io-cmd-bdev.c
@@ -86,7 +86,7 @@ int nvmet_bdev_ns_enable(struct nvmet_ns *ns)
 		return -ENOTBLK;
 
 	ns->bdev_file = bdev_file_open_by_path(ns->device_path,
-				BLK_OPEN_READ | BLK_OPEN_WRITE, NULL, NULL);
+				BLK_OPEN_READ | BLK_OPEN_WRITE, NULL);
 	if (IS_ERR(ns->bdev_file)) {
 		ret = PTR_ERR(ns->bdev_file);
 		if (ret != -ENOTBLK) {
diff --git a/drivers/target/target_core_iblock.c b/drivers/target/target_core_iblock.c
index 7f6ca8177845..4761b602441a 100644
--- a/drivers/target/target_core_iblock.c
+++ b/drivers/target/target_core_iblock.c
@@ -117,8 +117,7 @@ static int iblock_configure_device(struct se_device *dev)
 	else
 		dev->dev_flags |= DF_READ_ONLY;
 
-	bdev_file = bdev_file_open_by_path(ib_dev->ibd_udev_path, mode, ib_dev,
-					NULL);
+	bdev_file = bdev_file_open_by_path(ib_dev->ibd_udev_path, mode, ib_dev);
 	if (IS_ERR(bdev_file)) {
 		ret = PTR_ERR(bdev_file);
 		goto out_free_bioset;
diff --git a/drivers/target/target_core_pscsi.c b/drivers/target/target_core_pscsi.c
index f98ebb18666b..c198b547d609 100644
--- a/drivers/target/target_core_pscsi.c
+++ b/drivers/target/target_core_pscsi.c
@@ -367,7 +367,7 @@ static int pscsi_create_type_disk(struct se_device *dev, struct scsi_device *sd)
 	 * for TYPE_DISK and TYPE_ZBC using supplied udev_path
 	 */
 	bdev_file = bdev_file_open_by_path(dev->udev_path,
-				BLK_OPEN_WRITE | BLK_OPEN_READ, pdv, NULL);
+				BLK_OPEN_WRITE | BLK_OPEN_READ, pdv);
 	if (IS_ERR(bdev_file)) {
 		pr_err("pSCSI: bdev_open_by_path() failed\n");
 		scsi_device_put(sd);
diff --git a/fs/bcachefs/super-io.c b/fs/bcachefs/super-io.c
index bceac29f3d86..26c98e3a51c8 100644
--- a/fs/bcachefs/super-io.c
+++ b/fs/bcachefs/super-io.c
@@ -24,9 +24,6 @@
 #include <linux/backing-dev.h>
 #include <linux/sort.h>
 
-static const struct blk_holder_ops bch2_sb_handle_bdev_ops = {
-};
-
 struct bch2_metadata_version {
 	u16		version;
 	const char	*name;
@@ -712,13 +709,13 @@ static int __bch2_read_super(const char *path, struct bch_opts *opts,
 	if (!opt_get(*opts, nochanges))
 		sb->mode |= BLK_OPEN_WRITE;
 
-	sb->s_bdev_file = bdev_file_open_by_path(path, sb->mode, sb->holder, &bch2_sb_handle_bdev_ops);
+	sb->s_bdev_file = bdev_file_open_by_path(path, sb->mode, sb->holder);
 	if (IS_ERR(sb->s_bdev_file) &&
 	    PTR_ERR(sb->s_bdev_file) == -EACCES &&
 	    opt_get(*opts, read_only)) {
 		sb->mode &= ~BLK_OPEN_WRITE;
 
-		sb->s_bdev_file = bdev_file_open_by_path(path, sb->mode, sb->holder, &bch2_sb_handle_bdev_ops);
+		sb->s_bdev_file = bdev_file_open_by_path(path, sb->mode, sb->holder);
 		if (!IS_ERR(sb->s_bdev_file))
 			opt_set(*opts, nochanges, true);
 	}
diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
index 7696beec4c21..db1f819365ad 100644
--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -256,7 +256,7 @@ static int btrfs_init_dev_replace_tgtdev(struct btrfs_fs_info *fs_info,
 	}
 
 	bdev_file = bdev_file_open_by_path(device_path, BLK_OPEN_WRITE,
-					fs_info->bdev_holder, NULL);
+					fs_info->bdev_holder);
 	if (IS_ERR(bdev_file)) {
 		btrfs_err(fs_info, "target device %s is invalid!", device_path);
 		return PTR_ERR(bdev_file);
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index a2d07fa3cfdf..fa16296f7629 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -472,7 +472,7 @@ btrfs_get_bdev_and_sb(const char *device_path, blk_mode_t flags, void *holder,
 	struct block_device *bdev;
 	int ret;
 
-	*bdev_file = bdev_file_open_by_path(device_path, flags, holder, NULL);
+	*bdev_file = bdev_file_open_by_path(device_path, flags, holder);
 
 	if (IS_ERR(*bdev_file)) {
 		ret = PTR_ERR(*bdev_file);
@@ -1341,7 +1341,7 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags,
 	 * values temporarily, as the device paths of the fsid are the only
 	 * required information for assembling the volume.
 	 */
-	bdev_file = bdev_file_open_by_path(path, flags, NULL, NULL);
+	bdev_file = bdev_file_open_by_path(path, flags, NULL);
 	if (IS_ERR(bdev_file))
 		return ERR_CAST(bdev_file);
 
@@ -2597,7 +2597,7 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
 		return -EROFS;
 
 	bdev_file = bdev_file_open_by_path(device_path, BLK_OPEN_WRITE,
-					fs_info->bdev_holder, NULL);
+					fs_info->bdev_holder);
 	if (IS_ERR(bdev_file))
 		return PTR_ERR(bdev_file);
 
diff --git a/fs/erofs/super.c b/fs/erofs/super.c
index 69308fd73e4a..777d64ac2202 100644
--- a/fs/erofs/super.c
+++ b/fs/erofs/super.c
@@ -202,7 +202,7 @@ static int erofs_init_device(struct erofs_buf *buf, struct super_block *sb,
 		dif->fscache = fscache;
 	} else if (!sbi->devs->flatdev) {
 		bdev_file = bdev_file_open_by_path(dif->path, BLK_OPEN_READ,
-						sb->s_type, NULL);
+						sb->s_type);
 		if (IS_ERR(bdev_file))
 			return PTR_ERR(bdev_file);
 		dif->bdev_file = bdev_file;
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index a6867f26f141..c2e03744d9ee 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -4204,7 +4204,7 @@ static int f2fs_scan_devices(struct f2fs_sb_info *sbi)
 						SEGS_TO_BLKS(sbi,
 						FDEV(i).total_segments) - 1;
 				FDEV(i).bdev_file = bdev_file_open_by_path(
-					FDEV(i).path, mode, sbi->sb, NULL);
+					FDEV(i).path, mode, sbi->sb);
 			}
 		}
 		if (IS_ERR(FDEV(i).bdev_file))
diff --git a/fs/nfs/blocklayout/dev.c b/fs/nfs/blocklayout/dev.c
index 93ef7f864980..6ad005745ea0 100644
--- a/fs/nfs/blocklayout/dev.c
+++ b/fs/nfs/blocklayout/dev.c
@@ -312,7 +312,7 @@ bl_open_path(struct pnfs_block_volume *v, const char *prefix)
 		return ERR_PTR(-ENOMEM);
 
 	bdev_file = bdev_file_open_by_path(devname, BLK_OPEN_READ | BLK_OPEN_WRITE,
-					NULL, NULL);
+					NULL);
 	if (IS_ERR(bdev_file)) {
 		pr_warn("pNFS: failed to open device %s (%ld)\n",
 			devname, PTR_ERR(bdev_file));
diff --git a/fs/reiserfs/journal.c b/fs/reiserfs/journal.c
index 6474529c4253..adea6c49d698 100644
--- a/fs/reiserfs/journal.c
+++ b/fs/reiserfs/journal.c
@@ -2633,7 +2633,7 @@ static int journal_init_dev(struct super_block *super,
 	}
 
 	journal->j_bdev_file = bdev_file_open_by_path(jdev_name, blkdev_mode,
-						   holder, NULL);
+						   holder);
 	if (IS_ERR(journal->j_bdev_file)) {
 		result = PTR_ERR(journal->j_bdev_file);
 		journal->j_bdev_file = NULL;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 5aa2117812d1..2bbf9ddd0874 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -128,6 +128,12 @@ typedef unsigned int __bitwise blk_mode_t;
 #define BLK_OPEN_WRITE_IOCTL	((__force blk_mode_t)(1 << 4))
 /* open is exclusive wrt all other BLK_OPEN_WRITE opens to the device */
 #define BLK_OPEN_RESTRICT_WRITES	((__force blk_mode_t)(1 << 5))
+/* open is for filesystem */
+#define BLK_OPEN_MOUNTED ((__force blk_mode_t)(1 << 6))
+
+#define BLK_OPEN_VALID_FLAGS                                                \
+	(BLK_OPEN_READ | BLK_OPEN_WRITE | BLK_OPEN_EXCL | BLK_OPEN_NDELAY | \
+	 BLK_OPEN_WRITE_IOCTL | BLK_OPEN_RESTRICT_WRITES)
 
 struct gendisk {
 	/*
@@ -1535,7 +1541,7 @@ extern const struct blk_holder_ops fs_holder_ops;
 struct file *bdev_file_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
 		const struct blk_holder_ops *hops);
 struct file *bdev_file_open_by_path(const char *path, blk_mode_t mode,
-		void *holder, const struct blk_holder_ops *hops);
+		void *holder);
 struct file *bdev_fsopen(dev_t dev, blk_mode_t mode, struct super_block *sb);
 struct file *bdev_fsopen_path(const char *path, blk_mode_t mode,
 			      struct super_block *sb);
-- 
2.43.0


[-- Attachment #4: 0003-RFC-fs-block-remove-holder_ops-argument-from-_by_dev.patch --]
[-- Type: text/x-diff, Size: 11409 bytes --]

From 232edeaf0d2abecb0b949bf8e106a4c81c094245 Mon Sep 17 00:00:00 2001
From: Christian Brauner <brauner@kernel.org>
Date: Tue, 19 Mar 2024 11:02:19 +0100
Subject: [PATCH 3/5] [RFC] fs,block: remove holder_ops argument from
 *_by_dev()

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 block/bdev.c                       | 5 ++---
 block/genhd.c                      | 2 +-
 block/ioctl.c                      | 2 +-
 drivers/block/pktcdvd.c            | 4 ++--
 drivers/block/xen-blkback/xenbus.c | 2 +-
 drivers/block/zram/zram_drv.c      | 2 +-
 drivers/md/bcache/super.c          | 2 +-
 drivers/md/dm.c                    | 2 +-
 drivers/md/md.c                    | 2 +-
 drivers/mtd/devices/block2mtd.c    | 2 +-
 drivers/s390/block/dasd_genhd.c    | 2 +-
 fs/jfs/jfs_logmgr.c                | 2 +-
 fs/nfs/blocklayout/dev.c           | 2 +-
 fs/ocfs2/cluster/heartbeat.c       | 2 +-
 fs/reiserfs/journal.c              | 2 +-
 include/linux/blkdev.h             | 3 +--
 kernel/power/swap.c                | 4 ++--
 mm/swapfile.c                      | 2 +-
 18 files changed, 21 insertions(+), 23 deletions(-)

diff --git a/block/bdev.c b/block/bdev.c
index 230559fe098c..7db603bcf560 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -989,8 +989,7 @@ static struct file *bdev_file_open(dev_t dev, blk_mode_t mode, void *holder)
 	return bdev_file;
 }
 
-struct file *bdev_file_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
-		const struct blk_holder_ops *hops)
+struct file *bdev_file_open_by_dev(dev_t dev, blk_mode_t mode, void *holder)
 {
 	if (mode & ~BLK_OPEN_VALID_FLAGS)
 		return ERR_PTR(-EINVAL);
@@ -1010,7 +1009,7 @@ struct file *bdev_file_open_by_path(const char *path, blk_mode_t mode,
 	if (error)
 		return ERR_PTR(error);
 
-	file = bdev_file_open_by_dev(dev, mode, holder, NULL);
+	file = bdev_file_open_by_dev(dev, mode, holder);
 	if (!IS_ERR(file) && (mode & BLK_OPEN_WRITE)) {
 		if (bdev_read_only(file_bdev(file))) {
 			fput(file);
diff --git a/block/genhd.c b/block/genhd.c
index bb29a68e1d67..513ad4318fe3 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -367,7 +367,7 @@ int disk_scan_partitions(struct gendisk *disk, blk_mode_t mode)
 
 	set_bit(GD_NEED_PART_SCAN, &disk->state);
 	file = bdev_file_open_by_dev(disk_devt(disk), mode & ~BLK_OPEN_EXCL,
-				     NULL, NULL);
+				     NULL);
 	if (IS_ERR(file))
 		ret = PTR_ERR(file);
 	else
diff --git a/block/ioctl.c b/block/ioctl.c
index 0c76137adcaa..044980c7953b 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -488,7 +488,7 @@ static int blkdev_bszset(struct block_device *bdev, blk_mode_t mode,
 	if (mode & BLK_OPEN_EXCL)
 		return set_blocksize(bdev, n);
 
-	file = bdev_file_open_by_dev(bdev->bd_dev, mode, &bdev, NULL);
+	file = bdev_file_open_by_dev(bdev->bd_dev, mode, &bdev);
 	if (IS_ERR(file))
 		return -EBUSY;
 	ret = set_blocksize(bdev, n);
diff --git a/drivers/block/pktcdvd.c b/drivers/block/pktcdvd.c
index 21728e9ea5c3..4804a299ee59 100644
--- a/drivers/block/pktcdvd.c
+++ b/drivers/block/pktcdvd.c
@@ -2176,7 +2176,7 @@ static int pkt_open_dev(struct pktcdvd_device *pd, bool write)
 	 * so open should not fail.
 	 */
 	bdev_file = bdev_file_open_by_dev(file_bdev(pd->bdev_file)->bd_dev,
-				       BLK_OPEN_READ, pd, NULL);
+				       BLK_OPEN_READ, pd);
 	if (IS_ERR(bdev_file)) {
 		ret = PTR_ERR(bdev_file);
 		goto out;
@@ -2512,7 +2512,7 @@ static int pkt_new_dev(struct pktcdvd_device *pd, dev_t dev)
 	}
 
 	bdev_file = bdev_file_open_by_dev(dev, BLK_OPEN_READ | BLK_OPEN_NDELAY,
-				       NULL, NULL);
+				       NULL);
 	if (IS_ERR(bdev_file))
 		return PTR_ERR(bdev_file);
 	sdev = scsi_device_from_queue(file_bdev(bdev_file)->bd_disk->queue);
diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
index 0621878940ae..0127790baa97 100644
--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -492,7 +492,7 @@ static int xen_vbd_create(struct xen_blkif *blkif, blkif_vdev_t handle,
 	vbd->pdevice  = MKDEV(major, minor);
 
 	bdev_file = bdev_file_open_by_dev(vbd->pdevice, vbd->readonly ?
-				 BLK_OPEN_READ : BLK_OPEN_WRITE, NULL, NULL);
+				 BLK_OPEN_READ : BLK_OPEN_WRITE, NULL);
 
 	if (IS_ERR(bdev_file)) {
 		pr_warn("xen_vbd_create: device %08x could not be opened\n",
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index f0639df6cd18..16cf32b448e5 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -514,7 +514,7 @@ static ssize_t backing_dev_store(struct device *dev,
 	}
 
 	bdev_file = bdev_file_open_by_dev(inode->i_rdev,
-				BLK_OPEN_READ | BLK_OPEN_WRITE, zram, NULL);
+				BLK_OPEN_READ | BLK_OPEN_WRITE, zram);
 	if (IS_ERR(bdev_file)) {
 		err = PTR_ERR(bdev_file);
 		bdev_file = NULL;
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index f8e94ee2c012..d0248cefb064 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -2571,7 +2571,7 @@ static ssize_t register_bcache(struct kobject *k, struct kobj_attribute *attr,
 
 	/* Now reopen in exclusive mode with proper holder */
 	bdev_file2 = bdev_file_open_by_dev(file_bdev(bdev_file)->bd_dev,
-			BLK_OPEN_READ | BLK_OPEN_WRITE, holder, NULL);
+			BLK_OPEN_READ | BLK_OPEN_WRITE, holder);
 	fput(bdev_file);
 	bdev_file = bdev_file2;
 	if (IS_ERR(bdev_file)) {
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 56aa2a8b9d71..4a7231ba23cd 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -736,7 +736,7 @@ static struct table_device *open_table_device(struct mapped_device *md,
 		return ERR_PTR(-ENOMEM);
 	refcount_set(&td->count, 1);
 
-	bdev_file = bdev_file_open_by_dev(dev, mode, _dm_claim_ptr, NULL);
+	bdev_file = bdev_file_open_by_dev(dev, mode, _dm_claim_ptr);
 	if (IS_ERR(bdev_file)) {
 		r = PTR_ERR(bdev_file);
 		goto out_free_td;
diff --git a/drivers/md/md.c b/drivers/md/md.c
index e575e74aabf5..a15da59e7cac 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -3780,7 +3780,7 @@ static struct md_rdev *md_import_device(dev_t newdev, int super_format, int supe
 
 	rdev->bdev_file = bdev_file_open_by_dev(newdev,
 			BLK_OPEN_READ | BLK_OPEN_WRITE,
-			super_format == -2 ? &claim_rdev : rdev, NULL);
+			super_format == -2 ? &claim_rdev : rdev);
 	if (IS_ERR(rdev->bdev_file)) {
 		pr_warn("md: could not open device unknown-block(%u,%u).\n",
 			MAJOR(newdev), MINOR(newdev));
diff --git a/drivers/mtd/devices/block2mtd.c b/drivers/mtd/devices/block2mtd.c
index 2390b4010207..851729a99eb2 100644
--- a/drivers/mtd/devices/block2mtd.c
+++ b/drivers/mtd/devices/block2mtd.c
@@ -249,7 +249,7 @@ static struct file __ref *mdtblock_early_get_bdev(const char *devname,
 		wait_for_device_probe();
 
 		if (!early_lookup_bdev(devname, &devt)) {
-			bdev_file = bdev_file_open_by_dev(devt, mode, dev, NULL);
+			bdev_file = bdev_file_open_by_dev(devt, mode, dev);
 			if (!IS_ERR(bdev_file))
 				break;
 		}
diff --git a/drivers/s390/block/dasd_genhd.c b/drivers/s390/block/dasd_genhd.c
index 4533dd055ca8..723a9ebf591b 100644
--- a/drivers/s390/block/dasd_genhd.c
+++ b/drivers/s390/block/dasd_genhd.c
@@ -137,7 +137,7 @@ int dasd_scan_partitions(struct dasd_block *block)
 	int rc;
 
 	bdev_file = bdev_file_open_by_dev(disk_devt(block->gdp), BLK_OPEN_READ,
-				       NULL, NULL);
+				       NULL);
 	if (IS_ERR(bdev_file)) {
 		DBF_DEV_EVENT(DBF_ERR, block->base,
 			      "scan partitions error, blkdev_get returned %ld",
diff --git a/fs/jfs/jfs_logmgr.c b/fs/jfs/jfs_logmgr.c
index 73389c68e251..1ee57d6fd4d4 100644
--- a/fs/jfs/jfs_logmgr.c
+++ b/fs/jfs/jfs_logmgr.c
@@ -1101,7 +1101,7 @@ int lmLogOpen(struct super_block *sb)
 	 */
 
 	bdev_file = bdev_file_open_by_dev(sbi->logdev,
-			BLK_OPEN_READ | BLK_OPEN_WRITE, log, NULL);
+			BLK_OPEN_READ | BLK_OPEN_WRITE, log);
 	if (IS_ERR(bdev_file)) {
 		rc = PTR_ERR(bdev_file);
 		goto free;
diff --git a/fs/nfs/blocklayout/dev.c b/fs/nfs/blocklayout/dev.c
index 6ad005745ea0..036939882edc 100644
--- a/fs/nfs/blocklayout/dev.c
+++ b/fs/nfs/blocklayout/dev.c
@@ -244,7 +244,7 @@ bl_parse_simple(struct nfs_server *server, struct pnfs_block_dev *d,
 		return -EIO;
 
 	bdev_file = bdev_file_open_by_dev(dev, BLK_OPEN_READ | BLK_OPEN_WRITE,
-				       NULL, NULL);
+				       NULL);
 	if (IS_ERR(bdev_file)) {
 		printk(KERN_WARNING "pNFS: failed to open device %d:%d (%ld)\n",
 			MAJOR(dev), MINOR(dev), PTR_ERR(bdev_file));
diff --git a/fs/ocfs2/cluster/heartbeat.c b/fs/ocfs2/cluster/heartbeat.c
index 1bde1281d514..e0404fad7b00 100644
--- a/fs/ocfs2/cluster/heartbeat.c
+++ b/fs/ocfs2/cluster/heartbeat.c
@@ -1796,7 +1796,7 @@ static ssize_t o2hb_region_dev_store(struct config_item *item,
 		goto out2;
 
 	reg->hr_bdev_file = bdev_file_open_by_dev(f.file->f_mapping->host->i_rdev,
-			BLK_OPEN_WRITE | BLK_OPEN_READ, NULL, NULL);
+			BLK_OPEN_WRITE | BLK_OPEN_READ, NULL);
 	if (IS_ERR(reg->hr_bdev_file)) {
 		ret = PTR_ERR(reg->hr_bdev_file);
 		reg->hr_bdev_file = NULL;
diff --git a/fs/reiserfs/journal.c b/fs/reiserfs/journal.c
index adea6c49d698..ef359e2f4755 100644
--- a/fs/reiserfs/journal.c
+++ b/fs/reiserfs/journal.c
@@ -2617,7 +2617,7 @@ static int journal_init_dev(struct super_block *super,
 		if (jdev == super->s_dev)
 			holder = NULL;
 		journal->j_bdev_file = bdev_file_open_by_dev(jdev, blkdev_mode,
-							  holder, NULL);
+							  holder);
 		if (IS_ERR(journal->j_bdev_file)) {
 			result = PTR_ERR(journal->j_bdev_file);
 			journal->j_bdev_file = NULL;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 2bbf9ddd0874..7d08afe09849 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1538,8 +1538,7 @@ extern const struct blk_holder_ops fs_holder_ops;
 	(BLK_OPEN_READ | BLK_OPEN_RESTRICT_WRITES | \
 	 (((flags) & SB_RDONLY) ? 0 : BLK_OPEN_WRITE))
 
-struct file *bdev_file_open_by_dev(dev_t dev, blk_mode_t mode, void *holder,
-		const struct blk_holder_ops *hops);
+struct file *bdev_file_open_by_dev(dev_t dev, blk_mode_t mode, void *holder);
 struct file *bdev_file_open_by_path(const char *path, blk_mode_t mode,
 		void *holder);
 struct file *bdev_fsopen(dev_t dev, blk_mode_t mode, struct super_block *sb);
diff --git a/kernel/power/swap.c b/kernel/power/swap.c
index 5bc04bfe2db1..0ce027c66857 100644
--- a/kernel/power/swap.c
+++ b/kernel/power/swap.c
@@ -364,7 +364,7 @@ static int swsusp_swap_check(void)
 	root_swap = res;
 
 	hib_resume_bdev_file = bdev_file_open_by_dev(swsusp_resume_device,
-			BLK_OPEN_WRITE, NULL, NULL);
+			BLK_OPEN_WRITE, NULL);
 	if (IS_ERR(hib_resume_bdev_file))
 		return PTR_ERR(hib_resume_bdev_file);
 
@@ -1572,7 +1572,7 @@ int swsusp_check(bool exclusive)
 	int error;
 
 	hib_resume_bdev_file = bdev_file_open_by_dev(swsusp_resume_device,
-				BLK_OPEN_READ, holder, NULL);
+				BLK_OPEN_READ, holder);
 	if (!IS_ERR(hib_resume_bdev_file)) {
 		set_blocksize(file_bdev(hib_resume_bdev_file), PAGE_SIZE);
 		clear_page(swsusp_header);
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 4919423cce76..d4f08b14a68d 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -2786,7 +2786,7 @@ static int claim_swapfile(struct swap_info_struct *p, struct inode *inode)
 
 	if (S_ISBLK(inode->i_mode)) {
 		p->bdev_file = bdev_file_open_by_dev(inode->i_rdev,
-				BLK_OPEN_READ | BLK_OPEN_WRITE, p, NULL);
+				BLK_OPEN_READ | BLK_OPEN_WRITE, p);
 		if (IS_ERR(p->bdev_file)) {
 			error = PTR_ERR(p->bdev_file);
 			p->bdev_file = NULL;
-- 
2.43.0


[-- Attachment #5: 0004-RFC-fs-block-call-helpers-in-block-layer-directly.patch --]
[-- Type: text/x-diff, Size: 8072 bytes --]

From bac19560ff52cf7140051ca8c477b6e1ba783f23 Mon Sep 17 00:00:00 2001
From: Christian Brauner <brauner@kernel.org>
Date: Tue, 19 Mar 2024 11:27:28 +0100
Subject: [PATCH 4/5] [RFC] fs,block: call helpers in block layer directly

block: pass mode to bd_finish_claiming()

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 block/bdev.c              | 34 +++++++++++++++++-----------------
 block/ioctl.c             |  5 +++--
 fs/internal.h             |  7 +++++++
 fs/super.c                | 12 ++++++------
 include/linux/blk_types.h |  1 +
 5 files changed, 34 insertions(+), 25 deletions(-)

diff --git a/block/bdev.c b/block/bdev.c
index 7db603bcf560..6aed12b12bfa 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -241,8 +241,8 @@ int bdev_freeze(struct block_device *bdev)
 	}
 
 	mutex_lock(&bdev->bd_holder_lock);
-	if (bdev->bd_holder_ops && bdev->bd_holder_ops->freeze) {
-		error = bdev->bd_holder_ops->freeze(bdev);
+	if (bdev->bd_mounted) {
+		error = fs_bdev_freeze(bdev);
 		lockdep_assert_not_held(&bdev->bd_holder_lock);
 	} else {
 		mutex_unlock(&bdev->bd_holder_lock);
@@ -284,8 +284,8 @@ int bdev_thaw(struct block_device *bdev)
 		goto out;
 
 	mutex_lock(&bdev->bd_holder_lock);
-	if (bdev->bd_holder_ops && bdev->bd_holder_ops->thaw) {
-		error = bdev->bd_holder_ops->thaw(bdev);
+	if (bdev->bd_mounted) {
+		error = fs_bdev_thaw(bdev);
 		lockdep_assert_not_held(&bdev->bd_holder_lock);
 	} else {
 		mutex_unlock(&bdev->bd_holder_lock);
@@ -564,7 +564,7 @@ static void bd_clear_claiming(struct block_device *whole, void *holder)
  * open by the holder and wake up all waiters for exclusive open to finish.
  */
 static void bd_finish_claiming(struct block_device *bdev, void *holder,
-		const struct blk_holder_ops *hops)
+		const struct blk_holder_ops *hops, blk_mode_t mode)
 {
 	struct block_device *whole = bdev_whole(bdev);
 
@@ -579,13 +579,12 @@ static void bd_finish_claiming(struct block_device *bdev, void *holder,
 	bdev->bd_holders++;
 	mutex_lock(&bdev->bd_holder_lock);
 	bdev->bd_holder = holder;
+	if (mode & BLK_OPEN_MOUNTED)
+		bdev->bd_mounted = true;
 	bdev->bd_holder_ops = hops;
 	mutex_unlock(&bdev->bd_holder_lock);
 	bd_clear_claiming(whole, holder);
 	mutex_unlock(&bdev_lock);
-
-	if (hops && hops->get_holder)
-		hops->get_holder(holder);
 }
 
 /**
@@ -608,8 +607,7 @@ EXPORT_SYMBOL(bd_abort_claiming);
 static void bd_end_claim(struct block_device *bdev, void *holder)
 {
 	struct block_device *whole = bdev_whole(bdev);
-	const struct blk_holder_ops *hops = bdev->bd_holder_ops;
-	bool unblock = false;
+	bool unblock = false, mounted = false;
 
 	/*
 	 * Release a claim on the device.  The holder fields are protected with
@@ -619,10 +617,12 @@ static void bd_end_claim(struct block_device *bdev, void *holder)
 	WARN_ON_ONCE(bdev->bd_holder != holder);
 	WARN_ON_ONCE(--bdev->bd_holders < 0);
 	WARN_ON_ONCE(--whole->bd_holders < 0);
+	mounted = bdev->bd_mounted;
 	if (!bdev->bd_holders) {
 		mutex_lock(&bdev->bd_holder_lock);
 		bdev->bd_holder = NULL;
 		bdev->bd_holder_ops = NULL;
+		bdev->bd_mounted = false;
 		mutex_unlock(&bdev->bd_holder_lock);
 		if (bdev->bd_write_holder)
 			unblock = true;
@@ -631,8 +631,8 @@ static void bd_end_claim(struct block_device *bdev, void *holder)
 		whole->bd_holder = NULL;
 	mutex_unlock(&bdev_lock);
 
-	if (hops && hops->put_holder)
-		hops->put_holder(holder);
+	if (mounted)
+		fs_bdev_super_put(holder);
 
 	/*
 	 * If this was the last claim, remove holder link and unblock evpoll if
@@ -883,7 +883,7 @@ int bdev_open(struct block_device *bdev, blk_mode_t mode, void *holder,
 		goto put_module;
 	bdev_claim_write_access(bdev, mode);
 	if (holder) {
-		bd_finish_claiming(bdev, holder, hops);
+		bd_finish_claiming(bdev, holder, hops, mode);
 
 		/*
 		 * Block event polling for write claims if requested.  Any write
@@ -1030,7 +1030,7 @@ struct file *bdev_fsopen(dev_t dev, blk_mode_t mode, struct super_block *sb)
 
 	bdev_file = bdev_file_open(dev, mode | BLK_OPEN_MOUNTED, sb);
 	if (!IS_ERR(bdev_file))
-		return bdev_file;
+		fs_bdev_super_get(sb);
 	return bdev_file;
 }
 EXPORT_SYMBOL(bdev_fsopen);
@@ -1158,9 +1158,9 @@ EXPORT_SYMBOL(lookup_bdev);
 void bdev_mark_dead(struct block_device *bdev, bool surprise)
 {
 	mutex_lock(&bdev->bd_holder_lock);
-	if (bdev->bd_holder_ops && bdev->bd_holder_ops->mark_dead)
-		bdev->bd_holder_ops->mark_dead(bdev, surprise);
-	else {
+	if (bdev->bd_mounted) {
+		fs_bdev_mark_dead(bdev, surprise);
+	} else {
 		mutex_unlock(&bdev->bd_holder_lock);
 		sync_blockdev(bdev);
 	}
diff --git a/block/ioctl.c b/block/ioctl.c
index 044980c7953b..4f06b4566b35 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -11,6 +11,7 @@
 #include <linux/blktrace_api.h>
 #include <linux/pr.h>
 #include <linux/uaccess.h>
+#include "../fs/internal.h"
 #include "blk.h"
 
 static int blkpg_do_ioctl(struct block_device *bdev,
@@ -376,8 +377,8 @@ static int blkdev_flushbuf(struct block_device *bdev, unsigned cmd,
 		return -EACCES;
 
 	mutex_lock(&bdev->bd_holder_lock);
-	if (bdev->bd_holder_ops && bdev->bd_holder_ops->sync)
-		bdev->bd_holder_ops->sync(bdev);
+	if (bdev->bd_mounted)
+		fs_bdev_sync(bdev);
 	else {
 		mutex_unlock(&bdev->bd_holder_lock);
 		sync_blockdev(bdev);
diff --git a/fs/internal.h b/fs/internal.h
index 7ca738904e34..1d5875183b46 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -318,3 +318,10 @@ struct stashed_operations {
 int path_from_stashed(struct dentry **stashed, struct vfsmount *mnt, void *data,
 		      struct path *path);
 void stashed_dentry_prune(struct dentry *dentry);
+
+int fs_bdev_freeze(struct block_device *bdev);
+int fs_bdev_thaw(struct block_device *bdev);
+void fs_bdev_mark_dead(struct block_device *bdev, bool surprise);
+void fs_bdev_sync(struct block_device *bdev);
+void fs_bdev_super_get(void *);
+void fs_bdev_super_put(void *);
diff --git a/fs/super.c b/fs/super.c
index 71bfa4ffaa7d..383b945b363d 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1394,7 +1394,7 @@ static struct super_block *bdev_super_lock(struct block_device *bdev, bool excl)
 	return sb;
 }
 
-static void fs_bdev_mark_dead(struct block_device *bdev, bool surprise)
+void fs_bdev_mark_dead(struct block_device *bdev, bool surprise)
 {
 	struct super_block *sb;
 
@@ -1412,7 +1412,7 @@ static void fs_bdev_mark_dead(struct block_device *bdev, bool surprise)
 	super_unlock_shared(sb);
 }
 
-static void fs_bdev_sync(struct block_device *bdev)
+void fs_bdev_sync(struct block_device *bdev)
 {
 	struct super_block *sb;
 
@@ -1454,7 +1454,7 @@ static struct super_block *get_bdev_super(struct block_device *bdev)
  * Return: If the freeze was successful zero is returned. If the freeze
  *         failed a negative error code is returned.
  */
-static int fs_bdev_freeze(struct block_device *bdev)
+int fs_bdev_freeze(struct block_device *bdev)
 {
 	struct super_block *sb;
 	int error = 0;
@@ -1494,7 +1494,7 @@ static int fs_bdev_freeze(struct block_device *bdev)
  *         as it may have been frozen multiple times (kernel may hold a
  *         freeze or might be frozen from other block devices).
  */
-static int fs_bdev_thaw(struct block_device *bdev)
+int fs_bdev_thaw(struct block_device *bdev)
 {
 	struct super_block *sb;
 	int error;
@@ -1515,7 +1515,7 @@ static int fs_bdev_thaw(struct block_device *bdev)
 	return error;
 }
 
-static void fs_bdev_super_get(void *data)
+void fs_bdev_super_get(void *data)
 {
 	struct super_block *sb = data;
 
@@ -1524,7 +1524,7 @@ static void fs_bdev_super_get(void *data)
 	spin_unlock(&sb_lock);
 }
 
-static void fs_bdev_super_put(void *data)
+void fs_bdev_super_put(void *data)
 {
 	struct super_block *sb = data;
 
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index cb1526ec44b5..afcc58d04ce7 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -57,6 +57,7 @@ struct block_device {
 	void *			bd_claiming;
 	void *			bd_holder;
 	const struct blk_holder_ops *bd_holder_ops;
+	bool			bd_mounted;
 	struct mutex		bd_holder_lock;
 	int			bd_holders;
 	struct kobject		*bd_holder_dir;
-- 
2.43.0


[-- Attachment #6: 0005-RFC-fs-block-remove-holder-ops.patch --]
[-- Type: text/x-diff, Size: 13053 bytes --]

From 463ee9ed2668963f2edd2b4d7dcc4669c76c4e4b Mon Sep 17 00:00:00 2001
From: Christian Brauner <brauner@kernel.org>
Date: Tue, 19 Mar 2024 11:32:48 +0100
Subject: [PATCH 5/5] [RFC] fs,block: remove holder ops

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 block/bdev.c              | 46 ++++++++++++---------------------------
 block/blk.h               |  2 +-
 block/fops.c              |  2 +-
 block/genhd.c             |  3 +--
 block/partitions/core.c   |  4 ++--
 drivers/block/loop.c      |  2 +-
 fs/internal.h             |  4 ++--
 fs/super.c                | 17 ++-------------
 include/linux/blk_types.h |  1 -
 include/linux/blkdev.h    | 39 +--------------------------------
 10 files changed, 25 insertions(+), 95 deletions(-)

diff --git a/block/bdev.c b/block/bdev.c
index 6aed12b12bfa..83705c80a9f7 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -111,7 +111,7 @@ int truncate_bdev_range(struct block_device *bdev, blk_mode_t mode,
 	 * under live filesystem.
 	 */
 	if (!(mode & BLK_OPEN_EXCL)) {
-		int err = bd_prepare_to_claim(bdev, truncate_bdev_range, NULL);
+		int err = bd_prepare_to_claim(bdev, truncate_bdev_range);
 		if (err)
 			goto invalidate;
 	}
@@ -462,31 +462,21 @@ long nr_blockdev_pages(void)
  * bd_may_claim - test whether a block device can be claimed
  * @bdev: block device of interest
  * @holder: holder trying to claim @bdev
- * @hops: holder ops
  *
  * Test whether @bdev can be claimed by @holder.
  *
  * RETURNS:
  * %true if @bdev can be claimed, %false otherwise.
  */
-static bool bd_may_claim(struct block_device *bdev, void *holder,
-		const struct blk_holder_ops *hops)
+static bool bd_may_claim(struct block_device *bdev, void *holder)
 {
 	struct block_device *whole = bdev_whole(bdev);
 
 	lockdep_assert_held(&bdev_lock);
 
-	if (bdev->bd_holder) {
-		/*
-		 * The same holder can always re-claim.
-		 */
-		if (bdev->bd_holder == holder) {
-			if (WARN_ON_ONCE(bdev->bd_holder_ops != hops))
-				return false;
-			return true;
-		}
-		return false;
-	}
+	/* The same holder can always re-claim. */
+	if (bdev->bd_holder)
+		return bdev->bd_holder == holder;
 
 	/*
 	 * If the whole devices holder is set to bd_may_claim, a partition on
@@ -502,7 +492,6 @@ static bool bd_may_claim(struct block_device *bdev, void *holder,
  * bd_prepare_to_claim - claim a block device
  * @bdev: block device of interest
  * @holder: holder trying to claim @bdev
- * @hops: holder ops.
  *
  * Claim @bdev.  This function fails if @bdev is already claimed by another
  * holder and waits if another claiming is in progress. return, the caller
@@ -511,8 +500,7 @@ static bool bd_may_claim(struct block_device *bdev, void *holder,
  * RETURNS:
  * 0 if @bdev can be claimed, -EBUSY otherwise.
  */
-int bd_prepare_to_claim(struct block_device *bdev, void *holder,
-		const struct blk_holder_ops *hops)
+int bd_prepare_to_claim(struct block_device *bdev, void *holder)
 {
 	struct block_device *whole = bdev_whole(bdev);
 
@@ -521,7 +509,7 @@ int bd_prepare_to_claim(struct block_device *bdev, void *holder,
 retry:
 	mutex_lock(&bdev_lock);
 	/* if someone else claimed, fail */
-	if (!bd_may_claim(bdev, holder, hops)) {
+	if (!bd_may_claim(bdev, holder)) {
 		mutex_unlock(&bdev_lock);
 		return -EBUSY;
 	}
@@ -558,18 +546,16 @@ static void bd_clear_claiming(struct block_device *whole, void *holder)
  * bd_finish_claiming - finish claiming of a block device
  * @bdev: block device of interest
  * @holder: holder that has claimed @bdev
- * @hops: block device holder operations
  *
  * Finish exclusive open of a block device. Mark the device as exlusively
  * open by the holder and wake up all waiters for exclusive open to finish.
  */
-static void bd_finish_claiming(struct block_device *bdev, void *holder,
-		const struct blk_holder_ops *hops, blk_mode_t mode)
+static void bd_finish_claiming(struct block_device *bdev, void *holder, blk_mode_t mode)
 {
 	struct block_device *whole = bdev_whole(bdev);
 
 	mutex_lock(&bdev_lock);
-	BUG_ON(!bd_may_claim(bdev, holder, hops));
+	BUG_ON(!bd_may_claim(bdev, holder));
 	/*
 	 * Note that for a whole device bd_holders will be incremented twice,
 	 * and bd_holder will be set to bd_may_claim before being set to holder
@@ -581,7 +567,6 @@ static void bd_finish_claiming(struct block_device *bdev, void *holder,
 	bdev->bd_holder = holder;
 	if (mode & BLK_OPEN_MOUNTED)
 		bdev->bd_mounted = true;
-	bdev->bd_holder_ops = hops;
 	mutex_unlock(&bdev->bd_holder_lock);
 	bd_clear_claiming(whole, holder);
 	mutex_unlock(&bdev_lock);
@@ -621,7 +606,6 @@ static void bd_end_claim(struct block_device *bdev, void *holder)
 	if (!bdev->bd_holders) {
 		mutex_lock(&bdev->bd_holder_lock);
 		bdev->bd_holder = NULL;
-		bdev->bd_holder_ops = NULL;
 		bdev->bd_mounted = false;
 		mutex_unlock(&bdev->bd_holder_lock);
 		if (bdev->bd_write_holder)
@@ -633,7 +617,6 @@ static void bd_end_claim(struct block_device *bdev, void *holder)
 
 	if (mounted)
 		fs_bdev_super_put(holder);
-
 	/*
 	 * If this was the last claim, remove holder link and unblock evpoll if
 	 * it was a write holder.
@@ -835,7 +818,6 @@ static void bdev_yield_write_access(struct file *bdev_file)
  * @bdev: block device to open
  * @mode: open mode (BLK_OPEN_*)
  * @holder: exclusive holder identifier
- * @hops: holder operations
  * @bdev_file: file for the block device
  *
  * Open the block device. If @holder is not %NULL, the block device is opened
@@ -848,7 +830,7 @@ static void bdev_yield_write_access(struct file *bdev_file)
  * zero on success, -errno on failure.
  */
 int bdev_open(struct block_device *bdev, blk_mode_t mode, void *holder,
-	      const struct blk_holder_ops *hops, struct file *bdev_file)
+	      struct file *bdev_file)
 {
 	bool unblock_events = true;
 	struct gendisk *disk = bdev->bd_disk;
@@ -856,7 +838,7 @@ int bdev_open(struct block_device *bdev, blk_mode_t mode, void *holder,
 
 	if (holder) {
 		mode |= BLK_OPEN_EXCL;
-		ret = bd_prepare_to_claim(bdev, holder, hops);
+		ret = bd_prepare_to_claim(bdev, holder);
 		if (ret)
 			return ret;
 	} else {
@@ -883,7 +865,7 @@ int bdev_open(struct block_device *bdev, blk_mode_t mode, void *holder,
 		goto put_module;
 	bdev_claim_write_access(bdev, mode);
 	if (holder) {
-		bd_finish_claiming(bdev, holder, hops, mode);
+		bd_finish_claiming(bdev, holder, mode);
 
 		/*
 		 * Block event polling for write claims if requested.  Any write
@@ -977,9 +959,9 @@ static struct file *bdev_file_open(dev_t dev, blk_mode_t mode, void *holder)
 	ihold(bdev->bd_inode);
 
 	if (mode & BLK_OPEN_MOUNTED)
-		ret = bdev_open(bdev, mode, holder, &fs_holder_ops, bdev_file);
+		ret = bdev_open(bdev, mode, holder, bdev_file);
 	else
-		ret = bdev_open(bdev, mode, holder, NULL, bdev_file);
+		ret = bdev_open(bdev, mode, holder, bdev_file);
 	if (ret) {
 		/* We failed to open the block device. Let ->release() know. */
 		bdev_file->private_data = ERR_PTR(ret);
diff --git a/block/blk.h b/block/blk.h
index 5cac4e29ae17..dca9c12e205e 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -598,7 +598,7 @@ static inline void bio_issue_init(struct bio_issue *issue,
 
 void bdev_release(struct file *bdev_file);
 int bdev_open(struct block_device *bdev, blk_mode_t mode, void *holder,
-	      const struct blk_holder_ops *hops, struct file *bdev_file);
+	      struct file *bdev_file);
 int bdev_permission(dev_t dev, blk_mode_t mode, void *holder);
 
 #endif /* BLK_INTERNAL_H */
diff --git a/block/fops.c b/block/fops.c
index 679d9b752fe8..a04d7d3b4189 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -617,7 +617,7 @@ static int blkdev_open(struct inode *inode, struct file *filp)
 	if (!bdev)
 		return -ENXIO;
 
-	ret = bdev_open(bdev, mode, filp->private_data, NULL, filp);
+	ret = bdev_open(bdev, mode, filp->private_data, filp);
 	if (ret)
 		blkdev_put_no_open(bdev);
 	return ret;
diff --git a/block/genhd.c b/block/genhd.c
index 513ad4318fe3..e6d64a226c86 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -359,8 +359,7 @@ int disk_scan_partitions(struct gendisk *disk, blk_mode_t mode)
 	 * scanners.
 	 */
 	if (!(mode & BLK_OPEN_EXCL)) {
-		ret = bd_prepare_to_claim(disk->part0, disk_scan_partitions,
-					  NULL);
+		ret = bd_prepare_to_claim(disk->part0, disk_scan_partitions);
 		if (ret)
 			return ret;
 	}
diff --git a/block/partitions/core.c b/block/partitions/core.c
index b11e88c82c8c..e229580891c3 100644
--- a/block/partitions/core.c
+++ b/block/partitions/core.c
@@ -462,10 +462,10 @@ int bdev_del_partition(struct gendisk *disk, int partno)
 
 	/*
 	 * We verified that @part->bd_openers is zero above and so
-	 * @part->bd_holder{_ops} can't be set. And since we hold
+	 * @part->bd_mounted can't be set. And since we hold
 	 * @disk->open_mutex the device can't be claimed by anyone.
 	 *
-	 * So no need to call @part->bd_holder_ops->mark_dead() here.
+	 * So no need to call fs_bdev_mark_dead() here.
 	 * Just delete the partition and invalidate it.
 	 */
 
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 28a95fd366fe..92728e728473 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1014,7 +1014,7 @@ static int loop_configure(struct loop_device *lo, blk_mode_t mode,
 	 * here to avoid changing device under exclusive owner.
 	 */
 	if (!(mode & BLK_OPEN_EXCL)) {
-		error = bd_prepare_to_claim(bdev, loop_configure, NULL);
+		error = bd_prepare_to_claim(bdev, loop_configure);
 		if (error)
 			goto out_putf;
 	}
diff --git a/fs/internal.h b/fs/internal.h
index 1d5875183b46..48550b2987a9 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -323,5 +323,5 @@ int fs_bdev_freeze(struct block_device *bdev);
 int fs_bdev_thaw(struct block_device *bdev);
 void fs_bdev_mark_dead(struct block_device *bdev, bool surprise);
 void fs_bdev_sync(struct block_device *bdev);
-void fs_bdev_super_get(void *);
-void fs_bdev_super_put(void *);
+void fs_bdev_super_get(struct super_block *sb);
+void fs_bdev_super_put(struct super_block *sb);
diff --git a/fs/super.c b/fs/super.c
index 383b945b363d..84932ff1dd35 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1515,31 +1515,18 @@ int fs_bdev_thaw(struct block_device *bdev)
 	return error;
 }
 
-void fs_bdev_super_get(void *data)
+void fs_bdev_super_get(struct super_block *sb)
 {
-	struct super_block *sb = data;
-
 	spin_lock(&sb_lock);
 	sb->s_count++;
 	spin_unlock(&sb_lock);
 }
 
-void fs_bdev_super_put(void *data)
+void fs_bdev_super_put(struct super_block *sb)
 {
-	struct super_block *sb = data;
-
 	put_super(sb);
 }
 
-const struct blk_holder_ops fs_holder_ops = {
-	.mark_dead		= fs_bdev_mark_dead,
-	.sync			= fs_bdev_sync,
-	.freeze			= fs_bdev_freeze,
-	.thaw			= fs_bdev_thaw,
-	.get_holder		= fs_bdev_super_get,
-	.put_holder		= fs_bdev_super_put,
-};
-
 int setup_bdev_super(struct super_block *sb, int sb_flags,
 		struct fs_context *fc)
 {
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index afcc58d04ce7..2ad9c0963830 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -56,7 +56,6 @@ struct block_device {
 	spinlock_t		bd_size_lock; /* for bd_inode->i_size updates */
 	void *			bd_claiming;
 	void *			bd_holder;
-	const struct blk_holder_ops *bd_holder_ops;
 	bool			bd_mounted;
 	struct mutex		bd_holder_lock;
 	int			bd_holders;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 7d08afe09849..c4040f703479 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1494,42 +1494,6 @@ void blkdev_show(struct seq_file *seqf, off_t offset);
 #define BLKDEV_MAJOR_MAX	0
 #endif
 
-struct blk_holder_ops {
-	void (*mark_dead)(struct block_device *bdev, bool surprise);
-
-	/*
-	 * Sync the file system mounted on the block device.
-	 */
-	void (*sync)(struct block_device *bdev);
-
-	/*
-	 * Freeze the file system mounted on the block device.
-	 */
-	int (*freeze)(struct block_device *bdev);
-
-	/*
-	 * Thaw the file system mounted on the block device.
-	 */
-	int (*thaw)(struct block_device *bdev);
-
-	/*
-	 * If needed, get a reference to the holder.
-	 */
-	void (*get_holder)(void *holder);
-
-	/*
-	 * Release the holder.
-	 */
-	void (*put_holder)(void *holder);
-};
-
-/*
- * For filesystems using @fs_holder_ops, the @holder argument passed to
- * helpers used to open and claim block devices via
- * bd_prepare_to_claim() must point to a superblock.
- */
-extern const struct blk_holder_ops fs_holder_ops;
-
 /*
  * Return the correct open flags for blkdev_get_by_* for super block flags
  * as stored in sb->s_flags.
@@ -1544,8 +1508,7 @@ struct file *bdev_file_open_by_path(const char *path, blk_mode_t mode,
 struct file *bdev_fsopen(dev_t dev, blk_mode_t mode, struct super_block *sb);
 struct file *bdev_fsopen_path(const char *path, blk_mode_t mode,
 			      struct super_block *sb);
-int bd_prepare_to_claim(struct block_device *bdev, void *holder,
-		const struct blk_holder_ops *hops);
+int bd_prepare_to_claim(struct block_device *bdev, void *holder);
 void bd_abort_claiming(struct block_device *bdev, void *holder);
 
 /* just for blk-cgroup, don't use elsewhere */
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* Re: remove holder ops
  2024-03-19 16:24               ` remove holder ops Christian Brauner
@ 2024-03-19 17:03                 ` Matthew Wilcox
  2024-03-19 23:13                 ` Christoph Hellwig
  1 sibling, 0 replies; 146+ messages in thread
From: Matthew Wilcox @ 2024-03-19 17:03 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue, Mar 19, 2024 at 05:24:44PM +0100, Christian Brauner wrote:
> @@ -631,8 +631,8 @@ static void bd_end_claim(struct block_device *bdev, void *holder)
>  		whole->bd_holder = NULL;
>  	mutex_unlock(&bdev_lock);
>  
> -	if (hops && hops->put_holder)
> -		hops->put_holder(holder);
> +	if (mounted)
> +		fs_bdev_super_put(holder);

I think you haven't gone quite far enough here.  Call super_put()
directly and make bd_end_claim() take a super_block pointer.


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: remove holder ops
  2024-03-19 16:24               ` remove holder ops Christian Brauner
  2024-03-19 17:03                 ` Matthew Wilcox
@ 2024-03-19 23:13                 ` Christoph Hellwig
  1 sibling, 0 replies; 146+ messages in thread
From: Christoph Hellwig @ 2024-03-19 23:13 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue, Mar 19, 2024 at 05:24:44PM +0100, Christian Brauner wrote:
> So Linus complained about the fact that we have holder ops when really
> it currently isn't needed by anything apart from filesytems.

Err, that's jut because we haven't gotten to it.  The whole point of the
ops is that we do want device removal (and in the future resize)
notification for others users as well.  In fact what drove me is to
have proper notifications for a removed devices in md parity raid, I've
just not gotten to that quite yet as I've been busy with other things.

Vut even if that wasn't the case, calling straight from block layer code
into file systems is just wrong.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 00/34] Open block devices as files
  2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
                   ` (36 preceding siblings ...)
  2024-02-05 11:55 ` [PATCH v2 00/34] Open block devices as files Christian Brauner
@ 2024-03-21 22:17 ` Matthew Wilcox
  2024-03-22  3:38   ` Kent Overstreet
  2024-03-22 12:31   ` Christian Brauner
  37 siblings, 2 replies; 146+ messages in thread
From: Matthew Wilcox @ 2024-03-21 22:17 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Tue, Jan 23, 2024 at 02:26:17PM +0100, Christian Brauner wrote:
> This opens block devices as files. Instead of introducing a separate
> indirection into bdev_open_by_*() vis struct bdev_handle we can just
> make bdev_file_open_by_*() return a struct file. Opening and closing a
> block device from setup_bdev_super() and in all other places just
> becomes equivalent to opening and closing a file.
> 
> This has held up in xfstests and in blktests so far and it seems stable
> and clean. The equivalence of opening and closing block devices to
> regular files is a win in and of itself imho. Added to that is the
> ability to do away with struct bdev_handle completely and make various
> low-level helpers private to the block layer.

It fails to hold up in xfstests for me.

git bisect leads to:

commit 321de651fa565dcf76c017b257bdf15ec7fff45d
Author: Christian Brauner <brauner@kernel.org>
Date:   Tue Jan 23 14:26:48 2024 +0100

    block: don't rely on BLK_OPEN_RESTRICT_WRITES when yielding write access

QA output created by 015
mkfs failed
(see /ktest-out/xfstests/generic/015.full for details)
umount: /dev/vdc: not mounted.

** mkfs failed with extra mkfs options added to "-m reflink=1,rmapbt=1 -i sparse=1,nrext64=1" by test 015 **
** attempting to mkfs using only test 015 options: -d size=268435456 -b size=4096 **
mkfs.xfs: cannot open /dev/vdc: Device or resource busy
mkfs failed

About half the xfstests fail this way (722 of 1387 tests)

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 00/34] Open block devices as files
  2024-03-21 22:17 ` Matthew Wilcox
@ 2024-03-22  3:38   ` Kent Overstreet
  2024-03-22 13:56     ` Christian Brauner
  2024-03-22 12:31   ` Christian Brauner
  1 sibling, 1 reply; 146+ messages in thread
From: Kent Overstreet @ 2024-03-22  3:38 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Christian Brauner, Jan Kara, Christoph Hellwig, Jens Axboe,
	Darrick J. Wong, linux-fsdevel, linux-block

On Thu, Mar 21, 2024 at 10:17:55PM +0000, Matthew Wilcox wrote:
> On Tue, Jan 23, 2024 at 02:26:17PM +0100, Christian Brauner wrote:
> > This opens block devices as files. Instead of introducing a separate
> > indirection into bdev_open_by_*() vis struct bdev_handle we can just
> > make bdev_file_open_by_*() return a struct file. Opening and closing a
> > block device from setup_bdev_super() and in all other places just
> > becomes equivalent to opening and closing a file.
> > 
> > This has held up in xfstests and in blktests so far and it seems stable
> > and clean. The equivalence of opening and closing block devices to
> > regular files is a win in and of itself imho. Added to that is the
> > ability to do away with struct bdev_handle completely and make various
> > low-level helpers private to the block layer.
> 
> It fails to hold up in xfstests for me.
> 
> git bisect leads to:
> 
> commit 321de651fa565dcf76c017b257bdf15ec7fff45d
> Author: Christian Brauner <brauner@kernel.org>
> Date:   Tue Jan 23 14:26:48 2024 +0100
> 
>     block: don't rely on BLK_OPEN_RESTRICT_WRITES when yielding write access
> 
> QA output created by 015
> mkfs failed
> (see /ktest-out/xfstests/generic/015.full for details)
> umount: /dev/vdc: not mounted.
> 
> ** mkfs failed with extra mkfs options added to "-m reflink=1,rmapbt=1 -i sparse=1,nrext64=1" by test 015 **
> ** attempting to mkfs using only test 015 options: -d size=268435456 -b size=4096 **
> mkfs.xfs: cannot open /dev/vdc: Device or resource busy
> mkfs failed
> 
> About half the xfstests fail this way (722 of 1387 tests)

Christain, let's chat about testing at LSF - I was looking at this too
because we thought it was a ktest update that broke at first, but if we
can get you using the automated test infrastructure I built this could
get caught before hitting -next

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 00/34] Open block devices as files
  2024-03-21 22:17 ` Matthew Wilcox
  2024-03-22  3:38   ` Kent Overstreet
@ 2024-03-22 12:31   ` Christian Brauner
  2024-03-22 12:40     ` Matthew Wilcox
  1 sibling, 1 reply; 146+ messages in thread
From: Christian Brauner @ 2024-03-22 12:31 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

> ** mkfs failed with extra mkfs options added to "-m reflink=1,rmapbt=1 -i sparse=1,nrext64=1" by test 015 **
> ** attempting to mkfs using only test 015 options: -d size=268435456 -b size=4096 **
> mkfs.xfs: cannot open /dev/vdc: Device or resource busy
> mkfs failed
> 
> About half the xfstests fail this way (722 of 1387 tests)

Thanks for the report. Can you please show me the kernel config and the
xfstests config that was used for this?

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 00/34] Open block devices as files
  2024-03-22 12:31   ` Christian Brauner
@ 2024-03-22 12:40     ` Matthew Wilcox
  2024-03-22 13:53       ` Christian Brauner
  0 siblings, 1 reply; 146+ messages in thread
From: Matthew Wilcox @ 2024-03-22 12:40 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

[-- Attachment #1: Type: text/plain, Size: 1111 bytes --]

On Fri, Mar 22, 2024 at 01:31:23PM +0100, Christian Brauner wrote:
> > ** mkfs failed with extra mkfs options added to "-m reflink=1,rmapbt=1 -i sparse=1,nrext64=1" by test 015 **
> > ** attempting to mkfs using only test 015 options: -d size=268435456 -b size=4096 **
> > mkfs.xfs: cannot open /dev/vdc: Device or resource busy
> > mkfs failed
> > 
> > About half the xfstests fail this way (722 of 1387 tests)
> 
> Thanks for the report. Can you please show me the kernel config and the
> xfstests config that was used for this?

Kernel config attached.

I'll have to defer to Kent on the xfstests config that's used.  It might
be this:

    cat << EOF > /ktest/tests/xfstests/local.config
TEST_DEV=${ktest_scratch_dev[0]}
TEST_DIR=$TEST_DIR
SCRATCH_DEV=${ktest_scratch_dev[1]}
SCRATCH_MNT=/mnt/scratch
LOGWRITES_DEV=${ktest_scratch_dev[2]}
RESULT_BASE=/ktest-out/xfstests
LOGGER_PROG=true
EOF


Also, while generic/015 is the first to fail, you can't just run
generic/015.  You can't even just run 012, 013, 014, 015.  I haven't
bisected to exactly how many predecessor tests are necessary to get
a failure.

[-- Attachment #2: config-brauner.gz --]
[-- Type: application/gzip, Size: 18686 bytes --]

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 00/34] Open block devices as files
  2024-03-22 12:40     ` Matthew Wilcox
@ 2024-03-22 13:53       ` Christian Brauner
  0 siblings, 0 replies; 146+ messages in thread
From: Christian Brauner @ 2024-03-22 13:53 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block

On Fri, Mar 22, 2024 at 12:40:03PM +0000, Matthew Wilcox wrote:
> On Fri, Mar 22, 2024 at 01:31:23PM +0100, Christian Brauner wrote:
> > > ** mkfs failed with extra mkfs options added to "-m reflink=1,rmapbt=1 -i sparse=1,nrext64=1" by test 015 **
> > > ** attempting to mkfs using only test 015 options: -d size=268435456 -b size=4096 **
> > > mkfs.xfs: cannot open /dev/vdc: Device or resource busy
> > > mkfs failed
> > > 
> > > About half the xfstests fail this way (722 of 1387 tests)
> > 
> > Thanks for the report. Can you please show me the kernel config and the
> > xfstests config that was used for this?
> 
> Kernel config attached.
> 
> I'll have to defer to Kent on the xfstests config that's used.  It might
> be this:
> 
>     cat << EOF > /ktest/tests/xfstests/local.config
> TEST_DEV=${ktest_scratch_dev[0]}
> TEST_DIR=$TEST_DIR
> SCRATCH_DEV=${ktest_scratch_dev[1]}
> SCRATCH_MNT=/mnt/scratch
> LOGWRITES_DEV=${ktest_scratch_dev[2]}
> RESULT_BASE=/ktest-out/xfstests
> LOGGER_PROG=true
> EOF
> 
> 
> Also, while generic/015 is the first to fail, you can't just run
> generic/015.  You can't even just run 012, 013, 014, 015.  I haven't
> bisected to exactly how many predecessor tests are necessary to get
> a failure.

Thanks for the info. So it's as I suspected. The config that was used
has
# CONFIG_BLK_DEV_WRITE_MOUNTED is not set
That means it isn't possible to write to mounted block devices. The
default is CONFIG_BLK_DEV_WRITE_MOUNTED=y.

I go through all block-based filesystems with xfstests for such changes
wit various config options. My test matrix hasn't been updated to
specifically unset CONFIG_BLK_DEV_WRITE_MOUNTED which is why this
escaped. I'll send a fix shortly.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 00/34] Open block devices as files
  2024-03-22  3:38   ` Kent Overstreet
@ 2024-03-22 13:56     ` Christian Brauner
  0 siblings, 0 replies; 146+ messages in thread
From: Christian Brauner @ 2024-03-22 13:56 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: Matthew Wilcox, Jan Kara, Christoph Hellwig, Jens Axboe,
	Darrick J. Wong, linux-fsdevel, linux-block

> Christain, let's chat about testing at LSF - I was looking at this too
> because we thought it was a ktest update that broke at first, but if we
> can get you using the automated test infrastructure I built this could
> get caught before hitting -next

I already do automated testing. This specific error depends on a new
config option CONFIG_BLK_DEV_WRITE_MOUNTED which wasn't reflected in my
test matrix. I've fixed that now. But I'm always happy to have the tree
integrated with other automated testing as well!

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 04/34] md: port block device access to file
  2024-01-23 13:26 ` [PATCH v2 04/34] md: port block device access " Christian Brauner
  2024-01-29 16:14   ` Christoph Hellwig
  2024-01-31 18:15   ` Jan Kara
@ 2024-04-15  9:26   ` Ming Lei
  2024-04-15 12:35     ` Christian Brauner
  2 siblings, 1 reply; 146+ messages in thread
From: Ming Lei @ 2024-04-15  9:26 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block, Mike Snitzer, dm-devel,
	Mikulas Patocka

Hello,

On Tue, Jan 23, 2024 at 02:26:21PM +0100, Christian Brauner wrote:
> Signed-off-by: Christian Brauner <brauner@kernel.org>
> ---
>  drivers/md/dm.c               | 23 +++++++++++++----------
>  drivers/md/md.c               | 12 ++++++------
>  drivers/md/md.h               |  2 +-
>  include/linux/device-mapper.h |  2 +-
>  4 files changed, 21 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index 8dcabf84d866..87de5b5682ad 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c

...

> @@ -775,7 +778,7 @@ static void close_table_device(struct table_device *td, struct mapped_device *md
>  {
>  	if (md->disk->slave_dir)
>  		bd_unlink_disk_holder(td->dm_dev.bdev, md->disk);
> -	bdev_release(td->dm_dev.bdev_handle);
> +	fput(td->dm_dev.bdev_file);

The above change caused regression on 'dmsetup remove_all'.

blkdev_release() is delayed because of fput(), so dm_lock_for_deletion
returns -EBUSY, then this dm disk is skipped in remove_all().

Force to mark DMF_DEFERRED_REMOVE might solve it, but need our device
mapper guys to check if it is safe.

Or other better solution?

thanks,
Ming


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 04/34] md: port block device access to file
  2024-04-15  9:26   ` Ming Lei
@ 2024-04-15 12:35     ` Christian Brauner
  2024-04-15 13:56       ` Mike Snitzer
  2024-04-15 14:35       ` Ming Lei
  0 siblings, 2 replies; 146+ messages in thread
From: Christian Brauner @ 2024-04-15 12:35 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block, Mike Snitzer, dm-devel,
	Mikulas Patocka

On Mon, Apr 15, 2024 at 05:26:19PM +0800, Ming Lei wrote:
> Hello,
> 
> On Tue, Jan 23, 2024 at 02:26:21PM +0100, Christian Brauner wrote:
> > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > ---
> >  drivers/md/dm.c               | 23 +++++++++++++----------
> >  drivers/md/md.c               | 12 ++++++------
> >  drivers/md/md.h               |  2 +-
> >  include/linux/device-mapper.h |  2 +-
> >  4 files changed, 21 insertions(+), 18 deletions(-)
> > 
> > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > index 8dcabf84d866..87de5b5682ad 100644
> > --- a/drivers/md/dm.c
> > +++ b/drivers/md/dm.c
> 
> ...
> 
> > @@ -775,7 +778,7 @@ static void close_table_device(struct table_device *td, struct mapped_device *md
> >  {
> >  	if (md->disk->slave_dir)
> >  		bd_unlink_disk_holder(td->dm_dev.bdev, md->disk);
> > -	bdev_release(td->dm_dev.bdev_handle);
> > +	fput(td->dm_dev.bdev_file);
> 
> The above change caused regression on 'dmsetup remove_all'.
> 
> blkdev_release() is delayed because of fput(), so dm_lock_for_deletion
> returns -EBUSY, then this dm disk is skipped in remove_all().
> 
> Force to mark DMF_DEFERRED_REMOVE might solve it, but need our device
> mapper guys to check if it is safe.
> 
> Or other better solution?

Yeah, I think there is. You can just switch all fput() instances in
device mapper to bdev_fput() which is mainline now. This will yield the
device and make it able to be reclaimed. Should be as simple as the
patch below. Could you test this and send a patch based on this (I'm on
a prolonged vacation so I don't have time right now.):

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 56aa2a8b9d71..0f681a1e70af 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -765,7 +765,7 @@ static struct table_device *open_table_device(struct mapped_device *md,
        return td;

 out_blkdev_put:
-       fput(bdev_file);
+       bdev_fput(bdev_file);
 out_free_td:
        kfree(td);
        return ERR_PTR(r);
@@ -778,7 +778,7 @@ static void close_table_device(struct table_device *td, struct mapped_device *md
 {
        if (md->disk->slave_dir)
                bd_unlink_disk_holder(td->dm_dev.bdev, md->disk);
-       fput(td->dm_dev.bdev_file);
+       bdev_fput(td->dm_dev.bdev_file);
        put_dax(td->dm_dev.dax_dev);
        list_del(&td->list);
        kfree(td);


^ permalink raw reply related	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 04/34] md: port block device access to file
  2024-04-15 12:35     ` Christian Brauner
@ 2024-04-15 13:56       ` Mike Snitzer
  2024-04-15 14:35       ` Ming Lei
  1 sibling, 0 replies; 146+ messages in thread
From: Mike Snitzer @ 2024-04-15 13:56 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Ming Lei, Jan Kara, Christoph Hellwig, Jens Axboe,
	Darrick J. Wong, linux-fsdevel, linux-block, dm-devel,
	Mikulas Patocka

On Mon, Apr 15 2024 at  8:35P -0400,
Christian Brauner <brauner@kernel.org> wrote:

> On Mon, Apr 15, 2024 at 05:26:19PM +0800, Ming Lei wrote:
> > Hello,
> > 
> > On Tue, Jan 23, 2024 at 02:26:21PM +0100, Christian Brauner wrote:
> > > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > > ---
> > >  drivers/md/dm.c               | 23 +++++++++++++----------
> > >  drivers/md/md.c               | 12 ++++++------
> > >  drivers/md/md.h               |  2 +-
> > >  include/linux/device-mapper.h |  2 +-
> > >  4 files changed, 21 insertions(+), 18 deletions(-)
> > > 
> > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > index 8dcabf84d866..87de5b5682ad 100644
> > > --- a/drivers/md/dm.c
> > > +++ b/drivers/md/dm.c
> > 
> > ...
> > 
> > > @@ -775,7 +778,7 @@ static void close_table_device(struct table_device *td, struct mapped_device *md
> > >  {
> > >  	if (md->disk->slave_dir)
> > >  		bd_unlink_disk_holder(td->dm_dev.bdev, md->disk);
> > > -	bdev_release(td->dm_dev.bdev_handle);
> > > +	fput(td->dm_dev.bdev_file);
> > 
> > The above change caused regression on 'dmsetup remove_all'.
> > 
> > blkdev_release() is delayed because of fput(), so dm_lock_for_deletion
> > returns -EBUSY, then this dm disk is skipped in remove_all().
> > 
> > Force to mark DMF_DEFERRED_REMOVE might solve it, but need our device
> > mapper guys to check if it is safe.
> > 
> > Or other better solution?
> 
> Yeah, I think there is. You can just switch all fput() instances in
> device mapper to bdev_fput() which is mainline now. This will yield the
> device and make it able to be reclaimed. Should be as simple as the
> patch below. Could you test this and send a patch based on this (I'm on
> a prolonged vacation so I don't have time right now.):
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index 56aa2a8b9d71..0f681a1e70af 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -765,7 +765,7 @@ static struct table_device *open_table_device(struct mapped_device *md,
>         return td;
> 
>  out_blkdev_put:
> -       fput(bdev_file);
> +       bdev_fput(bdev_file);
>  out_free_td:
>         kfree(td);
>         return ERR_PTR(r);
> @@ -778,7 +778,7 @@ static void close_table_device(struct table_device *td, struct mapped_device *md
>  {
>         if (md->disk->slave_dir)
>                 bd_unlink_disk_holder(td->dm_dev.bdev, md->disk);
> -       fput(td->dm_dev.bdev_file);
> +       bdev_fput(td->dm_dev.bdev_file);
>         put_dax(td->dm_dev.dax_dev);
>         list_del(&td->list);
>         kfree(td);
> 
> 

Thanks. I'll work with Ming and others to take care of it. Have a great vacation!

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 04/34] md: port block device access to file
  2024-04-15 12:35     ` Christian Brauner
  2024-04-15 13:56       ` Mike Snitzer
@ 2024-04-15 14:35       ` Ming Lei
  2024-04-15 14:53         ` Christian Brauner
  1 sibling, 1 reply; 146+ messages in thread
From: Ming Lei @ 2024-04-15 14:35 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block, Mike Snitzer, dm-devel,
	Mikulas Patocka

On Mon, Apr 15, 2024 at 02:35:17PM +0200, Christian Brauner wrote:
> On Mon, Apr 15, 2024 at 05:26:19PM +0800, Ming Lei wrote:
> > Hello,
> > 
> > On Tue, Jan 23, 2024 at 02:26:21PM +0100, Christian Brauner wrote:
> > > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > > ---
> > >  drivers/md/dm.c               | 23 +++++++++++++----------
> > >  drivers/md/md.c               | 12 ++++++------
> > >  drivers/md/md.h               |  2 +-
> > >  include/linux/device-mapper.h |  2 +-
> > >  4 files changed, 21 insertions(+), 18 deletions(-)
> > > 
> > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > index 8dcabf84d866..87de5b5682ad 100644
> > > --- a/drivers/md/dm.c
> > > +++ b/drivers/md/dm.c
> > 
> > ...
> > 
> > > @@ -775,7 +778,7 @@ static void close_table_device(struct table_device *td, struct mapped_device *md
> > >  {
> > >  	if (md->disk->slave_dir)
> > >  		bd_unlink_disk_holder(td->dm_dev.bdev, md->disk);
> > > -	bdev_release(td->dm_dev.bdev_handle);
> > > +	fput(td->dm_dev.bdev_file);
> > 
> > The above change caused regression on 'dmsetup remove_all'.
> > 
> > blkdev_release() is delayed because of fput(), so dm_lock_for_deletion
> > returns -EBUSY, then this dm disk is skipped in remove_all().
> > 
> > Force to mark DMF_DEFERRED_REMOVE might solve it, but need our device
> > mapper guys to check if it is safe.
> > 
> > Or other better solution?
> 
> Yeah, I think there is. You can just switch all fput() instances in
> device mapper to bdev_fput() which is mainline now. This will yield the
> device and make it able to be reclaimed. Should be as simple as the
> patch below. Could you test this and send a patch based on this (I'm on
> a prolonged vacation so I don't have time right now.):

Unfortunately it doesn't work.

Here the problem is that blkdev_release() is delayed, which changes
'dmsetup remove_all' behavior, and causes that some of dm disks aren't
removed.

Please see dm_lock_for_deletion() and dm_blk_open()/dm_blk_close().

Thanks,
Ming


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 04/34] md: port block device access to file
  2024-04-15 14:35       ` Ming Lei
@ 2024-04-15 14:53         ` Christian Brauner
  2024-04-15 15:11           ` Ming Lei
  0 siblings, 1 reply; 146+ messages in thread
From: Christian Brauner @ 2024-04-15 14:53 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block, Mike Snitzer, dm-devel,
	Mikulas Patocka

On Mon, Apr 15, 2024 at 10:35:53PM +0800, Ming Lei wrote:
> On Mon, Apr 15, 2024 at 02:35:17PM +0200, Christian Brauner wrote:
> > On Mon, Apr 15, 2024 at 05:26:19PM +0800, Ming Lei wrote:
> > > Hello,
> > > 
> > > On Tue, Jan 23, 2024 at 02:26:21PM +0100, Christian Brauner wrote:
> > > > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > > > ---
> > > >  drivers/md/dm.c               | 23 +++++++++++++----------
> > > >  drivers/md/md.c               | 12 ++++++------
> > > >  drivers/md/md.h               |  2 +-
> > > >  include/linux/device-mapper.h |  2 +-
> > > >  4 files changed, 21 insertions(+), 18 deletions(-)
> > > > 
> > > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > > index 8dcabf84d866..87de5b5682ad 100644
> > > > --- a/drivers/md/dm.c
> > > > +++ b/drivers/md/dm.c
> > > 
> > > ...
> > > 
> > > > @@ -775,7 +778,7 @@ static void close_table_device(struct table_device *td, struct mapped_device *md
> > > >  {
> > > >  	if (md->disk->slave_dir)
> > > >  		bd_unlink_disk_holder(td->dm_dev.bdev, md->disk);
> > > > -	bdev_release(td->dm_dev.bdev_handle);
> > > > +	fput(td->dm_dev.bdev_file);
> > > 
> > > The above change caused regression on 'dmsetup remove_all'.
> > > 
> > > blkdev_release() is delayed because of fput(), so dm_lock_for_deletion
> > > returns -EBUSY, then this dm disk is skipped in remove_all().
> > > 
> > > Force to mark DMF_DEFERRED_REMOVE might solve it, but need our device
> > > mapper guys to check if it is safe.
> > > 
> > > Or other better solution?
> > 
> > Yeah, I think there is. You can just switch all fput() instances in
> > device mapper to bdev_fput() which is mainline now. This will yield the
> > device and make it able to be reclaimed. Should be as simple as the
> > patch below. Could you test this and send a patch based on this (I'm on
> > a prolonged vacation so I don't have time right now.):
> 
> Unfortunately it doesn't work.
> 
> Here the problem is that blkdev_release() is delayed, which changes
> 'dmsetup remove_all' behavior, and causes that some of dm disks aren't
> removed.
> 
> Please see dm_lock_for_deletion() and dm_blk_open()/dm_blk_close().

So you really need blkdev_release() itself to be synchronous? Groan, in
that case use __fput_sync() instead of fput() which ensures that this
file is closed synchronously.

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 04/34] md: port block device access to file
  2024-04-15 14:53         ` Christian Brauner
@ 2024-04-15 15:11           ` Ming Lei
  2024-04-15 15:53             ` Mike Snitzer
  2024-04-15 16:22             ` Jan Kara
  0 siblings, 2 replies; 146+ messages in thread
From: Ming Lei @ 2024-04-15 15:11 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block, Mike Snitzer, dm-devel,
	Mikulas Patocka

On Mon, Apr 15, 2024 at 04:53:42PM +0200, Christian Brauner wrote:
> On Mon, Apr 15, 2024 at 10:35:53PM +0800, Ming Lei wrote:
> > On Mon, Apr 15, 2024 at 02:35:17PM +0200, Christian Brauner wrote:
> > > On Mon, Apr 15, 2024 at 05:26:19PM +0800, Ming Lei wrote:
> > > > Hello,
> > > > 
> > > > On Tue, Jan 23, 2024 at 02:26:21PM +0100, Christian Brauner wrote:
> > > > > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > > > > ---
> > > > >  drivers/md/dm.c               | 23 +++++++++++++----------
> > > > >  drivers/md/md.c               | 12 ++++++------
> > > > >  drivers/md/md.h               |  2 +-
> > > > >  include/linux/device-mapper.h |  2 +-
> > > > >  4 files changed, 21 insertions(+), 18 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > > > index 8dcabf84d866..87de5b5682ad 100644
> > > > > --- a/drivers/md/dm.c
> > > > > +++ b/drivers/md/dm.c
> > > > 
> > > > ...
> > > > 
> > > > > @@ -775,7 +778,7 @@ static void close_table_device(struct table_device *td, struct mapped_device *md
> > > > >  {
> > > > >  	if (md->disk->slave_dir)
> > > > >  		bd_unlink_disk_holder(td->dm_dev.bdev, md->disk);
> > > > > -	bdev_release(td->dm_dev.bdev_handle);
> > > > > +	fput(td->dm_dev.bdev_file);
> > > > 
> > > > The above change caused regression on 'dmsetup remove_all'.
> > > > 
> > > > blkdev_release() is delayed because of fput(), so dm_lock_for_deletion
> > > > returns -EBUSY, then this dm disk is skipped in remove_all().
> > > > 
> > > > Force to mark DMF_DEFERRED_REMOVE might solve it, but need our device
> > > > mapper guys to check if it is safe.
> > > > 
> > > > Or other better solution?
> > > 
> > > Yeah, I think there is. You can just switch all fput() instances in
> > > device mapper to bdev_fput() which is mainline now. This will yield the
> > > device and make it able to be reclaimed. Should be as simple as the
> > > patch below. Could you test this and send a patch based on this (I'm on
> > > a prolonged vacation so I don't have time right now.):
> > 
> > Unfortunately it doesn't work.
> > 
> > Here the problem is that blkdev_release() is delayed, which changes
> > 'dmsetup remove_all' behavior, and causes that some of dm disks aren't
> > removed.
> > 
> > Please see dm_lock_for_deletion() and dm_blk_open()/dm_blk_close().
> 
> So you really need blkdev_release() itself to be synchronous? Groan, in

At least the current dm implementation relies on this way sort of, and
it could be addressed by forcing to mark DMF_DEFERRED_REMOVE in
remove_all().

> that case use __fput_sync() instead of fput() which ensures that this
> file is closed synchronously.

I tried __fput_sync(), but the following panic is caused:

[  113.486522] ------------[ cut here ]------------
[  113.486524] kernel BUG at fs/file_table.c:453!
[  113.486531] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[  113.488878] CPU: 6 PID: 1919 Comm: dmsetup Kdump: loaded Not tainted 5.14.0+ #23
[  113.490114] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc37 04/01/2014
[  113.491661] RIP: 0010:__fput_sync+0x25/0x30
[  113.492562] Code: 90 90 90 90 90 0f 1f 44 00 00 f0 48 ff 4f 38 75 14 65 48 8b 04 25 40 25 03 00 f6 40 36 20 74 0a e9 20 fd ff ff c3 cc cc cc cc <0f0
[  113.493926] RSP: 0018:ffffb76581003c20 EFLAGS: 00010246
[  113.494220] RAX: ffff92eca6ef8000 RBX: ffff92ed176c3c18 RCX: 000000008080007c
[  113.494632] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff92ec844cac00
[  113.495033] RBP: ffff92ed176c3c00 R08: 0000000000000001 R09: 0000000000000000
[  113.495378] R10: ffffb76581003b00 R11: ffffb76581003b68 R12: ffff92ec8fccec20
[  113.495723] R13: ffff92ec8431b400 R14: ffff92ec8431b508 R15: ffff92ec8fccec00
[  113.496108] FS:  00007f5be5638840(0000) GS:ffff92f0ebb80000(0000) knlGS:0000000000000000
[  113.496581] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  113.496907] CR2: 00007f5be54694b0 CR3: 0000000108e54003 CR4: 0000000000770ef0
[  113.497308] PKRU: 55555554
[  113.497469] Call Trace:
[  113.497613]  <TASK>
[  113.497741]  ? show_trace_log_lvl+0x1c4/0x2df
[  113.497997]  ? show_trace_log_lvl+0x1c4/0x2df
[  113.498251]  ? dm_put_table_device+0x64/0xd0 [dm_mod]
[  113.498553]  ? __die_body.cold+0x8/0xd
[  113.498768]  ? die+0x2b/0x50
[  113.498937]  ? do_trap+0xce/0x120
[  113.499129]  ? __fput_sync+0x25/0x30
[  113.499337]  ? do_error_trap+0x65/0x80
[  113.499577]  ? __fput_sync+0x25/0x30
[  113.499787]  ? exc_invalid_op+0x4e/0x70
[  113.500011]  ? __fput_sync+0x25/0x30
[  113.500239]  ? asm_exc_invalid_op+0x16/0x20
[  113.500842]  ? __fput_sync+0x25/0x30
[  113.501387]  dm_put_table_device+0x64/0xd0 [dm_mod]
[  113.502047]  dm_put_device+0x80/0x110 [dm_mod]
[  113.502650]  stripe_dtr+0x2f/0x50 [dm_mod]
[  113.503218]  dm_table_destroy+0x59/0x120 [dm_mod]
[  113.503842]  __dm_destroy+0x114/0x1e0 [dm_mod]
[  113.504402]  dm_hash_remove_all+0x63/0x160 [dm_mod]
[  113.505028]  remove_all+0x1e/0x30 [dm_mod]
[  113.505602]  ctl_ioctl+0x19f/0x290 [dm_mod]
[  113.506146]  dm_ctl_ioctl+0xa/0x20 [dm_mod]
[  113.506717]  __x64_sys_ioctl+0x87/0xc0
[  113.507230]  do_syscall_64+0x5c/0xf0
[  113.507755]  ? exc_page_fault+0x62/0x150
[  113.508309]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
[  113.508945] RIP: 0033:0x7f5be543ec6b



Thanks. 
Ming


^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 04/34] md: port block device access to file
  2024-04-15 15:11           ` Ming Lei
@ 2024-04-15 15:53             ` Mike Snitzer
  2024-04-15 16:22             ` Jan Kara
  1 sibling, 0 replies; 146+ messages in thread
From: Mike Snitzer @ 2024-04-15 15:53 UTC (permalink / raw)
  To: Ming Lei
  Cc: Christian Brauner, Jan Kara, Christoph Hellwig, Jens Axboe,
	Darrick J. Wong, linux-fsdevel, linux-block, dm-devel,
	Mikulas Patocka

On Mon, Apr 15, 2024 at 11:11:50PM +0800, Ming Lei wrote:
> On Mon, Apr 15, 2024 at 04:53:42PM +0200, Christian Brauner wrote:
> > On Mon, Apr 15, 2024 at 10:35:53PM +0800, Ming Lei wrote:
> > > On Mon, Apr 15, 2024 at 02:35:17PM +0200, Christian Brauner wrote:
> > > > On Mon, Apr 15, 2024 at 05:26:19PM +0800, Ming Lei wrote:
> > > > > Hello,
> > > > > 
> > > > > On Tue, Jan 23, 2024 at 02:26:21PM +0100, Christian Brauner wrote:
> > > > > > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > > > > > ---
> > > > > >  drivers/md/dm.c               | 23 +++++++++++++----------
> > > > > >  drivers/md/md.c               | 12 ++++++------
> > > > > >  drivers/md/md.h               |  2 +-
> > > > > >  include/linux/device-mapper.h |  2 +-
> > > > > >  4 files changed, 21 insertions(+), 18 deletions(-)
> > > > > > 
> > > > > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > > > > index 8dcabf84d866..87de5b5682ad 100644
> > > > > > --- a/drivers/md/dm.c
> > > > > > +++ b/drivers/md/dm.c
> > > > > 
> > > > > ...
> > > > > 
> > > > > > @@ -775,7 +778,7 @@ static void close_table_device(struct table_device *td, struct mapped_device *md
> > > > > >  {
> > > > > >  	if (md->disk->slave_dir)
> > > > > >  		bd_unlink_disk_holder(td->dm_dev.bdev, md->disk);
> > > > > > -	bdev_release(td->dm_dev.bdev_handle);
> > > > > > +	fput(td->dm_dev.bdev_file);
> > > > > 
> > > > > The above change caused regression on 'dmsetup remove_all'.
> > > > > 
> > > > > blkdev_release() is delayed because of fput(), so dm_lock_for_deletion
> > > > > returns -EBUSY, then this dm disk is skipped in remove_all().
> > > > > 
> > > > > Force to mark DMF_DEFERRED_REMOVE might solve it, but need our device
> > > > > mapper guys to check if it is safe.
> > > > > 
> > > > > Or other better solution?
> > > > 
> > > > Yeah, I think there is. You can just switch all fput() instances in
> > > > device mapper to bdev_fput() which is mainline now. This will yield the
> > > > device and make it able to be reclaimed. Should be as simple as the
> > > > patch below. Could you test this and send a patch based on this (I'm on
> > > > a prolonged vacation so I don't have time right now.):
> > > 
> > > Unfortunately it doesn't work.
> > > 
> > > Here the problem is that blkdev_release() is delayed, which changes
> > > 'dmsetup remove_all' behavior, and causes that some of dm disks aren't
> > > removed.
> > > 
> > > Please see dm_lock_for_deletion() and dm_blk_open()/dm_blk_close().
> > 
> > So you really need blkdev_release() itself to be synchronous? Groan, in
> 
> At least the current dm implementation relies on this way sort of, and
> it could be addressed by forcing to mark DMF_DEFERRED_REMOVE in
> remove_all().

You floated that earlier in this thread, etc: no, that would change
the interface.  DMF_DEFERRED_REMOVE gives people options to allow for
async device closes, etc.  But I don't want to impose it as some faux
equivalent to the sync model remove_all has always provided.

And what about simple 'dmsetup remove'? remove_all just loops doing
remove... so isn't 'dmsetup remove' also being forced to be async as
of commit a28d893eb3270 ("md: port block device access to file")?

dm.c:dm_put_device -> dm_put_table_device -> close_table_device

> > that case use __fput_sync() instead of fput() which ensures that this
> > file is closed synchronously.
> 
> I tried __fput_sync(), but the following panic is caused:

Ok, so more work needed.  But we need to preserve the existing sync
interface for DM device removal.

Mike

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 04/34] md: port block device access to file
  2024-04-15 15:11           ` Ming Lei
  2024-04-15 15:53             ` Mike Snitzer
@ 2024-04-15 16:22             ` Jan Kara
  2024-04-16  0:27               ` Ming Lei
  1 sibling, 1 reply; 146+ messages in thread
From: Jan Kara @ 2024-04-15 16:22 UTC (permalink / raw)
  To: Ming Lei
  Cc: Christian Brauner, Jan Kara, Christoph Hellwig, Jens Axboe,
	Darrick J. Wong, linux-fsdevel, linux-block, Mike Snitzer,
	dm-devel, Mikulas Patocka

On Mon 15-04-24 23:11:50, Ming Lei wrote:
> On Mon, Apr 15, 2024 at 04:53:42PM +0200, Christian Brauner wrote:
> > On Mon, Apr 15, 2024 at 10:35:53PM +0800, Ming Lei wrote:
> > > On Mon, Apr 15, 2024 at 02:35:17PM +0200, Christian Brauner wrote:
> > > > On Mon, Apr 15, 2024 at 05:26:19PM +0800, Ming Lei wrote:
> > > > > Hello,
> > > > > 
> > > > > On Tue, Jan 23, 2024 at 02:26:21PM +0100, Christian Brauner wrote:
> > > > > > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > > > > > ---
> > > > > >  drivers/md/dm.c               | 23 +++++++++++++----------
> > > > > >  drivers/md/md.c               | 12 ++++++------
> > > > > >  drivers/md/md.h               |  2 +-
> > > > > >  include/linux/device-mapper.h |  2 +-
> > > > > >  4 files changed, 21 insertions(+), 18 deletions(-)
> > > > > > 
> > > > > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > > > > index 8dcabf84d866..87de5b5682ad 100644
> > > > > > --- a/drivers/md/dm.c
> > > > > > +++ b/drivers/md/dm.c
> > > > > 
> > > > > ...
> > > > > 
> > > > > > @@ -775,7 +778,7 @@ static void close_table_device(struct table_device *td, struct mapped_device *md
> > > > > >  {
> > > > > >  	if (md->disk->slave_dir)
> > > > > >  		bd_unlink_disk_holder(td->dm_dev.bdev, md->disk);
> > > > > > -	bdev_release(td->dm_dev.bdev_handle);
> > > > > > +	fput(td->dm_dev.bdev_file);
> > > > > 
> > > > > The above change caused regression on 'dmsetup remove_all'.
> > > > > 
> > > > > blkdev_release() is delayed because of fput(), so dm_lock_for_deletion
> > > > > returns -EBUSY, then this dm disk is skipped in remove_all().
> > > > > 
> > > > > Force to mark DMF_DEFERRED_REMOVE might solve it, but need our device
> > > > > mapper guys to check if it is safe.
> > > > > 
> > > > > Or other better solution?
> > > > 
> > > > Yeah, I think there is. You can just switch all fput() instances in
> > > > device mapper to bdev_fput() which is mainline now. This will yield the
> > > > device and make it able to be reclaimed. Should be as simple as the
> > > > patch below. Could you test this and send a patch based on this (I'm on
> > > > a prolonged vacation so I don't have time right now.):
> > > 
> > > Unfortunately it doesn't work.
> > > 
> > > Here the problem is that blkdev_release() is delayed, which changes
> > > 'dmsetup remove_all' behavior, and causes that some of dm disks aren't
> > > removed.
> > > 
> > > Please see dm_lock_for_deletion() and dm_blk_open()/dm_blk_close().
> > 
> > So you really need blkdev_release() itself to be synchronous? Groan, in
> 
> At least the current dm implementation relies on this way sort of, and
> it could be addressed by forcing to mark DMF_DEFERRED_REMOVE in
> remove_all().
> 
> > that case use __fput_sync() instead of fput() which ensures that this
> > file is closed synchronously.
> 
> I tried __fput_sync(), but the following panic is caused:
> 
> [  113.486522] ------------[ cut here ]------------
> [  113.486524] kernel BUG at fs/file_table.c:453!
> [  113.486531] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> [  113.488878] CPU: 6 PID: 1919 Comm: dmsetup Kdump: loaded Not tainted 5.14.0+ #23

Wait, how come this is 5.14 kernel? Apparently you're crashing on:

BUG_ON(!(task->flags & PF_KTHREAD));

but that is not present in current upstream (BUG_ON was removed in 6.6-rc1
by commit 021a160abf62c).

								Honza

> [  113.490114] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc37 04/01/2014
> [  113.491661] RIP: 0010:__fput_sync+0x25/0x30
> [  113.492562] Code: 90 90 90 90 90 0f 1f 44 00 00 f0 48 ff 4f 38 75 14 65 48 8b 04 25 40 25 03 00 f6 40 36 20 74 0a e9 20 fd ff ff c3 cc cc cc cc <0f0
> [  113.493926] RSP: 0018:ffffb76581003c20 EFLAGS: 00010246
> [  113.494220] RAX: ffff92eca6ef8000 RBX: ffff92ed176c3c18 RCX: 000000008080007c
> [  113.494632] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff92ec844cac00
> [  113.495033] RBP: ffff92ed176c3c00 R08: 0000000000000001 R09: 0000000000000000
> [  113.495378] R10: ffffb76581003b00 R11: ffffb76581003b68 R12: ffff92ec8fccec20
> [  113.495723] R13: ffff92ec8431b400 R14: ffff92ec8431b508 R15: ffff92ec8fccec00
> [  113.496108] FS:  00007f5be5638840(0000) GS:ffff92f0ebb80000(0000) knlGS:0000000000000000
> [  113.496581] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  113.496907] CR2: 00007f5be54694b0 CR3: 0000000108e54003 CR4: 0000000000770ef0
> [  113.497308] PKRU: 55555554
> [  113.497469] Call Trace:
> [  113.497613]  <TASK>
> [  113.497741]  ? show_trace_log_lvl+0x1c4/0x2df
> [  113.497997]  ? show_trace_log_lvl+0x1c4/0x2df
> [  113.498251]  ? dm_put_table_device+0x64/0xd0 [dm_mod]
> [  113.498553]  ? __die_body.cold+0x8/0xd
> [  113.498768]  ? die+0x2b/0x50
> [  113.498937]  ? do_trap+0xce/0x120
> [  113.499129]  ? __fput_sync+0x25/0x30
> [  113.499337]  ? do_error_trap+0x65/0x80
> [  113.499577]  ? __fput_sync+0x25/0x30
> [  113.499787]  ? exc_invalid_op+0x4e/0x70
> [  113.500011]  ? __fput_sync+0x25/0x30
> [  113.500239]  ? asm_exc_invalid_op+0x16/0x20
> [  113.500842]  ? __fput_sync+0x25/0x30
> [  113.501387]  dm_put_table_device+0x64/0xd0 [dm_mod]
> [  113.502047]  dm_put_device+0x80/0x110 [dm_mod]
> [  113.502650]  stripe_dtr+0x2f/0x50 [dm_mod]
> [  113.503218]  dm_table_destroy+0x59/0x120 [dm_mod]
> [  113.503842]  __dm_destroy+0x114/0x1e0 [dm_mod]
> [  113.504402]  dm_hash_remove_all+0x63/0x160 [dm_mod]
> [  113.505028]  remove_all+0x1e/0x30 [dm_mod]
> [  113.505602]  ctl_ioctl+0x19f/0x290 [dm_mod]
> [  113.506146]  dm_ctl_ioctl+0xa/0x20 [dm_mod]
> [  113.506717]  __x64_sys_ioctl+0x87/0xc0
> [  113.507230]  do_syscall_64+0x5c/0xf0
> [  113.507755]  ? exc_page_fault+0x62/0x150
> [  113.508309]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
> [  113.508945] RIP: 0033:0x7f5be543ec6b
> 
> 
> 
> Thanks. 
> Ming
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 146+ messages in thread

* Re: [PATCH v2 04/34] md: port block device access to file
  2024-04-15 16:22             ` Jan Kara
@ 2024-04-16  0:27               ` Ming Lei
  0 siblings, 0 replies; 146+ messages in thread
From: Ming Lei @ 2024-04-16  0:27 UTC (permalink / raw)
  To: Jan Kara
  Cc: Christian Brauner, Christoph Hellwig, Jens Axboe, Darrick J. Wong,
	linux-fsdevel, linux-block, Mike Snitzer, dm-devel,
	Mikulas Patocka

On Mon, Apr 15, 2024 at 06:22:10PM +0200, Jan Kara wrote:
> On Mon 15-04-24 23:11:50, Ming Lei wrote:
> > On Mon, Apr 15, 2024 at 04:53:42PM +0200, Christian Brauner wrote:
> > > On Mon, Apr 15, 2024 at 10:35:53PM +0800, Ming Lei wrote:
> > > > On Mon, Apr 15, 2024 at 02:35:17PM +0200, Christian Brauner wrote:
> > > > > On Mon, Apr 15, 2024 at 05:26:19PM +0800, Ming Lei wrote:
> > > > > > Hello,
> > > > > > 
> > > > > > On Tue, Jan 23, 2024 at 02:26:21PM +0100, Christian Brauner wrote:
> > > > > > > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > > > > > > ---
> > > > > > >  drivers/md/dm.c               | 23 +++++++++++++----------
> > > > > > >  drivers/md/md.c               | 12 ++++++------
> > > > > > >  drivers/md/md.h               |  2 +-
> > > > > > >  include/linux/device-mapper.h |  2 +-
> > > > > > >  4 files changed, 21 insertions(+), 18 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > > > > > index 8dcabf84d866..87de5b5682ad 100644
> > > > > > > --- a/drivers/md/dm.c
> > > > > > > +++ b/drivers/md/dm.c
> > > > > > 
> > > > > > ...
> > > > > > 
> > > > > > > @@ -775,7 +778,7 @@ static void close_table_device(struct table_device *td, struct mapped_device *md
> > > > > > >  {
> > > > > > >  	if (md->disk->slave_dir)
> > > > > > >  		bd_unlink_disk_holder(td->dm_dev.bdev, md->disk);
> > > > > > > -	bdev_release(td->dm_dev.bdev_handle);
> > > > > > > +	fput(td->dm_dev.bdev_file);
> > > > > > 
> > > > > > The above change caused regression on 'dmsetup remove_all'.
> > > > > > 
> > > > > > blkdev_release() is delayed because of fput(), so dm_lock_for_deletion
> > > > > > returns -EBUSY, then this dm disk is skipped in remove_all().
> > > > > > 
> > > > > > Force to mark DMF_DEFERRED_REMOVE might solve it, but need our device
> > > > > > mapper guys to check if it is safe.
> > > > > > 
> > > > > > Or other better solution?
> > > > > 
> > > > > Yeah, I think there is. You can just switch all fput() instances in
> > > > > device mapper to bdev_fput() which is mainline now. This will yield the
> > > > > device and make it able to be reclaimed. Should be as simple as the
> > > > > patch below. Could you test this and send a patch based on this (I'm on
> > > > > a prolonged vacation so I don't have time right now.):
> > > > 
> > > > Unfortunately it doesn't work.
> > > > 
> > > > Here the problem is that blkdev_release() is delayed, which changes
> > > > 'dmsetup remove_all' behavior, and causes that some of dm disks aren't
> > > > removed.
> > > > 
> > > > Please see dm_lock_for_deletion() and dm_blk_open()/dm_blk_close().
> > > 
> > > So you really need blkdev_release() itself to be synchronous? Groan, in
> > 
> > At least the current dm implementation relies on this way sort of, and
> > it could be addressed by forcing to mark DMF_DEFERRED_REMOVE in
> > remove_all().
> > 
> > > that case use __fput_sync() instead of fput() which ensures that this
> > > file is closed synchronously.
> > 
> > I tried __fput_sync(), but the following panic is caused:
> > 
> > [  113.486522] ------------[ cut here ]------------
> > [  113.486524] kernel BUG at fs/file_table.c:453!
> > [  113.486531] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> > [  113.488878] CPU: 6 PID: 1919 Comm: dmsetup Kdump: loaded Not tainted 5.14.0+ #23
> 
> Wait, how come this is 5.14 kernel? Apparently you're crashing on:
> 
> BUG_ON(!(task->flags & PF_KTHREAD));
> 
> but that is not present in current upstream (BUG_ON was removed in 6.6-rc1
> by commit 021a160abf62c).

Indeed, just tried the change on v6.9-rc3, looks it does work. 


Thanks,
Ming


^ permalink raw reply	[flat|nested] 146+ messages in thread

end of thread, other threads:[~2024-04-16  0:27 UTC | newest]

Thread overview: 146+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-23 13:26 [PATCH v2 00/34] Open block devices as files Christian Brauner
2024-01-23 13:26 ` [PATCH v2 01/34] bdev: open block device " Christian Brauner
2024-01-29 16:02   ` Christoph Hellwig
2024-02-01 17:08     ` Christian Brauner
2024-02-02  6:43       ` Christoph Hellwig
2024-02-02 11:46         ` Christian Brauner
2024-02-09 11:39       ` Christian Brauner
2024-03-13  2:32   ` Christoph Hellwig
2024-03-14 11:10     ` Christian Brauner
2024-03-14 14:47       ` Christian Brauner
2024-03-14 16:45         ` Christian Brauner
2024-03-14 16:58         ` Jan Kara
2024-03-15 13:23           ` [PATCH] fs,block: get holder during claim Christian Brauner
2024-03-15 14:28             ` Jan Kara
2024-03-19 16:24               ` remove holder ops Christian Brauner
2024-03-19 17:03                 ` Matthew Wilcox
2024-03-19 23:13                 ` Christoph Hellwig
2024-03-17 20:53             ` [PATCH] fs,block: get holder during claim Christoph Hellwig
2024-03-18  8:33               ` Christian Brauner
2024-03-18  9:10             ` Yi Zhang
2024-01-23 13:26 ` [PATCH v2 02/34] block/ioctl: port blkdev_bszset() to file Christian Brauner
2024-01-29 16:14   ` Christoph Hellwig
2024-01-31 18:10   ` Jan Kara
2024-01-23 13:26 ` [PATCH v2 03/34] block/genhd: port disk_scan_partitions() " Christian Brauner
2024-01-29 16:14   ` Christoph Hellwig
2024-01-31 18:13   ` Jan Kara
2024-01-23 13:26 ` [PATCH v2 04/34] md: port block device access " Christian Brauner
2024-01-29 16:14   ` Christoph Hellwig
2024-01-31 18:15   ` Jan Kara
2024-04-15  9:26   ` Ming Lei
2024-04-15 12:35     ` Christian Brauner
2024-04-15 13:56       ` Mike Snitzer
2024-04-15 14:35       ` Ming Lei
2024-04-15 14:53         ` Christian Brauner
2024-04-15 15:11           ` Ming Lei
2024-04-15 15:53             ` Mike Snitzer
2024-04-15 16:22             ` Jan Kara
2024-04-16  0:27               ` Ming Lei
2024-01-23 13:26 ` [PATCH v2 05/34] swap: port block device usage " Christian Brauner
2024-01-29 16:15   ` Christoph Hellwig
2024-01-31 18:16   ` Jan Kara
2024-01-23 13:26 ` [PATCH v2 06/34] power: port block device access " Christian Brauner
2024-01-29 16:15   ` Christoph Hellwig
2024-01-31 18:17   ` Jan Kara
2024-01-23 13:26 ` [PATCH v2 07/34] xfs: port block device access to files Christian Brauner
2024-01-29 16:17   ` Christoph Hellwig
2024-02-01 14:33     ` Christian Brauner
2024-01-31 18:19   ` Jan Kara
2024-01-23 13:26 ` [PATCH v2 08/34] drbd: port block device access to file Christian Brauner
2024-01-31 18:22   ` Jan Kara
2024-01-23 13:26 ` [PATCH v2 09/34] pktcdvd: " Christian Brauner
2024-01-31 18:26   ` Jan Kara
2024-01-23 13:26 ` [PATCH v2 10/34] rnbd: " Christian Brauner
2024-01-31 18:28   ` Jan Kara
2024-01-23 13:26 ` [PATCH v2 11/34] xen: " Christian Brauner
2024-01-31 18:31   ` Jan Kara
2024-01-23 13:26 ` [PATCH v2 12/34] zram: " Christian Brauner
2024-01-31 18:32   ` Jan Kara
2024-01-23 13:26 ` [PATCH v2 13/34] bcache: port block device access to files Christian Brauner
2024-02-01  9:45   ` Jan Kara
2024-01-23 13:26 ` [PATCH v2 14/34] block2mtd: port " Christian Brauner
2024-02-01  9:47   ` Jan Kara
2024-01-23 13:26 ` [PATCH v2 15/34] nvme: port block device access to file Christian Brauner
2024-02-01  9:48   ` Jan Kara
2024-01-23 13:26 ` [PATCH v2 16/34] s390: " Christian Brauner
2024-02-01 10:11   ` Jan Kara
2024-01-23 13:26 ` [PATCH v2 17/34] target: " Christian Brauner
2024-02-01 10:12   ` Jan Kara
2024-01-23 13:26 ` [PATCH v2 18/34] bcachefs: " Christian Brauner
2024-02-01 10:13   ` Jan Kara
2024-01-23 13:26 ` [PATCH v2 19/34] btrfs: port " Christian Brauner
2024-02-01 10:16   ` Jan Kara
2024-01-23 13:26 ` [PATCH v2 20/34] erofs: " Christian Brauner
2024-02-01 10:16   ` Jan Kara
2024-01-23 13:26 ` [PATCH v2 21/34] ext4: port block " Christian Brauner
2024-02-01 10:18   ` Jan Kara
2024-01-23 13:26 ` [PATCH v2 22/34] f2fs: port block device access to files Christian Brauner
2024-02-01 10:19   ` Jan Kara
2024-01-23 13:26 ` [PATCH v2 23/34] jfs: port block device access to file Christian Brauner
2024-02-01 10:19   ` Jan Kara
2024-01-23 13:26 ` [PATCH v2 24/34] nfs: port block device access to files Christian Brauner
2024-02-01 10:22   ` Jan Kara
2024-01-23 13:26 ` [PATCH v2 25/34] ocfs2: port block device access to file Christian Brauner
2024-02-01 10:22   ` Jan Kara
2024-01-23 13:26 ` [PATCH v2 26/34] reiserfs: " Christian Brauner
2024-02-01 10:24   ` Jan Kara
2024-01-23 13:26 ` [PATCH v2 27/34] bdev: remove bdev_open_by_path() Christian Brauner
2024-01-29 16:17   ` Christoph Hellwig
2024-02-01 10:24   ` Jan Kara
2024-01-23 13:26 ` [PATCH v2 28/34] bdev: make bdev_release() private to block layer Christian Brauner
2024-01-29 16:19   ` Christoph Hellwig
2024-02-01 10:26   ` Jan Kara
2024-02-01 14:48     ` Christian Brauner
2024-01-23 13:26 ` [PATCH v2 29/34] bdev: make struct bdev_handle private to the " Christian Brauner
2024-01-29 16:22   ` Christoph Hellwig
2024-02-01 14:50     ` Christian Brauner
2024-02-01 10:54   ` Jan Kara
2024-02-01 15:07     ` Christian Brauner
2024-02-01 17:42       ` Jan Kara
2024-02-01 11:23   ` Jan Kara
2024-02-01 14:52     ` Christian Brauner
2024-01-23 13:26 ` [PATCH v2 30/34] bdev: remove bdev pointer from struct bdev_handle Christian Brauner
2024-01-29 16:22   ` Christoph Hellwig
2024-02-01 10:57   ` Jan Kara
2024-01-23 13:26 ` [PATCH v2 31/34] block: use file->f_op to indicate restricted writes Christian Brauner
2024-01-29 16:49   ` Christoph Hellwig
2024-01-29 17:09     ` [PATCH v2 31/34] block: use file->f_op to indicate restricted writes^[ Christian Brauner
2024-01-30  8:32       ` Christoph Hellwig
2024-01-30  9:11         ` Christian Brauner
2024-02-01 11:08   ` [PATCH v2 31/34] block: use file->f_op to indicate restricted writes Jan Kara
2024-02-01 16:16     ` Christian Brauner
2024-02-01 17:36       ` Jan Kara
2024-02-02 11:45         ` Christian Brauner
2024-02-02 11:51           ` Jan Kara
2024-01-23 13:26 ` [PATCH v2 32/34] block: remove bdev_handle completely Christian Brauner
2024-01-29 16:50   ` Christoph Hellwig
2024-02-01 11:20   ` Jan Kara
2024-02-01 16:18     ` Christian Brauner
2024-01-23 13:26 ` [PATCH v2 33/34] block: expose bdev_file_inode() Christian Brauner
2024-02-01 10:09   ` Jan Kara
2024-01-23 13:26 ` [PATCH v2 34/34] ext4: rely on sb->f_bdev only Christian Brauner
2024-02-01 11:34   ` Jan Kara
2024-02-01 13:40     ` Christian Brauner
2024-01-29  6:17 ` [PATCH v2 00/34] Open block devices as files Christoph Hellwig
2024-01-29 10:17   ` Christian Brauner
2024-01-29 10:56 ` [PATCH RFC 0/2] fs & block: remove bd_inode Christian Brauner
2024-01-29 10:56   ` [PATCH RFC 1/2] fs & block: remove bdev->bd_inode Christian Brauner
2024-02-20 11:57     ` Yu Kuai
2024-02-21  7:36       ` Christian Brauner
2024-01-29 10:56   ` [PATCH RFC 2/2] fs,drivers: remove bdev_inode() usage outside of block layer and drivers Christian Brauner
2024-01-29 14:37     ` Christoph Hellwig
2024-01-29 15:29       ` Christian Brauner
2024-01-29 15:36         ` Christoph Hellwig
2024-02-19 13:34           ` Yu Kuai
2024-02-19 13:42           ` Yu Kuai
2024-02-05 11:55 ` [PATCH v2 00/34] Open block devices as files Christian Brauner
2024-02-05 14:19   ` Jan Kara
2024-02-06 13:39     ` Christian Brauner
2024-02-06 13:58       ` Jan Kara
2024-02-06 16:10         ` Christian Brauner
2024-03-21 22:17 ` Matthew Wilcox
2024-03-22  3:38   ` Kent Overstreet
2024-03-22 13:56     ` Christian Brauner
2024-03-22 12:31   ` Christian Brauner
2024-03-22 12:40     ` Matthew Wilcox
2024-03-22 13:53       ` Christian Brauner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).