public inbox for linux-bcachefs@vger.kernel.org
 help / color / mirror / Atom feed
From: Kent Overstreet <kent.overstreet@linux.dev>
To: linux-bcachefs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org
Cc: Kent Overstreet <kent.overstreet@linux.dev>,
	Christian Brauner <brauner@kernel.org>,
	Jens Axboe <axboe@kernel.dk>
Subject: [PATCH 07/20] block: Don't block on s_umount from __invalidate_super()
Date: Wed, 12 Jul 2023 17:11:02 -0400	[thread overview]
Message-ID: <20230712211115.2174650-8-kent.overstreet@linux.dev> (raw)
In-Reply-To: <20230712211115.2174650-1-kent.overstreet@linux.dev>

__invalidate_super() is used to flush any filesystem mounted on a
device, generally on some sort of media change event.

However, when unmounting a filesystem and closing the underlying block
devices, we can deadlock if the block driver then calls
__invalidate_device() (e.g. because the block device goes away when it
is no longer in use).

This happens with bcachefs on top of loopback, and can be triggered by
fstests generic/042:

  put_super
    -> blkdev_put
    -> lo_release
    -> disk_force_media_change
    -> __invalidate_device
    -> get_super

This isn't inherently specific to bcachefs - it hasn't shown up with
other filesystems before because most other filesystems use the sget()
mechanism for opening/closing block devices (and enforcing exclusion),
however sget() has its own downsides and weird/sketchy behaviour w.r.t.
block device open lifetime - if that ever gets fixed more code will run
into this issue.

The __invalidate_device() call here is really a best effort "I just
yanked the device for a mounted filesystem, please try not to lose my
data" - if it's ever actually needed the user has already done something
crazy, and we probably shouldn't make things worse by deadlocking.
Switching to a trylock seems in keeping with what the code is trying to
do.

If we ever get revoke() at the block layer, perhaps we would look at
rearchitecting to use that instead.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
---
 block/bdev.c       |  2 +-
 fs/super.c         | 40 +++++++++++++++++++++++++++++++---------
 include/linux/fs.h |  1 +
 3 files changed, 33 insertions(+), 10 deletions(-)

diff --git a/block/bdev.c b/block/bdev.c
index 21c63bfef3..a4d7e8732c 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -934,7 +934,7 @@ EXPORT_SYMBOL(lookup_bdev);
 
 int __invalidate_device(struct block_device *bdev, bool kill_dirty)
 {
-	struct super_block *sb = get_super(bdev);
+	struct super_block *sb = try_get_super(bdev);
 	int res = 0;
 
 	if (sb) {
diff --git a/fs/super.c b/fs/super.c
index 04bc62ab7d..a2decce02f 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -791,14 +791,7 @@ void iterate_supers_type(struct file_system_type *type,
 
 EXPORT_SYMBOL(iterate_supers_type);
 
-/**
- * get_super - get the superblock of a device
- * @bdev: device to get the superblock for
- *
- * Scans the superblock list and finds the superblock of the file system
- * mounted on the device given. %NULL is returned if no match is found.
- */
-struct super_block *get_super(struct block_device *bdev)
+static struct super_block *__get_super(struct block_device *bdev, bool try)
 {
 	struct super_block *sb;
 
@@ -813,7 +806,12 @@ struct super_block *get_super(struct block_device *bdev)
 		if (sb->s_bdev == bdev) {
 			sb->s_count++;
 			spin_unlock(&sb_lock);
-			down_read(&sb->s_umount);
+
+			if (!try)
+				down_read(&sb->s_umount);
+			else if (!down_read_trylock(&sb->s_umount))
+				return NULL;
+
 			/* still alive? */
 			if (sb->s_root && (sb->s_flags & SB_BORN))
 				return sb;
@@ -828,6 +826,30 @@ struct super_block *get_super(struct block_device *bdev)
 	return NULL;
 }
 
+/**
+ * get_super - get the superblock of a device
+ * @bdev: device to get the superblock for
+ *
+ * Scans the superblock list and finds the superblock of the file system
+ * mounted on the device given. %NULL is returned if no match is found.
+ */
+struct super_block *get_super(struct block_device *bdev)
+{
+	return __get_super(bdev, false);
+}
+
+/**
+ * try_get_super - get the superblock of a device, using trylock on sb->s_umount
+ * @bdev: device to get the superblock for
+ *
+ * Scans the superblock list and finds the superblock of the file system
+ * mounted on the device given. %NULL is returned if no match is found.
+ */
+struct super_block *try_get_super(struct block_device *bdev)
+{
+	return __get_super(bdev, true);
+}
+
 /**
  * get_active_super - get an active reference to the superblock of a device
  * @bdev: device to get the superblock for
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 133f0640fb..bd5105bc92 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2897,6 +2897,7 @@ extern struct file_system_type *get_filesystem(struct file_system_type *fs);
 extern void put_filesystem(struct file_system_type *fs);
 extern struct file_system_type *get_fs_type(const char *name);
 extern struct super_block *get_super(struct block_device *);
+extern struct super_block *try_get_super(struct block_device *);
 extern struct super_block *get_active_super(struct block_device *bdev);
 extern void drop_super(struct super_block *sb);
 extern void drop_super_exclusive(struct super_block *sb);
-- 
2.40.1


  parent reply	other threads:[~2023-07-12 21:12 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-12 21:10 [PATCH 00/20] bcachefs prereqs patch series Kent Overstreet
2023-07-12 21:10 ` [PATCH 01/20] sched: Add task_struct->faults_disabled_mapping Kent Overstreet
2023-07-12 21:10 ` [PATCH 02/20] fs: factor out d_mark_tmpfile() Kent Overstreet
2023-07-12 21:10 ` [PATCH 03/20] iov_iter: Handle compound highmem pages in copy_page_from_iter_atomic() Kent Overstreet
2023-07-12 21:10 ` [PATCH 04/20] block: Add some exports for bcachefs Kent Overstreet
2023-07-24 17:31   ` Christoph Hellwig
2023-07-25  3:00     ` Kent Overstreet
2023-07-26 13:20       ` Christoph Hellwig
2023-08-01 18:59         ` Kent Overstreet
2023-07-25  2:59   ` Matthew Wilcox
2023-07-12 21:11 ` [PATCH 05/20] block: Allow bio_iov_iter_get_pages() with bio->bi_bdev unset Kent Overstreet
2023-07-24 17:34   ` Christoph Hellwig
2023-07-25  2:43     ` Kent Overstreet
2023-07-26 13:23       ` Christoph Hellwig
2023-08-01 19:04         ` Kent Overstreet
2023-08-02 11:47           ` Christoph Hellwig
2023-08-02 16:44             ` Kent Overstreet
2023-07-12 21:11 ` [PATCH 06/20] block: Bring back zero_fill_bio_iter Kent Overstreet
2023-07-24 17:35   ` Christoph Hellwig
2023-07-25  2:45     ` Kent Overstreet
2023-07-26 13:21       ` Christoph Hellwig
2023-08-01 19:05         ` Kent Overstreet
2023-07-12 21:11 ` Kent Overstreet [this message]
2023-07-12 21:11 ` [PATCH 08/20] stacktrace: Export stack_trace_save_tsk Kent Overstreet
2023-07-12 21:11 ` [PATCH 09/20] lib/string_helpers: string_get_size() now returns characters wrote Kent Overstreet
2023-07-12 21:11 ` [PATCH 10/20] lib: Export errname Kent Overstreet
2023-07-13  7:10   ` Eric Biggers
2023-07-12 21:11 ` [PATCH 11/20] locking/osq: Export osq_(lock|unlock) Kent Overstreet
2023-08-02 20:16   ` Waiman Long
2023-08-02 20:44     ` Kent Overstreet
2023-08-02 21:09       ` Waiman Long
2023-08-02 21:42         ` Kent Overstreet
2023-10-10  8:09           ` [NAK] " Ingo Molnar
2023-10-18 21:04             ` Kent Overstreet
2023-07-12 21:11 ` [PATCH 12/20] bcache: move closures to lib/ Kent Overstreet
2023-07-13  3:21   ` Randy Dunlap
2023-07-13  3:52     ` Kent Overstreet
2023-07-12 21:11 ` [PATCH 13/20] MAINTAINERS: Add entry for closures Kent Overstreet
2023-07-12 21:11 ` [PATCH 14/20] closures: closure_wait_event() Kent Overstreet
2023-07-12 21:11 ` [PATCH 15/20] closures: closure_nr_remaining() Kent Overstreet
2023-07-12 21:11 ` [PATCH 16/20] closures: Add a missing include Kent Overstreet
2023-07-12 21:11 ` [PATCH 17/20] MAINTAINERS: Add entry for generic-radix-tree Kent Overstreet
2023-07-12 21:11 ` [PATCH 18/20] lib/generic-radix-tree.c: Don't overflow in peek() Kent Overstreet
2023-07-12 21:11 ` [PATCH 19/20] lib/generic-radix-tree.c: Add a missing include Kent Overstreet
2023-07-25  3:04   ` Matthew Wilcox
2023-07-25  3:36     ` Kent Overstreet
2023-07-12 21:11 ` [PATCH 20/20] lib/generic-radix-tree.c: Add peek_prev() Kent Overstreet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230712211115.2174650-8-kent.overstreet@linux.dev \
    --to=kent.overstreet@linux.dev \
    --cc=axboe@kernel.dk \
    --cc=brauner@kernel.org \
    --cc=linux-bcachefs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox