linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Christian Brauner <brauner@kernel.org>
To: Qu Wenruo <wqu@suse.com>, Christoph Hellwig <hch@lst.de>
Cc: linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	 viro@zeniv.linux.org.uk, jack@suse.cz
Subject: Re: [PATCH 1/6] fs: add a new remove_bdev() super operations callback
Date: Thu, 26 Jun 2025 10:38:11 +0200	[thread overview]
Message-ID: <20250626-schildern-flutlicht-36fa57d43570@brauner> (raw)
In-Reply-To: <c8853ae1710df330e600a02efe629a3b196dde88.1750895337.git.wqu@suse.com>

On Thu, Jun 26, 2025 at 09:23:42AM +0930, Qu Wenruo wrote:
> The new remove_bdev() call back is mostly for multi-device filesystems
> to handle device removal.
> 
> Some multi-devices filesystems like btrfs can have the ability to handle
> device lose according to the setup (e.g. all chunks have extra mirrors),
> thus losing a block device will not interrupt the normal operations.
> 
> Btrfs will soon implement this call back by:
> 
> - Automatically degrade the fs if read-write operations can be
>   maintained
> 
> - Shutdown the fs if read-write operations can not be maintained
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  fs/super.c         |  4 +++-
>  include/linux/fs.h | 18 ++++++++++++++++++
>  2 files changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/super.c b/fs/super.c
> index 80418ca8e215..07845d2f9ec4 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -1463,7 +1463,9 @@ static void fs_bdev_mark_dead(struct block_device *bdev, bool surprise)
>  		sync_filesystem(sb);
>  	shrink_dcache_sb(sb);
>  	evict_inodes(sb);
> -	if (sb->s_op->shutdown)
> +	if (sb->s_op->remove_bdev)
> +		sb->s_op->remove_bdev(sb, bdev, surprise);
> +	else if (sb->s_op->shutdown)
>  		sb->s_op->shutdown(sb);

This makes ->remove_bdev() and ->shutdown() mutually exclusive. I really
really dislike this pattern. It introduces the possibility that a
filesystem accidently implement both variants and assumes both are
somehow called. That can be solved by an assert at superblock initation
time but it's still nasty.

The other thing is that this just reeks of being the wrong api. We
should absolutely aim for the methods to not be mutually exclusive. I
hate that with a passion. That's just an ugly api and I want to have as
little of that as possible in our code.

>  
>  	super_unlock_shared(sb);
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index b085f161ed22..5e84e06c7354 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2367,7 +2367,25 @@ struct super_operations {
>  				  struct shrink_control *);
>  	long (*free_cached_objects)(struct super_block *,
>  				    struct shrink_control *);
> +	/*
> +	 * Callback to shutdown the fs.
> +	 *
> +	 * If a fs can not afford losing any block device, implement this callback.
> +	 */
>  	void (*shutdown)(struct super_block *sb);
> +
> +	/*
> +	 * Callback to handle a block device removal.
> +	 *
> +	 * Recommended to implement this for multi-device filesystems, as they
> +	 * may afford losing a block device and continue operations.
> +	 *
> +	 * @surprse:	indicates a surprise removal. If true the device/media is
> +	 *		already gone. Otherwise we're prepareing for an orderly
> +	 *		removal.
> +	 */
> +	void (*remove_bdev)(struct super_block *sb, struct block_device *bdev,
> +			    bool surprise);
>  };

Yeah, I think that's just not a good api. That looks a lot to me like we
should just collapse both functions even though earlier discussion said
we shouldn't. Just do:

s/shutdown/remove_bdev/

or

s/shutdown/shutdown_bdev()

The filesystem will know whether it has to kill the filesystem or if it
can keep going even if the device is lost. Hell, if we have to we could
just have it return whether it killed the superblock or just the device
by giving the method a return value. But for now it probably doesn't
matter.

  reply	other threads:[~2025-06-26  8:38 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-25 23:53 [PATCH 0/6] btrfs: add remove_bdev() callback Qu Wenruo
2025-06-25 23:53 ` [PATCH 1/6] fs: add a new remove_bdev() super operations callback Qu Wenruo
2025-06-26  8:38   ` Christian Brauner [this message]
2025-06-26  8:44     ` Qu Wenruo
2025-06-26  9:10       ` Christian Brauner
2025-06-26  9:18         ` Qu Wenruo
2025-06-26  9:29           ` Christian Brauner
2025-06-26 10:14     ` Christoph Hellwig
2025-06-26 10:36       ` Qu Wenruo
2025-06-26 11:02         ` Christoph Hellwig
2025-07-01 11:35       ` Christian Brauner
2025-06-25 23:53 ` [PATCH 2/6] btrfs: introduce a new fs state, EMERGENCY_SHUTDOWN Qu Wenruo
2025-06-25 23:53 ` [PATCH 3/6] btrfs: reject file operations if in shutdown state Qu Wenruo
2025-06-25 23:53 ` [PATCH 4/6] btrfs: reject delalloc ranges if the fs is shutdown Qu Wenruo
2025-06-25 23:53 ` [PATCH 5/6] btrfs: implement shutdown ioctl Qu Wenruo
2025-06-25 23:53 ` [PATCH 6/6] btrfs: implement remove_bdev super operation callback Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250626-schildern-flutlicht-36fa57d43570@brauner \
    --to=brauner@kernel.org \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).