Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: Qu Wenruo <wqu@suse.com>
To: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH RFC] btrfs: reject transaction creation for read-only mount except for log recovery
Date: Thu, 16 Dec 2021 17:32:12 +0800	[thread overview]
Message-ID: <7cd18b3c-e6bd-f5fc-3c6f-92da2f65cd3b@suse.com> (raw)
In-Reply-To: <20211216091441.53270-1-wqu@suse.com>



On 2021/12/16 17:14, Qu Wenruo wrote:
> [BUG]
> The following super simple script would crash btrfs at unmount time, if
> CONFIG_BTRFS_ASSERT() is set.
> 
>   mkfs.btrfs -f $dev
>   mount $dev $mnt
>   xfs_io -f -c "pwrite 0 4k" $mnt/file
>   umount $mnt
>   mount -r ro $dev $mnt
>   btrfs scrub start -Br $mnt
>   umount $mnt
> 
> This will trigger the following ASSERT() introduced by commit
> 0a31daa4b602 ("btrfs: add assertion for empty list of transactions at
> late stage of umount").
> 
> That patch is deifnitely not the cause, it just makes enough noise for
> us developer.
> 
> [CAUSE]
> We will start transaction for the following call chain during scrub:
> 
> scrub_enumerate_chunks()
> |- btrfs_inc_block_group_ro()
>     |- btrfs_join_transaction()
> 
> However for RO mount, there is no running transaction at all, thus
> btrfs_join_transaction() will start a new transaction.
> 
> But since the fs is already read-only, there is no way to commit the
> transaction, thus triggering the ASSERT().
> 
> The bug should be there for a long time. As I can still reproduce the
> crash at v5.10 kernel.
> 
> [FIX]
> Currently I choose to separate the log recovery code transaction with
> other transactions, and reject all other transactions if the filesystem
> is mounted read-only.
> 
> But I'm not sure if this is the best solution, thus this patch still
> requires for comments.
> 
> There is some alternatives I can thing of:
> 
> - Don't start new transaction in btrfs_inc_block_group_ro().
>    We have btrfs_join_transaction_nostart(), but even with that we still
>    can't ensure there is no new transaction started and committed after
>    we called btrfs_join_transaction_nostart() and got -ENOENT.
> 
> - Allow btrfs to commit empty transaction without writing any thing
>    If we know this transaction contains no dirty metadata, we allow
>    it to be "committed" even on RO mount, although no real data will
>    be written to disk.
> 
> And even with current fix, I'm not 100% sure if we won't get a crash if
> we run read-only scrub with frequently ro/rw remount.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>   fs/btrfs/block-group.c |  6 +++++-
>   fs/btrfs/transaction.c | 14 ++++++++++++++
>   fs/btrfs/transaction.h |  5 +++++
>   fs/btrfs/tree-log.c    |  2 +-
>   4 files changed, 25 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
> index 1db24e6d6d90..13391d562189 100644
> --- a/fs/btrfs/block-group.c
> +++ b/fs/btrfs/block-group.c
> @@ -2546,6 +2546,9 @@ int btrfs_inc_block_group_ro(struct btrfs_block_group *cache,
>   
>   	do {
>   		trans = btrfs_join_transaction(root);
> +		if (IS_ERR(trans) && PTR_ERR(trans) == -EROFS)
> +			return 0;
> +

Here I should still go through ro_block_group_mutex locking and 
inc_block_group_ro().

By this, we don't need to change the BUG_ON() call.

But the core idea is still the same.

Thanks,
Qu
>   		if (IS_ERR(trans))
>   			return PTR_ERR(trans);
>   
> @@ -2621,7 +2624,8 @@ void btrfs_dec_block_group_ro(struct btrfs_block_group *cache)
>   	struct btrfs_space_info *sinfo = cache->space_info;
>   	u64 num_bytes;
>   
> -	BUG_ON(!cache->ro);
> +	if (cache->ro)
> +		return;
>   
>   	spin_lock(&sinfo->lock);
>   	spin_lock(&cache->lock);
> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> index 03de89b45f27..306eaeb41ec9 100644
> --- a/fs/btrfs/transaction.c
> +++ b/fs/btrfs/transaction.c
> @@ -591,6 +591,13 @@ start_transaction(struct btrfs_root *root, unsigned int num_items,
>   	if (BTRFS_FS_ERROR(fs_info))
>   		return ERR_PTR(-EROFS);
>   
> +	/*
> +	 * If the FS is mounted RO, we only allow transaction for log recovery,
> +	 * no regular transaction can be started.
> +	 */
> +	if (sb_rdonly(fs_info->sb) && !(type & __TRANS_START_IGNORE_RO))
> +		return ERR_PTR(-EROFS);
> +
>   	if (current->journal_info) {
>   		WARN_ON(type & TRANS_EXTWRITERS);
>   		h = current->journal_info;
> @@ -781,6 +788,13 @@ struct btrfs_trans_handle *btrfs_start_transaction(struct btrfs_root *root,
>   				 BTRFS_RESERVE_FLUSH_ALL, true);
>   }
>   
> +struct btrfs_trans_handle *btrfs_start_transaction_log_recover(
> +			struct btrfs_root *root, unsigned int num_items)
> +{
> +	return start_transaction(root, num_items, TRANS_START_LOG_RECOVER,
> +				 BTRFS_RESERVE_FLUSH_ALL, true);
> +}
> +
>   struct btrfs_trans_handle *btrfs_start_transaction_fallback_global_rsv(
>   					struct btrfs_root *root,
>   					unsigned int num_items)
> diff --git a/fs/btrfs/transaction.h b/fs/btrfs/transaction.h
> index 1852ed9de7fd..24a743d91eff 100644
> --- a/fs/btrfs/transaction.h
> +++ b/fs/btrfs/transaction.h
> @@ -106,8 +106,11 @@ struct btrfs_transaction {
>   #define __TRANS_JOIN_NOLOCK	(1U << 12)
>   #define __TRANS_DUMMY		(1U << 13)
>   #define __TRANS_JOIN_NOSTART	(1U << 14)
> +#define __TRANS_START_IGNORE_RO	(1U << 15)
>   
>   #define TRANS_START		(__TRANS_START | __TRANS_FREEZABLE)
> +#define TRANS_START_LOG_RECOVER	(__TRANS_START | __TRANS_FREEZABLE | \
> +				 __TRANS_START_IGNORE_RO)
>   #define TRANS_ATTACH		(__TRANS_ATTACH)
>   #define TRANS_JOIN		(__TRANS_JOIN | __TRANS_FREEZABLE)
>   #define TRANS_JOIN_NOLOCK	(__TRANS_JOIN_NOLOCK)
> @@ -201,6 +204,8 @@ static inline void btrfs_clear_skip_qgroup(struct btrfs_trans_handle *trans)
>   int btrfs_end_transaction(struct btrfs_trans_handle *trans);
>   struct btrfs_trans_handle *btrfs_start_transaction(struct btrfs_root *root,
>   						   unsigned int num_items);
> +struct btrfs_trans_handle *btrfs_start_transaction_log_recover(
> +			struct btrfs_root *root, unsigned int num_items);
>   struct btrfs_trans_handle *btrfs_start_transaction_fallback_global_rsv(
>   					struct btrfs_root *root,
>   					unsigned int num_items);
> diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
> index 69f901813ea8..42369fa9a038 100644
> --- a/fs/btrfs/tree-log.c
> +++ b/fs/btrfs/tree-log.c
> @@ -6478,7 +6478,7 @@ int btrfs_recover_log_trees(struct btrfs_root *log_root_tree)
>   
>   	set_bit(BTRFS_FS_LOG_RECOVERING, &fs_info->flags);
>   
> -	trans = btrfs_start_transaction(fs_info->tree_root, 0);
> +	trans = btrfs_start_transaction_log_recover(fs_info->tree_root, 0);
>   	if (IS_ERR(trans)) {
>   		ret = PTR_ERR(trans);
>   		goto error;


  reply	other threads:[~2021-12-16  9:32 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-16  9:14 [PATCH RFC] btrfs: reject transaction creation for read-only mount except for log recovery Qu Wenruo
2021-12-16  9:32 ` Qu Wenruo [this message]
2021-12-16 10:25 ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7cd18b3c-e6bd-f5fc-3c6f-92da2f65cd3b@suse.com \
    --to=wqu@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox