All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Long Li <leo.lilong@huawei.com>
Cc: cem@kernel.org, linux-xfs@vger.kernel.org, david@fromorbit.com,
	yi.zhang@huawei.com, houtao1@huawei.com, yangerkun@huawei.com,
	lonuxli.64@gmail.com
Subject: Re: [PATCH 2/2] xfs: fix mount hang during primary superblock recovery failure
Date: Mon, 6 Jan 2025 11:55:41 -0800	[thread overview]
Message-ID: <20250106195541.GL6174@frogsfrogsfrogs> (raw)
In-Reply-To: <20241231023423.656128-3-leo.lilong@huawei.com>

On Tue, Dec 31, 2024 at 10:34:23AM +0800, Long Li wrote:
> When mounting an image containing a log with sb modifications that require
> log replay, the mount process hang all the time and stack as follows:
> 
>   [root@localhost ~]# cat /proc/557/stack
>   [<0>] xfs_buftarg_wait+0x31/0x70
>   [<0>] xfs_buftarg_drain+0x54/0x350
>   [<0>] xfs_mountfs+0x66e/0xe80
>   [<0>] xfs_fs_fill_super+0x7f1/0xec0
>   [<0>] get_tree_bdev_flags+0x186/0x280
>   [<0>] get_tree_bdev+0x18/0x30
>   [<0>] xfs_fs_get_tree+0x1d/0x30
>   [<0>] vfs_get_tree+0x2d/0x110
>   [<0>] path_mount+0xb59/0xfc0
>   [<0>] do_mount+0x92/0xc0
>   [<0>] __x64_sys_mount+0xc2/0x160
>   [<0>] x64_sys_call+0x2de4/0x45c0
>   [<0>] do_syscall_64+0xa7/0x240
>   [<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
> 
> During log recovery, while updating the in-memory superblock from the
> primary SB buffer, if an error is encountered, such as superblock
> corruption occurs or some other reasons, we will proceed to out_release
> and release the xfs_buf. However, this is insufficient because the
> xfs_buf's log item has already been initialized and the xfs_buf is held
> by the buffer log item as follows, the xfs_buf will not be released,
> causing the mount thread to hang.
> 
>   xlog_recover_do_primary_sb_buffer
>     xlog_recover_do_reg_buffer
>       xlog_recover_validate_buf_type
>         xfs_buf_item_init(bp, mp)
> 
> The solution is straightforward: we simply need to allow it to be
> handled by the normal buffer write process. The filesystem will be
> shutdown before the submission of buffer_list in xlog_do_recovery_pass(),

What shuts it down?  If xlog_recover_do_primary_sb_buffer trips over
something like "mp->m_sb.sb_rgcount < orig_rgcount" then we haven't shut
anything down yet.  Am I missing something? <confused>

--D

> ensuring the correct release of the xfs_buf.
> 
> Fixes: 6a18765b54e2 ("xfs: update the file system geometry after recoverying superblock buffers")
> Signed-off-by: Long Li <leo.lilong@huawei.com>
> ---
>  fs/xfs/xfs_buf_item_recover.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/xfs_buf_item_recover.c b/fs/xfs/xfs_buf_item_recover.c
> index 3d0c6402cb36..ec2a42ef66ff 100644
> --- a/fs/xfs/xfs_buf_item_recover.c
> +++ b/fs/xfs/xfs_buf_item_recover.c
> @@ -1079,7 +1079,7 @@ xlog_recover_buf_commit_pass2(
>  		error = xlog_recover_do_primary_sb_buffer(mp, item, bp, buf_f,
>  				current_lsn);
>  		if (error)
> -			goto out_release;
> +			goto out_writebuf;
>  
>  		/* Update the rt superblock if we have one. */
>  		if (xfs_has_rtsb(mp) && mp->m_rtsb_bp) {
> @@ -1096,6 +1096,7 @@ xlog_recover_buf_commit_pass2(
>  		xlog_recover_do_reg_buffer(mp, item, bp, buf_f, current_lsn);
>  	}
>  
> +out_writebuf:
>  	/*
>  	 * Perform delayed write on the buffer.  Asynchronous writes will be
>  	 * slower when taking into account all the buffers to be flushed.
> -- 
> 2.39.2
> 
> 

  reply	other threads:[~2025-01-06 19:55 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-31  2:34 [PATCH 0/2] xfs: fix two issues regarding mount failures Long Li
2024-12-31  2:34 ` [PATCH 1/2] xfs: correct the sb_rgcount when the disk not support rt volume Long Li
2025-01-06 19:52   ` Darrick J. Wong
2025-01-07 13:11     ` Long Li
2025-01-08  0:32       ` Darrick J. Wong
2025-01-08  1:26         ` Long Li
2025-01-08  6:58       ` Christoph Hellwig
2025-01-08  7:22         ` Long Li
2025-01-08  6:55     ` Christoph Hellwig
2025-01-08  7:10       ` Long Li
2024-12-31  2:34 ` [PATCH 2/2] xfs: fix mount hang during primary superblock recovery failure Long Li
2025-01-06 19:55   ` Darrick J. Wong [this message]
2025-01-07 13:39     ` Long Li
2025-01-08  0:34       ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250106195541.GL6174@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=cem@kernel.org \
    --cc=david@fromorbit.com \
    --cc=houtao1@huawei.com \
    --cc=leo.lilong@huawei.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=lonuxli.64@gmail.com \
    --cc=yangerkun@huawei.com \
    --cc=yi.zhang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.