From: Leah Rumancik <leah.rumancik@gmail.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH] xfs: up(ic_sema) if flushing data device fails
Date: Mon, 23 Oct 2023 17:44:14 -0700 [thread overview]
Message-ID: <ZTcTXnVVTX747zqP@google.com> (raw)
In-Reply-To: <20231023212221.GV3195650@frogsfrogsfrogs>
On Mon, Oct 23, 2023 at 02:22:21PM -0700, Darrick J. Wong wrote:
> On Mon, Oct 23, 2023 at 11:14:10AM -0700, Leah Rumancik wrote:
> > We flush the data device cache before we issue external log IO. Since
> > 7d839e325af2, we check the return value of the flush, and if the flush
> > failed, we shut down the log immediately and return. However, the
> > iclog->ic_sema is left in a decremented state so let's add an up().
> > Prior to this patch, xfs/438 would fail consistently when running with
> > an external log device:
> >
> > sync
> > -> xfs_log_force
> > -> xlog_write_iclog
> > -> down(&iclog->ic_sema)
> > -> blkdev_issue_flush (fail causes us to intiate shutdown)
> > -> xlog_force_shutdown
> > -> return
> >
> > unmount
> > -> xfs_log_umount
> > -> xlog_wait_iclog_completion
> > -> down(&iclog->ic_sema) --------> HANG
> >
> > There is a second early return / shutdown. Add an up() there as well.
>
> Ow. Yes, I think it's correct that both of those error returns need to
> drop ic_sema since we don't submit_bio, so there is no xlog_ioend_work
> to do it for us.
>
> > Fixes: 7d839e325af2 ("xfs: check return codes when flushing block devices")
>
> Hmm. This bug was introduced in b5d721eaae47e ("xfs: external logs need
> to flush data device"), not 7d839. That said, this patch only applies
> cleanly to 7d839e325af2.
>
> b5d721 was introduced in 5.14 and 7d839 came in via 6.0, so ... this is
> where I would have spat out:
>
> Fixes: 7d839e325af2 ("xfs: check return codes when flushing block devices")
> Actually-Fixes: b5d721eaae47e ("xfs: external logs need to flush data device")
I'm not sure I follow. Before 7d839e325af2, we didn't return if there
was an issue with the flush so there wasn't an issue with ic_sema. 7d839e325af2
was a fix for eef983ffeae7 though. Do you try to keep fixes tags
associated with the original commit as opposed to fixes of fixes?
>
> > Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
> > ---
> >
> > Notes:
> > Tested auto group for xfs/4k and xfs/logdev configs with no regressions
> > seen.
> >
> > fs/xfs/xfs_log.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> > index 51c100c86177..b4a8105299c2 100644
> > --- a/fs/xfs/xfs_log.c
> > +++ b/fs/xfs/xfs_log.c
> > @@ -1926,6 +1926,7 @@ xlog_write_iclog(
> > */
> > if (log->l_targ != log->l_mp->m_ddev_targp &&
> > blkdev_issue_flush(log->l_mp->m_ddev_targp->bt_bdev)) {
> > + up(&iclog->ic_sema);
> > xlog_force_shutdown(log, SHUTDOWN_LOG_IO_ERROR);
> > return;
> > }
> > @@ -1936,6 +1937,7 @@ xlog_write_iclog(
> > iclog->ic_flags &= ~(XLOG_ICL_NEED_FLUSH | XLOG_ICL_NEED_FUA);
> >
> > if (xlog_map_iclog_data(&iclog->ic_bio, iclog->ic_data, count)) {
> > + up(&iclog->ic_sema);
> > xlog_force_shutdown(log, SHUTDOWN_LOG_IO_ERROR);
>
> I wonder if these two should both become a cleanup clause at the end?
Sure, that sounds good, I'll create a new version.
Thanks for the review! :)
- Leah
>
> if (log->l_targ != log->l_mp->m_ddev_targp &&
> blkdev_issue_flush(log->l_mp->m_ddev_targp->bt_bdev))
> goto shutdown;
>
> ...
>
> if (xlog_map_iclog_data(&iclog->ic_bio, iclog->ic_data, count))
> goto shutdown;
>
> ...
>
> submit_bio(&iclog->ic_bio);
> return;
>
> shutdown:
> up(&iclog->ic_sema);
> xlog_force_shutdown(log, SHUTDOWN_LOG_IO_ERROR);
> }
>
> Seeing as we've screwed this up twice now and not a whole lot of people
> actually use external logs, and somehow I've never seen this on my test
> fleet.
>
> Anyway the code change looks correct so modulo the stylistic thing,
> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
>
> --D
>
> > return;
> > }
> > --
> > 2.42.0.758.gaed0368e0e-goog
> >
next prev parent reply other threads:[~2023-10-24 0:44 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-23 18:14 [PATCH] xfs: up(ic_sema) if flushing data device fails Leah Rumancik
2023-10-23 21:22 ` Darrick J. Wong
2023-10-24 0:44 ` Leah Rumancik [this message]
2023-10-24 1:41 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZTcTXnVVTX747zqP@google.com \
--to=leah.rumancik@gmail.com \
--cc=djwong@kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox