public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Leah Rumancik <leah.rumancik@gmail.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH] xfs: up(ic_sema) if flushing data device fails
Date: Mon, 23 Oct 2023 17:44:14 -0700	[thread overview]
Message-ID: <ZTcTXnVVTX747zqP@google.com> (raw)
In-Reply-To: <20231023212221.GV3195650@frogsfrogsfrogs>

On Mon, Oct 23, 2023 at 02:22:21PM -0700, Darrick J. Wong wrote:
> On Mon, Oct 23, 2023 at 11:14:10AM -0700, Leah Rumancik wrote:
> > We flush the data device cache before we issue external log IO. Since
> > 7d839e325af2, we check the return value of the flush, and if the flush
> > failed, we shut down the log immediately and return. However, the
> > iclog->ic_sema is left in a decremented state so let's add an up().
> > Prior to this patch, xfs/438 would fail consistently when running with
> > an external log device:
> > 
> > sync
> >   -> xfs_log_force
> >   -> xlog_write_iclog
> >       -> down(&iclog->ic_sema)
> >       -> blkdev_issue_flush (fail causes us to intiate shutdown)
> >           -> xlog_force_shutdown
> >           -> return
> > 
> > unmount
> >   -> xfs_log_umount
> >       -> xlog_wait_iclog_completion
> >           -> down(&iclog->ic_sema) --------> HANG
> > 
> > There is a second early return / shutdown. Add an up() there as well.
> 
> Ow.  Yes, I think it's correct that both of those error returns need to
> drop ic_sema since we don't submit_bio, so there is no xlog_ioend_work
> to do it for us.
> 
> > Fixes: 7d839e325af2 ("xfs: check return codes when flushing block devices")
> 
> Hmm.  This bug was introduced in b5d721eaae47e ("xfs: external logs need
> to flush data device"), not 7d839.  That said, this patch only applies
> cleanly to 7d839e325af2.
> 
> b5d721 was introduced in 5.14 and 7d839 came in via 6.0, so ... this is
> where I would have spat out:
> 
> Fixes: 7d839e325af2 ("xfs: check return codes when flushing block devices")
> Actually-Fixes: b5d721eaae47e ("xfs: external logs need to flush data device")

I'm not sure I follow. Before 7d839e325af2, we didn't return if there
was an issue with the flush so there wasn't an issue with ic_sema. 7d839e325af2
was a fix for eef983ffeae7 though. Do you try to keep fixes tags
associated with the original commit as opposed to fixes of fixes?

> 
> > Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
> > ---
> > 
> > Notes:
> >     Tested auto group for xfs/4k and xfs/logdev configs with no regressions
> >     seen.
> > 
> >  fs/xfs/xfs_log.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> > index 51c100c86177..b4a8105299c2 100644
> > --- a/fs/xfs/xfs_log.c
> > +++ b/fs/xfs/xfs_log.c
> > @@ -1926,6 +1926,7 @@ xlog_write_iclog(
> >  		 */
> >  		if (log->l_targ != log->l_mp->m_ddev_targp &&
> >  		    blkdev_issue_flush(log->l_mp->m_ddev_targp->bt_bdev)) {
> > +			up(&iclog->ic_sema);
> >  			xlog_force_shutdown(log, SHUTDOWN_LOG_IO_ERROR);
> >  			return;
> >  		}
> > @@ -1936,6 +1937,7 @@ xlog_write_iclog(
> >  	iclog->ic_flags &= ~(XLOG_ICL_NEED_FLUSH | XLOG_ICL_NEED_FUA);
> >  
> >  	if (xlog_map_iclog_data(&iclog->ic_bio, iclog->ic_data, count)) {
> > +		up(&iclog->ic_sema);
> >  		xlog_force_shutdown(log, SHUTDOWN_LOG_IO_ERROR);
> 
> I wonder if these two should both become a cleanup clause at the end?

Sure, that sounds good, I'll create a new version.

Thanks for the review! :)
- Leah
> 
> 		if (log->l_targ != log->l_mp->m_ddev_targp &&
> 		    blkdev_issue_flush(log->l_mp->m_ddev_targp->bt_bdev))
> 			goto shutdown;
> 
> ...
> 
> 	if (xlog_map_iclog_data(&iclog->ic_bio, iclog->ic_data, count))
> 		goto shutdown;
> 
> ...
> 
> 	submit_bio(&iclog->ic_bio);
> 	return;
> 
> shutdown:
> 	up(&iclog->ic_sema);
> 	xlog_force_shutdown(log, SHUTDOWN_LOG_IO_ERROR);
> }
> 
> Seeing as we've screwed this up twice now and not a whole lot of people
> actually use external logs, and somehow I've never seen this on my test
> fleet.
> 
> Anyway the code change looks correct so modulo the stylistic thing,
> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
> 
> --D
> 
> >  		return;
> >  	}
> > -- 
> > 2.42.0.758.gaed0368e0e-goog
> > 

  reply	other threads:[~2023-10-24  0:44 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-23 18:14 [PATCH] xfs: up(ic_sema) if flushing data device fails Leah Rumancik
2023-10-23 21:22 ` Darrick J. Wong
2023-10-24  0:44   ` Leah Rumancik [this message]
2023-10-24  1:41     ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZTcTXnVVTX747zqP@google.com \
    --to=leah.rumancik@gmail.com \
    --cc=djwong@kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox