public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 2/5] xfs: external logs need to flush data device
Date: Thu, 22 Jul 2021 16:10:19 -0700	[thread overview]
Message-ID: <20210722231019.GO559212@magnolia> (raw)
In-Reply-To: <20210722214539.GP664593@dread.disaster.area>

On Fri, Jul 23, 2021 at 07:45:39AM +1000, Dave Chinner wrote:
> On Thu, Jul 22, 2021 at 11:14:45AM -0700, Darrick J. Wong wrote:
> > On Thu, Jul 22, 2021 at 11:53:32AM +1000, Dave Chinner wrote:
> > > From: Dave Chinner <dchinner@redhat.com>
> > > 
> > > The recent journal flush/FUA changes replaced the flushing of the
> > > data device on every iclog write with an up-front async data device
> > > cache flush. Unfortunately, the assumption of which this was based
> > > on has been proven incorrect by the flush vs log tail update
> > > ordering issue. As the fix for that issue uses the
> > > XLOG_ICL_NEED_FLUSH flag to indicate that data device needs a cache
> > > flush, we now need to (once again) ensure that an iclog write to
> > > external logs that need a cache flush to be issued actually issue a
> > > cache flush to the data device as well as the log device.
> > > 
> > > Fixes: eef983ffeae7 ("xfs: journal IO cache flush reductions")
> > > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> > > ---
> > >  fs/xfs/xfs_log.c | 19 +++++++++++--------
> > >  1 file changed, 11 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> > > index 96434cc4df6e..a3c4d48195d9 100644
> > > --- a/fs/xfs/xfs_log.c
> > > +++ b/fs/xfs/xfs_log.c
> > > @@ -827,13 +827,6 @@ xlog_write_unmount_record(
> > >  	/* account for space used by record data */
> > >  	ticket->t_curr_res -= sizeof(ulf);
> > >  
> > > -	/*
> > > -	 * For external log devices, we need to flush the data device cache
> > > -	 * first to ensure all metadata writeback is on stable storage before we
> > > -	 * stamp the tail LSN into the unmount record.
> > > -	 */
> > > -	if (log->l_targ != log->l_mp->m_ddev_targp)
> > > -		blkdev_issue_flush(log->l_mp->m_ddev_targp->bt_bdev);
> > >  	return xlog_write(log, &vec, ticket, NULL, NULL, XLOG_UNMOUNT_TRANS);
> > >  }
> > >  
> > > @@ -1796,10 +1789,20 @@ xlog_write_iclog(
> > >  	 * metadata writeback and causing priority inversions.
> > >  	 */
> > >  	iclog->ic_bio.bi_opf = REQ_OP_WRITE | REQ_META | REQ_SYNC | REQ_IDLE;
> > > -	if (iclog->ic_flags & XLOG_ICL_NEED_FLUSH)
> > > +	if (iclog->ic_flags & XLOG_ICL_NEED_FLUSH) {
> > >  		iclog->ic_bio.bi_opf |= REQ_PREFLUSH;
> > > +		/*
> > > +		 * For external log devices, we also need to flush the data
> > > +		 * device cache first to ensure all metadata writeback covered
> > > +		 * by the LSN in this iclog is on stable storage. This is slow,
> > > +		 * but it *must* complete before we issue the external log IO.
> > 
> > I'm a little confused about what's going on here.  We're about to write
> > a log record to disk, with h_tail_lsn reflecting the tail of the log and
> > h_lsn reflecting the current head of the log (i.e. this record).
> > 
> > If the log tail has moved forward since the last log record was written
> > and this fs has an external log, we need to flush the data device
> > because the AIL could have written logged items back into the filesystem
> > and we need to ensure those items have been persisted before we write to
> > the log the fact that the tail moved forward.  The AIL itself doesn't
> > issue cache flushes (nor does it need to), so that's why we do that
> > here.
> > 
> > Why don't we need a flush like this if only FUA is set?  Is it not
> > possible to write a checkpoint that fits within a single iclog after the
> > log tail has moved forward?
> 
> Yes, it is, and that is the race condition is exactly what the next
> patch in the series addresses. If the log tail moves after the data
> device cache flush was issued before we started writing the
> checkpoint to the iclogs, then we detect that when releasing the
> commit iclog and set the XLOG_ICL_NEED_FLUSH flag on it. That will
> then trigger this code to issue a data device cache flush....

Aha, yeah, I noticed that after scanning the next few patches.

> IOWs, for external logs, the XLOG_ICL_NEED_FLUSH flag indicates that
> both the data device and the log device need a cache flush, rather
> than just the log device. I think it could be split into two flags,
> but then my head explodes thinking about log forces and trying to
> determine what type of flush is implied (and what flags we'd need to
> set) when we return log_flushed = true....

Maybe later when we're not focussed on recovery failures.

In the meantime, I'm satisfied enough to
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

  reply	other threads:[~2021-07-22 23:10 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-22  1:53 [PATCH 0/5] xfs: fix log cache flush regressions Dave Chinner
2021-07-22  1:53 ` [PATCH 1/5] xfs: flush data dev on external log write Dave Chinner
2021-07-22  6:41   ` Christoph Hellwig
2021-07-22 15:52   ` Darrick J. Wong
2021-07-22  1:53 ` [PATCH 2/5] xfs: external logs need to flush data device Dave Chinner
2021-07-22  6:48   ` Christoph Hellwig
2021-07-22 18:14   ` Darrick J. Wong
2021-07-22 21:45     ` Dave Chinner
2021-07-22 23:10       ` Darrick J. Wong [this message]
2021-07-22  1:53 ` [PATCH 3/5] xfs: fix ordering violation between cache flushes and tail updates Dave Chinner
2021-07-22  7:06   ` Christoph Hellwig
2021-07-22  7:28     ` Dave Chinner
2021-07-22 19:12     ` Darrick J. Wong
2021-07-22  1:53 ` [PATCH 4/5] xfs: log forces imply data device cache flushes Dave Chinner
2021-07-22  7:14   ` Christoph Hellwig
2021-07-22  7:32     ` Dave Chinner
2021-07-22 19:30   ` Darrick J. Wong
2021-07-22 22:12     ` Dave Chinner
2021-07-22 23:13       ` Darrick J. Wong
2021-07-22  1:53 ` [PATCH 5/5] xfs: avoid unnecessary waits in xfs_log_force_lsn() Dave Chinner
2021-07-22  7:15   ` Christoph Hellwig
2021-07-22 19:13   ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210722231019.GO559212@magnolia \
    --to=djwong@kernel.org \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox