Re: [PATCH 4/8] libxfs: enable tools to check that metadata updates have been committed

public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed

From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Brian Foster <bfoster@redhat.com>
Cc: sandeen@sandeen.net, linux-xfs@vger.kernel.org
Subject: Re: [PATCH 4/8] libxfs: enable tools to check that metadata updates have been committed
Date: Thu, 20 Feb 2020 10:26:42 -0800	[thread overview]
Message-ID: <20200220182642.GZ9506@magnolia> (raw)
In-Reply-To: <20200220175850.GH48977@bfoster>

On Thu, Feb 20, 2020 at 12:58:50PM -0500, Brian Foster wrote:
> On Thu, Feb 20, 2020 at 08:46:38AM -0800, Darrick J. Wong wrote:
> > On Thu, Feb 20, 2020 at 09:06:12AM -0500, Brian Foster wrote:
> > > On Wed, Feb 19, 2020 at 05:42:06PM -0800, Darrick J. Wong wrote:
> > > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > > 
> > > > Add a new function that will ensure that everything we changed has
> > > > landed on stable media, and report the results.  Subsequent commits will
> > > > teach the individual programs to report when things go wrong.
> > > > 
> > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > > ---
> > > >  include/xfs_mount.h |    3 +++
> > > >  libxfs/init.c       |   43 +++++++++++++++++++++++++++++++++++++++++++
> > > >  libxfs/libxfs_io.h  |    2 ++
> > > >  libxfs/rdwr.c       |   27 +++++++++++++++++++++++++--
> > > >  4 files changed, 73 insertions(+), 2 deletions(-)
> > > > 
> > > > 
> > > > diff --git a/include/xfs_mount.h b/include/xfs_mount.h
> > > > index 29b3cc1b..c80aaf69 100644
> > > > --- a/include/xfs_mount.h
> > > > +++ b/include/xfs_mount.h
> > > > @@ -187,4 +187,7 @@ extern xfs_mount_t	*libxfs_mount (xfs_mount_t *, xfs_sb_t *,
> > > >  extern void	libxfs_umount (xfs_mount_t *);
> > > >  extern void	libxfs_rtmount_destroy (xfs_mount_t *);
> > > >  
> > > > +void libxfs_flush_devices(struct xfs_mount *mp, int *datadev, int *logdev,
> > > > +		int *rtdev);
> > > > +
> > > >  #endif	/* __XFS_MOUNT_H__ */
> > > > diff --git a/libxfs/init.c b/libxfs/init.c
> > > > index a0d4b7f4..d1d3f4df 100644
> > > > --- a/libxfs/init.c
> > > > +++ b/libxfs/init.c
> > > > @@ -569,6 +569,8 @@ libxfs_buftarg_alloc(
> > > >  	}
> > > >  	btp->bt_mount = mp;
> > > >  	btp->dev = dev;
> > > > +	btp->lost_writes = false;
> > > > +
> > > >  	return btp;
> > > >  }
> > > >  
> > > > @@ -791,6 +793,47 @@ libxfs_rtmount_destroy(xfs_mount_t *mp)
> > > >  	mp->m_rsumip = mp->m_rbmip = NULL;
> > > >  }
> > > >  
> > > > +static inline int
> > > > +libxfs_flush_buftarg(
> > > > +	struct xfs_buftarg	*btp)
> > > > +{
> > > > +	if (btp->lost_writes)
> > > > +		return -ENOTRECOVERABLE;
> > > 
> > > I'm curious why we'd want to skip the flush just because some writes
> > > happened to fail..? I suppose the fs might be borked, but it seems a
> > > little strange to at least not try the flush, particularly since we
> > > might still flush any of the other two possible devices.
> > 
> > My thinking here was that if the write verifiers (or the pwrite() calls
> > themselves) failed then there's no point in telling the disk to flush
> > its write cache since we already know it's not in sync with the buffer
> > cache.
> > 
> 
> I suppose, but it seems there is some value in flushing what we did
> write.. That's effectively historical behavior (since we ignored
> errors), right?

It's the historical behavior, yes.  I don't think it makes much sense,
but OTOH I'm not opposed to restoring that.

> > > > +
> > > > +	return libxfs_blkdev_issue_flush(btp);
> > > > +}
> > > > +
> > > > +/*
> > > > + * Purge the buffer cache to write all dirty buffers to disk and free all
> > > > + * incore buffers.  Buffers that cannot be written will cause the lost_writes
> > > > + * flag to be set in the buftarg.  If there were no lost writes, flush the
> > > > + * device to make sure the writes made it to stable storage.
> > > > + *
> > > > + * For each device, the return code will be set to -ENOTRECOVERABLE if we
> > > > + * couldn't write something to disk; or the results of the block device flush
> > > > + * operation.
> > > 
> > > Why not -EIO?
> > 
> > Originally I thought it might be useful to be able to distinguish
> > between "dirty buffers never even made it out of the buffer cache" vs.
> > "dirty buffers were sent to the disk but the disk sent back media
> > errors", though in the end the userspace tools don't make any
> > distinction.
> > 
> > That said, looking at this again, maybe I should track write verifier
> > failure separately so that we can return EFSCORRUPTED for that?
> > 
> 
> It's not clear to me that anything application level would care much
> about verifier failure vs. I/O failure, but I've no objection to doing
> something like that either.

Yeah.  The single usecase I can think of is where repair trips over a
write verifier and we should make it really obvious to the sysadmin that
repair is buggy and needs either (a) an upgrade or (b) a complaint filed
on linux-xfs.

--D

> Brian
> 
> > --D
> > 
> > > 
> > > Brian
> > > 
> > > > + */
> > > > +void
> > > > +libxfs_flush_devices(
> > > > +	struct xfs_mount	*mp,
> > > > +	int			*datadev,
> > > > +	int			*logdev,
> > > > +	int			*rtdev)
> > > > +{
> > > > +	*datadev = *logdev = *rtdev = 0;
> > > > +
> > > > +	libxfs_bcache_purge();
> > > > +
> > > > +	if (mp->m_ddev_targp)
> > > > +		*datadev = libxfs_flush_buftarg(mp->m_ddev_targp);
> > > > +
> > > > +	if (mp->m_logdev_targp && mp->m_logdev_targp != mp->m_ddev_targp)
> > > > +		*logdev = libxfs_flush_buftarg(mp->m_logdev_targp);
> > > > +
> > > > +	if (mp->m_rtdev_targp)
> > > > +		*rtdev = libxfs_flush_buftarg(mp->m_rtdev_targp);
> > > > +}
> > > > +
> > > >  /*
> > > >   * Release any resource obtained during a mount.
> > > >   */
> > > > diff --git a/libxfs/libxfs_io.h b/libxfs/libxfs_io.h
> > > > index 579df52b..fc0fd060 100644
> > > > --- a/libxfs/libxfs_io.h
> > > > +++ b/libxfs/libxfs_io.h
> > > > @@ -23,10 +23,12 @@ struct xfs_perag;
> > > >  struct xfs_buftarg {
> > > >  	struct xfs_mount	*bt_mount;
> > > >  	dev_t			dev;
> > > > +	bool			lost_writes;
> > > >  };
> > > >  
> > > >  extern void	libxfs_buftarg_init(struct xfs_mount *mp, dev_t ddev,
> > > >  				    dev_t logdev, dev_t rtdev);
> > > > +int libxfs_blkdev_issue_flush(struct xfs_buftarg *btp);
> > > >  
> > > >  #define LIBXFS_BBTOOFF64(bbs)	(((xfs_off_t)(bbs)) << BBSHIFT)
> > > >  
> > > > diff --git a/libxfs/rdwr.c b/libxfs/rdwr.c
> > > > index 8b47d438..92e497f9 100644
> > > > --- a/libxfs/rdwr.c
> > > > +++ b/libxfs/rdwr.c
> > > > @@ -17,6 +17,7 @@
> > > >  #include "xfs_inode_fork.h"
> > > >  #include "xfs_inode.h"
> > > >  #include "xfs_trans.h"
> > > > +#include "libfrog/platform.h"
> > > >  
> > > >  #include "libxfs.h"		/* for LIBXFS_EXIT_ON_FAILURE */
> > > >  
> > > > @@ -1227,9 +1228,11 @@ libxfs_brelse(
> > > >  
> > > >  	if (!bp)
> > > >  		return;
> > > > -	if (bp->b_flags & LIBXFS_B_DIRTY)
> > > > +	if (bp->b_flags & LIBXFS_B_DIRTY) {
> > > >  		fprintf(stderr,
> > > >  			"releasing dirty buffer to free list!\n");
> > > > +		bp->b_target->lost_writes = true;
> > > > +	}
> > > >  
> > > >  	pthread_mutex_lock(&xfs_buf_freelist.cm_mutex);
> > > >  	list_add(&bp->b_node.cn_mru, &xfs_buf_freelist.cm_list);
> > > > @@ -1248,9 +1251,11 @@ libxfs_bulkrelse(
> > > >  		return 0 ;
> > > >  
> > > >  	list_for_each_entry(bp, list, b_node.cn_mru) {
> > > > -		if (bp->b_flags & LIBXFS_B_DIRTY)
> > > > +		if (bp->b_flags & LIBXFS_B_DIRTY) {
> > > >  			fprintf(stderr,
> > > >  				"releasing dirty buffer (bulk) to free list!\n");
> > > > +			bp->b_target->lost_writes = true;
> > > > +		}
> > > >  		count++;
> > > >  	}
> > > >  
> > > > @@ -1479,6 +1484,24 @@ libxfs_irele(
> > > >  	kmem_cache_free(xfs_inode_zone, ip);
> > > >  }
> > > >  
> > > > +/*
> > > > + * Flush everything dirty in the kernel and disk write caches to stable media.
> > > > + * Returns 0 for success or a negative error code.
> > > > + */
> > > > +int
> > > > +libxfs_blkdev_issue_flush(
> > > > +	struct xfs_buftarg	*btp)
> > > > +{
> > > > +	int			fd, ret;
> > > > +
> > > > +	if (btp->dev == 0)
> > > > +		return 0;
> > > > +
> > > > +	fd = libxfs_device_to_fd(btp->dev);
> > > > +	ret = platform_flush_device(fd, btp->dev);
> > > > +	return ret ? -errno : 0;
> > > > +}
> > > > +
> > > >  /*
> > > >   * Write out a buffer list synchronously.
> > > >   *
> > > > 
> > > 
> > 
>

next prev parent reply	other threads:[~2020-02-20 18:26 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-20  1:41 [PATCH v2 0/8] xfsprogs: actually check that writes succeeded Darrick J. Wong
2020-02-20  1:41 ` [PATCH 1/8] libxfs: libxfs_buf_delwri_submit should write buffers immediately Darrick J. Wong
2020-02-20  1:41 ` [PATCH 2/8] libxfs: complain when write IOs fail Darrick J. Wong
2020-02-20  1:42 ` [PATCH 3/8] libxfs: return flush failures Darrick J. Wong
2020-02-20  1:42 ` [PATCH 4/8] libxfs: enable tools to check that metadata updates have been committed Darrick J. Wong
2020-02-20 14:06   ` Brian Foster
2020-02-20 16:46     ` Darrick J. Wong
2020-02-20 17:58       ` Brian Foster
2020-02-20 18:26         ` Darrick J. Wong [this message]
2020-02-20 18:50           ` Brian Foster
2020-02-20 23:40   ` Dave Chinner
2020-02-21  0:33     ` Darrick J. Wong
2020-02-20  1:42 ` [PATCH 5/8] xfs_db: " Darrick J. Wong
2020-02-20 14:06   ` Brian Foster
2020-02-20 16:58     ` Darrick J. Wong
2020-02-20 17:58       ` Brian Foster
2020-02-20 18:34         ` Darrick J. Wong
2020-02-21  0:01           ` Dave Chinner
2020-02-21  0:39             ` Darrick J. Wong
2020-02-21  1:17               ` Dave Chinner
2020-02-20  1:42 ` [PATCH 6/8] mkfs: " Darrick J. Wong
2020-02-20  1:42 ` [PATCH 7/8] xfs_repair: " Darrick J. Wong
2020-02-20  1:42 ` [PATCH 8/8] libfrog: always fsync when flushing a device Darrick J. Wong
2020-02-20 14:06   ` Brian Foster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200220182642.GZ9506@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=bfoster@redhat.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=sandeen@sandeen.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox