All of lore.kernel.org
 help / color / mirror / Atom feed
From: Omar Sandoval <osandov@osandov.com>
To: Nikolay Borisov <nborisov@suse.com>
Cc: linux-btrfs@vger.kernel.org, kernel-team@fb.com,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH 15/15] btrfs: unify buffered and direct I/O read repair
Date: Fri, 20 Mar 2020 14:28:22 -0700	[thread overview]
Message-ID: <20200320212822.GD32817@vader> (raw)
In-Reply-To: <37bf11cc-92b3-2b15-ee87-0cbe8c662cc7@suse.com>

On Thu, Mar 19, 2020 at 10:53:22AM +0200, Nikolay Borisov wrote:
> 
> 
> On 9.03.20 г. 23:32 ч., Omar Sandoval wrote:
> > From: Omar Sandoval <osandov@fb.com>
> > 
> > Currently, direct I/O has its own versions of bio_readpage_error() and
> > btrfs_check_repairable() (dio_read_error() and
> > btrfs_check_dio_repairable(), respectively). The main difference is that
> > the direct I/O version doesn't do read validation. The rework of direct
> > I/O repair makes it possible to do validation, so we can get rid of
> > btrfs_check_dio_repairable() and combine bio_readpage_error() and
> > dio_read_error() into a new helper, btrfs_submit_read_repair().
> > 
> > Signed-off-by: Omar Sandoval <osandov@fb.com>
> > ---
> >  fs/btrfs/extent_io.c | 126 +++++++++++++++++++------------------------
> >  fs/btrfs/extent_io.h |  17 +++---
> >  fs/btrfs/inode.c     | 103 ++++-------------------------------
> >  3 files changed, 76 insertions(+), 170 deletions(-)
> > 
> > diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> > index fad86ef4d09d..a5cbe04da803 100644
> > --- a/fs/btrfs/extent_io.c
> > +++ b/fs/btrfs/extent_io.c
> 
> <snip>
> 
> > -/*
> > - * This is a generic handler for readpage errors. If other copies exist, read
> > - * those and write back good data to the failed position. Does not investigate
> > - * in remapping the failed extent elsewhere, hoping the device will be smart
> > - * enough to do this as needed
> > - */
> > -static int bio_readpage_error(struct bio *failed_bio, u64 phy_offset,
> > -			      struct page *page, u64 start, u64 end,
> > -			      int failed_mirror)
> > +blk_status_t btrfs_submit_read_repair(struct inode *inode,
> > +				      struct bio *failed_bio, u64 phy_offset,
> > +				      struct page *page, unsigned int pgoff,
> > +				      u64 start, u64 end, int failed_mirror,
> > +				      submit_bio_hook_t *submit_bio_hook)
> >  {
> >  	struct io_failure_record *failrec;
> > -	struct inode *inode = page->mapping->host;
> > +	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
> >  	struct extent_io_tree *tree = &BTRFS_I(inode)->io_tree;
> >  	struct extent_io_tree *failure_tree = &BTRFS_I(inode)->io_failure_tree;
> > +	struct btrfs_io_bio *failed_io_bio = btrfs_io_bio(failed_bio);
> > +	struct btrfs_io_bio *io_bio;
> > +	int icsum = phy_offset >> inode->i_sb->s_blocksize_bits;
> >  	bool need_validation = false;
> >  	struct bio *bio;
> > -	int read_mode = 0;
> >  	blk_status_t status;
> >  	int ret;
> >  
> > +	btrfs_info(btrfs_sb(inode->i_sb),
> > +		   "Repair Read Error: read error at %llu", start);
> > +
> >  	BUG_ON(bio_op(failed_bio) == REQ_OP_WRITE);
> >  
> >  	ret = btrfs_get_io_failure_record(inode, start, end, &failrec);
> >  	if (ret)
> > -		return ret;
> > +		return errno_to_blk_status(ret);
> >  
> >  	/*
> >  	 * If there was an I/O error and the I/O was for multiple sectors, we
> >  	 * need to validate each sector individually.
> >  	 */
> >  	if (failed_bio->bi_status != BLK_STS_OK) {
> 
> Is this correct though, in case of buffered reads we are always called
> with bi_status != BLK_STS_OK (we are called from end_bio_extent_readpage
> in case uptodate is false,  which happens if failed_bio->bi_status is
> non-zero. Additionally the bio is guaranteed to not be cloned because
> there is : ASSERT(!bio_flagged(bio, BIO_CLONED));
> 
> The end effect of all of this is in case of buffered bios we never set
> need_revalidate, is this intentional?

For buffered I/O, this is called when bi_status != BLK_STS_OK OR
readpage_end_io_hook (i.e., check_data_csum()) failed. This check
distinguishes between those two cases: if we didn't hit an I/O error
(bi_status == BLK_STS_OK), then we don't need validation, otherwise, we
need validation if the bio is more than one sector.

> > -		u64 len = 0;
> > -		int i;
> > -
> > -		for (i = 0; i < failed_bio->bi_vcnt; i++) {
> > -			len += failed_bio->bi_io_vec[i].bv_len;
> > -			if (len > inode->i_sb->s_blocksize) {
> > +		if (bio_flagged(failed_bio, BIO_CLONED)) {
> 
> If I understand this correctly this is the "this is a DIO " branch. IMO
> it'd be clearer if you had bool is_dio = bio_flagged(failed_bio,
> BIO_CLONED) at the top of the function and you used that.

Repair bios for direct I/O aren't cloned, so is_dio isn't accurate. IMO
it shouldn't matter whether it came from direct I/O or not. If it's a
cloned bio, you get the size out of io_bio->iter, and if it's not, you
get it out of bi_io_vec.

> > +			if (failed_io_bio->iter.bi_size >
> > +			    inode->i_sb->s_blocksize)
> >  				need_validation = true;
> > -				break;
> > +		} else {
> 
> This branch will only ever be executed in case of DIO with csum failure.
> So either add a comment to demarcate when various leaves of the 2 'if'
> should be called or, and I think this would be the better solution,
> rewrite it.

As commented above, this outer branch is for I/O errors (not checksum
errors), and this specific branch is for non-cloned bios, which happens
to be buffered read bios and buffered or direct I/O repair bios. Would
it be clearer as:

static u64 btrfs_bio_size(struct bio *bio)
{
	if (bio_flagged(bio, BIO_CLONED))
		return bio->iter.bi_size;
	else
		return bio_size_all(bio);
}

blk_status_t btrfs_submit_read_repair(...)
{
	...
	/*
	 * If there was an I/O error and the I/O was for multiple sectors, we
	 * need to validate each sector individually.
	 */
	need_validation = (failed_bio->bi_status != BLK_STS_OK &&
			   btrfs_bio_size() > inode->i_sb->s_blocksize);
	...
}

  parent reply	other threads:[~2020-03-20 21:28 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-09 21:32 [PATCH 00/15] btrfs: read repair/direct I/O improvements Omar Sandoval
2020-03-09 21:32 ` [PATCH 01/15] btrfs: fix error handling when submitting direct I/O bio Omar Sandoval
2020-03-11 17:54   ` Josef Bacik
2020-03-17 13:46   ` Nikolay Borisov
2020-03-09 21:32 ` [PATCH 02/15] btrfs: fix double __endio_write_update_ordered in direct I/O Omar Sandoval
2020-03-10 16:30   ` Christoph Hellwig
2020-03-11  9:03     ` Omar Sandoval
2020-03-17 14:04   ` Nikolay Borisov
2020-03-09 21:32 ` [PATCH 03/15] btrfs: look at full bi_io_vec for repair decision Omar Sandoval
2020-03-10 16:33   ` Christoph Hellwig
2020-03-11  9:07     ` Omar Sandoval
2020-03-16 10:48       ` Christoph Hellwig
2020-03-17 14:38   ` Nikolay Borisov
2020-03-09 21:32 ` [PATCH 04/15] btrfs: don't do repair validation for checksum errors Omar Sandoval
2020-03-11 17:55   ` Josef Bacik
2020-03-09 21:32 ` [PATCH 05/15] btrfs: clarify btrfs_lookup_bio_sums documentation Omar Sandoval
2020-03-11 17:56   ` Josef Bacik
2020-03-11 18:23     ` Omar Sandoval
2020-03-11 18:34       ` Josef Bacik
2020-03-17 14:38   ` Nikolay Borisov
2020-03-09 21:32 ` [PATCH 06/15] btrfs: rename __readpage_endio_check to check_data_csum Omar Sandoval
2020-03-10 14:46   ` Johannes Thumshirn
2020-03-11 17:57   ` Josef Bacik
2020-03-17 14:39   ` Nikolay Borisov
2020-03-09 21:32 ` [PATCH 07/15] btrfs: make btrfs_check_repairable() static Omar Sandoval
2020-03-10 14:53   ` Johannes Thumshirn
2020-03-11 17:58   ` Josef Bacik
2020-03-17 14:52   ` Nikolay Borisov
2020-03-09 21:32 ` [PATCH 08/15] btrfs: move btrfs_dio_private to inode.c Omar Sandoval
2020-03-10 14:56   ` Johannes Thumshirn
2020-03-11  8:48     ` Omar Sandoval
2020-03-17 14:53   ` Nikolay Borisov
2020-03-19 16:16   ` David Sterba
2020-03-09 21:32 ` [PATCH 09/15] btrfs: kill btrfs_dio_private->private Omar Sandoval
2020-03-11 17:59   ` Josef Bacik
2020-03-17 14:54   ` Nikolay Borisov
2020-03-09 21:32 ` [PATCH 10/15] btrfs: convert btrfs_dio_private->pending_bios to refcount_t Omar Sandoval
2020-03-11 18:00   ` Josef Bacik
2020-03-17 15:10   ` Nikolay Borisov
2020-03-09 21:32 ` [PATCH 11/15] btrfs: put direct I/O checksums in btrfs_dio_private instead of bio Omar Sandoval
2020-03-11 18:04   ` Josef Bacik
2020-03-17 16:37   ` Nikolay Borisov
2020-04-03 16:18   ` David Sterba
2020-03-09 21:32 ` [PATCH 12/15] btrfs: get rid of one layer of bios in direct I/O Omar Sandoval
2020-03-10 16:38   ` Christoph Hellwig
2020-03-11  9:19     ` Omar Sandoval
2020-03-16 10:49       ` Christoph Hellwig
2020-03-11 18:07   ` Josef Bacik
2020-03-17 16:48   ` Nikolay Borisov
2020-03-09 21:32 ` [PATCH 13/15] btrfs: simplify direct I/O read repair Omar Sandoval
2020-03-11 18:16   ` Josef Bacik
2020-04-03 16:40   ` David Sterba
2020-04-03 18:05     ` Omar Sandoval
2020-04-16 10:08       ` David Sterba
2020-03-09 21:32 ` [PATCH 14/15] btrfs: get rid of endio_repair_workers Omar Sandoval
2020-03-11 18:16   ` Josef Bacik
2020-03-09 21:32 ` [PATCH 15/15] btrfs: unify buffered and direct I/O read repair Omar Sandoval
2020-03-11 18:19   ` Josef Bacik
2020-03-19  8:53   ` Nikolay Borisov
2020-03-19  9:03     ` Christoph Hellwig
2020-03-20 21:28     ` Omar Sandoval [this message]
2020-03-10 16:39 ` [PATCH 00/15] btrfs: read repair/direct I/O improvements Christoph Hellwig
2020-03-11  9:22   ` Omar Sandoval
2020-03-18 16:33     ` Goldwyn Rodrigues
2020-03-19 14:08       ` David Sterba
2020-03-18 22:07 ` David Sterba
2020-03-20 21:29   ` Omar Sandoval
2020-03-20 14:10 ` Christoph Hellwig
2020-03-20 14:11   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200320212822.GD32817@vader \
    --to=osandov@osandov.com \
    --cc=hch@lst.de \
    --cc=kernel-team@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=nborisov@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.