From: Qu Wenruo <wqu@suse.com>
To: linux-btrfs@vger.kernel.org
Subject: [PATCH v3 3/4] btrfs: submit read time repair only for each corrupted sector
Date: Mon, 3 May 2021 10:08:55 +0800 [thread overview]
Message-ID: <20210503020856.93333-4-wqu@suse.com> (raw)
In-Reply-To: <20210503020856.93333-1-wqu@suse.com>
Currently btrfs_submit_read_repair() has some extra check on whether the
failed bio needs extra validation for repair.
But we can avoid all these extra mechanism if we submit the repair for
each sector.
By this, each read repair can be easily handled without the need to
verify which sector is corrupted.
This will also benefit subpage, as one subpage bvec can contain several
sectors, making the extra verification more complex.
So this patch will:
- Introduce repair_one_sector()
The main code submitting repair, which is more or less the same as old
btrfs_submit_read_repair().
But this time, it only repair one sector.
- Make btrfs_submit_read_repair() to handle sectors differently
For sectors without csum error, just release them like what we did
in end_bio_extent_readpage().
Although in this context we don't have process_extent structure, thus
we have to do extent tree operations sector by sector.
This is slower, but since it's only in csum mismatch path, it should
be fine.
For sectors with csum error, we submit repair for each sector.
This patch will focus on the change on the repair path, the extra
validation code is still kept as is, and will be cleaned up later.
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
fs/btrfs/extent_io.c | 144 ++++++++++++++++++++++++++++++-------------
fs/btrfs/extent_io.h | 1 +
fs/btrfs/inode.c | 2 +-
3 files changed, 104 insertions(+), 43 deletions(-)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 0787fae5f7f1..dca204730d17 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2494,7 +2494,7 @@ void btrfs_free_io_failure_record(struct btrfs_inode *inode, u64 start, u64 end)
}
static struct io_failure_record *btrfs_get_io_failure_record(struct inode *inode,
- u64 start, u64 end)
+ u64 start)
{
struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
struct io_failure_record *failrec;
@@ -2502,6 +2502,7 @@ static struct io_failure_record *btrfs_get_io_failure_record(struct inode *inode
struct extent_io_tree *failure_tree = &BTRFS_I(inode)->io_failure_tree;
struct extent_io_tree *tree = &BTRFS_I(inode)->io_tree;
struct extent_map_tree *em_tree = &BTRFS_I(inode)->extent_tree;
+ const u32 sectorsize = fs_info->sectorsize;
int ret;
u64 logical;
@@ -2525,7 +2526,7 @@ static struct io_failure_record *btrfs_get_io_failure_record(struct inode *inode
return ERR_PTR(-ENOMEM);
failrec->start = start;
- failrec->len = end - start + 1;
+ failrec->len = sectorsize;
failrec->this_mirror = 0;
failrec->bio_flags = 0;
failrec->in_validation = 0;
@@ -2564,12 +2565,13 @@ static struct io_failure_record *btrfs_get_io_failure_record(struct inode *inode
free_extent_map(em);
/* Set the bits in the private failure tree */
- ret = set_extent_bits(failure_tree, start, end,
+ ret = set_extent_bits(failure_tree, start, start + sectorsize - 1,
EXTENT_LOCKED | EXTENT_DIRTY);
if (ret >= 0) {
ret = set_state_failrec(failure_tree, start, failrec);
/* Set the bits in the inode's tree */
- ret = set_extent_bits(tree, start, end, EXTENT_DAMAGED);
+ ret = set_extent_bits(tree, start, start + sectorsize - 1,
+ EXTENT_DAMAGED);
} else if (ret < 0) {
kfree(failrec);
return ERR_PTR(ret);
@@ -2682,11 +2684,11 @@ static bool btrfs_io_needs_validation(struct inode *inode, struct bio *bio)
return false;
}
-blk_status_t btrfs_submit_read_repair(struct inode *inode,
- struct bio *failed_bio, u32 bio_offset,
- struct page *page, unsigned int pgoff,
- u64 start, u64 end, int failed_mirror,
- submit_bio_hook_t *submit_bio_hook)
+static int repair_one_sector(struct inode *inode,
+ struct bio *failed_bio, u32 bio_offset,
+ struct page *page, unsigned int pgoff,
+ u64 start, int failed_mirror,
+ submit_bio_hook_t *submit_bio_hook)
{
struct io_failure_record *failrec;
struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
@@ -2704,16 +2706,22 @@ blk_status_t btrfs_submit_read_repair(struct inode *inode,
BUG_ON(bio_op(failed_bio) == REQ_OP_WRITE);
- failrec = btrfs_get_io_failure_record(inode, start, end);
+ failrec = btrfs_get_io_failure_record(inode, start);
if (IS_ERR(failrec))
- return errno_to_blk_status(PTR_ERR(failrec));
-
- need_validation = btrfs_io_needs_validation(inode, failed_bio);
+ return PTR_ERR(failrec);
+ /*
+ * We will only submit repair for one sector, thus we don't need
+ * extra validation anymore.
+ *
+ * TODO: All those extra validation related code will be cleaned up
+ * later.
+ */
+ need_validation = false;
if (!btrfs_check_repairable(inode, need_validation, failrec,
failed_mirror)) {
free_io_failure(failure_tree, tree, failrec);
- return BLK_STS_IOERR;
+ return -EIO;
}
repair_bio = btrfs_io_bio_alloc(1);
@@ -2747,7 +2755,79 @@ blk_status_t btrfs_submit_read_repair(struct inode *inode,
free_io_failure(failure_tree, tree, failrec);
bio_put(repair_bio);
}
- return status;
+ return blk_status_to_errno(status);
+}
+
+static void end_page_read(struct page *page, bool uptodate, u64 start, u32 len)
+{
+ struct btrfs_fs_info *fs_info = btrfs_sb(page->mapping->host->i_sb);
+
+ ASSERT(page_offset(page) <= start &&
+ start + len <= page_offset(page) + PAGE_SIZE);
+
+ if (uptodate) {
+ btrfs_page_set_uptodate(fs_info, page, start, len);
+ } else {
+ btrfs_page_clear_uptodate(fs_info, page, start, len);
+ btrfs_page_set_error(fs_info, page, start, len);
+ }
+
+ if (fs_info->sectorsize == PAGE_SIZE)
+ unlock_page(page);
+ else if (is_data_inode(page->mapping->host))
+ /*
+ * For subpage data, unlock the page if we're the last reader.
+ * For subpage metadata, page lock is not utilized for read.
+ */
+ btrfs_subpage_end_reader(fs_info, page, start, len);
+}
+
+blk_status_t btrfs_submit_read_repair(struct inode *inode,
+ struct bio *failed_bio, u32 bio_offset,
+ struct page *page, unsigned int pgoff,
+ u64 start, u64 end, int failed_mirror,
+ unsigned int error_bitmap,
+ submit_bio_hook_t *submit_bio_hook)
+{
+ struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
+ const u32 sectorsize = fs_info->sectorsize;
+ int nr_bits = (end + 1 - start) / sectorsize;
+ int i;
+
+ BUG_ON(bio_op(failed_bio) == REQ_OP_WRITE);
+
+ /* We're here because we had some read errors or csum mismatch */
+ ASSERT(error_bitmap);
+
+ /* Iterate through all the sectors in the range */
+ for (i = 0; i < nr_bits; i++) {
+ int ret;
+ unsigned int offset = i * sectorsize;
+
+ if (!(error_bitmap & (1 << i))) {
+ struct extent_state *cached = NULL;
+
+ /* This sector has no error, just finish the read. */
+ end_page_read(page, true, start + offset, sectorsize);
+ set_extent_uptodate(&BTRFS_I(inode)->io_tree,
+ start + offset,
+ start + offset + sectorsize - 1,
+ &cached, GFP_ATOMIC);
+ unlock_extent_cached_atomic(&BTRFS_I(inode)->io_tree,
+ start + offset,
+ start + offset + sectorsize - 1,
+ &cached);
+ continue;
+ }
+
+ /* This sector has been corrupted, repair it */
+ ret = repair_one_sector(inode, failed_bio, bio_offset + offset,
+ page, pgoff + offset, start + offset,
+ failed_mirror, submit_bio_hook);
+ if (ret < 0)
+ return errno_to_blk_status(ret);
+ }
+ return BLK_STS_OK;
}
/* lots and lots of room for performance fixes in the end_bio funcs */
@@ -2904,30 +2984,6 @@ static void begin_page_read(struct btrfs_fs_info *fs_info, struct page *page)
btrfs_subpage_start_reader(fs_info, page, page_offset(page), PAGE_SIZE);
}
-static void end_page_read(struct page *page, bool uptodate, u64 start, u32 len)
-{
- struct btrfs_fs_info *fs_info = btrfs_sb(page->mapping->host->i_sb);
-
- ASSERT(page_offset(page) <= start &&
- start + len <= page_offset(page) + PAGE_SIZE);
-
- if (uptodate) {
- btrfs_page_set_uptodate(fs_info, page, start, len);
- } else {
- btrfs_page_clear_uptodate(fs_info, page, start, len);
- btrfs_page_set_error(fs_info, page, start, len);
- }
-
- if (fs_info->sectorsize == PAGE_SIZE)
- unlock_page(page);
- else if (is_data_inode(page->mapping->host))
- /*
- * For subpage data, unlock the page if we're the last reader.
- * For subpage metadata, page lock is not utilized for read.
- */
- btrfs_subpage_end_reader(fs_info, page, start, len);
-}
-
/*
* Find extent buffer for a givne bytenr.
*
@@ -2990,6 +3046,7 @@ static void end_bio_extent_readpage(struct bio *bio)
struct inode *inode = page->mapping->host;
struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
const u32 sectorsize = fs_info->sectorsize;
+ unsigned int error_bitmap = (unsigned int)-1;
u64 start;
u64 end;
u32 len;
@@ -3024,12 +3081,14 @@ static void end_bio_extent_readpage(struct bio *bio)
mirror = io_bio->mirror_num;
if (likely(uptodate)) {
- if (is_data_inode(inode))
- ret = btrfs_verify_data_csum(io_bio,
+ if (is_data_inode(inode)) {
+ error_bitmap = btrfs_verify_data_csum(io_bio,
bio_offset, page, start, end);
- else
+ ret = error_bitmap;
+ } else {
ret = btrfs_validate_metadata_buffer(io_bio,
page, start, end, mirror);
+ }
if (ret)
uptodate = 0;
else
@@ -3058,6 +3117,7 @@ static void end_bio_extent_readpage(struct bio *bio)
page,
start - page_offset(page),
start, end, mirror,
+ error_bitmap,
btrfs_submit_data_bio)) {
uptodate = !bio->bi_status;
ASSERT(bio_offset + len > bio_offset);
diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
index 1d7bc27719da..89820028c4bc 100644
--- a/fs/btrfs/extent_io.h
+++ b/fs/btrfs/extent_io.h
@@ -312,6 +312,7 @@ blk_status_t btrfs_submit_read_repair(struct inode *inode,
struct bio *failed_bio, u32 bio_offset,
struct page *page, unsigned int pgoff,
u64 start, u64 end, int failed_mirror,
+ unsigned int error_bitmap,
submit_bio_hook_t *submit_bio_hook);
#ifdef CONFIG_BTRFS_FS_RUN_SANITY_TESTS
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index e9db33afcb5d..4fc6e6766234 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -7941,7 +7941,7 @@ static blk_status_t btrfs_check_read_dio_bio(struct inode *inode,
bvec.bv_page, pgoff,
start,
start + sectorsize - 1,
- io_bio->mirror_num,
+ io_bio->mirror_num, 1,
submit_dio_repair_bio);
if (status)
err = status;
--
2.31.1
next prev parent reply other threads:[~2021-05-03 2:09 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-03 2:08 [PATCH v3 0/4] btrfs: make read time repair to be only submitted for each corrupted sector Qu Wenruo
2021-05-03 2:08 ` [PATCH v3 1/4] btrfs: remove the dead branch in btrfs_io_needs_validation() Qu Wenruo
2021-05-03 17:05 ` David Sterba
2021-05-03 23:39 ` Qu Wenruo
2021-05-10 20:14 ` David Sterba
2021-05-10 23:56 ` Qu Wenruo
2021-05-11 11:27 ` David Sterba
2021-05-03 2:08 ` [PATCH v3 2/4] btrfs: make btrfs_verify_data_csum() to return a bitmap Qu Wenruo
2021-05-03 2:08 ` Qu Wenruo [this message]
2021-05-10 20:32 ` [PATCH v3 3/4] btrfs: submit read time repair only for each corrupted sector David Sterba
2021-05-11 1:42 ` Qu Wenruo
2021-05-11 11:35 ` David Sterba
2021-05-03 2:08 ` [PATCH v3 4/4] btrfs: remove io_failure_record::in_validation Qu Wenruo
2021-05-10 20:41 ` [PATCH v3 0/4] btrfs: make read time repair to be only submitted for each corrupted sector David Sterba
2021-05-11 1:07 ` Qu Wenruo
2021-05-11 11:59 ` David Sterba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210503020856.93333-4-wqu@suse.com \
--to=wqu@suse.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox