From: Liu Bo <bo.li.liu@oracle.com>
To: linux-btrfs@vger.kernel.org
Subject: [PATCH 3/3] Btrfs: make raid6 rebuild retry more
Date: Mon, 4 Dec 2017 15:40:37 -0700 [thread overview]
Message-ID: <20171204224037.7556-4-bo.li.liu@oracle.com> (raw)
In-Reply-To: <20171204224037.7556-1-bo.li.liu@oracle.com>
There is a scenario that can end up with rebuild process failing to
return good content, i.e.
suppose that all disks can be read without problems and if the content
that was read out doesn't match its checksum, currently for raid6
btrfs at most retries twice,
- the 1st retry is to rebuild with all other stripes, it'll eventually
be a raid5 xor rebuild,
- if the 1st fails, the 2nd retry will deliberately fail parity p so
that it will do raid6 style rebuild,
however, the chances are that another non-parity stripe content also
has something corrupted, so that the above retries are not able to
return correct content, and users will think of this as data loss.
More seriouly, if the loss happens on some important internal btree
roots, it could refuse to mount.
This extends btrfs to do more retries and each retry fails only one
stripe. Since raid6 can tolerate 2 disk failures, if there is one
more failure besides the failure on which we're recovering, this can
always work.
The worst case is to retry as many times as the number of raid6 disks,
but given the fact that such a scenario is really rare in practice,
it's still acceptable.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
fs/btrfs/raid56.c | 18 ++++++++++++++----
fs/btrfs/volumes.c | 9 ++++++++-
2 files changed, 22 insertions(+), 5 deletions(-)
diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
index 8d09535..064d5bc 100644
--- a/fs/btrfs/raid56.c
+++ b/fs/btrfs/raid56.c
@@ -2166,11 +2166,21 @@ int raid56_parity_recover(struct btrfs_fs_info *fs_info, struct bio *bio,
}
/*
- * reconstruct from the q stripe if they are
- * asking for mirror 3
+ * Loop retry:
+ * for 'mirror == 2', reconstruct from all other stripes.
+ * for 'mirror_num > 2', select a stripe to fail on every retry.
*/
- if (mirror_num == 3)
- rbio->failb = rbio->real_stripes - 2;
+ if (mirror_num > 2) {
+ /*
+ * 'mirror == 3' is to fail the p stripe and
+ * reconstruct from the q stripe. 'mirror > 3' is to
+ * fail a data stripe and reconstruct from p+q stripe.
+ */
+ rbio->failb = rbio->real_stripes - (mirror_num - 1);
+ ASSERT(rbio->failb > 0);
+ if (rbio->failb <= rbio->faila)
+ rbio->failb--;
+ }
ret = lock_stripe_add(rbio);
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index b397375..95371f8 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -5094,7 +5094,14 @@ int btrfs_num_copies(struct btrfs_fs_info *fs_info, u64 logical, u64 len)
else if (map->type & BTRFS_BLOCK_GROUP_RAID5)
ret = 2;
else if (map->type & BTRFS_BLOCK_GROUP_RAID6)
- ret = 3;
+ /*
+ * There could be two corrupted data stripes, we need
+ * to loop retry in order to rebuild the correct data.
+ *
+ * Fail a stripe at a time on every retry except the
+ * stripe under reconstruction.
+ */
+ ret = map->num_stripes;
else
ret = 1;
free_extent_map(em);
--
2.9.4
next prev parent reply other threads:[~2017-12-04 23:43 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-12-04 22:40 [PATCH 0/3] Btrfs: loop retry on raid6 read failures Liu Bo
2017-12-04 22:40 ` [PATCH 1/3] Btrfs: remove redundant check in rbio_can_merge Liu Bo
2017-12-05 18:20 ` David Sterba
2017-12-04 22:40 ` [PATCH 2/3] Btrfs: do not merge rbios if their fail stripe index are not identical Liu Bo
2017-12-05 18:24 ` David Sterba
2017-12-04 22:40 ` Liu Bo [this message]
2017-12-05 8:07 ` [PATCH 3/3] Btrfs: make raid6 rebuild retry more Qu Wenruo
2017-12-05 18:04 ` Liu Bo
2017-12-05 19:29 ` David Sterba
2017-12-05 18:09 ` David Sterba
2017-12-05 22:55 ` Liu Bo
2017-12-06 0:11 ` Qu Wenruo
2017-12-07 0:26 ` Liu Bo
2017-12-05 8:08 ` Qu Wenruo
2017-12-05 18:26 ` [PATCH 0/3] Btrfs: loop retry on raid6 read failures David Sterba
2018-01-05 15:54 ` David Sterba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171204224037.7556-4-bo.li.liu@oracle.com \
--to=bo.li.liu@oracle.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).