linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Zhaolei <zhaolei@cn.fujitsu.com>
To: linux-btrfs <linux-btrfs@vger.kernel.org>
Cc: Zhao Lei <zhaolei@cn.fujitsu.com>, Miao Xie <miaox@cn.fujitsu.com>
Subject: [PATCH 12/16] Btrfs: Combine per-page recover in dev-replace and scrub
Date: Mon, 19 Jan 2015 19:21:01 +0800	[thread overview]
Message-ID: <1421666465-3892-13-git-send-email-zhaolei@cn.fujitsu.com> (raw)
In-Reply-To: <1421666465-3892-1-git-send-email-zhaolei@cn.fujitsu.com>

From: Zhao Lei <zhaolei@cn.fujitsu.com>

The code are similar, combine them to make code clean and easy to maintenance.
Some lost condition are also completed with benefit of this combination.

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
---
 fs/btrfs/scrub.c | 110 ++++++++++++++++++++++++++++---------------------------
 1 file changed, 57 insertions(+), 53 deletions(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 285f8cf..de959f6 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -1099,19 +1099,47 @@ nodatasum_case:
 		}
 	}
 
+	/*
+	 * In case of I/O errors in the area that is supposed to be
+	 * repaired, continue by picking good copies of those pages.
+	 * Select the good pages from mirrors to rewrite bad pages from
+	 * the area to fix. Afterwards verify the checksum of the block
+	 * that is supposed to be repaired. This verification step is
+	 * only done for the purpose of statistic counting and for the
+	 * final scrub report, whether errors remain.
+	 * A perfect algorithm could make use of the checksum and try
+	 * all possible combinations of pages from the different mirrors
+	 * until the checksum verification succeeds. For example, when
+	 * the 2nd page of mirror #1 faces I/O errors, and the 2nd page
+	 * of mirror #2 is readable but the final checksum test fails,
+	 * then the 2nd page of mirror #3 could be tried, whether now
+	 * the final checksum succeedes. But this would be a rare
+	 * exception and is therefore not implemented. At least it is
+	 * avoided that the good copy is overwritten.
+	 * A more useful improvement would be to pick the sectors
+	 * without I/O error based on sector sizes (512 bytes on legacy
+	 * disks) instead of on PAGE_SIZE. Then maybe 512 byte of one
+	 * mirror could be repaired by taking 512 byte of a different
+	 * mirror, even if other 512 byte sectors in the same PAGE_SIZE
+	 * area are unreadable.
+	 */
+
 	/* can only fix I/O errors from here on */
 	if (sblock_bad->no_io_error_seen)
 		goto did_not_correct_error;
 
-	/*
-	 * for dev_replace, pick good pages and write to the target device.
-	 */
-	if (sctx->is_dev_replace) {
-		success = 1;
-		for (page_num = 0; page_num < sblock_bad->page_count;
-		     page_num++) {
-			struct scrub_block *sblock_other = NULL;
+	success = 1;
+	for (page_num = 0; page_num < sblock_bad->page_count;
+	     page_num++) {
+		struct scrub_page *page_bad = sblock_bad->pagev[page_num];
+		struct scrub_block *sblock_other = NULL;
 
+		/* skip no-io-error page in scrub */
+		if (!page_bad->io_error && !sctx->is_dev_replace)
+			continue;
+
+		/* try to find no-io-error page in mirrors */
+		if (page_bad->io_error) {
 			for (mirror_index = 0;
 			     mirror_index < BTRFS_MAX_MIRRORS &&
 			     sblocks_for_recheck[mirror_index].page_count > 0;
@@ -1123,18 +1151,20 @@ nodatasum_case:
 					break;
 				}
 			}
+			if (!sblock_other)
+				success = 0;
+		}
 
-			if (!sblock_other) {
-				/*
-				 * did not find a mirror to fetch the page
-				 * from. scrub_write_page_to_dev_replace()
-				 * handles this case (page->io_error), by
-				 * filling the block with zeros before
-				 * submitting the write request
-				 */
+		if (sctx->is_dev_replace) {
+			/*
+			 * did not find a mirror to fetch the page
+			 * from. scrub_write_page_to_dev_replace()
+			 * handles this case (page->io_error), by
+			 * filling the block with zeros before
+			 * submitting the write request
+			 */
+			if (!sblock_other)
 				sblock_other = sblock_bad;
-				success = 0;
-			}
 
 			if (scrub_write_page_to_dev_replace(sblock_other,
 							    page_num) != 0) {
@@ -1144,9 +1174,15 @@ nodatasum_case:
 					num_write_errors);
 				success = 0;
 			}
+		} else if (sblock_other) {
+			ret = scrub_repair_page_from_good_copy(sblock_bad,
+							       sblock_other,
+							       page_num, 0);
+			if (0 == ret)
+				page_bad->io_error = 0;
+			else
+				success = 0;
 		}
-
-		goto out;
 	}
 
 	/*
@@ -1175,39 +1211,7 @@ nodatasum_case:
 	 * area are unreadable.
 	 */
 
-	success = 1;
-	for (page_num = 0; page_num < sblock_bad->page_count; page_num++) {
-		struct scrub_page *page_bad = sblock_bad->pagev[page_num];
-
-		if (!page_bad->io_error)
-			continue;
-
-		for (mirror_index = 0;
-		     mirror_index < BTRFS_MAX_MIRRORS &&
-		     sblocks_for_recheck[mirror_index].page_count > 0;
-		     mirror_index++) {
-			struct scrub_block *sblock_other = sblocks_for_recheck +
-							   mirror_index;
-			struct scrub_page *page_other = sblock_other->pagev[
-							page_num];
-
-			if (!page_other->io_error) {
-				ret = scrub_repair_page_from_good_copy(
-					sblock_bad, sblock_other, page_num, 0);
-				if (0 == ret) {
-					page_bad->io_error = 0;
-					break; /* succeeded for this page */
-				}
-			}
-		}
-
-		if (page_bad->io_error) {
-			/* did not find a mirror to copy the page from */
-			success = 0;
-		}
-	}
-
-	if (success) {
+	if (success && !sctx->is_dev_replace) {
 		if (is_metadata || have_csum) {
 			/*
 			 * need to verify the checksum now that all
-- 
1.8.5.1


  parent reply	other threads:[~2015-01-19 11:22 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-19 11:20 [PATCH v2 00/16] Btrfs: Cleanup for raid56 scrub Zhaolei
2015-01-19 11:20 ` [PATCH 01/16] Btrfs: fix a out-of-bound access of raid_map Zhaolei
2015-01-19 11:20 ` [PATCH 02/16] Btrfs: sort raid_map before adding tgtdev stripes Zhaolei
2015-01-19 11:20 ` [PATCH 03/16] Btrfs: Make raid_map array be inlined in btrfs_bio structure Zhaolei
2015-01-19 11:20 ` [PATCH 04/16] Btrfs: add ref_count and free function for btrfs_bio Zhaolei
2015-01-19 11:20 ` [PATCH 05/16] Btrfs: Fix a jump typo of nodatasum_case to avoid wrong WARN_ON() Zhaolei
2015-01-19 11:20 ` [PATCH 06/16] Btrfs: Remove noneed force_write in scrub_write_block_to_dev_replace Zhaolei
2015-01-19 11:20 ` [PATCH 07/16] Btrfs: Cleanup btrfs_bio_counter_inc_blocked() Zhaolei
2015-01-19 11:20 ` [PATCH 08/16] Btrfs: btrfs_rm_dev_replace_blocked(): Use wait_event() Zhaolei
2015-01-19 11:20 ` [PATCH 09/16] Btrfs: Break loop when reach BTRFS_MAX_MIRRORS in scrub_setup_recheck_block() Zhaolei
2015-01-19 11:20 ` [PATCH 10/16] Btrfs: Avoid trustless page-level-repair in dev-replace Zhaolei
2015-01-19 12:25   ` Stefan Behrens
2015-01-20  3:07     ` Zhao Lei
2015-01-19 11:21 ` [PATCH 11/16] Btrfs: Separate finding-right-mirror and writing-to-target's process in scrub_handle_errored_block() Zhaolei
2015-01-19 11:21 ` Zhaolei [this message]
2015-01-19 11:21 ` [PATCH 13/16] Btrfs: Simplify scrub_setup_recheck_block()'s argument Zhaolei
2015-01-19 11:21 ` [PATCH 14/16] Btrfs: Include map_type in raid_bio Zhaolei
2015-01-19 11:21 ` [PATCH 15/16] Btrfs: Introduce BTRFS_BLOCK_GROUP_RAID56_MASK to check raid56 simply Zhaolei
2015-01-19 11:21 ` [PATCH 16/16] Rename all ref_count to refs in struct Zhaolei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1421666465-3892-13-git-send-email-zhaolei@cn.fujitsu.com \
    --to=zhaolei@cn.fujitsu.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=miaox@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).