[PATCH 0/4] Btrfs: RAID 5/6 missing device replace

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/4] Btrfs: RAID 5/6 missing device replace
@ 2015-05-11  7:58 Omar Sandoval
  2015-05-11  7:58 ` [PATCH 1/4] Btrfs: remove misleading handling of missing device scrub Omar Sandoval
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Omar Sandoval @ 2015-05-11  7:58 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Miao Xie, Philip, Omar Sandoval

A user reported on Bugzilla that they were seeing kernel BUGs when
attempting to replace a missing device on a RAID 6 array. After
identifying the apparent cause of the BUG, I reached the conclusion that
there wasn't a quick fix. Maybe Miao Xie can point something out that I
missed, as he originally implemented device replace on RAID 5/6 :)

Patch 4 has the details, but the main problem is that we can't create
bios for a missing device, so the main scrub code path isn't very
useful. On RAID 5/6, since we only have one mirror for any piece of
data, the missing device is the only mirror we can use. Clearly, (unless
I missed something), this case needs to be handled differently.

These patches are on top of v4.1-rc2. I ran the scrub and replace
xfstests and the script below, which also reproduces the original BUG.

Thanks!

Omar Sandoval (4):
  Btrfs: remove misleading handling of missing device scrub
  Btrfs: count devices correctly in readahead during RAID 5/6 replace
  Btrfs: add RAID 5/6 BTRFS_RBIO_REBUILD_MISSING operation
  Btrfs: fix device replace of a missing RAID 5/6 device

 fs/btrfs/raid56.c |  87 +++++++++++++++++++++++++----
 fs/btrfs/raid56.h |  10 +++-
 fs/btrfs/reada.c  |   4 +-
 fs/btrfs/scrub.c  | 164 +++++++++++++++++++++++++++++++++++++++++++++---------
 4 files changed, 225 insertions(+), 40 deletions(-)

Testing script:

----
#!/bin/bash

USAGE="Usage: $0 [eio|missing] [raid0|raid1|raid5|raid6]" 

if [ "$1" = "-h" ]; then
	echo "$USAGE"
	exit
fi

MODE="${1:-missing}"
RAID="${2:-raid5}"

case "$MODE" in
	eio|missing)
		;;
	*)
		echo "$USAGE" >&2
		exit 1
		;;
esac

case "$RAID" in
	raid[0156])
		;;
	*)
		echo "$USAGE" >&2
		exit 1
		;;
esac

NUM_DISKS=4
NUM_RAID_DISKS=3
SRC_DISK=1
TARGET_DISK=3
NUM_SECTORS=$((1024 * 1024))
LOOP_DEVICES=()
DM_DEVICES=()

cleanup () {
	echo "Done. Press enter to cleanup..."
	read
	if findmnt /mnt; then
		umount /mnt
	fi
	for DM in "${DM_DEVICES[@]}"; do
		dmsetup remove "$DM"
	done
	for LOOP in "${LOOP_DEVICES[@]}"; do
		losetup --detach "$LOOP"
	done
	for ((i = 0; i < NUM_DISKS; i++)); do
		rm -f disk${i}.img
	done
}
trap 'cleanup; exit 1' ERR

echo "Creating disk images..."
for ((i = 0; i < NUM_DISKS; i++)); do
	rm -f disk${i}.img
	dd if=/dev/zero of=disk${i}.img bs=512 seek=$NUM_SECTORS count=0
	LOOP_DEVICES+=("$(losetup --find --show disk${i}.img)")
done

echo "Creating loopback devices..."
for LOOP in "${LOOP_DEVICES[@]}"; do
	DM="${LOOP/\/dev\/loop/dm}"
	dmsetup create "$DM" --table "0 $NUM_SECTORS linear $LOOP 0"
	DM_DEVICES+=("$DM")
done

echo "Creating filesystem..."
FS_DEVICES=("${DM_DEVICES[@]:0:$NUM_RAID_DISKS}")
FS_DEVICES=("${FS_DEVICES[@]/#/\/dev\/mapper\/}")
MOUNT_DEVICE="${FS_DEVICES[$(((SRC_DISK + 1) % NUM_RAID_DISKS))]}"
mkfs.btrfs -d "$RAID" -m "$RAID" "${FS_DEVICES[@]}"
mount "$MOUNT_DEVICE" /mnt
cp -r ~/xfstests /mnt
sync

case "$MODE" in
	eio)
		echo "Killing disk..."
		dmsetup suspend "${DM_DEVICES[$SRC_DISK]}"
		dmsetup reload "${DM_DEVICES[$SRC_DISK]}" --table "0 $NUM_SECTORS error"
		dmsetup resume "${DM_DEVICES[$SRC_DISK]}"
		;;
	missing)
		echo "Removing disk and remounting degraded..."
		umount /mnt
		dmsetup remove "${DM_DEVICES[$SRC_DISK]}"
		unset DM_DEVICES[$SRC_DISK]
		mount -o degraded "$MOUNT_DEVICE" /mnt
		;;
esac

echo "Replacing disk..."
btrfs replace start -B $((SRC_DISK + 1)) /dev/mapper/"${DM_DEVICES[$TARGET_DISK]}" /mnt

echo "Scrubbing to double-check..."
btrfs scrub start -Br /mnt

cleanup
----

-- 
2.4.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/4] Btrfs: remove misleading handling of missing device scrub
  2015-05-11  7:58 [PATCH 0/4] Btrfs: RAID 5/6 missing device replace Omar Sandoval
@ 2015-05-11  7:58 ` Omar Sandoval
  2015-05-11  7:58 ` [PATCH 2/4] Btrfs: count devices correctly in readahead during RAID 5/6 replace Omar Sandoval
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Omar Sandoval @ 2015-05-11  7:58 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Miao Xie, Philip, Omar Sandoval

scrub_submit() claims that it can handle a bio with a NULL block device,
but this is misleading, as calling bio_add_page() on a bio with a NULL
->bi_bdev would've already crashed. Delete this, as we're about to
properly handle a missing block device.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
---
 fs/btrfs/scrub.c | 16 +---------------
 1 file changed, 1 insertion(+), 15 deletions(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index ab58115..633fa7b 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -2087,21 +2087,7 @@ static void scrub_submit(struct scrub_ctx *sctx)
 	sbio = sctx->bios[sctx->curr];
 	sctx->curr = -1;
 	scrub_pending_bio_inc(sctx);
-
-	if (!sbio->bio->bi_bdev) {
-		/*
-		 * this case should not happen. If btrfs_map_block() is
-		 * wrong, it could happen for dev-replace operations on
-		 * missing devices when no mirrors are available, but in
-		 * this case it should already fail the mount.
-		 * This case is handled correctly (but _very_ slowly).
-		 */
-		printk_ratelimited(KERN_WARNING
-			"BTRFS: scrub_submit(bio bdev == NULL) is unexpected!\n");
-		bio_endio(sbio->bio, -EIO);
-	} else {
-		btrfsic_submit_bio(READ, sbio->bio);
-	}
+	btrfsic_submit_bio(READ, sbio->bio);
 }
 
 static int scrub_add_page_to_rd_bio(struct scrub_ctx *sctx,
-- 
2.4.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/4] Btrfs: count devices correctly in readahead during RAID 5/6 replace
  2015-05-11  7:58 [PATCH 0/4] Btrfs: RAID 5/6 missing device replace Omar Sandoval
  2015-05-11  7:58 ` [PATCH 1/4] Btrfs: remove misleading handling of missing device scrub Omar Sandoval
@ 2015-05-11  7:58 ` Omar Sandoval
  2015-05-11  7:58 ` [PATCH 3/4] Btrfs: add RAID 5/6 BTRFS_RBIO_REBUILD_MISSING operation Omar Sandoval
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Omar Sandoval @ 2015-05-11  7:58 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Miao Xie, Philip, Omar Sandoval

Commit 5fbc7c59fd22 ("Btrfs: fix unfinished readahead thread for raid5/6
degraded mounting") fixed a problem where we would skip a missing device
when we shouldn't have because there are no other mirrors to read from
in RAID 5/6. After commit 2c8cdd6ee4e7 ("Btrfs, replace: write dirty
pages into the replace target device"), the fix doesn't work when we're
doing a missing device replace on RAID 5/6 because the replace device is
counted as a mirror so we're tricked into thinking we can safely skip
the missing device. The fix is to count only the real stripes and decide
based on that.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
---
 fs/btrfs/reada.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/reada.c b/fs/btrfs/reada.c
index 0e7beea..4645cd1 100644
--- a/fs/btrfs/reada.c
+++ b/fs/btrfs/reada.c
@@ -328,6 +328,7 @@ static struct reada_extent *reada_find_extent(struct btrfs_root *root,
 	struct btrfs_device *prev_dev;
 	u32 blocksize;
 	u64 length;
+	int real_stripes;
 	int nzones = 0;
 	int i;
 	unsigned long index = logical >> PAGE_CACHE_SHIFT;
@@ -369,7 +370,8 @@ static struct reada_extent *reada_find_extent(struct btrfs_root *root,
 		goto error;
 	}
 
-	for (nzones = 0; nzones < bbio->num_stripes; ++nzones) {
+	real_stripes = bbio->num_stripes - bbio->num_tgtdevs;
+	for (nzones = 0; nzones < real_stripes; ++nzones) {
 		struct reada_zone *zone;
 
 		dev = bbio->stripes[nzones].dev;
-- 
2.4.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/4] Btrfs: add RAID 5/6 BTRFS_RBIO_REBUILD_MISSING operation
  2015-05-11  7:58 [PATCH 0/4] Btrfs: RAID 5/6 missing device replace Omar Sandoval
  2015-05-11  7:58 ` [PATCH 1/4] Btrfs: remove misleading handling of missing device scrub Omar Sandoval
  2015-05-11  7:58 ` [PATCH 2/4] Btrfs: count devices correctly in readahead during RAID 5/6 replace Omar Sandoval
@ 2015-05-11  7:58 ` Omar Sandoval
  2015-05-11  7:58 ` [PATCH 4/4] Btrfs: fix device replace of a missing RAID 5/6 device Omar Sandoval
  2015-05-26 17:05 ` [PATCH 0/4] Btrfs: RAID 5/6 missing device replace Omar Sandoval
  4 siblings, 0 replies; 12+ messages in thread
From: Omar Sandoval @ 2015-05-11  7:58 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Miao Xie, Philip, Omar Sandoval

The current RAID 5/6 recovery code isn't quite prepared to handle
missing devices. In particular, it expects a bio that we previously
attempted to use in the read path, meaning that it has valid pages
allocated. However, missing devices have a NULL blkdev, and we can't
call bio_add_page() on a bio with a NULL blkdev. We could do manual
manipulation of bio->bi_io_vec, but that's pretty gross. So instead, add
a separate path that allows us to manually add pages to the rbio.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
---
 fs/btrfs/raid56.c | 87 ++++++++++++++++++++++++++++++++++++++++++++++++-------
 fs/btrfs/raid56.h | 10 +++++--
 fs/btrfs/scrub.c  |  3 +-
 3 files changed, 86 insertions(+), 14 deletions(-)

diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
index fa72068..6fe2613 100644
--- a/fs/btrfs/raid56.c
+++ b/fs/btrfs/raid56.c
@@ -61,9 +61,10 @@
 #define RBIO_CACHE_SIZE 1024
 
 enum btrfs_rbio_ops {
-	BTRFS_RBIO_WRITE	= 0,
-	BTRFS_RBIO_READ_REBUILD	= 1,
-	BTRFS_RBIO_PARITY_SCRUB	= 2,
+	BTRFS_RBIO_WRITE,
+	BTRFS_RBIO_READ_REBUILD,
+	BTRFS_RBIO_PARITY_SCRUB,
+	BTRFS_RBIO_REBUILD_MISSING,
 };
 
 struct btrfs_raid_bio {
@@ -602,6 +603,10 @@ static int rbio_can_merge(struct btrfs_raid_bio *last,
 	    cur->operation == BTRFS_RBIO_PARITY_SCRUB)
 		return 0;
 
+	if (last->operation == BTRFS_RBIO_REBUILD_MISSING ||
+	    cur->operation == BTRFS_RBIO_REBUILD_MISSING)
+		return 0;
+
 	return 1;
 }
 
@@ -793,7 +798,10 @@ static noinline void unlock_stripe(struct btrfs_raid_bio *rbio)
 
 			if (next->operation == BTRFS_RBIO_READ_REBUILD)
 				async_read_rebuild(next);
-			else if (next->operation == BTRFS_RBIO_WRITE) {
+			else if (next->operation == BTRFS_RBIO_REBUILD_MISSING) {
+				steal_rbio(rbio, next);
+				async_read_rebuild(next);
+			} else if (next->operation == BTRFS_RBIO_WRITE) {
 				steal_rbio(rbio, next);
 				async_rmw_stripe(next);
 			} else if (next->operation == BTRFS_RBIO_PARITY_SCRUB) {
@@ -1809,7 +1817,8 @@ static void __raid_recover_end_io(struct btrfs_raid_bio *rbio)
 	faila = rbio->faila;
 	failb = rbio->failb;
 
-	if (rbio->operation == BTRFS_RBIO_READ_REBUILD) {
+	if (rbio->operation == BTRFS_RBIO_READ_REBUILD ||
+	    rbio->operation == BTRFS_RBIO_REBUILD_MISSING) {
 		spin_lock_irq(&rbio->bio_list_lock);
 		set_bit(RBIO_RMW_LOCKED_BIT, &rbio->flags);
 		spin_unlock_irq(&rbio->bio_list_lock);
@@ -1834,7 +1843,8 @@ static void __raid_recover_end_io(struct btrfs_raid_bio *rbio)
 			 * if we're rebuilding a read, we have to use
 			 * pages from the bio list
 			 */
-			if (rbio->operation == BTRFS_RBIO_READ_REBUILD &&
+			if ((rbio->operation == BTRFS_RBIO_READ_REBUILD ||
+			     rbio->operation == BTRFS_RBIO_REBUILD_MISSING) &&
 			    (stripe == faila || stripe == failb)) {
 				page = page_in_rbio(rbio, stripe, pagenr, 0);
 			} else {
@@ -1943,7 +1953,8 @@ pstripe:
 			 * if we're rebuilding a read, we have to use
 			 * pages from the bio list
 			 */
-			if (rbio->operation == BTRFS_RBIO_READ_REBUILD &&
+			if ((rbio->operation == BTRFS_RBIO_READ_REBUILD ||
+			     rbio->operation == BTRFS_RBIO_REBUILD_MISSING) &&
 			    (stripe == faila || stripe == failb)) {
 				page = page_in_rbio(rbio, stripe, pagenr, 0);
 			} else {
@@ -1965,6 +1976,8 @@ cleanup_io:
 			clear_bit(RBIO_CACHE_READY_BIT, &rbio->flags);
 
 		rbio_orig_end_io(rbio, err, err == 0);
+	} else if (rbio->operation == BTRFS_RBIO_REBUILD_MISSING) {
+		rbio_orig_end_io(rbio, err, err == 0);
 	} else if (err == 0) {
 		rbio->faila = -1;
 		rbio->failb = -1;
@@ -2101,7 +2114,8 @@ out:
 	return 0;
 
 cleanup:
-	if (rbio->operation == BTRFS_RBIO_READ_REBUILD)
+	if (rbio->operation == BTRFS_RBIO_READ_REBUILD ||
+	    rbio->operation == BTRFS_RBIO_REBUILD_MISSING)
 		rbio_orig_end_io(rbio, -EIO, 0);
 	return -EIO;
 }
@@ -2232,8 +2246,9 @@ raid56_parity_alloc_scrub_rbio(struct btrfs_root *root, struct bio *bio,
 	return rbio;
 }
 
-void raid56_parity_add_scrub_pages(struct btrfs_raid_bio *rbio,
-				   struct page *page, u64 logical)
+/* Used for both parity scrub and missing. */
+void raid56_add_scrub_pages(struct btrfs_raid_bio *rbio, struct page *page,
+			    u64 logical)
 {
 	int stripe_offset;
 	int index;
@@ -2668,3 +2683,55 @@ void raid56_parity_submit_scrub_rbio(struct btrfs_raid_bio *rbio)
 	if (!lock_stripe_add(rbio))
 		async_scrub_parity(rbio);
 }
+
+/* The following code is used for dev replace of a missing RAID 5/6 device. */
+
+struct btrfs_raid_bio *
+raid56_alloc_missing_rbio(struct btrfs_root *root, struct bio *bio,
+			  struct btrfs_bio *bbio, u64 length)
+{
+	struct btrfs_raid_bio *rbio;
+
+	rbio = alloc_rbio(root, bbio, length);
+	if (IS_ERR(rbio))
+		return NULL;
+
+	rbio->operation = BTRFS_RBIO_REBUILD_MISSING;
+	bio_list_add(&rbio->bio_list, bio);
+	/*
+	 * This is a special bio which is used to hold the completion handler
+	 * and make the scrub rbio is similar to the other types
+	 */
+	ASSERT(!bio->bi_iter.bi_size);
+
+	rbio->faila = find_logical_bio_stripe(rbio, bio);
+	if (rbio->faila == -1) {
+		BUG();
+		kfree(rbio);
+		return NULL;
+	}
+
+	return rbio;
+}
+
+static void missing_raid56_work(struct btrfs_work *work)
+{
+	struct btrfs_raid_bio *rbio;
+
+	rbio = container_of(work, struct btrfs_raid_bio, work);
+	__raid56_parity_recover(rbio);
+}
+
+static void async_missing_raid56(struct btrfs_raid_bio *rbio)
+{
+	btrfs_init_work(&rbio->work, btrfs_rmw_helper,
+			missing_raid56_work, NULL, NULL);
+
+	btrfs_queue_work(rbio->fs_info->rmw_workers, &rbio->work);
+}
+
+void raid56_submit_missing_rbio(struct btrfs_raid_bio *rbio)
+{
+	if (!lock_stripe_add(rbio))
+		async_missing_raid56(rbio);
+}
diff --git a/fs/btrfs/raid56.h b/fs/btrfs/raid56.h
index 2b5d797..8b69469 100644
--- a/fs/btrfs/raid56.h
+++ b/fs/btrfs/raid56.h
@@ -48,15 +48,21 @@ int raid56_parity_recover(struct btrfs_root *root, struct bio *bio,
 int raid56_parity_write(struct btrfs_root *root, struct bio *bio,
 			       struct btrfs_bio *bbio, u64 stripe_len);
 
+void raid56_add_scrub_pages(struct btrfs_raid_bio *rbio, struct page *page,
+			    u64 logical);
+
 struct btrfs_raid_bio *
 raid56_parity_alloc_scrub_rbio(struct btrfs_root *root, struct bio *bio,
 			       struct btrfs_bio *bbio, u64 stripe_len,
 			       struct btrfs_device *scrub_dev,
 			       unsigned long *dbitmap, int stripe_nsectors);
-void raid56_parity_add_scrub_pages(struct btrfs_raid_bio *rbio,
-				   struct page *page, u64 logical);
 void raid56_parity_submit_scrub_rbio(struct btrfs_raid_bio *rbio);
 
+struct btrfs_raid_bio *
+raid56_alloc_missing_rbio(struct btrfs_root *root, struct bio *bio,
+			  struct btrfs_bio *bbio, u64 length);
+void raid56_submit_missing_rbio(struct btrfs_raid_bio *rbio);
+
 int btrfs_alloc_stripe_hash_table(struct btrfs_fs_info *info);
 void btrfs_free_stripe_hash_table(struct btrfs_fs_info *info);
 #endif
diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 633fa7b..b94694d 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -2699,8 +2699,7 @@ static void scrub_parity_check_and_repair(struct scrub_parity *sparity)
 		goto rbio_out;
 
 	list_for_each_entry(spage, &sparity->spages, list)
-		raid56_parity_add_scrub_pages(rbio, spage->page,
-					      spage->logical);
+		raid56_add_scrub_pages(rbio, spage->page, spage->logical);
 
 	scrub_pending_bio_inc(sctx);
 	raid56_parity_submit_scrub_rbio(rbio);
-- 
2.4.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/4] Btrfs: fix device replace of a missing RAID 5/6 device
  2015-05-11  7:58 [PATCH 0/4] Btrfs: RAID 5/6 missing device replace Omar Sandoval
                   ` (2 preceding siblings ...)
  2015-05-11  7:58 ` [PATCH 3/4] Btrfs: add RAID 5/6 BTRFS_RBIO_REBUILD_MISSING operation Omar Sandoval
@ 2015-05-11  7:58 ` Omar Sandoval
  2015-06-11 10:29   ` Zhao Lei
  2015-05-26 17:05 ` [PATCH 0/4] Btrfs: RAID 5/6 missing device replace Omar Sandoval
  4 siblings, 1 reply; 12+ messages in thread
From: Omar Sandoval @ 2015-05-11  7:58 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Miao Xie, Philip, Omar Sandoval

The original implementation of device replace on RAID 5/6 seems to have
missed support for replacing a missing device. When this is attempted,
we end up calling bio_add_page() on a bio with a NULL ->bi_bdev, which
crashes when we try to dereference it. This happens because
btrfs_map_block() has no choice but to return us the missing device
because RAID 5/6 don't have any alternate mirrors to read from, and a
missing device has a NULL bdev.

The idea implemented here is to handle the missing device case
separately, which better only happen when we're replacing a missing RAID
5/6 device. We use the new BTRFS_RBIO_REBUILD_MISSING operation to
reconstruct the data from parity, check it with
scrub_recheck_block_checksum(), and write it out with
scrub_write_block_to_dev_replace().

Reported-by: Philip <bugzilla@philip-seeger.de>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=96141
Signed-off-by: Omar Sandoval <osandov@osandov.com>
---
 fs/btrfs/scrub.c | 145 +++++++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 135 insertions(+), 10 deletions(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index b94694d..a13f91a 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -125,6 +125,7 @@ struct scrub_block {
 		/* It is for the data with checksum */
 		unsigned int	data_corrected:1;
 	};
+	struct btrfs_work	work;
 };
 
 /* Used for the chunks with parity stripe such RAID5/6 */
@@ -2164,6 +2165,126 @@ again:
 	return 0;
 }
 
+static void scrub_missing_raid56_end_io(struct bio *bio, int error)
+{
+	struct scrub_block *sblock = bio->bi_private;
+	struct btrfs_fs_info *fs_info = sblock->sctx->dev_root->fs_info;
+
+	if (error)
+		sblock->no_io_error_seen = 0;
+
+	btrfs_queue_work(fs_info->scrub_workers, &sblock->work);
+}
+
+static void scrub_missing_raid56_worker(struct btrfs_work *work)
+{
+	struct scrub_block *sblock = container_of(work, struct scrub_block, work);
+	struct scrub_ctx *sctx = sblock->sctx;
+	struct btrfs_fs_info *fs_info = sctx->dev_root->fs_info;
+	unsigned int is_metadata;
+	unsigned int have_csum;
+	u8 *csum;
+	u64 generation;
+	u64 logical;
+	struct btrfs_device *dev;
+
+	is_metadata = !(sblock->pagev[0]->flags & BTRFS_EXTENT_FLAG_DATA);
+	have_csum = sblock->pagev[0]->have_csum;
+	csum = sblock->pagev[0]->csum;
+	generation = sblock->pagev[0]->generation;
+	logical = sblock->pagev[0]->logical;
+	dev = sblock->pagev[0]->dev;
+
+	if (sblock->no_io_error_seen) {
+		scrub_recheck_block_checksum(fs_info, sblock, is_metadata,
+					     have_csum, csum, generation,
+					     sctx->csum_size);
+	}
+
+	if (!sblock->no_io_error_seen) {
+		spin_lock(&sctx->stat_lock);
+		sctx->stat.read_errors++;
+		spin_unlock(&sctx->stat_lock);
+		printk_ratelimited_in_rcu(KERN_ERR
+			"BTRFS: I/O error rebulding logical %llu for dev %s\n",
+			logical, rcu_str_deref(dev->name));
+	} else if (sblock->header_error || sblock->checksum_error) {
+		spin_lock(&sctx->stat_lock);
+		sctx->stat.uncorrectable_errors++;
+		spin_unlock(&sctx->stat_lock);
+		printk_ratelimited_in_rcu(KERN_ERR
+			"BTRFS: failed to rebuild valid logical %llu for dev %s\n",
+			logical, rcu_str_deref(dev->name));
+	} else {
+		scrub_write_block_to_dev_replace(sblock);
+	}
+
+	scrub_block_put(sblock);
+	scrub_pending_bio_dec(sctx);
+}
+
+static void scrub_missing_raid56_pages(struct scrub_block *sblock)
+{
+	struct scrub_ctx *sctx = sblock->sctx;
+	struct btrfs_fs_info *fs_info = sctx->dev_root->fs_info;
+	u64 length = sblock->page_count * PAGE_SIZE;
+	u64 logical = sblock->pagev[0]->logical;
+	struct btrfs_bio *bbio;
+	struct bio *bio;
+	struct btrfs_raid_bio *rbio;
+	int ret;
+	int i;
+
+	ret = btrfs_map_sblock(fs_info, REQ_GET_READ_MIRRORS, logical, &length,
+			       &bbio, 0, 1);
+	if (ret || !bbio || !bbio->raid_map)
+		goto bbio_out;
+
+	if (WARN_ON(!sctx->is_dev_replace ||
+		    !(bbio->map_type & BTRFS_BLOCK_GROUP_RAID56_MASK))) {
+		/*
+		 * We shouldn't be scrubbing a missing device. Even for dev
+		 * replace, we should only get here for RAID 5/6. We either
+		 * managed to mount something with no mirrors remaining or
+		 * there's a bug in scrub_remap_extent()/btrfs_map_block().
+		 */
+		goto bbio_out;
+	}
+
+	bio = btrfs_io_bio_alloc(GFP_NOFS, 0);
+	if (!bio)
+		goto bbio_out;
+
+	bio->bi_iter.bi_sector = logical >> 9;
+	bio->bi_private = sblock;
+	bio->bi_end_io = scrub_missing_raid56_end_io;
+
+	rbio = raid56_alloc_missing_rbio(sctx->dev_root, bio, bbio, length);
+	if (!rbio)
+		goto rbio_out;
+
+	for (i = 0; i < sblock->page_count; i++) {
+		struct scrub_page *spage = sblock->pagev[i];
+
+		raid56_add_scrub_pages(rbio, spage->page, spage->logical);
+	}
+
+	btrfs_init_work(&sblock->work, btrfs_scrub_helper,
+			scrub_missing_raid56_worker, NULL, NULL);
+	scrub_block_get(sblock);
+	scrub_pending_bio_inc(sctx);
+	raid56_submit_missing_rbio(rbio);
+	return;
+
+rbio_out:
+	bio_put(bio);
+bbio_out:
+	btrfs_put_bbio(bbio);
+	spin_lock(&sctx->stat_lock);
+	sctx->stat.malloc_errors++;
+	spin_unlock(&sctx->stat_lock);
+}
+
 static int scrub_pages(struct scrub_ctx *sctx, u64 logical, u64 len,
 		       u64 physical, struct btrfs_device *dev, u64 flags,
 		       u64 gen, int mirror_num, u8 *csum, int force,
@@ -2227,19 +2348,23 @@ leave_nomem:
 	}
 
 	WARN_ON(sblock->page_count == 0);
-	for (index = 0; index < sblock->page_count; index++) {
-		struct scrub_page *spage = sblock->pagev[index];
-		int ret;
+	if (dev->missing) {
+		scrub_missing_raid56_pages(sblock);
+	} else {
+		for (index = 0; index < sblock->page_count; index++) {
+			struct scrub_page *spage = sblock->pagev[index];
+			int ret;
 
-		ret = scrub_add_page_to_rd_bio(sctx, spage);
-		if (ret) {
-			scrub_block_put(sblock);
-			return ret;
+			ret = scrub_add_page_to_rd_bio(sctx, spage);
+			if (ret) {
+				scrub_block_put(sblock);
+				return ret;
+			}
 		}
-	}
 
-	if (force)
-		scrub_submit(sctx);
+		if (force)
+			scrub_submit(sctx);
+	}
 
 	/* last one frees, either here or in bio completion for last page */
 	scrub_block_put(sblock);
-- 
2.4.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/4] Btrfs: RAID 5/6 missing device replace
  2015-05-11  7:58 [PATCH 0/4] Btrfs: RAID 5/6 missing device replace Omar Sandoval
                   ` (3 preceding siblings ...)
  2015-05-11  7:58 ` [PATCH 4/4] Btrfs: fix device replace of a missing RAID 5/6 device Omar Sandoval
@ 2015-05-26 17:05 ` Omar Sandoval
  2015-06-11  3:52   ` Zhao Lei
  4 siblings, 1 reply; 12+ messages in thread
From: Omar Sandoval @ 2015-05-26 17:05 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Miao Xie, Philip

On Mon, May 11, 2015 at 12:58:11AM -0700, Omar Sandoval wrote:
> A user reported on Bugzilla that they were seeing kernel BUGs when
> attempting to replace a missing device on a RAID 6 array. After
> identifying the apparent cause of the BUG, I reached the conclusion that
> there wasn't a quick fix. Maybe Miao Xie can point something out that I
> missed, as he originally implemented device replace on RAID 5/6 :)
> 
> Patch 4 has the details, but the main problem is that we can't create
> bios for a missing device, so the main scrub code path isn't very
> useful. On RAID 5/6, since we only have one mirror for any piece of
> data, the missing device is the only mirror we can use. Clearly, (unless
> I missed something), this case needs to be handled differently.
> 
> These patches are on top of v4.1-rc2. I ran the scrub and replace
> xfstests and the script below, which also reproduces the original BUG.
> 
> Thanks!
> 
> Omar Sandoval (4):
>   Btrfs: remove misleading handling of missing device scrub
>   Btrfs: count devices correctly in readahead during RAID 5/6 replace
>   Btrfs: add RAID 5/6 BTRFS_RBIO_REBUILD_MISSING operation
>   Btrfs: fix device replace of a missing RAID 5/6 device
> 
>  fs/btrfs/raid56.c |  87 +++++++++++++++++++++++++----
>  fs/btrfs/raid56.h |  10 +++-
>  fs/btrfs/reada.c  |   4 +-
>  fs/btrfs/scrub.c  | 164 +++++++++++++++++++++++++++++++++++++++++++++---------
>  4 files changed, 225 insertions(+), 40 deletions(-)
> 

Ping for review.

-- 
Omar

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH 0/4] Btrfs: RAID 5/6 missing device replace
  2015-05-26 17:05 ` [PATCH 0/4] Btrfs: RAID 5/6 missing device replace Omar Sandoval
@ 2015-06-11  3:52   ` Zhao Lei
  2015-06-11  6:08     ` Omar Sandoval
  0 siblings, 1 reply; 12+ messages in thread
From: Zhao Lei @ 2015-06-11  3:52 UTC (permalink / raw)
  To: 'Omar Sandoval', linux-btrfs; +Cc: 'Miao Xie', 'Philip'

Hi, Omar Sandoval

I tested this patchset with my script, but see general protection fault
again.
NODE: kvm with virtio disk
ROOTFS: RHEL6 with btrfs-progs v4.0
KERNEL: v4.1-rc6 with 4 patchs in this patchset

Maybe my test have small different with yours,
but it seems is similar bug, could you check it?

My script:
****************
*** NOTE: update FS_DEVS in your own env before run ***
****************

#!/bin/bash

FS_DEVS=(/dev/vdd /dev/vde /dev/vdf)
PRUNE_DEV=/dev/vde
MNT=/mnt/btrfs_test_for_raid56_strip

do_cmd()
{
	echo "   $*"
	local output
	local ret
	output=$("$@" 2>&1)
	ret="$?"
	[[ "$ret" != 0 ]] && {
		echo "$output"
	}
	return "$ret"
}

mkdir -p "$MNT"
for ((i = 0; i < 10; i++)); do
	umount "$MNT" &>/dev/null
done
dmesg -c >/dev/null

echo "1: Creating filesystem"
do_cmd mkfs.btrfs -f -d raid5 -m raid5 "${FS_DEVS[@]}" || exit 1
do_cmd mount "$FS_DEVS" "$MNT" || exit 1

echo "2: Write some data"
DATA_CNT=4
for ((i = 0; i < DATA_CNT; i++)); do
	size_m="$((1<<i))"
	do_cmd dd bs=1M if=/dev/urandom of="$MNT"/file_"$i" count="$size_m" || exit 1
done

echo "3: Prune a disk in fs"
do_cmd umount "$MNT" || exit 1
do_cmd dd bs=1M if=/dev/zero of="$PRUNE_DEV" count=1
do_cmd mount -o "degraded" "$FS_DEVS" "$MNT" || exit 1

echo "4: Do scrub"
do_cmd btrfs scrub start -B "$MNT"

echo "5: Checking result"
if dmesg | grep -q 'general protection fault'; then
	echo "Result Fail"
	dmesg | grep -A10000 'general protection fault'	
else
	echo "Result OK"
fi

exit 0

Test result:
# ./bug.sh
1: Creating filesystem
   mkfs.btrfs -f -d raid5 -m raid5 /dev/vdd /dev/vde /dev/vdf
   mount /dev/vdd /mnt/btrfs_test_for_raid56_strip
2: Write some data
   dd bs=1M if=/dev/urandom of=/mnt/btrfs_test_for_raid56_strip/file_0 count=1
   dd bs=1M if=/dev/urandom of=/mnt/btrfs_test_for_raid56_strip/file_1 count=2
   dd bs=1M if=/dev/urandom of=/mnt/btrfs_test_for_raid56_strip/file_2 count=4
   dd bs=1M if=/dev/urandom of=/mnt/btrfs_test_for_raid56_strip/file_3 count=8
3: Prune a disk in fs
   umount /mnt/btrfs_test_for_raid56_strip
   dd bs=1M if=/dev/zero of=/dev/vde count=1
   mount -o degraded /dev/vdd /mnt/btrfs_test_for_raid56_strip
4: Do scrub
   btrfs scrub start -B /mnt/btrfs_test_for_raid56_strip
5: Checking result
Result Fail
[   48.899408] general protection fault: 0000 [#1] SMP
[   48.900021] Modules linked in:
[   48.900021] CPU: 0 PID: 2102 Comm: btrfs Not tainted 4.1.0-rc6_HEAD_c65b99f046843d2455aa231747b5a07a999a9f3d_ #53
[   48.900021] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5.1-0-g8936dbb-20141113_115728-nilsson.home.kraxel.org 04/01/2014
[   48.900021] task: ffff88003f6f5500 ti: ffff88003c954000 task.ti: ffff88003c954000
[   48.900021] RIP: 0010:[<ffffffff81441395>]  [<ffffffff81441395>] bio_add_page+0x15/0x70
[   48.900021] RSP: 0018:ffff88003c957718  EFLAGS: 00010246
[   48.900021] RAX: ffff88003b3ac6a8 RBX: ffff88003cb19000 RCX: 007fffc4001e58ca
[   48.900021] RDX: 000210868b41ffca RSI: ffffea0001273b60 RDI: ffff88003b3ac6a8
...

Thanks
Zhaolei

> From: linux-btrfs-owner@vger.kernel.org
> [mailto:linux-btrfs-owner@vger.kernel.org] On Behalf Of Omar Sandoval
> Sent: Wednesday, May 27, 2015 1:06 AM
> To: linux-btrfs@vger.kernel.org
> Cc: Miao Xie; Philip
> Subject: Re: [PATCH 0/4] Btrfs: RAID 5/6 missing device replace
> 
> On Mon, May 11, 2015 at 12:58:11AM -0700, Omar Sandoval wrote:
> > A user reported on Bugzilla that they were seeing kernel BUGs when
> > attempting to replace a missing device on a RAID 6 array. After
> > identifying the apparent cause of the BUG, I reached the conclusion
> > that there wasn't a quick fix. Maybe Miao Xie can point something out
> > that I missed, as he originally implemented device replace on RAID 5/6
> > :)
> >
> > Patch 4 has the details, but the main problem is that we can't create
> > bios for a missing device, so the main scrub code path isn't very
> > useful. On RAID 5/6, since we only have one mirror for any piece of
> > data, the missing device is the only mirror we can use. Clearly,
> > (unless I missed something), this case needs to be handled differently.
> >
> > These patches are on top of v4.1-rc2. I ran the scrub and replace
> > xfstests and the script below, which also reproduces the original BUG.
> >
> > Thanks!
> >
> > Omar Sandoval (4):
> >   Btrfs: remove misleading handling of missing device scrub
> >   Btrfs: count devices correctly in readahead during RAID 5/6 replace
> >   Btrfs: add RAID 5/6 BTRFS_RBIO_REBUILD_MISSING operation
> >   Btrfs: fix device replace of a missing RAID 5/6 device
> >
> >  fs/btrfs/raid56.c |  87 +++++++++++++++++++++++++----
> > fs/btrfs/raid56.h |  10 +++-
> >  fs/btrfs/reada.c  |   4 +-
> >  fs/btrfs/scrub.c  | 164
> > +++++++++++++++++++++++++++++++++++++++++++++---------
> >  4 files changed, 225 insertions(+), 40 deletions(-)
> >
> 
> Ping for review.
> 
> --
> Omar
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body
> of a message to majordomo@vger.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/4] Btrfs: RAID 5/6 missing device replace
  2015-06-11  3:52   ` Zhao Lei
@ 2015-06-11  6:08     ` Omar Sandoval
  2015-06-12  9:42       ` wangyf
  0 siblings, 1 reply; 12+ messages in thread
From: Omar Sandoval @ 2015-06-11  6:08 UTC (permalink / raw)
  To: Zhao Lei; +Cc: linux-btrfs, 'Miao Xie', 'Philip'

On Thu, Jun 11, 2015 at 11:52:30AM +0800, Zhao Lei wrote:
> Hi, Omar Sandoval
> 
> I tested this patchset with my script, but see general protection fault
> again.
> NODE: kvm with virtio disk
> ROOTFS: RHEL6 with btrfs-progs v4.0
> KERNEL: v4.1-rc6 with 4 patchs in this patchset
> 
> Maybe my test have small different with yours,
> but it seems is similar bug, could you check it?
> 

Hi, Zhao Lei,

Thanks for taking a look! I was able to reproduce this and it looks like
a similar bug, this time coming from the parity scrubbing code instead
of the data scrubbing code. I didn't test scrubbing a degraded
filesystem, only replacing a missing device, so that's probably why I
didn't run into it. I'll take a closer look.

Thanks,
-- 
Omar

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH 4/4] Btrfs: fix device replace of a missing RAID 5/6 device
  2015-05-11  7:58 ` [PATCH 4/4] Btrfs: fix device replace of a missing RAID 5/6 device Omar Sandoval
@ 2015-06-11 10:29   ` Zhao Lei
  2015-06-12  8:12     ` Omar Sandoval
  0 siblings, 1 reply; 12+ messages in thread
From: Zhao Lei @ 2015-06-11 10:29 UTC (permalink / raw)
  To: 'Omar Sandoval', linux-btrfs; +Cc: 'Miao Xie', 'Philip'

Hi, Omar Sandoval

> -----Original Message-----
> From: linux-btrfs-owner@vger.kernel.org
> [mailto:linux-btrfs-owner@vger.kernel.org] On Behalf Of Omar Sandoval
> Sent: Monday, May 11, 2015 3:58 PM
> To: linux-btrfs@vger.kernel.org
> Cc: Miao Xie; Philip; Omar Sandoval
> Subject: [PATCH 4/4] Btrfs: fix device replace of a missing RAID 5/6 device
> 
> The original implementation of device replace on RAID 5/6 seems to have
> missed support for replacing a missing device. When this is attempted, we end
> up calling bio_add_page() on a bio with a NULL ->bi_bdev, which crashes when
> we try to dereference it. This happens because
> btrfs_map_block() has no choice but to return us the missing device because
> RAID 5/6 don't have any alternate mirrors to read from, and a missing device
> has a NULL bdev.
> 
> The idea implemented here is to handle the missing device case separately,
> which better only happen when we're replacing a missing RAID
> 5/6 device. We use the new BTRFS_RBIO_REBUILD_MISSING operation to
> reconstruct the data from parity, check it with
> scrub_recheck_block_checksum(), and write it out with
> scrub_write_block_to_dev_replace().
> 
> Reported-by: Philip <bugzilla@philip-seeger.de>
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=96141
> Signed-off-by: Omar Sandoval <osandov@osandov.com>
> ---
>  fs/btrfs/scrub.c | 145
> +++++++++++++++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 135 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index b94694d..a13f91a 100644
> --- a/fs/btrfs/scrub.c
> +++ b/fs/btrfs/scrub.c
> @@ -125,6 +125,7 @@ struct scrub_block {
>  		/* It is for the data with checksum */
>  		unsigned int	data_corrected:1;
>  	};
> +	struct btrfs_work	work;
>  };
> 
>  /* Used for the chunks with parity stripe such RAID5/6 */ @@ -2164,6
> +2165,126 @@ again:
>  	return 0;
>  }
> 
> +static void scrub_missing_raid56_end_io(struct bio *bio, int error) {
> +	struct scrub_block *sblock = bio->bi_private;
> +	struct btrfs_fs_info *fs_info = sblock->sctx->dev_root->fs_info;
> +
> +	if (error)
> +		sblock->no_io_error_seen = 0;
> +
> +	btrfs_queue_work(fs_info->scrub_workers, &sblock->work); }
> +
> +static void scrub_missing_raid56_worker(struct btrfs_work *work) {
> +	struct scrub_block *sblock = container_of(work, struct scrub_block, work);
> +	struct scrub_ctx *sctx = sblock->sctx;
> +	struct btrfs_fs_info *fs_info = sctx->dev_root->fs_info;
> +	unsigned int is_metadata;
> +	unsigned int have_csum;
> +	u8 *csum;
> +	u64 generation;
> +	u64 logical;
> +	struct btrfs_device *dev;
> +
> +	is_metadata = !(sblock->pagev[0]->flags & BTRFS_EXTENT_FLAG_DATA);
> +	have_csum = sblock->pagev[0]->have_csum;
> +	csum = sblock->pagev[0]->csum;
> +	generation = sblock->pagev[0]->generation;
> +	logical = sblock->pagev[0]->logical;
> +	dev = sblock->pagev[0]->dev;
> +
> +	if (sblock->no_io_error_seen) {
> +		scrub_recheck_block_checksum(fs_info, sblock, is_metadata,
> +					     have_csum, csum, generation,
> +					     sctx->csum_size);
> +	}
> +
> +	if (!sblock->no_io_error_seen) {
> +		spin_lock(&sctx->stat_lock);
> +		sctx->stat.read_errors++;
> +		spin_unlock(&sctx->stat_lock);
> +		printk_ratelimited_in_rcu(KERN_ERR
> +			"BTRFS: I/O error rebulding logical %llu for dev %s\n",
> +			logical, rcu_str_deref(dev->name));
> +	} else if (sblock->header_error || sblock->checksum_error) {
> +		spin_lock(&sctx->stat_lock);
> +		sctx->stat.uncorrectable_errors++;
> +		spin_unlock(&sctx->stat_lock);
> +		printk_ratelimited_in_rcu(KERN_ERR
> +			"BTRFS: failed to rebuild valid logical %llu for dev %s\n",
> +			logical, rcu_str_deref(dev->name));
> +	} else {
> +		scrub_write_block_to_dev_replace(sblock);
> +	}
> +
> +	scrub_block_put(sblock);
> +	scrub_pending_bio_dec(sctx);
> +}
> +
> +static void scrub_missing_raid56_pages(struct scrub_block *sblock) {
> +	struct scrub_ctx *sctx = sblock->sctx;
> +	struct btrfs_fs_info *fs_info = sctx->dev_root->fs_info;
> +	u64 length = sblock->page_count * PAGE_SIZE;
> +	u64 logical = sblock->pagev[0]->logical;
> +	struct btrfs_bio *bbio;
> +	struct bio *bio;
> +	struct btrfs_raid_bio *rbio;
> +	int ret;
> +	int i;
> +
> +	ret = btrfs_map_sblock(fs_info, REQ_GET_READ_MIRRORS, logical,
> &length,
> +			       &bbio, 0, 1);
> +	if (ret || !bbio || !bbio->raid_map)
> +		goto bbio_out;
> +
> +	if (WARN_ON(!sctx->is_dev_replace ||
> +		    !(bbio->map_type & BTRFS_BLOCK_GROUP_RAID56_MASK))) {
> +		/*
> +		 * We shouldn't be scrubbing a missing device. Even for dev
> +		 * replace, we should only get here for RAID 5/6. We either
> +		 * managed to mount something with no mirrors remaining or
> +		 * there's a bug in scrub_remap_extent()/btrfs_map_block().
> +		 */
> +		goto bbio_out;
> +	}
> +
> +	bio = btrfs_io_bio_alloc(GFP_NOFS, 0);
> +	if (!bio)
> +		goto bbio_out;
> +
> +	bio->bi_iter.bi_sector = logical >> 9;
> +	bio->bi_private = sblock;
> +	bio->bi_end_io = scrub_missing_raid56_end_io;
> +
> +	rbio = raid56_alloc_missing_rbio(sctx->dev_root, bio, bbio, length);
> +	if (!rbio)
> +		goto rbio_out;
> +
> +	for (i = 0; i < sblock->page_count; i++) {
> +		struct scrub_page *spage = sblock->pagev[i];
> +
> +		raid56_add_scrub_pages(rbio, spage->page, spage->logical);
> +	}
> +
> +	btrfs_init_work(&sblock->work, btrfs_scrub_helper,
> +			scrub_missing_raid56_worker, NULL, NULL);
> +	scrub_block_get(sblock);
> +	scrub_pending_bio_inc(sctx);
> +	raid56_submit_missing_rbio(rbio);
> +	return;
> +
> +rbio_out:
> +	bio_put(bio);
> +bbio_out:
> +	btrfs_put_bbio(bbio);
> +	spin_lock(&sctx->stat_lock);
> +	sctx->stat.malloc_errors++;
> +	spin_unlock(&sctx->stat_lock);
> +}
> +
>  static int scrub_pages(struct scrub_ctx *sctx, u64 logical, u64 len,
>  		       u64 physical, struct btrfs_device *dev, u64 flags,
>  		       u64 gen, int mirror_num, u8 *csum, int force, @@ -2227,19
> +2348,23 @@ leave_nomem:
>  	}
> 
>  	WARN_ON(sblock->page_count == 0);
> -	for (index = 0; index < sblock->page_count; index++) {
> -		struct scrub_page *spage = sblock->pagev[index];
> -		int ret;
> +	if (dev->missing) {
> +		scrub_missing_raid56_pages(sblock);

Both non-raid56 and raid56 case have possibility run to here.

If it is a bad non-raid56 device, call scrub_missing_raid56_pages() for
a non-raid56 device seems not suitable.

Since I hadn't read this patch carefully, please ignore if it is not
a problem

Thanks
Zhaolei

> +	} else {
> +		for (index = 0; index < sblock->page_count; index++) {
> +			struct scrub_page *spage = sblock->pagev[index];
> +			int ret;
> 
> -		ret = scrub_add_page_to_rd_bio(sctx, spage);
> -		if (ret) {
> -			scrub_block_put(sblock);
> -			return ret;
> +			ret = scrub_add_page_to_rd_bio(sctx, spage);
> +			if (ret) {
> +				scrub_block_put(sblock);
> +				return ret;
> +			}
>  		}
> -	}
> 
> -	if (force)
> -		scrub_submit(sctx);
> +		if (force)
> +			scrub_submit(sctx);
> +	}
> 
>  	/* last one frees, either here or in bio completion for last page */
>  	scrub_block_put(sblock);
> --
> 2.4.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body
> of a message to majordomo@vger.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/4] Btrfs: fix device replace of a missing RAID 5/6 device
  2015-06-11 10:29   ` Zhao Lei
@ 2015-06-12  8:12     ` Omar Sandoval
  2015-06-12  8:26       ` Zhao Lei
  0 siblings, 1 reply; 12+ messages in thread
From: Omar Sandoval @ 2015-06-12  8:12 UTC (permalink / raw)
  To: Zhao Lei; +Cc: linux-btrfs, 'Miao Xie', 'Philip'

Hi, Zhaolei,

On Thu, Jun 11, 2015 at 06:29:15PM +0800, Zhao Lei wrote:
> Hi, Omar Sandoval
> 
> > -----Original Message-----
> > From: linux-btrfs-owner@vger.kernel.org
> > [mailto:linux-btrfs-owner@vger.kernel.org] On Behalf Of Omar Sandoval
> > Sent: Monday, May 11, 2015 3:58 PM
> > To: linux-btrfs@vger.kernel.org
> > Cc: Miao Xie; Philip; Omar Sandoval
> > Subject: [PATCH 4/4] Btrfs: fix device replace of a missing RAID 5/6 device
> > 
> > The original implementation of device replace on RAID 5/6 seems to have
> > missed support for replacing a missing device. When this is attempted, we end
> > up calling bio_add_page() on a bio with a NULL ->bi_bdev, which crashes when
> > we try to dereference it. This happens because
> > btrfs_map_block() has no choice but to return us the missing device because
> > RAID 5/6 don't have any alternate mirrors to read from, and a missing device
> > has a NULL bdev.
> > 
> > The idea implemented here is to handle the missing device case separately,
> > which better only happen when we're replacing a missing RAID
> > 5/6 device. We use the new BTRFS_RBIO_REBUILD_MISSING operation to
> > reconstruct the data from parity, check it with
> > scrub_recheck_block_checksum(), and write it out with
> > scrub_write_block_to_dev_replace().
> > 
> > Reported-by: Philip <bugzilla@philip-seeger.de>
> > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=96141
> > Signed-off-by: Omar Sandoval <osandov@osandov.com>
> > ---
> >  fs/btrfs/scrub.c | 145
> > +++++++++++++++++++++++++++++++++++++++++++++++++++----
> >  1 file changed, 135 insertions(+), 10 deletions(-)
> > 
> > diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index b94694d..a13f91a 100644
> > --- a/fs/btrfs/scrub.c
> > +++ b/fs/btrfs/scrub.c
> > @@ -125,6 +125,7 @@ struct scrub_block {
> >  		/* It is for the data with checksum */
> >  		unsigned int	data_corrected:1;
> >  	};
> > +	struct btrfs_work	work;
> >  };
> > 
> >  /* Used for the chunks with parity stripe such RAID5/6 */ @@ -2164,6
> > +2165,126 @@ again:
> >  	return 0;
> >  }
> > 
> > +static void scrub_missing_raid56_end_io(struct bio *bio, int error) {
> > +	struct scrub_block *sblock = bio->bi_private;
> > +	struct btrfs_fs_info *fs_info = sblock->sctx->dev_root->fs_info;
> > +
> > +	if (error)
> > +		sblock->no_io_error_seen = 0;
> > +
> > +	btrfs_queue_work(fs_info->scrub_workers, &sblock->work); }
> > +
> > +static void scrub_missing_raid56_worker(struct btrfs_work *work) {
> > +	struct scrub_block *sblock = container_of(work, struct scrub_block, work);
> > +	struct scrub_ctx *sctx = sblock->sctx;
> > +	struct btrfs_fs_info *fs_info = sctx->dev_root->fs_info;
> > +	unsigned int is_metadata;
> > +	unsigned int have_csum;
> > +	u8 *csum;
> > +	u64 generation;
> > +	u64 logical;
> > +	struct btrfs_device *dev;
> > +
> > +	is_metadata = !(sblock->pagev[0]->flags & BTRFS_EXTENT_FLAG_DATA);
> > +	have_csum = sblock->pagev[0]->have_csum;
> > +	csum = sblock->pagev[0]->csum;
> > +	generation = sblock->pagev[0]->generation;
> > +	logical = sblock->pagev[0]->logical;
> > +	dev = sblock->pagev[0]->dev;
> > +
> > +	if (sblock->no_io_error_seen) {
> > +		scrub_recheck_block_checksum(fs_info, sblock, is_metadata,
> > +					     have_csum, csum, generation,
> > +					     sctx->csum_size);
> > +	}
> > +
> > +	if (!sblock->no_io_error_seen) {
> > +		spin_lock(&sctx->stat_lock);
> > +		sctx->stat.read_errors++;
> > +		spin_unlock(&sctx->stat_lock);
> > +		printk_ratelimited_in_rcu(KERN_ERR
> > +			"BTRFS: I/O error rebulding logical %llu for dev %s\n",
> > +			logical, rcu_str_deref(dev->name));
> > +	} else if (sblock->header_error || sblock->checksum_error) {
> > +		spin_lock(&sctx->stat_lock);
> > +		sctx->stat.uncorrectable_errors++;
> > +		spin_unlock(&sctx->stat_lock);
> > +		printk_ratelimited_in_rcu(KERN_ERR
> > +			"BTRFS: failed to rebuild valid logical %llu for dev %s\n",
> > +			logical, rcu_str_deref(dev->name));
> > +	} else {
> > +		scrub_write_block_to_dev_replace(sblock);
> > +	}
> > +
> > +	scrub_block_put(sblock);

First of all, I tested a bit more and it looks like I need this here as
well:

+
+	if (sctx->is_dev_replace &&
+	    atomic_read(&sctx->wr_ctx.flush_all_writes)) {
+		mutex_lock(&sctx->wr_ctx.wr_lock);
+		scrub_wr_submit(sctx);
+		mutex_unlock(&sctx->wr_ctx.wr_lock);
+	}
+

I'll resend with this added once I get to the other bug you reported.

> > +	scrub_pending_bio_dec(sctx);
> > +}
> > +
> > +static void scrub_missing_raid56_pages(struct scrub_block *sblock) {
> > +	struct scrub_ctx *sctx = sblock->sctx;
> > +	struct btrfs_fs_info *fs_info = sctx->dev_root->fs_info;
> > +	u64 length = sblock->page_count * PAGE_SIZE;
> > +	u64 logical = sblock->pagev[0]->logical;
> > +	struct btrfs_bio *bbio;
> > +	struct bio *bio;
> > +	struct btrfs_raid_bio *rbio;
> > +	int ret;
> > +	int i;
> > +
> > +	ret = btrfs_map_sblock(fs_info, REQ_GET_READ_MIRRORS, logical,
> > &length,
> > +			       &bbio, 0, 1);
> > +	if (ret || !bbio || !bbio->raid_map)
> > +		goto bbio_out;
> > +
> > +	if (WARN_ON(!sctx->is_dev_replace ||
> > +		    !(bbio->map_type & BTRFS_BLOCK_GROUP_RAID56_MASK))) {
> > +		/*
> > +		 * We shouldn't be scrubbing a missing device. Even for dev
> > +		 * replace, we should only get here for RAID 5/6. We either
> > +		 * managed to mount something with no mirrors remaining or
> > +		 * there's a bug in scrub_remap_extent()/btrfs_map_block().
> > +		 */
> > +		goto bbio_out;
> > +	}
> > +
> > +	bio = btrfs_io_bio_alloc(GFP_NOFS, 0);
> > +	if (!bio)
> > +		goto bbio_out;
> > +
> > +	bio->bi_iter.bi_sector = logical >> 9;
> > +	bio->bi_private = sblock;
> > +	bio->bi_end_io = scrub_missing_raid56_end_io;
> > +
> > +	rbio = raid56_alloc_missing_rbio(sctx->dev_root, bio, bbio, length);
> > +	if (!rbio)
> > +		goto rbio_out;
> > +
> > +	for (i = 0; i < sblock->page_count; i++) {
> > +		struct scrub_page *spage = sblock->pagev[i];
> > +
> > +		raid56_add_scrub_pages(rbio, spage->page, spage->logical);
> > +	}
> > +
> > +	btrfs_init_work(&sblock->work, btrfs_scrub_helper,
> > +			scrub_missing_raid56_worker, NULL, NULL);
> > +	scrub_block_get(sblock);
> > +	scrub_pending_bio_inc(sctx);
> > +	raid56_submit_missing_rbio(rbio);
> > +	return;
> > +
> > +rbio_out:
> > +	bio_put(bio);
> > +bbio_out:
> > +	btrfs_put_bbio(bbio);
> > +	spin_lock(&sctx->stat_lock);
> > +	sctx->stat.malloc_errors++;
> > +	spin_unlock(&sctx->stat_lock);
> > +}
> > +
> >  static int scrub_pages(struct scrub_ctx *sctx, u64 logical, u64 len,
> >  		       u64 physical, struct btrfs_device *dev, u64 flags,
> >  		       u64 gen, int mirror_num, u8 *csum, int force, @@ -2227,19
> > +2348,23 @@ leave_nomem:
> >  	}
> > 
> >  	WARN_ON(sblock->page_count == 0);
> > -	for (index = 0; index < sblock->page_count; index++) {
> > -		struct scrub_page *spage = sblock->pagev[index];
> > -		int ret;
> > +	if (dev->missing) {
> > +		scrub_missing_raid56_pages(sblock);
> 
> Both non-raid56 and raid56 case have possibility run to here.
> 
> If it is a bad non-raid56 device, call scrub_missing_raid56_pages() for
> a non-raid56 device seems not suitable.
> 
> Since I hadn't read this patch carefully, please ignore if it is not
> a problem

I think the chunk from above should clarify:

+	if (WARN_ON(!sctx->is_dev_replace ||
+		    !(bbio->map_type & BTRFS_BLOCK_GROUP_RAID56_MASK))) {
+		/*
+		 * We shouldn't be scrubbing a missing device. Even for dev
+		 * replace, we should only get here for RAID 5/6. We either
+		 * managed to mount something with no mirrors remaining or
+		 * there's a bug in scrub_remap_extent()/btrfs_map_block().
+		 */
+		goto bbio_out;
+	}

btrfs_scrub_dev() errors out when you attempt to scrub a missing device:

	mutex_lock(&fs_info->fs_devices->device_list_mutex);
	dev = btrfs_find_device(fs_info, devid, NULL, NULL);
	if (!dev || (dev->missing && !is_dev_replace)) {
		mutex_unlock(&fs_info->fs_devices->device_list_mutex);
		return -ENODEV;
	}

So we can't get here for scrub. Now for replace, in scrub_stripe(), we
try to remap the block from the missing device:

			if (is_dev_replace)
				scrub_remap_extent(fs_info, extent_logical,
						   extent_len, &extent_physical,
						   &extent_dev,
						   &extent_mirror_num);

So now let's consider what will happen with replace on the different
RAID levels:

- For RAID 0, the filesystem can't even be mounted
- For RAID 1 and RAID 10, we can always remap the block to another
  mirror
- For RAID 5 and 6, this won't actually remap anything because there
  isn't another mirror! And that's what causes the bug that this patch
  series addresses

Also, consider that before this patch series, if we were to end up in
scrub_stripe() with a missing device, we would crash later when we do a
bio_add_page() on it, and I haven't seen any reports of that.

So I think that this is right, but please correct me if I'm wrong!

> Thanks
> Zhaolei

Thanks,
-- 
Omar

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH 4/4] Btrfs: fix device replace of a missing RAID 5/6 device
  2015-06-12  8:12     ` Omar Sandoval
@ 2015-06-12  8:26       ` Zhao Lei
  0 siblings, 0 replies; 12+ messages in thread
From: Zhao Lei @ 2015-06-12  8:26 UTC (permalink / raw)
  To: 'Omar Sandoval'; +Cc: linux-btrfs, 'Miao Xie', 'Philip'

Hi, Omar Sandoval

> -----Original Message-----
> From: Omar Sandoval [mailto:osandov@osandov.com]
> Sent: Friday, June 12, 2015 4:12 PM
> To: Zhao Lei
> Cc: linux-btrfs@vger.kernel.org; 'Miao Xie'; 'Philip'
> Subject: Re: [PATCH 4/4] Btrfs: fix device replace of a missing RAID 5/6 device
> 
> Hi, Zhaolei,
> 
> On Thu, Jun 11, 2015 at 06:29:15PM +0800, Zhao Lei wrote:
> > Hi, Omar Sandoval
> >
> > > -----Original Message-----
> > > From: linux-btrfs-owner@vger.kernel.org
> > > [mailto:linux-btrfs-owner@vger.kernel.org] On Behalf Of Omar
> > > Sandoval
> > > Sent: Monday, May 11, 2015 3:58 PM
> > > To: linux-btrfs@vger.kernel.org
> > > Cc: Miao Xie; Philip; Omar Sandoval
> > > Subject: [PATCH 4/4] Btrfs: fix device replace of a missing RAID 5/6
> > > device
> > >
> > > The original implementation of device replace on RAID 5/6 seems to
> > > have missed support for replacing a missing device. When this is
> > > attempted, we end up calling bio_add_page() on a bio with a NULL
> > > ->bi_bdev, which crashes when we try to dereference it. This happens
> > > because
> > > btrfs_map_block() has no choice but to return us the missing device
> > > because RAID 5/6 don't have any alternate mirrors to read from, and
> > > a missing device has a NULL bdev.
> > >
> > > The idea implemented here is to handle the missing device case
> > > separately, which better only happen when we're replacing a missing
> > > RAID
> > > 5/6 device. We use the new BTRFS_RBIO_REBUILD_MISSING operation to
> > > reconstruct the data from parity, check it with
> > > scrub_recheck_block_checksum(), and write it out with
> > > scrub_write_block_to_dev_replace().
> > >
> > > Reported-by: Philip <bugzilla@philip-seeger.de>
> > > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=96141
> > > Signed-off-by: Omar Sandoval <osandov@osandov.com>
> > > ---
> > >  fs/btrfs/scrub.c | 145
> > > +++++++++++++++++++++++++++++++++++++++++++++++++++----
> > >  1 file changed, 135 insertions(+), 10 deletions(-)
> > >
> > > diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index
> > > b94694d..a13f91a 100644
> > > --- a/fs/btrfs/scrub.c
> > > +++ b/fs/btrfs/scrub.c
> > > @@ -125,6 +125,7 @@ struct scrub_block {
> > >  		/* It is for the data with checksum */
> > >  		unsigned int	data_corrected:1;
> > >  	};
> > > +	struct btrfs_work	work;
> > >  };
> > >
> > >  /* Used for the chunks with parity stripe such RAID5/6 */ @@
> > > -2164,6
> > > +2165,126 @@ again:
> > >  	return 0;
> > >  }
> > >
> > > +static void scrub_missing_raid56_end_io(struct bio *bio, int error) {
> > > +	struct scrub_block *sblock = bio->bi_private;
> > > +	struct btrfs_fs_info *fs_info = sblock->sctx->dev_root->fs_info;
> > > +
> > > +	if (error)
> > > +		sblock->no_io_error_seen = 0;
> > > +
> > > +	btrfs_queue_work(fs_info->scrub_workers, &sblock->work); }
> > > +
> > > +static void scrub_missing_raid56_worker(struct btrfs_work *work) {
> > > +	struct scrub_block *sblock = container_of(work, struct scrub_block,
> work);
> > > +	struct scrub_ctx *sctx = sblock->sctx;
> > > +	struct btrfs_fs_info *fs_info = sctx->dev_root->fs_info;
> > > +	unsigned int is_metadata;
> > > +	unsigned int have_csum;
> > > +	u8 *csum;
> > > +	u64 generation;
> > > +	u64 logical;
> > > +	struct btrfs_device *dev;
> > > +
> > > +	is_metadata = !(sblock->pagev[0]->flags &
> BTRFS_EXTENT_FLAG_DATA);
> > > +	have_csum = sblock->pagev[0]->have_csum;
> > > +	csum = sblock->pagev[0]->csum;
> > > +	generation = sblock->pagev[0]->generation;
> > > +	logical = sblock->pagev[0]->logical;
> > > +	dev = sblock->pagev[0]->dev;
> > > +
> > > +	if (sblock->no_io_error_seen) {
> > > +		scrub_recheck_block_checksum(fs_info, sblock, is_metadata,
> > > +					     have_csum, csum, generation,
> > > +					     sctx->csum_size);
> > > +	}
> > > +
> > > +	if (!sblock->no_io_error_seen) {
> > > +		spin_lock(&sctx->stat_lock);
> > > +		sctx->stat.read_errors++;
> > > +		spin_unlock(&sctx->stat_lock);
> > > +		printk_ratelimited_in_rcu(KERN_ERR
> > > +			"BTRFS: I/O error rebulding logical %llu for dev %s\n",
> > > +			logical, rcu_str_deref(dev->name));
> > > +	} else if (sblock->header_error || sblock->checksum_error) {
> > > +		spin_lock(&sctx->stat_lock);
> > > +		sctx->stat.uncorrectable_errors++;
> > > +		spin_unlock(&sctx->stat_lock);
> > > +		printk_ratelimited_in_rcu(KERN_ERR
> > > +			"BTRFS: failed to rebuild valid logical %llu for dev %s\n",
> > > +			logical, rcu_str_deref(dev->name));
> > > +	} else {
> > > +		scrub_write_block_to_dev_replace(sblock);
> > > +	}
> > > +
> > > +	scrub_block_put(sblock);
> 
> First of all, I tested a bit more and it looks like I need this here as
> well:
> 
> +
> +	if (sctx->is_dev_replace &&
> +	    atomic_read(&sctx->wr_ctx.flush_all_writes)) {
> +		mutex_lock(&sctx->wr_ctx.wr_lock);
> +		scrub_wr_submit(sctx);
> +		mutex_unlock(&sctx->wr_ctx.wr_lock);
> +	}
> +
> 
> I'll resend with this added once I get to the other bug you reported.
> 
Great, I'll test after you send new version.

> > > +	scrub_pending_bio_dec(sctx);
> > > +}
> > > +
> > > +static void scrub_missing_raid56_pages(struct scrub_block *sblock) {
> > > +	struct scrub_ctx *sctx = sblock->sctx;
> > > +	struct btrfs_fs_info *fs_info = sctx->dev_root->fs_info;
> > > +	u64 length = sblock->page_count * PAGE_SIZE;
> > > +	u64 logical = sblock->pagev[0]->logical;
> > > +	struct btrfs_bio *bbio;
> > > +	struct bio *bio;
> > > +	struct btrfs_raid_bio *rbio;
> > > +	int ret;
> > > +	int i;
> > > +
> > > +	ret = btrfs_map_sblock(fs_info, REQ_GET_READ_MIRRORS, logical,
> > > &length,
> > > +			       &bbio, 0, 1);
> > > +	if (ret || !bbio || !bbio->raid_map)
> > > +		goto bbio_out;
> > > +
> > > +	if (WARN_ON(!sctx->is_dev_replace ||
> > > +		    !(bbio->map_type &
> BTRFS_BLOCK_GROUP_RAID56_MASK))) {
> > > +		/*
> > > +		 * We shouldn't be scrubbing a missing device. Even for dev
> > > +		 * replace, we should only get here for RAID 5/6. We either
> > > +		 * managed to mount something with no mirrors remaining or
> > > +		 * there's a bug in scrub_remap_extent()/btrfs_map_block().
> > > +		 */
> > > +		goto bbio_out;
> > > +	}
> > > +
> > > +	bio = btrfs_io_bio_alloc(GFP_NOFS, 0);
> > > +	if (!bio)
> > > +		goto bbio_out;
> > > +
> > > +	bio->bi_iter.bi_sector = logical >> 9;
> > > +	bio->bi_private = sblock;
> > > +	bio->bi_end_io = scrub_missing_raid56_end_io;
> > > +
> > > +	rbio = raid56_alloc_missing_rbio(sctx->dev_root, bio, bbio, length);
> > > +	if (!rbio)
> > > +		goto rbio_out;
> > > +
> > > +	for (i = 0; i < sblock->page_count; i++) {
> > > +		struct scrub_page *spage = sblock->pagev[i];
> > > +
> > > +		raid56_add_scrub_pages(rbio, spage->page, spage->logical);
> > > +	}
> > > +
> > > +	btrfs_init_work(&sblock->work, btrfs_scrub_helper,
> > > +			scrub_missing_raid56_worker, NULL, NULL);
> > > +	scrub_block_get(sblock);
> > > +	scrub_pending_bio_inc(sctx);
> > > +	raid56_submit_missing_rbio(rbio);
> > > +	return;
> > > +
> > > +rbio_out:
> > > +	bio_put(bio);
> > > +bbio_out:
> > > +	btrfs_put_bbio(bbio);
> > > +	spin_lock(&sctx->stat_lock);
> > > +	sctx->stat.malloc_errors++;
> > > +	spin_unlock(&sctx->stat_lock);
> > > +}
> > > +
> > >  static int scrub_pages(struct scrub_ctx *sctx, u64 logical, u64 len,
> > >  		       u64 physical, struct btrfs_device *dev, u64 flags,
> > >  		       u64 gen, int mirror_num, u8 *csum, int force, @@
> -2227,19
> > > +2348,23 @@ leave_nomem:
> > >  	}
> > >
> > >  	WARN_ON(sblock->page_count == 0);
> > > -	for (index = 0; index < sblock->page_count; index++) {
> > > -		struct scrub_page *spage = sblock->pagev[index];
> > > -		int ret;
> > > +	if (dev->missing) {
> > > +		scrub_missing_raid56_pages(sblock);
> >
> > Both non-raid56 and raid56 case have possibility run to here.
> >
> > If it is a bad non-raid56 device, call scrub_missing_raid56_pages()
> > for a non-raid56 device seems not suitable.
> >
> > Since I hadn't read this patch carefully, please ignore if it is not a
> > problem
> 
> I think the chunk from above should clarify:
> 
> +	if (WARN_ON(!sctx->is_dev_replace ||
> +		    !(bbio->map_type & BTRFS_BLOCK_GROUP_RAID56_MASK))) {
> +		/*
> +		 * We shouldn't be scrubbing a missing device. Even for dev
> +		 * replace, we should only get here for RAID 5/6. We either
> +		 * managed to mount something with no mirrors remaining or
> +		 * there's a bug in scrub_remap_extent()/btrfs_map_block().
> +		 */
> +		goto bbio_out;
> +	}
> 
> btrfs_scrub_dev() errors out when you attempt to scrub a missing device:
> 
> 	mutex_lock(&fs_info->fs_devices->device_list_mutex);
> 	dev = btrfs_find_device(fs_info, devid, NULL, NULL);
> 	if (!dev || (dev->missing && !is_dev_replace)) {
> 		mutex_unlock(&fs_info->fs_devices->device_list_mutex);
> 		return -ENODEV;
> 	}
> 
> So we can't get here for scrub. Now for replace, in scrub_stripe(), we try to
> remap the block from the missing device:
> 
> 			if (is_dev_replace)
> 				scrub_remap_extent(fs_info, extent_logical,
> 						   extent_len, &extent_physical,
> 						   &extent_dev,
> 						   &extent_mirror_num);
> 
> So now let's consider what will happen with replace on the different RAID
> levels:
> 
> - For RAID 0, the filesystem can't even be mounted
> - For RAID 1 and RAID 10, we can always remap the block to another
>   mirror
> - For RAID 5 and 6, this won't actually remap anything because there
>   isn't another mirror! And that's what causes the bug that this patch
>   series addresses
> 
> Also, consider that before this patch series, if we were to end up in
> scrub_stripe() with a missing device, we would crash later when we do a
> bio_add_page() on it, and I haven't seen any reports of that.
> 
> So I think that this is right, but please correct me if I'm wrong!
> 

Thanks for your detailed explanation.
I accept your view that the code can ONLY run to above line in raid5/6.

How about add a comment like this?
if (dev->missing) {
    /*
     * this case only exist in raid5/6,
     * see comment in scrub_missing_raid56_pages() for detail.
     */
    scrub_missing_raid56_pages(sblock);
}

Thanks
Zhaolei

> > Thanks
> > Zhaolei
> 
> Thanks,
> --
> Omar


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/4] Btrfs: RAID 5/6 missing device replace
  2015-06-11  6:08     ` Omar Sandoval
@ 2015-06-12  9:42       ` wangyf
  0 siblings, 0 replies; 12+ messages in thread
From: wangyf @ 2015-06-12  9:42 UTC (permalink / raw)
  To: Omar Sandoval, Zhao Lei; +Cc: linux-btrfs, 'Miao Xie', 'Philip'

Hi. I have tested this patchset in Virtual Machine.

Environment:
         Oracle VirtualBox 4.3.10 + Ubuntu 14.10 server + LVM 2.02.98
Kernel:
         4.1.0-rc7 (12 June, 2015 / Today) with your 4 patches.

Btrfs-progs: 4.0

Generic Test Procedure:
         mkfs.btrfs -f -m $RAID -d $RAID && mount
         cp some data
         lvremove a logical volume
         mount -o degraded
         btrfs replace with a new device
         btrfs scrub  /mnt

without patches, raid 1, raid 10 are both OK, raid 5/6 cause NULL 
pointer dereference
                                and when shudown -h , get BTRFS info： 
suspending dev_replace for umount
with the patches, raid 1/10/5/6 are all okey.


在 2015年06月11日 14:08, Omar Sandoval 写道:
> On Thu, Jun 11, 2015 at 11:52:30AM +0800, Zhao Lei wrote:
>> Hi, Omar Sandoval
>>
>> I tested this patchset with my script, but see general protection fault
>> again.
>> NODE: kvm with virtio disk
>> ROOTFS: RHEL6 with btrfs-progs v4.0
>> KERNEL: v4.1-rc6 with 4 patchs in this patchset
>>
>> Maybe my test have small different with yours,
>> but it seems is similar bug, could you check it?
>>
> Hi, Zhao Lei,
>
> Thanks for taking a look! I was able to reproduce this and it looks like
> a similar bug, this time coming from the parity scrubbing code instead
> of the data scrubbing code. I didn't test scrubbing a degraded
> filesystem, only replacing a missing device, so that's probably why I
> didn't run into it. I'll take a closer look.
>
> Thanks,


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2015-06-12  9:43 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-11  7:58 [PATCH 0/4] Btrfs: RAID 5/6 missing device replace Omar Sandoval
2015-05-11  7:58 ` [PATCH 1/4] Btrfs: remove misleading handling of missing device scrub Omar Sandoval
2015-05-11  7:58 ` [PATCH 2/4] Btrfs: count devices correctly in readahead during RAID 5/6 replace Omar Sandoval
2015-05-11  7:58 ` [PATCH 3/4] Btrfs: add RAID 5/6 BTRFS_RBIO_REBUILD_MISSING operation Omar Sandoval
2015-05-11  7:58 ` [PATCH 4/4] Btrfs: fix device replace of a missing RAID 5/6 device Omar Sandoval
2015-06-11 10:29   ` Zhao Lei
2015-06-12  8:12     ` Omar Sandoval
2015-06-12  8:26       ` Zhao Lei
2015-05-26 17:05 ` [PATCH 0/4] Btrfs: RAID 5/6 missing device replace Omar Sandoval
2015-06-11  3:52   ` Zhao Lei
2015-06-11  6:08     ` Omar Sandoval
2015-06-12  9:42       ` wangyf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).