[PATCH 00/15] Btrfs: Cleanup for raid56 scrib

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 00/15] Btrfs: Cleanup for raid56 scrib
@ 2015-01-13 12:34 Zhaolei
  2015-01-13 12:34 ` [PATCH 01/15] Btrfs: fix a out-of-bound access of raid_map Zhaolei
                   ` (14 more replies)
  0 siblings, 15 replies; 18+ messages in thread
From: Zhaolei @ 2015-01-13 12:34 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Zhao Lei

From: Zhao Lei <zhaolei@cn.fujitsu.com>

Hi, everyone,

These are cleanup patchs for raid56's scrib functions, it is based on review
for new-developed raid56's scrub code.

Some small typo-fix and cleanup for other functions are also included in
this patchset.

Thanks
Zhaolei

Zhao Lei (15):
  Btrfs: fix a out-of-bound access of raid_map
  Btrfs: sort raid_map before adding tgtdev stripes
  Btrfs: Make raid_map array be inlined in btrfs_bio structure
  Btrfs: add ref_count and free function for btrfs_bio
  Btrfs: Fix a jump typo of nodatasum_case to avoid wrong WARN_ON()
  Btrfs: Remove noneed force_write in scrub_write_block_to_dev_replace
  Btrfs: Cleanup btrfs_bio_counter_inc_blocked()
  Btrfs: btrfs_rm_dev_replace_blocked(): Use wait_event()
  Btrfs: Break loop when reach BTRFS_MAX_MIRRORS in
    scrub_setup_recheck_block()
  Btrfs: Avoid trustless page-level-repair in dev-replace
  Btrfs: Separate finding-right-mirror and writing-to-target's process
    in scrub_handle_errored_block()
  Btrfs: Combine per-page recover in dev-replace and scrub
  Btrfs: Simplify scrub_setup_recheck_block()'s argument
  Btrfs: Include map_type in raid_bio
  Btrfs: Introduce BTRFS_BLOCK_GROUP_RAID56_MASK to check raid56 simply

 fs/btrfs/ctree.h       |   3 +
 fs/btrfs/dev-replace.c |  25 ++---
 fs/btrfs/extent-tree.c |   2 +-
 fs/btrfs/extent_io.c   |   2 +-
 fs/btrfs/inode.c       |   3 +-
 fs/btrfs/raid56.c      | 104 +++++++------------
 fs/btrfs/raid56.h      |  11 +-
 fs/btrfs/reada.c       |   4 +-
 fs/btrfs/scrub.c       | 270 ++++++++++++++++++++++---------------------------
 fs/btrfs/volumes.c     | 176 +++++++++++++++++---------------
 fs/btrfs/volumes.h     |  18 ++--
 11 files changed, 284 insertions(+), 334 deletions(-)

-- 
1.8.5.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 01/15] Btrfs: fix a out-of-bound access of raid_map
  2015-01-13 12:34 [PATCH 00/15] Btrfs: Cleanup for raid56 scrib Zhaolei
@ 2015-01-13 12:34 ` Zhaolei
  2015-01-13 12:34 ` [PATCH 02/15] Btrfs: sort raid_map before adding tgtdev stripes Zhaolei
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Zhaolei @ 2015-01-13 12:34 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Zhao Lei, Miao Xie

From: Zhao Lei <zhaolei@cn.fujitsu.com>

We add the number of stripes on target devices into bbio->num_stripes
if we are under device replacement, and we just sort the raid_map of
those stripes that not on the target devices, so if when we need
real raid_map, we need skip the stripes on the target devices.

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
---
 fs/btrfs/scrub.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 9e1569f..9d19065 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -1291,7 +1291,9 @@ out:
 static inline int scrub_nr_raid_mirrors(struct btrfs_bio *bbio, u64 *raid_map)
 {
 	if (raid_map) {
-		if (raid_map[bbio->num_stripes - 1] == RAID6_Q_STRIPE)
+		int real_stripes = bbio->num_stripes - bbio->num_tgtdevs;
+
+		if (raid_map[real_stripes - 1] == RAID6_Q_STRIPE)
 			return 3;
 		else
 			return 2;
@@ -1412,7 +1414,8 @@ leave_nomem:
 
 			scrub_stripe_index_and_offset(logical, raid_map,
 						      mapped_length,
-						      bbio->num_stripes,
+						      bbio->num_stripes -
+						      bbio->num_tgtdevs,
 						      mirror_index,
 						      &stripe_index,
 						      &stripe_offset);
-- 
1.8.5.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 02/15] Btrfs: sort raid_map before adding tgtdev stripes
  2015-01-13 12:34 [PATCH 00/15] Btrfs: Cleanup for raid56 scrib Zhaolei
  2015-01-13 12:34 ` [PATCH 01/15] Btrfs: fix a out-of-bound access of raid_map Zhaolei
@ 2015-01-13 12:34 ` Zhaolei
  2015-01-13 12:34 ` [PATCH 03/15] Btrfs: Make raid_map array be inlined in btrfs_bio structure Zhaolei
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Zhaolei @ 2015-01-13 12:34 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Zhao Lei, Miao Xie

From: Zhao Lei <zhaolei@cn.fujitsu.com>

It can avoid complex calculation of real stripes in sort,
moreover, we can clean up code of sorting tgtdev_map because it
will be in order initially.

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
---
 fs/btrfs/volumes.c | 22 ++++++++--------------
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 0144790..db779ca 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4876,18 +4876,17 @@ static inline int parity_smaller(u64 a, u64 b)
 }
 
 /* Bubble-sort the stripe set to put the parity/syndrome stripes last */
-static void sort_parity_stripes(struct btrfs_bio *bbio, u64 *raid_map)
+static void sort_parity_stripes(struct btrfs_bio *bbio, u64 *raid_map,
+				int num_stripes)
 {
 	struct btrfs_bio_stripe s;
-	int real_stripes = bbio->num_stripes - bbio->num_tgtdevs;
 	int i;
 	u64 l;
 	int again = 1;
-	int m;
 
 	while (again) {
 		again = 0;
-		for (i = 0; i < real_stripes - 1; i++) {
+		for (i = 0; i < num_stripes - 1; i++) {
 			if (parity_smaller(raid_map[i], raid_map[i+1])) {
 				s = bbio->stripes[i];
 				l = raid_map[i];
@@ -4896,13 +4895,6 @@ static void sort_parity_stripes(struct btrfs_bio *bbio, u64 *raid_map)
 				bbio->stripes[i+1] = s;
 				raid_map[i+1] = l;
 
-				if (bbio->tgtdev_map) {
-					m = bbio->tgtdev_map[i];
-					bbio->tgtdev_map[i] =
-							bbio->tgtdev_map[i + 1];
-					bbio->tgtdev_map[i + 1] = m;
-				}
-
 				again = 1;
 			}
 		}
@@ -5340,6 +5332,9 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int rw,
 	if (rw & (REQ_WRITE | REQ_GET_READ_MIRRORS))
 		max_errors = btrfs_chunk_max_errors(map);
 
+	if (raid_map)
+		sort_parity_stripes(bbio, raid_map, num_stripes);
+
 	tgtdev_indexes = 0;
 	if (dev_replace_is_ongoing && (rw & (REQ_WRITE | REQ_DISCARD)) &&
 	    dev_replace->tgtdev != NULL) {
@@ -5443,10 +5438,9 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int rw,
 		bbio->stripes[0].physical = physical_to_patch_in_first_stripe;
 		bbio->mirror_num = map->num_stripes + 1;
 	}
-	if (raid_map) {
-		sort_parity_stripes(bbio, raid_map);
+
+	if (raid_map_ret)
 		*raid_map_ret = raid_map;
-	}
 out:
 	if (dev_replace_is_ongoing)
 		btrfs_dev_replace_unlock(dev_replace);
-- 
1.8.5.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 03/15] Btrfs: Make raid_map array be inlined in btrfs_bio structure
  2015-01-13 12:34 [PATCH 00/15] Btrfs: Cleanup for raid56 scrib Zhaolei
  2015-01-13 12:34 ` [PATCH 01/15] Btrfs: fix a out-of-bound access of raid_map Zhaolei
  2015-01-13 12:34 ` [PATCH 02/15] Btrfs: sort raid_map before adding tgtdev stripes Zhaolei
@ 2015-01-13 12:34 ` Zhaolei
  2015-01-13 12:34 ` [PATCH 04/15] Btrfs: add ref_count and free function for btrfs_bio Zhaolei
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Zhaolei @ 2015-01-13 12:34 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Zhao Lei, Miao Xie

From: Zhao Lei <zhaolei@cn.fujitsu.com>

It can make code more simple and clear, we need not care about
free bbio and raid_map together.

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
---
 fs/btrfs/raid56.c  |  77 ++++++++++++++++++-----------------------
 fs/btrfs/raid56.h  |  11 +++---
 fs/btrfs/scrub.c   |  31 ++++++-----------
 fs/btrfs/volumes.c | 100 +++++++++++++++++++++++++----------------------------
 fs/btrfs/volumes.h |  11 ++++--
 5 files changed, 105 insertions(+), 125 deletions(-)

diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
index 8ab2a17..e301d33 100644
--- a/fs/btrfs/raid56.c
+++ b/fs/btrfs/raid56.c
@@ -79,13 +79,6 @@ struct btrfs_raid_bio {
 	struct btrfs_fs_info *fs_info;
 	struct btrfs_bio *bbio;
 
-	/*
-	 * logical block numbers for the start of each stripe
-	 * The last one or two are p/q.  These are sorted,
-	 * so raid_map[0] is the start of our full stripe
-	 */
-	u64 *raid_map;
-
 	/* while we're doing rmw on a stripe
 	 * we put it into a hash table so we can
 	 * lock the stripe and merge more rbios
@@ -303,7 +296,7 @@ static void cache_rbio_pages(struct btrfs_raid_bio *rbio)
  */
 static int rbio_bucket(struct btrfs_raid_bio *rbio)
 {
-	u64 num = rbio->raid_map[0];
+	u64 num = rbio->bbio->raid_map[0];
 
 	/*
 	 * we shift down quite a bit.  We're using byte
@@ -606,8 +599,8 @@ static int rbio_can_merge(struct btrfs_raid_bio *last,
 	    test_bit(RBIO_CACHE_BIT, &cur->flags))
 		return 0;
 
-	if (last->raid_map[0] !=
-	    cur->raid_map[0])
+	if (last->bbio->raid_map[0] !=
+	    cur->bbio->raid_map[0])
 		return 0;
 
 	/* we can't merge with different operations */
@@ -689,7 +682,7 @@ static noinline int lock_stripe_add(struct btrfs_raid_bio *rbio)
 	spin_lock_irqsave(&h->lock, flags);
 	list_for_each_entry(cur, &h->hash_list, hash_list) {
 		walk++;
-		if (cur->raid_map[0] == rbio->raid_map[0]) {
+		if (cur->bbio->raid_map[0] == rbio->bbio->raid_map[0]) {
 			spin_lock(&cur->bio_list_lock);
 
 			/* can we steal this cached rbio's pages? */
@@ -842,18 +835,16 @@ done_nolock:
 }
 
 static inline void
-__free_bbio_and_raid_map(struct btrfs_bio *bbio, u64 *raid_map, int need)
+__free_bbio(struct btrfs_bio *bbio, int need)
 {
-	if (need) {
-		kfree(raid_map);
+	if (need)
 		kfree(bbio);
-	}
 }
 
-static inline void free_bbio_and_raid_map(struct btrfs_raid_bio *rbio)
+static inline void free_bbio(struct btrfs_raid_bio *rbio)
 {
-	__free_bbio_and_raid_map(rbio->bbio, rbio->raid_map,
-			!test_bit(RBIO_HOLD_BBIO_MAP_BIT, &rbio->flags));
+	__free_bbio(rbio->bbio,
+		    !test_bit(RBIO_HOLD_BBIO_MAP_BIT, &rbio->flags));
 }
 
 static void __free_raid_bio(struct btrfs_raid_bio *rbio)
@@ -875,7 +866,7 @@ static void __free_raid_bio(struct btrfs_raid_bio *rbio)
 		}
 	}
 
-	free_bbio_and_raid_map(rbio);
+	free_bbio(rbio);
 
 	kfree(rbio);
 }
@@ -985,8 +976,7 @@ static unsigned long rbio_nr_pages(unsigned long stripe_len, int nr_stripes)
  * this does not allocate any pages for rbio->pages.
  */
 static struct btrfs_raid_bio *alloc_rbio(struct btrfs_root *root,
-			  struct btrfs_bio *bbio, u64 *raid_map,
-			  u64 stripe_len)
+			  struct btrfs_bio *bbio, u64 stripe_len)
 {
 	struct btrfs_raid_bio *rbio;
 	int nr_data = 0;
@@ -1007,7 +997,6 @@ static struct btrfs_raid_bio *alloc_rbio(struct btrfs_root *root,
 	INIT_LIST_HEAD(&rbio->stripe_cache);
 	INIT_LIST_HEAD(&rbio->hash_list);
 	rbio->bbio = bbio;
-	rbio->raid_map = raid_map;
 	rbio->fs_info = root->fs_info;
 	rbio->stripe_len = stripe_len;
 	rbio->nr_pages = num_pages;
@@ -1028,7 +1017,7 @@ static struct btrfs_raid_bio *alloc_rbio(struct btrfs_root *root,
 	rbio->bio_pages = p + sizeof(struct page *) * num_pages;
 	rbio->dbitmap = p + sizeof(struct page *) * num_pages * 2;
 
-	if (raid_map[real_stripes - 1] == RAID6_Q_STRIPE)
+	if (bbio->raid_map[real_stripes - 1] == RAID6_Q_STRIPE)
 		nr_data = real_stripes - 2;
 	else
 		nr_data = real_stripes - 1;
@@ -1182,7 +1171,7 @@ static void index_rbio_pages(struct btrfs_raid_bio *rbio)
 	spin_lock_irq(&rbio->bio_list_lock);
 	bio_list_for_each(bio, &rbio->bio_list) {
 		start = (u64)bio->bi_iter.bi_sector << 9;
-		stripe_offset = start - rbio->raid_map[0];
+		stripe_offset = start - rbio->bbio->raid_map[0];
 		page_index = stripe_offset >> PAGE_CACHE_SHIFT;
 
 		for (i = 0; i < bio->bi_vcnt; i++) {
@@ -1402,7 +1391,7 @@ static int find_logical_bio_stripe(struct btrfs_raid_bio *rbio,
 	logical <<= 9;
 
 	for (i = 0; i < rbio->nr_data; i++) {
-		stripe_start = rbio->raid_map[i];
+		stripe_start = rbio->bbio->raid_map[i];
 		if (logical >= stripe_start &&
 		    logical < stripe_start + rbio->stripe_len) {
 			return i;
@@ -1776,17 +1765,16 @@ static void btrfs_raid_unplug(struct blk_plug_cb *cb, bool from_schedule)
  * our main entry point for writes from the rest of the FS.
  */
 int raid56_parity_write(struct btrfs_root *root, struct bio *bio,
-			struct btrfs_bio *bbio, u64 *raid_map,
-			u64 stripe_len)
+			struct btrfs_bio *bbio, u64 stripe_len)
 {
 	struct btrfs_raid_bio *rbio;
 	struct btrfs_plug_cb *plug = NULL;
 	struct blk_plug_cb *cb;
 	int ret;
 
-	rbio = alloc_rbio(root, bbio, raid_map, stripe_len);
+	rbio = alloc_rbio(root, bbio, stripe_len);
 	if (IS_ERR(rbio)) {
-		__free_bbio_and_raid_map(bbio, raid_map, 1);
+		__free_bbio(bbio, 1);
 		return PTR_ERR(rbio);
 	}
 	bio_list_add(&rbio->bio_list, bio);
@@ -1885,7 +1873,7 @@ static void __raid_recover_end_io(struct btrfs_raid_bio *rbio)
 		}
 
 		/* all raid6 handling here */
-		if (rbio->raid_map[rbio->real_stripes - 1] ==
+		if (rbio->bbio->raid_map[rbio->real_stripes - 1] ==
 		    RAID6_Q_STRIPE) {
 
 			/*
@@ -1922,8 +1910,9 @@ static void __raid_recover_end_io(struct btrfs_raid_bio *rbio)
 			 * here due to a crc mismatch and we can't give them the
 			 * data they want
 			 */
-			if (rbio->raid_map[failb] == RAID6_Q_STRIPE) {
-				if (rbio->raid_map[faila] == RAID5_P_STRIPE) {
+			if (rbio->bbio->raid_map[failb] == RAID6_Q_STRIPE) {
+				if (rbio->bbio->raid_map[faila] ==
+				    RAID5_P_STRIPE) {
 					err = -EIO;
 					goto cleanup;
 				}
@@ -1934,7 +1923,7 @@ static void __raid_recover_end_io(struct btrfs_raid_bio *rbio)
 				goto pstripe;
 			}
 
-			if (rbio->raid_map[failb] == RAID5_P_STRIPE) {
+			if (rbio->bbio->raid_map[failb] == RAID5_P_STRIPE) {
 				raid6_datap_recov(rbio->real_stripes,
 						  PAGE_SIZE, faila, pointers);
 			} else {
@@ -2156,15 +2145,15 @@ cleanup:
  * of the drive.
  */
 int raid56_parity_recover(struct btrfs_root *root, struct bio *bio,
-			  struct btrfs_bio *bbio, u64 *raid_map,
-			  u64 stripe_len, int mirror_num, int generic_io)
+			  struct btrfs_bio *bbio, u64 stripe_len,
+			  int mirror_num, int generic_io)
 {
 	struct btrfs_raid_bio *rbio;
 	int ret;
 
-	rbio = alloc_rbio(root, bbio, raid_map, stripe_len);
+	rbio = alloc_rbio(root, bbio, stripe_len);
 	if (IS_ERR(rbio)) {
-		__free_bbio_and_raid_map(bbio, raid_map, generic_io);
+		__free_bbio(bbio, generic_io);
 		return PTR_ERR(rbio);
 	}
 
@@ -2175,7 +2164,7 @@ int raid56_parity_recover(struct btrfs_root *root, struct bio *bio,
 	rbio->faila = find_logical_bio_stripe(rbio, bio);
 	if (rbio->faila == -1) {
 		BUG();
-		__free_bbio_and_raid_map(bbio, raid_map, generic_io);
+		__free_bbio(bbio, generic_io);
 		kfree(rbio);
 		return -EIO;
 	}
@@ -2240,14 +2229,14 @@ static void read_rebuild_work(struct btrfs_work *work)
 
 struct btrfs_raid_bio *
 raid56_parity_alloc_scrub_rbio(struct btrfs_root *root, struct bio *bio,
-			       struct btrfs_bio *bbio, u64 *raid_map,
-			       u64 stripe_len, struct btrfs_device *scrub_dev,
+			       struct btrfs_bio *bbio, u64 stripe_len,
+			       struct btrfs_device *scrub_dev,
 			       unsigned long *dbitmap, int stripe_nsectors)
 {
 	struct btrfs_raid_bio *rbio;
 	int i;
 
-	rbio = alloc_rbio(root, bbio, raid_map, stripe_len);
+	rbio = alloc_rbio(root, bbio, stripe_len);
 	if (IS_ERR(rbio))
 		return NULL;
 	bio_list_add(&rbio->bio_list, bio);
@@ -2279,10 +2268,10 @@ void raid56_parity_add_scrub_pages(struct btrfs_raid_bio *rbio,
 	int stripe_offset;
 	int index;
 
-	ASSERT(logical >= rbio->raid_map[0]);
-	ASSERT(logical + PAGE_SIZE <= rbio->raid_map[0] +
+	ASSERT(logical >= rbio->bbio->raid_map[0]);
+	ASSERT(logical + PAGE_SIZE <= rbio->bbio->raid_map[0] +
 				rbio->stripe_len * rbio->nr_data);
-	stripe_offset = (int)(logical - rbio->raid_map[0]);
+	stripe_offset = (int)(logical - rbio->bbio->raid_map[0]);
 	index = stripe_offset >> PAGE_CACHE_SHIFT;
 	rbio->bio_pages[index] = page;
 }
diff --git a/fs/btrfs/raid56.h b/fs/btrfs/raid56.h
index 31d4a15..2b5d797 100644
--- a/fs/btrfs/raid56.h
+++ b/fs/btrfs/raid56.h
@@ -43,16 +43,15 @@ struct btrfs_raid_bio;
 struct btrfs_device;
 
 int raid56_parity_recover(struct btrfs_root *root, struct bio *bio,
-			  struct btrfs_bio *bbio, u64 *raid_map,
-			  u64 stripe_len, int mirror_num, int generic_io);
+			  struct btrfs_bio *bbio, u64 stripe_len,
+			  int mirror_num, int generic_io);
 int raid56_parity_write(struct btrfs_root *root, struct bio *bio,
-			       struct btrfs_bio *bbio, u64 *raid_map,
-			       u64 stripe_len);
+			       struct btrfs_bio *bbio, u64 stripe_len);
 
 struct btrfs_raid_bio *
 raid56_parity_alloc_scrub_rbio(struct btrfs_root *root, struct bio *bio,
-			       struct btrfs_bio *bbio, u64 *raid_map,
-			       u64 stripe_len, struct btrfs_device *scrub_dev,
+			       struct btrfs_bio *bbio, u64 stripe_len,
+			       struct btrfs_device *scrub_dev,
 			       unsigned long *dbitmap, int stripe_nsectors);
 void raid56_parity_add_scrub_pages(struct btrfs_raid_bio *rbio,
 				   struct page *page, u64 logical);
diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 9d19065..8f32520 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -66,7 +66,6 @@ struct scrub_ctx;
 struct scrub_recover {
 	atomic_t		refs;
 	struct btrfs_bio	*bbio;
-	u64			*raid_map;
 	u64			map_length;
 };
 
@@ -849,7 +848,6 @@ static inline void scrub_put_recover(struct scrub_recover *recover)
 {
 	if (atomic_dec_and_test(&recover->refs)) {
 		kfree(recover->bbio);
-		kfree(recover->raid_map);
 		kfree(recover);
 	}
 }
@@ -1288,12 +1286,12 @@ out:
 	return 0;
 }
 
-static inline int scrub_nr_raid_mirrors(struct btrfs_bio *bbio, u64 *raid_map)
+static inline int scrub_nr_raid_mirrors(struct btrfs_bio *bbio)
 {
-	if (raid_map) {
+	if (bbio->raid_map) {
 		int real_stripes = bbio->num_stripes - bbio->num_tgtdevs;
 
-		if (raid_map[real_stripes - 1] == RAID6_Q_STRIPE)
+		if (bbio->raid_map[real_stripes - 1] == RAID6_Q_STRIPE)
 			return 3;
 		else
 			return 2;
@@ -1339,7 +1337,6 @@ static int scrub_setup_recheck_block(struct scrub_ctx *sctx,
 {
 	struct scrub_recover *recover;
 	struct btrfs_bio *bbio;
-	u64 *raid_map;
 	u64 sublen;
 	u64 mapped_length;
 	u64 stripe_offset;
@@ -1360,35 +1357,31 @@ static int scrub_setup_recheck_block(struct scrub_ctx *sctx,
 		sublen = min_t(u64, length, PAGE_SIZE);
 		mapped_length = sublen;
 		bbio = NULL;
-		raid_map = NULL;
 
 		/*
 		 * with a length of PAGE_SIZE, each returned stripe
 		 * represents one mirror
 		 */
 		ret = btrfs_map_sblock(fs_info, REQ_GET_READ_MIRRORS, logical,
-				       &mapped_length, &bbio, 0, &raid_map);
+				       &mapped_length, &bbio, 0, 1);
 		if (ret || !bbio || mapped_length < sublen) {
 			kfree(bbio);
-			kfree(raid_map);
 			return -EIO;
 		}
 
 		recover = kzalloc(sizeof(struct scrub_recover), GFP_NOFS);
 		if (!recover) {
 			kfree(bbio);
-			kfree(raid_map);
 			return -ENOMEM;
 		}
 
 		atomic_set(&recover->refs, 1);
 		recover->bbio = bbio;
-		recover->raid_map = raid_map;
 		recover->map_length = mapped_length;
 
 		BUG_ON(page_index >= SCRUB_PAGES_PER_RD_BIO);
 
-		nmirrors = scrub_nr_raid_mirrors(bbio, raid_map);
+		nmirrors = scrub_nr_raid_mirrors(bbio);
 		for (mirror_index = 0; mirror_index < nmirrors;
 		     mirror_index++) {
 			struct scrub_block *sblock;
@@ -1412,7 +1405,7 @@ leave_nomem:
 			sblock->pagev[page_index] = page;
 			page->logical = logical;
 
-			scrub_stripe_index_and_offset(logical, raid_map,
+			scrub_stripe_index_and_offset(logical, bbio->raid_map,
 						      mapped_length,
 						      bbio->num_stripes -
 						      bbio->num_tgtdevs,
@@ -1461,7 +1454,7 @@ static void scrub_bio_wait_endio(struct bio *bio, int error)
 
 static inline int scrub_is_page_on_raid56(struct scrub_page *page)
 {
-	return page->recover && page->recover->raid_map;
+	return page->recover && page->recover->bbio->raid_map;
 }
 
 static int scrub_submit_raid56_bio_wait(struct btrfs_fs_info *fs_info,
@@ -1478,7 +1471,6 @@ static int scrub_submit_raid56_bio_wait(struct btrfs_fs_info *fs_info,
 	bio->bi_end_io = scrub_bio_wait_endio;
 
 	ret = raid56_parity_recover(fs_info->fs_root, bio, page->recover->bbio,
-				    page->recover->raid_map,
 				    page->recover->map_length,
 				    page->mirror_num, 0);
 	if (ret)
@@ -2708,7 +2700,6 @@ static void scrub_parity_check_and_repair(struct scrub_parity *sparity)
 	struct btrfs_raid_bio *rbio;
 	struct scrub_page *spage;
 	struct btrfs_bio *bbio = NULL;
-	u64 *raid_map = NULL;
 	u64 length;
 	int ret;
 
@@ -2719,8 +2710,8 @@ static void scrub_parity_check_and_repair(struct scrub_parity *sparity)
 	length = sparity->logic_end - sparity->logic_start + 1;
 	ret = btrfs_map_sblock(sctx->dev_root->fs_info, WRITE,
 			       sparity->logic_start,
-			       &length, &bbio, 0, &raid_map);
-	if (ret || !bbio || !raid_map)
+			       &length, &bbio, 0, 1);
+	if (ret || !bbio || !bbio->raid_map)
 		goto bbio_out;
 
 	bio = btrfs_io_bio_alloc(GFP_NOFS, 0);
@@ -2732,8 +2723,7 @@ static void scrub_parity_check_and_repair(struct scrub_parity *sparity)
 	bio->bi_end_io = scrub_parity_bio_endio;
 
 	rbio = raid56_parity_alloc_scrub_rbio(sctx->dev_root, bio, bbio,
-					      raid_map, length,
-					      sparity->scrub_dev,
+					      length, sparity->scrub_dev,
 					      sparity->dbitmap,
 					      sparity->nsectors);
 	if (!rbio)
@@ -2751,7 +2741,6 @@ rbio_out:
 	bio_put(bio);
 bbio_out:
 	kfree(bbio);
-	kfree(raid_map);
 	bitmap_or(sparity->ebitmap, sparity->ebitmap, sparity->dbitmap,
 		  sparity->nsectors);
 	spin_lock(&sctx->stat_lock);
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index db779ca..b364e14 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4876,8 +4876,7 @@ static inline int parity_smaller(u64 a, u64 b)
 }
 
 /* Bubble-sort the stripe set to put the parity/syndrome stripes last */
-static void sort_parity_stripes(struct btrfs_bio *bbio, u64 *raid_map,
-				int num_stripes)
+static void sort_parity_stripes(struct btrfs_bio *bbio, int num_stripes)
 {
 	struct btrfs_bio_stripe s;
 	int i;
@@ -4887,13 +4886,14 @@ static void sort_parity_stripes(struct btrfs_bio *bbio, u64 *raid_map,
 	while (again) {
 		again = 0;
 		for (i = 0; i < num_stripes - 1; i++) {
-			if (parity_smaller(raid_map[i], raid_map[i+1])) {
+			if (parity_smaller(bbio->raid_map[i],
+					   bbio->raid_map[i+1])) {
 				s = bbio->stripes[i];
-				l = raid_map[i];
+				l = bbio->raid_map[i];
 				bbio->stripes[i] = bbio->stripes[i+1];
-				raid_map[i] = raid_map[i+1];
+				bbio->raid_map[i] = bbio->raid_map[i+1];
 				bbio->stripes[i+1] = s;
-				raid_map[i+1] = l;
+				bbio->raid_map[i+1] = l;
 
 				again = 1;
 			}
@@ -4904,7 +4904,7 @@ static void sort_parity_stripes(struct btrfs_bio *bbio, u64 *raid_map,
 static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int rw,
 			     u64 logical, u64 *length,
 			     struct btrfs_bio **bbio_ret,
-			     int mirror_num, u64 **raid_map_ret)
+			     int mirror_num, int need_raid_map)
 {
 	struct extent_map *em;
 	struct map_lookup *map;
@@ -4917,7 +4917,6 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int rw,
 	u64 stripe_nr_orig;
 	u64 stripe_nr_end;
 	u64 stripe_len;
-	u64 *raid_map = NULL;
 	int stripe_index;
 	int i;
 	int ret = 0;
@@ -5039,7 +5038,7 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int rw,
 		u64 physical_of_found = 0;
 
 		ret = __btrfs_map_block(fs_info, REQ_GET_READ_MIRRORS,
-			     logical, &tmp_length, &tmp_bbio, 0, NULL);
+			     logical, &tmp_length, &tmp_bbio, 0, 0);
 		if (ret) {
 			WARN_ON(tmp_bbio != NULL);
 			goto out;
@@ -5160,13 +5159,9 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int rw,
 
 	} else if (map->type & (BTRFS_BLOCK_GROUP_RAID5 |
 				BTRFS_BLOCK_GROUP_RAID6)) {
-		u64 tmp;
-
-		if (raid_map_ret &&
+		if (need_raid_map &&
 		    ((rw & (REQ_WRITE | REQ_GET_READ_MIRRORS)) ||
 		     mirror_num > 1)) {
-			int i, rot;
-
 			/* push stripe_nr back to the start of the full stripe */
 			stripe_nr = raid56_full_stripe_start;
 			do_div(stripe_nr, stripe_len * nr_data_stripes(map));
@@ -5175,32 +5170,12 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int rw,
 			num_stripes = map->num_stripes;
 			max_errors = nr_parity_stripes(map);
 
-			raid_map = kmalloc_array(num_stripes, sizeof(u64),
-					   GFP_NOFS);
-			if (!raid_map) {
-				ret = -ENOMEM;
-				goto out;
-			}
-
-			/* Work out the disk rotation on this stripe-set */
-			tmp = stripe_nr;
-			rot = do_div(tmp, num_stripes);
-
-			/* Fill in the logical address of each stripe */
-			tmp = stripe_nr * nr_data_stripes(map);
-			for (i = 0; i < nr_data_stripes(map); i++)
-				raid_map[(i+rot) % num_stripes] =
-					em->start + (tmp + i) * map->stripe_len;
-
-			raid_map[(i+rot) % map->num_stripes] = RAID5_P_STRIPE;
-			if (map->type & BTRFS_BLOCK_GROUP_RAID6)
-				raid_map[(i+rot+1) % num_stripes] =
-					RAID6_Q_STRIPE;
-
 			*length = map->stripe_len;
 			stripe_index = 0;
 			stripe_offset = 0;
 		} else {
+			u64 tmp;
+
 			/*
 			 * Mirror #0 or #1 means the original data block.
 			 * Mirror #2 is RAID5 parity block.
@@ -5241,7 +5216,6 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int rw,
 	bbio = kzalloc(btrfs_bio_size(num_alloc_stripes, tgtdev_indexes),
 		       GFP_NOFS);
 	if (!bbio) {
-		kfree(raid_map);
 		ret = -ENOMEM;
 		goto out;
 	}
@@ -5249,6 +5223,34 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int rw,
 	if (dev_replace_is_ongoing)
 		bbio->tgtdev_map = (int *)(bbio->stripes + num_alloc_stripes);
 
+	/* build raid_map */
+	if (map->type & (BTRFS_BLOCK_GROUP_RAID5 | BTRFS_BLOCK_GROUP_RAID6) &&
+	    need_raid_map && ((rw & (REQ_WRITE | REQ_GET_READ_MIRRORS)) ||
+	    mirror_num > 1)) {
+		u64 tmp;
+		int i, rot;
+
+		bbio->raid_map = (u64 *)((void *)bbio->stripes +
+				 sizeof(struct btrfs_bio_stripe) *
+				 num_alloc_stripes +
+				 sizeof(int) * tgtdev_indexes);
+
+		/* Work out the disk rotation on this stripe-set */
+		tmp = stripe_nr;
+		rot = do_div(tmp, num_stripes);
+
+		/* Fill in the logical address of each stripe */
+		tmp = stripe_nr * nr_data_stripes(map);
+		for (i = 0; i < nr_data_stripes(map); i++)
+			bbio->raid_map[(i+rot) % num_stripes] =
+				em->start + (tmp + i) * map->stripe_len;
+
+		bbio->raid_map[(i+rot) % map->num_stripes] = RAID5_P_STRIPE;
+		if (map->type & BTRFS_BLOCK_GROUP_RAID6)
+			bbio->raid_map[(i+rot+1) % num_stripes] =
+				RAID6_Q_STRIPE;
+	}
+
 	if (rw & REQ_DISCARD) {
 		int factor = 0;
 		int sub_stripes = 0;
@@ -5332,8 +5334,8 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int rw,
 	if (rw & (REQ_WRITE | REQ_GET_READ_MIRRORS))
 		max_errors = btrfs_chunk_max_errors(map);
 
-	if (raid_map)
-		sort_parity_stripes(bbio, raid_map, num_stripes);
+	if (bbio->raid_map)
+		sort_parity_stripes(bbio, num_stripes);
 
 	tgtdev_indexes = 0;
 	if (dev_replace_is_ongoing && (rw & (REQ_WRITE | REQ_DISCARD)) &&
@@ -5438,9 +5440,6 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int rw,
 		bbio->stripes[0].physical = physical_to_patch_in_first_stripe;
 		bbio->mirror_num = map->num_stripes + 1;
 	}
-
-	if (raid_map_ret)
-		*raid_map_ret = raid_map;
 out:
 	if (dev_replace_is_ongoing)
 		btrfs_dev_replace_unlock(dev_replace);
@@ -5453,17 +5452,17 @@ int btrfs_map_block(struct btrfs_fs_info *fs_info, int rw,
 		      struct btrfs_bio **bbio_ret, int mirror_num)
 {
 	return __btrfs_map_block(fs_info, rw, logical, length, bbio_ret,
-				 mirror_num, NULL);
+				 mirror_num, 0);
 }
 
 /* For Scrub/replace */
 int btrfs_map_sblock(struct btrfs_fs_info *fs_info, int rw,
 		     u64 logical, u64 *length,
 		     struct btrfs_bio **bbio_ret, int mirror_num,
-		     u64 **raid_map_ret)
+		     int need_raid_map)
 {
 	return __btrfs_map_block(fs_info, rw, logical, length, bbio_ret,
-				 mirror_num, raid_map_ret);
+				 mirror_num, need_raid_map);
 }
 
 int btrfs_rmap_block(struct btrfs_mapping_tree *map_tree,
@@ -5802,7 +5801,6 @@ int btrfs_map_bio(struct btrfs_root *root, int rw, struct bio *bio,
 	u64 logical = (u64)bio->bi_iter.bi_sector << 9;
 	u64 length = 0;
 	u64 map_length;
-	u64 *raid_map = NULL;
 	int ret;
 	int dev_nr = 0;
 	int total_devs = 1;
@@ -5813,7 +5811,7 @@ int btrfs_map_bio(struct btrfs_root *root, int rw, struct bio *bio,
 
 	btrfs_bio_counter_inc_blocked(root->fs_info);
 	ret = __btrfs_map_block(root->fs_info, rw, logical, &map_length, &bbio,
-			      mirror_num, &raid_map);
+			      mirror_num, 1);
 	if (ret) {
 		btrfs_bio_counter_dec(root->fs_info);
 		return ret;
@@ -5826,15 +5824,13 @@ int btrfs_map_bio(struct btrfs_root *root, int rw, struct bio *bio,
 	bbio->fs_info = root->fs_info;
 	atomic_set(&bbio->stripes_pending, bbio->num_stripes);
 
-	if (raid_map) {
+	if (bbio->raid_map) {
 		/* In this case, map_length has been set to the length of
 		   a single stripe; not the whole write */
 		if (rw & WRITE) {
-			ret = raid56_parity_write(root, bio, bbio,
-						  raid_map, map_length);
+			ret = raid56_parity_write(root, bio, bbio, map_length);
 		} else {
-			ret = raid56_parity_recover(root, bio, bbio,
-						    raid_map, map_length,
+			ret = raid56_parity_recover(root, bio, bbio, map_length,
 						    mirror_num, 1);
 		}
 
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index d6fe73c..fb0e8c3 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -307,6 +307,12 @@ struct btrfs_bio {
 	int mirror_num;
 	int num_tgtdevs;
 	int *tgtdev_map;
+	/*
+	 * logical block numbers for the start of each stripe
+	 * The last one or two are p/q.  These are sorted,
+	 * so raid_map[0] is the start of our full stripe
+	 */
+	u64 *raid_map;
 	struct btrfs_bio_stripe stripes[];
 };
 
@@ -392,7 +398,8 @@ int btrfs_account_dev_extents_size(struct btrfs_device *device, u64 start,
 #define btrfs_bio_size(total_stripes, real_stripes)		\
 	(sizeof(struct btrfs_bio) +				\
 	 (sizeof(struct btrfs_bio_stripe) * (total_stripes)) +	\
-	 (sizeof(int) * (real_stripes)))
+	 (sizeof(int) * (real_stripes)) +			\
+	 (sizeof(u64) * (real_stripes)))
 
 int btrfs_map_block(struct btrfs_fs_info *fs_info, int rw,
 		    u64 logical, u64 *length,
@@ -400,7 +407,7 @@ int btrfs_map_block(struct btrfs_fs_info *fs_info, int rw,
 int btrfs_map_sblock(struct btrfs_fs_info *fs_info, int rw,
 		     u64 logical, u64 *length,
 		     struct btrfs_bio **bbio_ret, int mirror_num,
-		     u64 **raid_map_ret);
+		     int need_raid_map);
 int btrfs_rmap_block(struct btrfs_mapping_tree *map_tree,
 		     u64 chunk_start, u64 physical, u64 devid,
 		     u64 **logical, int *naddrs, int *stripe_len);
-- 
1.8.5.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 04/15] Btrfs: add ref_count and free function for btrfs_bio
  2015-01-13 12:34 [PATCH 00/15] Btrfs: Cleanup for raid56 scrib Zhaolei
                   ` (2 preceding siblings ...)
  2015-01-13 12:34 ` [PATCH 03/15] Btrfs: Make raid_map array be inlined in btrfs_bio structure Zhaolei
@ 2015-01-13 12:34 ` Zhaolei
  2015-01-15 12:52   ` David Sterba
  2015-01-13 12:34 ` [PATCH 05/15] Btrfs: Fix a jump typo of nodatasum_case to avoid wrong WARN_ON() Zhaolei
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 18+ messages in thread
From: Zhaolei @ 2015-01-13 12:34 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Zhao Lei, Miao Xie

From: Zhao Lei <zhaolei@cn.fujitsu.com>

1: ref_count is simple than current RBIO_HOLD_BBIO_MAP_BIT flag
   to keep btrfs_bio's memory in raid56 recovery implement.
2: free function for bbio will make code clean and flexible, plus
   forced data type checking in compile.

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
---
 fs/btrfs/extent-tree.c |  2 +-
 fs/btrfs/extent_io.c   |  2 +-
 fs/btrfs/raid56.c      | 39 +++++++++------------------------------
 fs/btrfs/reada.c       |  4 ++--
 fs/btrfs/scrub.c       | 12 ++++++------
 fs/btrfs/volumes.c     | 43 ++++++++++++++++++++++++++++++++++++-------
 fs/btrfs/volumes.h     | 10 +++-------
 7 files changed, 58 insertions(+), 54 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 1511658..da8fe79 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -1925,7 +1925,7 @@ int btrfs_discard_extent(struct btrfs_root *root, u64 bytenr,
 			 */
 			ret = 0;
 		}
-		kfree(bbio);
+		put_btrfs_bio(bbio);
 	}
 
 	if (actual_bytes)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 4ebabd2..4f959e5 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2057,7 +2057,7 @@ int repair_io_failure(struct inode *inode, u64 start, u64 length, u64 logical,
 	sector = bbio->stripes[mirror_num-1].physical >> 9;
 	bio->bi_iter.bi_sector = sector;
 	dev = bbio->stripes[mirror_num-1].dev;
-	kfree(bbio);
+	put_btrfs_bio(bbio);
 	if (!dev || !dev->bdev || !dev->writeable) {
 		bio_put(bio);
 		return -EIO;
diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
index e301d33..b1dadf7 100644
--- a/fs/btrfs/raid56.c
+++ b/fs/btrfs/raid56.c
@@ -58,15 +58,6 @@
  */
 #define RBIO_CACHE_READY_BIT	3
 
-/*
- * bbio and raid_map is managed by the caller, so we shouldn't free
- * them here. And besides that, all rbios with this flag should not
- * be cached, because we need raid_map to check the rbios' stripe
- * is the same or not, but it is very likely that the caller has
- * free raid_map, so don't cache those rbios.
- */
-#define RBIO_HOLD_BBIO_MAP_BIT	4
-
 #define RBIO_CACHE_SIZE 1024
 
 enum btrfs_rbio_ops {
@@ -834,19 +825,6 @@ done_nolock:
 		remove_rbio_from_cache(rbio);
 }
 
-static inline void
-__free_bbio(struct btrfs_bio *bbio, int need)
-{
-	if (need)
-		kfree(bbio);
-}
-
-static inline void free_bbio(struct btrfs_raid_bio *rbio)
-{
-	__free_bbio(rbio->bbio,
-		    !test_bit(RBIO_HOLD_BBIO_MAP_BIT, &rbio->flags));
-}
-
 static void __free_raid_bio(struct btrfs_raid_bio *rbio)
 {
 	int i;
@@ -866,8 +844,7 @@ static void __free_raid_bio(struct btrfs_raid_bio *rbio)
 		}
 	}
 
-	free_bbio(rbio);
-
+	put_btrfs_bio(rbio->bbio);
 	kfree(rbio);
 }
 
@@ -1774,7 +1751,7 @@ int raid56_parity_write(struct btrfs_root *root, struct bio *bio,
 
 	rbio = alloc_rbio(root, bbio, stripe_len);
 	if (IS_ERR(rbio)) {
-		__free_bbio(bbio, 1);
+		put_btrfs_bio(bbio);
 		return PTR_ERR(rbio);
 	}
 	bio_list_add(&rbio->bio_list, bio);
@@ -1990,8 +1967,7 @@ cleanup:
 
 cleanup_io:
 	if (rbio->operation == BTRFS_RBIO_READ_REBUILD) {
-		if (err == 0 &&
-		    !test_bit(RBIO_HOLD_BBIO_MAP_BIT, &rbio->flags))
+		if (err == 0)
 			cache_rbio_pages(rbio);
 		else
 			clear_bit(RBIO_CACHE_READY_BIT, &rbio->flags);
@@ -2153,7 +2129,8 @@ int raid56_parity_recover(struct btrfs_root *root, struct bio *bio,
 
 	rbio = alloc_rbio(root, bbio, stripe_len);
 	if (IS_ERR(rbio)) {
-		__free_bbio(bbio, generic_io);
+		if (generic_io)
+			put_btrfs_bio(bbio);
 		return PTR_ERR(rbio);
 	}
 
@@ -2164,7 +2141,8 @@ int raid56_parity_recover(struct btrfs_root *root, struct bio *bio,
 	rbio->faila = find_logical_bio_stripe(rbio, bio);
 	if (rbio->faila == -1) {
 		BUG();
-		__free_bbio(bbio, generic_io);
+		if (generic_io)
+			put_btrfs_bio(bbio);
 		kfree(rbio);
 		return -EIO;
 	}
@@ -2173,9 +2151,10 @@ int raid56_parity_recover(struct btrfs_root *root, struct bio *bio,
 		btrfs_bio_counter_inc_noblocked(root->fs_info);
 		rbio->generic_bio_cnt = 1;
 	} else {
-		set_bit(RBIO_HOLD_BBIO_MAP_BIT, &rbio->flags);
+		get_btrfs_bio(bbio);
 	}
 
+
 	/*
 	 * reconstruct from the q stripe if they are
 	 * asking for mirror 3
diff --git a/fs/btrfs/reada.c b/fs/btrfs/reada.c
index b63ae20..b346227 100644
--- a/fs/btrfs/reada.c
+++ b/fs/btrfs/reada.c
@@ -463,7 +463,7 @@ static struct reada_extent *reada_find_extent(struct btrfs_root *root,
 	spin_unlock(&fs_info->reada_lock);
 	btrfs_dev_replace_unlock(&fs_info->dev_replace);
 
-	kfree(bbio);
+	put_btrfs_bio(bbio);
 	return re;
 
 error:
@@ -488,7 +488,7 @@ error:
 		kref_put(&zone->refcnt, reada_zone_release);
 		spin_unlock(&fs_info->reada_lock);
 	}
-	kfree(bbio);
+	put_btrfs_bio(bbio);
 	kfree(re);
 	return re_exist;
 }
diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 8f32520..c4b7876 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -847,7 +847,7 @@ static inline void scrub_get_recover(struct scrub_recover *recover)
 static inline void scrub_put_recover(struct scrub_recover *recover)
 {
 	if (atomic_dec_and_test(&recover->refs)) {
-		kfree(recover->bbio);
+		put_btrfs_bio(recover->bbio);
 		kfree(recover);
 	}
 }
@@ -1365,13 +1365,13 @@ static int scrub_setup_recheck_block(struct scrub_ctx *sctx,
 		ret = btrfs_map_sblock(fs_info, REQ_GET_READ_MIRRORS, logical,
 				       &mapped_length, &bbio, 0, 1);
 		if (ret || !bbio || mapped_length < sublen) {
-			kfree(bbio);
+			put_btrfs_bio(bbio);
 			return -EIO;
 		}
 
 		recover = kzalloc(sizeof(struct scrub_recover), GFP_NOFS);
 		if (!recover) {
-			kfree(bbio);
+			put_btrfs_bio(bbio);
 			return -ENOMEM;
 		}
 
@@ -2740,7 +2740,7 @@ static void scrub_parity_check_and_repair(struct scrub_parity *sparity)
 rbio_out:
 	bio_put(bio);
 bbio_out:
-	kfree(bbio);
+	put_btrfs_bio(bbio);
 	bitmap_or(sparity->ebitmap, sparity->ebitmap, sparity->dbitmap,
 		  sparity->nsectors);
 	spin_lock(&sctx->stat_lock);
@@ -3871,14 +3871,14 @@ static void scrub_remap_extent(struct btrfs_fs_info *fs_info,
 			      &mapped_length, &bbio, 0);
 	if (ret || !bbio || mapped_length < extent_len ||
 	    !bbio->stripes[0].dev->bdev) {
-		kfree(bbio);
+		put_btrfs_bio(bbio);
 		return;
 	}
 
 	*extent_physical = bbio->stripes[0].physical;
 	*extent_mirror_num = bbio->mirror_num;
 	*extent_dev = bbio->stripes[0].dev;
-	kfree(bbio);
+	put_btrfs_bio(bbio);
 }
 
 static int scrub_setup_wr_ctx(struct scrub_ctx *sctx,
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index b364e14..8671400 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4901,6 +4901,37 @@ static void sort_parity_stripes(struct btrfs_bio *bbio, int num_stripes)
 	}
 }
 
+static struct btrfs_bio *alloc_btrfs_bio(int total_stripes, int real_stripes)
+{
+	struct btrfs_bio *bbio = kzalloc(
+		sizeof(struct btrfs_bio) +
+		sizeof(struct btrfs_bio_stripe) * (total_stripes) +
+		sizeof(int) * (real_stripes) +
+		sizeof(u64) * (real_stripes),
+		GFP_NOFS);
+	if (!bbio)
+		return NULL;
+
+	atomic_set(&bbio->error, 0);
+	atomic_set(&bbio->ref_count, 1);
+
+	return bbio;
+}
+
+void get_btrfs_bio(struct btrfs_bio *bbio)
+{
+	WARN_ON(!atomic_read(&bbio->ref_count));
+	atomic_inc(&bbio->ref_count);
+}
+
+void put_btrfs_bio(struct btrfs_bio *bbio)
+{
+	if (!bbio)
+		return;
+	if (atomic_dec_and_test(&bbio->ref_count))
+		kfree(bbio);
+}
+
 static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int rw,
 			     u64 logical, u64 *length,
 			     struct btrfs_bio **bbio_ret,
@@ -5052,7 +5083,7 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int rw,
 			 * is not left of the left cursor
 			 */
 			ret = -EIO;
-			kfree(tmp_bbio);
+			put_btrfs_bio(tmp_bbio);
 			goto out;
 		}
 
@@ -5087,11 +5118,11 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int rw,
 		} else {
 			WARN_ON(1);
 			ret = -EIO;
-			kfree(tmp_bbio);
+			put_btrfs_bio(tmp_bbio);
 			goto out;
 		}
 
-		kfree(tmp_bbio);
+		put_btrfs_bio(tmp_bbio);
 	} else if (mirror_num > map->num_stripes) {
 		mirror_num = 0;
 	}
@@ -5213,13 +5244,11 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int rw,
 		tgtdev_indexes = num_stripes;
 	}
 
-	bbio = kzalloc(btrfs_bio_size(num_alloc_stripes, tgtdev_indexes),
-		       GFP_NOFS);
+	bbio = alloc_btrfs_bio(num_alloc_stripes, tgtdev_indexes);
 	if (!bbio) {
 		ret = -ENOMEM;
 		goto out;
 	}
-	atomic_set(&bbio->error, 0);
 	if (dev_replace_is_ongoing)
 		bbio->tgtdev_map = (int *)(bbio->stripes + num_alloc_stripes);
 
@@ -5558,7 +5587,7 @@ static inline void btrfs_end_bbio(struct btrfs_bio *bbio, struct bio *bio, int e
 		bio_endio_nodec(bio, err);
 	else
 		bio_endio(bio, err);
-	kfree(bbio);
+	put_btrfs_bio(bbio);
 }
 
 static void btrfs_end_bio(struct bio *bio, int err)
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index fb0e8c3..db195f0 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -295,6 +295,7 @@ typedef void (btrfs_bio_end_io_t) (struct btrfs_bio *bio, int err);
 #define BTRFS_BIO_ORIG_BIO_SUBMITTED	(1 << 0)
 
 struct btrfs_bio {
+	atomic_t ref_count;
 	atomic_t stripes_pending;
 	struct btrfs_fs_info *fs_info;
 	bio_end_io_t *end_io;
@@ -394,13 +395,8 @@ struct btrfs_balance_control {
 
 int btrfs_account_dev_extents_size(struct btrfs_device *device, u64 start,
 				   u64 end, u64 *length);
-
-#define btrfs_bio_size(total_stripes, real_stripes)		\
-	(sizeof(struct btrfs_bio) +				\
-	 (sizeof(struct btrfs_bio_stripe) * (total_stripes)) +	\
-	 (sizeof(int) * (real_stripes)) +			\
-	 (sizeof(u64) * (real_stripes)))
-
+void get_btrfs_bio(struct btrfs_bio *bbio);
+void put_btrfs_bio(struct btrfs_bio *bbio);
 int btrfs_map_block(struct btrfs_fs_info *fs_info, int rw,
 		    u64 logical, u64 *length,
 		    struct btrfs_bio **bbio_ret, int mirror_num);
-- 
1.8.5.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 05/15] Btrfs: Fix a jump typo of nodatasum_case to avoid wrong WARN_ON()
  2015-01-13 12:34 [PATCH 00/15] Btrfs: Cleanup for raid56 scrib Zhaolei
                   ` (3 preceding siblings ...)
  2015-01-13 12:34 ` [PATCH 04/15] Btrfs: add ref_count and free function for btrfs_bio Zhaolei
@ 2015-01-13 12:34 ` Zhaolei
  2015-01-13 12:34 ` [PATCH 06/15] Btrfs: Remove noneed force_write in scrub_write_block_to_dev_replace Zhaolei
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Zhaolei @ 2015-01-13 12:34 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Zhao Lei, Miao Xie

From: Zhao Lei <zhaolei@cn.fujitsu.com>

if (sctx->is_dev_replace && !is_metadata && !have_csum) {
    ...
    goto nodatasum_case;
}
...
nodatasum_case:
    WARN_ON(sctx->is_dev_replace);

In above code, nodatasum_case marker should be moved after
WARN_ON().

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
---
 fs/btrfs/scrub.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index c4b7876..f45399a 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -1028,9 +1028,10 @@ static int scrub_handle_errored_block(struct scrub_block *sblock_to_check)
 	if (!is_metadata && !have_csum) {
 		struct scrub_fixup_nodatasum *fixup_nodatasum;
 
-nodatasum_case:
 		WARN_ON(sctx->is_dev_replace);
 
+nodatasum_case:
+
 		/*
 		 * !is_metadata and !have_csum, this means that the data
 		 * might not be COW'ed, that it might be modified
-- 
1.8.5.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 06/15] Btrfs: Remove noneed force_write in scrub_write_block_to_dev_replace
  2015-01-13 12:34 [PATCH 00/15] Btrfs: Cleanup for raid56 scrib Zhaolei
                   ` (4 preceding siblings ...)
  2015-01-13 12:34 ` [PATCH 05/15] Btrfs: Fix a jump typo of nodatasum_case to avoid wrong WARN_ON() Zhaolei
@ 2015-01-13 12:34 ` Zhaolei
  2015-01-13 12:34 ` [PATCH 07/15] Btrfs: Cleanup btrfs_bio_counter_inc_blocked() Zhaolei
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Zhaolei @ 2015-01-13 12:34 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Zhao Lei, Miao Xie

From: Zhao Lei <zhaolei@cn.fujitsu.com>

It is always 1 in this place, because !1 case was already jumped
out in previous code.

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
---
 fs/btrfs/scrub.c | 19 +++++++------------
 1 file changed, 7 insertions(+), 12 deletions(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index f45399a..8ea5ce7 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -250,8 +250,7 @@ static void scrub_recheck_block_checksum(struct btrfs_fs_info *fs_info,
 					 const u8 *csum, u64 generation,
 					 u16 csum_size);
 static int scrub_repair_block_from_good_copy(struct scrub_block *sblock_bad,
-					     struct scrub_block *sblock_good,
-					     int force_write);
+					     struct scrub_block *sblock_good);
 static int scrub_repair_page_from_good_copy(struct scrub_block *sblock_bad,
 					    struct scrub_block *sblock_good,
 					    int page_num, int force_write);
@@ -1090,15 +1089,13 @@ nodatasum_case:
 		    sblock_other->no_io_error_seen) {
 			if (sctx->is_dev_replace) {
 				scrub_write_block_to_dev_replace(sblock_other);
+				goto corrected_error;
 			} else {
-				int force_write = is_metadata || have_csum;
-
 				ret = scrub_repair_block_from_good_copy(
-						sblock_bad, sblock_other,
-						force_write);
+						sblock_bad, sblock_other);
+				if (!ret)
+					goto corrected_error;
 			}
-			if (0 == ret)
-				goto corrected_error;
 		}
 	}
 
@@ -1611,8 +1608,7 @@ static void scrub_recheck_block_checksum(struct btrfs_fs_info *fs_info,
 }
 
 static int scrub_repair_block_from_good_copy(struct scrub_block *sblock_bad,
-					     struct scrub_block *sblock_good,
-					     int force_write)
+					     struct scrub_block *sblock_good)
 {
 	int page_num;
 	int ret = 0;
@@ -1622,8 +1618,7 @@ static int scrub_repair_block_from_good_copy(struct scrub_block *sblock_bad,
 
 		ret_sub = scrub_repair_page_from_good_copy(sblock_bad,
 							   sblock_good,
-							   page_num,
-							   force_write);
+							   page_num, 1);
 		if (ret_sub)
 			ret = ret_sub;
 	}
-- 
1.8.5.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 07/15] Btrfs: Cleanup btrfs_bio_counter_inc_blocked()
  2015-01-13 12:34 [PATCH 00/15] Btrfs: Cleanup for raid56 scrib Zhaolei
                   ` (5 preceding siblings ...)
  2015-01-13 12:34 ` [PATCH 06/15] Btrfs: Remove noneed force_write in scrub_write_block_to_dev_replace Zhaolei
@ 2015-01-13 12:34 ` Zhaolei
  2015-01-13 12:34 ` [PATCH 08/15] Btrfs: btrfs_rm_dev_replace_blocked(): Use wait_event() Zhaolei
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Zhaolei @ 2015-01-13 12:34 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Zhao Lei, Miao Xie

From: Zhao Lei <zhaolei@cn.fujitsu.com>

1: Remove no-need DEFINE_WAIT(wait)
2: Add likely() for BTRFS_FS_STATE_DEV_REPLACING condition
3: Use while loop instead of goto

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
---
 fs/btrfs/dev-replace.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
index ca6a3a3..92109b7 100644
--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -932,15 +932,15 @@ void btrfs_bio_counter_sub(struct btrfs_fs_info *fs_info, s64 amount)
 
 void btrfs_bio_counter_inc_blocked(struct btrfs_fs_info *fs_info)
 {
-	DEFINE_WAIT(wait);
-again:
-	percpu_counter_inc(&fs_info->bio_counter);
-	if (test_bit(BTRFS_FS_STATE_DEV_REPLACING, &fs_info->fs_state)) {
+	while (1) {
+		percpu_counter_inc(&fs_info->bio_counter);
+		if (likely(!test_bit(BTRFS_FS_STATE_DEV_REPLACING,
+				     &fs_info->fs_state)))
+			break;
+
 		btrfs_bio_counter_dec(fs_info);
 		wait_event(fs_info->replace_wait,
 			   !test_bit(BTRFS_FS_STATE_DEV_REPLACING,
 				     &fs_info->fs_state));
-		goto again;
 	}
-
 }
-- 
1.8.5.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 08/15] Btrfs: btrfs_rm_dev_replace_blocked(): Use wait_event()
  2015-01-13 12:34 [PATCH 00/15] Btrfs: Cleanup for raid56 scrib Zhaolei
                   ` (6 preceding siblings ...)
  2015-01-13 12:34 ` [PATCH 07/15] Btrfs: Cleanup btrfs_bio_counter_inc_blocked() Zhaolei
@ 2015-01-13 12:34 ` Zhaolei
  2015-01-13 12:34 ` [PATCH 09/15] Btrfs: Break loop when reach BTRFS_MAX_MIRRORS in scrub_setup_recheck_block() Zhaolei
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Zhaolei @ 2015-01-13 12:34 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Zhao Lei, Miao Xie

From: Zhao Lei <zhaolei@cn.fujitsu.com>

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
---
 fs/btrfs/dev-replace.c | 13 ++-----------
 1 file changed, 2 insertions(+), 11 deletions(-)

diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
index 92109b7..5ec03d9 100644
--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -440,18 +440,9 @@ leave:
  */
 static void btrfs_rm_dev_replace_blocked(struct btrfs_fs_info *fs_info)
 {
-	s64 writers;
-	DEFINE_WAIT(wait);
-
 	set_bit(BTRFS_FS_STATE_DEV_REPLACING, &fs_info->fs_state);
-	do {
-		prepare_to_wait(&fs_info->replace_wait, &wait,
-				TASK_UNINTERRUPTIBLE);
-		writers = percpu_counter_sum(&fs_info->bio_counter);
-		if (writers)
-			schedule();
-		finish_wait(&fs_info->replace_wait, &wait);
-	} while (writers);
+	wait_event(fs_info->replace_wait, !percpu_counter_sum(
+		   &fs_info->bio_counter));
 }
 
 /*
-- 
1.8.5.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 09/15] Btrfs: Break loop when reach BTRFS_MAX_MIRRORS in scrub_setup_recheck_block()
  2015-01-13 12:34 [PATCH 00/15] Btrfs: Cleanup for raid56 scrib Zhaolei
                   ` (7 preceding siblings ...)
  2015-01-13 12:34 ` [PATCH 08/15] Btrfs: btrfs_rm_dev_replace_blocked(): Use wait_event() Zhaolei
@ 2015-01-13 12:34 ` Zhaolei
  2015-01-13 12:34 ` [PATCH 10/15] Btrfs: Avoid trustless page-level-repair in dev-replace Zhaolei
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Zhaolei @ 2015-01-13 12:34 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Zhao Lei, Miao Xie

From: Zhao Lei <zhaolei@cn.fujitsu.com>

Use break instead of useless loop should be more suitable in this
case.

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
---
 fs/btrfs/scrub.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 8ea5ce7..22779dd 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -1386,7 +1386,7 @@ static int scrub_setup_recheck_block(struct scrub_ctx *sctx,
 			struct scrub_page *page;
 
 			if (mirror_index >= BTRFS_MAX_MIRRORS)
-				continue;
+				break;
 
 			sblock = sblocks_for_recheck + mirror_index;
 			sblock->sctx = sctx;
-- 
1.8.5.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 10/15] Btrfs: Avoid trustless page-level-repair in dev-replace
  2015-01-13 12:34 [PATCH 00/15] Btrfs: Cleanup for raid56 scrib Zhaolei
                   ` (8 preceding siblings ...)
  2015-01-13 12:34 ` [PATCH 09/15] Btrfs: Break loop when reach BTRFS_MAX_MIRRORS in scrub_setup_recheck_block() Zhaolei
@ 2015-01-13 12:34 ` Zhaolei
  2015-01-13 12:34 ` [PATCH 11/15] Btrfs: Separate finding-right-mirror and writing-to-target's process in scrub_handle_errored_block() Zhaolei
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Zhaolei @ 2015-01-13 12:34 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Zhao Lei, Miao Xie

From: Zhao Lei <zhaolei@cn.fujitsu.com>

Current code of page level repair for dev-replace can only support
io-error, we can't use it in checksum-fail case.

We can skip this kind of repair in dev-replace just as we in scrub.

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
---
 fs/btrfs/scrub.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 22779dd..8d6c29a 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -1099,6 +1099,10 @@ nodatasum_case:
 		}
 	}
 
+	/* can only fix I/O errors from here on */
+	if (sblock_bad->no_io_error_seen)
+		goto did_not_correct_error;
+
 	/*
 	 * for dev_replace, pick good pages and write to the target device.
 	 */
@@ -1181,10 +1185,6 @@ nodatasum_case:
 	 * area are unreadable.
 	 */
 
-	/* can only fix I/O errors from here on */
-	if (sblock_bad->no_io_error_seen)
-		goto did_not_correct_error;
-
 	success = 1;
 	for (page_num = 0; page_num < sblock_bad->page_count; page_num++) {
 		struct scrub_page *page_bad = sblock_bad->pagev[page_num];
-- 
1.8.5.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 11/15] Btrfs: Separate finding-right-mirror and writing-to-target's process in scrub_handle_errored_block()
  2015-01-13 12:34 [PATCH 00/15] Btrfs: Cleanup for raid56 scrib Zhaolei
                   ` (9 preceding siblings ...)
  2015-01-13 12:34 ` [PATCH 10/15] Btrfs: Avoid trustless page-level-repair in dev-replace Zhaolei
@ 2015-01-13 12:34 ` Zhaolei
  2015-01-13 12:34 ` [PATCH 12/15] Btrfs: Combine per-page recover in dev-replace and scrub Zhaolei
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Zhaolei @ 2015-01-13 12:34 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Zhao Lei, Miao Xie

From: Zhao Lei <zhaolei@cn.fujitsu.com>

In corrent code, code of finding-right-mirror and writing-to-target
are mixed in logic, if we find a right mirror but failed in writing
to target, it will treat as "hadn't found right block", and fill the
target with sblock_bad.

Actually, "failed in writing to target" does not mean "source
block is wrong", this patch separate above two condition in logic,
and do some cleanup to make code clean.

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
---
 fs/btrfs/scrub.c | 44 +++++++++++++++++---------------------------
 1 file changed, 17 insertions(+), 27 deletions(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 8d6c29a..d1d681f 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -1110,35 +1110,21 @@ nodatasum_case:
 		success = 1;
 		for (page_num = 0; page_num < sblock_bad->page_count;
 		     page_num++) {
-			int sub_success;
+			struct scrub_block *sblock_other = NULL;
 
-			sub_success = 0;
 			for (mirror_index = 0;
 			     mirror_index < BTRFS_MAX_MIRRORS &&
 			     sblocks_for_recheck[mirror_index].page_count > 0;
 			     mirror_index++) {
-				struct scrub_block *sblock_other =
-					sblocks_for_recheck + mirror_index;
-				struct scrub_page *page_other =
-					sblock_other->pagev[page_num];
-
-				if (!page_other->io_error) {
-					ret = scrub_write_page_to_dev_replace(
-							sblock_other, page_num);
-					if (ret == 0) {
-						/* succeeded for this page */
-						sub_success = 1;
-						break;
-					} else {
-						btrfs_dev_replace_stats_inc(
-							&sctx->dev_root->
-							fs_info->dev_replace.
-							num_write_errors);
-					}
+				if (!sblocks_for_recheck[mirror_index].
+				    pagev[page_num]->io_error) {
+					sblock_other = sblocks_for_recheck +
+						       mirror_index;
+					break;
 				}
 			}
 
-			if (!sub_success) {
+			if (!sblock_other) {
 				/*
 				 * did not find a mirror to fetch the page
 				 * from. scrub_write_page_to_dev_replace()
@@ -1146,13 +1132,17 @@ nodatasum_case:
 				 * filling the block with zeros before
 				 * submitting the write request
 				 */
+				sblock_other = sblock_bad;
+				success = 0;
+			}
+
+			if (scrub_write_page_to_dev_replace(sblock_other,
+			    page_num) != 0) {
+				btrfs_dev_replace_stats_inc(
+					&sctx->dev_root->
+					fs_info->dev_replace.
+					num_write_errors);
 				success = 0;
-				ret = scrub_write_page_to_dev_replace(
-						sblock_bad, page_num);
-				if (ret)
-					btrfs_dev_replace_stats_inc(
-						&sctx->dev_root->fs_info->
-						dev_replace.num_write_errors);
 			}
 		}
 
-- 
1.8.5.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 12/15] Btrfs: Combine per-page recover in dev-replace and scrub
  2015-01-13 12:34 [PATCH 00/15] Btrfs: Cleanup for raid56 scrib Zhaolei
                   ` (10 preceding siblings ...)
  2015-01-13 12:34 ` [PATCH 11/15] Btrfs: Separate finding-right-mirror and writing-to-target's process in scrub_handle_errored_block() Zhaolei
@ 2015-01-13 12:34 ` Zhaolei
  2015-01-13 12:34 ` [PATCH 13/15] Btrfs: Simplify scrub_setup_recheck_block()'s argument Zhaolei
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Zhaolei @ 2015-01-13 12:34 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Zhao Lei, Miao Xie

From: Zhao Lei <zhaolei@cn.fujitsu.com>

The code are similar, combine them to make code clean and easy to maintenance.
Some lost condition are also completed with benefit of this combination.

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
---
 fs/btrfs/scrub.c | 110 ++++++++++++++++++++++++++++---------------------------
 1 file changed, 57 insertions(+), 53 deletions(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index d1d681f..000fa59 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -1099,19 +1099,47 @@ nodatasum_case:
 		}
 	}
 
+	/*
+	 * In case of I/O errors in the area that is supposed to be
+	 * repaired, continue by picking good copies of those pages.
+	 * Select the good pages from mirrors to rewrite bad pages from
+	 * the area to fix. Afterwards verify the checksum of the block
+	 * that is supposed to be repaired. This verification step is
+	 * only done for the purpose of statistic counting and for the
+	 * final scrub report, whether errors remain.
+	 * A perfect algorithm could make use of the checksum and try
+	 * all possible combinations of pages from the different mirrors
+	 * until the checksum verification succeeds. For example, when
+	 * the 2nd page of mirror #1 faces I/O errors, and the 2nd page
+	 * of mirror #2 is readable but the final checksum test fails,
+	 * then the 2nd page of mirror #3 could be tried, whether now
+	 * the final checksum succeedes. But this would be a rare
+	 * exception and is therefore not implemented. At least it is
+	 * avoided that the good copy is overwritten.
+	 * A more useful improvement would be to pick the sectors
+	 * without I/O error based on sector sizes (512 bytes on legacy
+	 * disks) instead of on PAGE_SIZE. Then maybe 512 byte of one
+	 * mirror could be repaired by taking 512 byte of a different
+	 * mirror, even if other 512 byte sectors in the same PAGE_SIZE
+	 * area are unreadable.
+	 */
+
 	/* can only fix I/O errors from here on */
 	if (sblock_bad->no_io_error_seen)
 		goto did_not_correct_error;
 
-	/*
-	 * for dev_replace, pick good pages and write to the target device.
-	 */
-	if (sctx->is_dev_replace) {
-		success = 1;
-		for (page_num = 0; page_num < sblock_bad->page_count;
-		     page_num++) {
-			struct scrub_block *sblock_other = NULL;
+	success = 1;
+	for (page_num = 0; page_num < sblock_bad->page_count;
+	     page_num++) {
+		struct scrub_page *page_bad = sblock_bad->pagev[page_num];
+		struct scrub_block *sblock_other = NULL;
 
+		/* skip no-io-error page in scrub */
+		if (!page_bad->io_error && !sctx->is_dev_replace)
+			continue;
+
+		/* try to find no-io-error page in mirrors */
+		if (page_bad->io_error) {
 			for (mirror_index = 0;
 			     mirror_index < BTRFS_MAX_MIRRORS &&
 			     sblocks_for_recheck[mirror_index].page_count > 0;
@@ -1123,18 +1151,20 @@ nodatasum_case:
 					break;
 				}
 			}
+			if (!sblock_other)
+				success = 0;
+		}
 
-			if (!sblock_other) {
-				/*
-				 * did not find a mirror to fetch the page
-				 * from. scrub_write_page_to_dev_replace()
-				 * handles this case (page->io_error), by
-				 * filling the block with zeros before
-				 * submitting the write request
-				 */
+		if (sctx->is_dev_replace) {
+			/*
+			 * did not find a mirror to fetch the page
+			 * from. scrub_write_page_to_dev_replace()
+			 * handles this case (page->io_error), by
+			 * filling the block with zeros before
+			 * submitting the write request
+			 */
+			if (!sblock_other)
 				sblock_other = sblock_bad;
-				success = 0;
-			}
 
 			if (scrub_write_page_to_dev_replace(sblock_other,
 			    page_num) != 0) {
@@ -1144,9 +1174,15 @@ nodatasum_case:
 					num_write_errors);
 				success = 0;
 			}
+		} else if (sblock_other) {
+			ret = scrub_repair_page_from_good_copy(sblock_bad,
+							       sblock_other,
+							       page_num, 0);
+			if (0 == ret)
+				page_bad->io_error = 0;
+			else
+				success = 0;
 		}
-
-		goto out;
 	}
 
 	/*
@@ -1175,39 +1211,7 @@ nodatasum_case:
 	 * area are unreadable.
 	 */
 
-	success = 1;
-	for (page_num = 0; page_num < sblock_bad->page_count; page_num++) {
-		struct scrub_page *page_bad = sblock_bad->pagev[page_num];
-
-		if (!page_bad->io_error)
-			continue;
-
-		for (mirror_index = 0;
-		     mirror_index < BTRFS_MAX_MIRRORS &&
-		     sblocks_for_recheck[mirror_index].page_count > 0;
-		     mirror_index++) {
-			struct scrub_block *sblock_other = sblocks_for_recheck +
-							   mirror_index;
-			struct scrub_page *page_other = sblock_other->pagev[
-							page_num];
-
-			if (!page_other->io_error) {
-				ret = scrub_repair_page_from_good_copy(
-					sblock_bad, sblock_other, page_num, 0);
-				if (0 == ret) {
-					page_bad->io_error = 0;
-					break; /* succeeded for this page */
-				}
-			}
-		}
-
-		if (page_bad->io_error) {
-			/* did not find a mirror to copy the page from */
-			success = 0;
-		}
-	}
-
-	if (success) {
+	if (success && !sctx->is_dev_replace) {
 		if (is_metadata || have_csum) {
 			/*
 			 * need to verify the checksum now that all
-- 
1.8.5.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 13/15] Btrfs: Simplify scrub_setup_recheck_block()'s argument
  2015-01-13 12:34 [PATCH 00/15] Btrfs: Cleanup for raid56 scrib Zhaolei
                   ` (11 preceding siblings ...)
  2015-01-13 12:34 ` [PATCH 12/15] Btrfs: Combine per-page recover in dev-replace and scrub Zhaolei
@ 2015-01-13 12:34 ` Zhaolei
  2015-01-13 12:34 ` [PATCH 14/15] Btrfs: Include map_type in raid_bio Zhaolei
  2015-01-13 12:34 ` [PATCH 15/15] Btrfs: Introduce BTRFS_BLOCK_GROUP_RAID56_MASK to check raid56 simply Zhaolei
  14 siblings, 0 replies; 18+ messages in thread
From: Zhaolei @ 2015-01-13 12:34 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Zhao Lei, Miao Xie

From: Zhao Lei <zhaolei@cn.fujitsu.com>

scrub_setup_recheck_block() have many arguments but most of them
can be get from one of them, we can remove them to make code clean.
Some other cleanup for that function also included in this patch.

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
---
 fs/btrfs/scrub.c | 25 +++++++++----------------
 1 file changed, 9 insertions(+), 16 deletions(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 000fa59..ce86807 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -235,10 +235,7 @@ static void scrub_pending_bio_dec(struct scrub_ctx *sctx);
 static void scrub_pending_trans_workers_inc(struct scrub_ctx *sctx);
 static void scrub_pending_trans_workers_dec(struct scrub_ctx *sctx);
 static int scrub_handle_errored_block(struct scrub_block *sblock_to_check);
-static int scrub_setup_recheck_block(struct scrub_ctx *sctx,
-				     struct btrfs_fs_info *fs_info,
-				     struct scrub_block *original_sblock,
-				     u64 length, u64 logical,
+static int scrub_setup_recheck_block(struct scrub_block *original_sblock,
 				     struct scrub_block *sblocks_for_recheck);
 static void scrub_recheck_block(struct btrfs_fs_info *fs_info,
 				struct scrub_block *sblock, int is_metadata,
@@ -952,8 +949,7 @@ static int scrub_handle_errored_block(struct scrub_block *sblock_to_check)
 	}
 
 	/* setup the context, map the logical blocks and alloc the pages */
-	ret = scrub_setup_recheck_block(sctx, fs_info, sblock_to_check, length,
-					logical, sblocks_for_recheck);
+	ret = scrub_setup_recheck_block(sblock_to_check, sblocks_for_recheck);
 	if (ret) {
 		spin_lock(&sctx->stat_lock);
 		sctx->stat.read_errors++;
@@ -1321,19 +1317,20 @@ static inline void scrub_stripe_index_and_offset(u64 logical, u64 *raid_map,
 	}
 }
 
-static int scrub_setup_recheck_block(struct scrub_ctx *sctx,
-				     struct btrfs_fs_info *fs_info,
-				     struct scrub_block *original_sblock,
-				     u64 length, u64 logical,
+static int scrub_setup_recheck_block(struct scrub_block *original_sblock,
 				     struct scrub_block *sblocks_for_recheck)
 {
+	struct scrub_ctx *sctx = original_sblock->sctx;
+	struct btrfs_fs_info *fs_info = sctx->dev_root->fs_info;
+	u64 length = original_sblock->page_count * PAGE_SIZE;
+	u64 logical = original_sblock->pagev[0]->logical;
 	struct scrub_recover *recover;
 	struct btrfs_bio *bbio;
 	u64 sublen;
 	u64 mapped_length;
 	u64 stripe_offset;
 	int stripe_index;
-	int page_index;
+	int page_index = 0;
 	int mirror_index;
 	int nmirrors;
 	int ret;
@@ -1344,7 +1341,6 @@ static int scrub_setup_recheck_block(struct scrub_ctx *sctx,
 	 * the recheck procedure
 	 */
 
-	page_index = 0;
 	while (length > 0) {
 		sublen = min_t(u64, length, PAGE_SIZE);
 		mapped_length = sublen;
@@ -1373,15 +1369,12 @@ static int scrub_setup_recheck_block(struct scrub_ctx *sctx,
 
 		BUG_ON(page_index >= SCRUB_PAGES_PER_RD_BIO);
 
-		nmirrors = scrub_nr_raid_mirrors(bbio);
+		nmirrors = min(scrub_nr_raid_mirrors(bbio), BTRFS_MAX_MIRRORS);
 		for (mirror_index = 0; mirror_index < nmirrors;
 		     mirror_index++) {
 			struct scrub_block *sblock;
 			struct scrub_page *page;
 
-			if (mirror_index >= BTRFS_MAX_MIRRORS)
-				break;
-
 			sblock = sblocks_for_recheck + mirror_index;
 			sblock->sctx = sctx;
 			page = kzalloc(sizeof(*page), GFP_NOFS);
-- 
1.8.5.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 14/15] Btrfs: Include map_type in raid_bio
  2015-01-13 12:34 [PATCH 00/15] Btrfs: Cleanup for raid56 scrib Zhaolei
                   ` (12 preceding siblings ...)
  2015-01-13 12:34 ` [PATCH 13/15] Btrfs: Simplify scrub_setup_recheck_block()'s argument Zhaolei
@ 2015-01-13 12:34 ` Zhaolei
  2015-01-13 12:34 ` [PATCH 15/15] Btrfs: Introduce BTRFS_BLOCK_GROUP_RAID56_MASK to check raid56 simply Zhaolei
  14 siblings, 0 replies; 18+ messages in thread
From: Zhaolei @ 2015-01-13 12:34 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Zhao Lei, Miao Xie

From: Zhao Lei <zhaolei@cn.fujitsu.com>

Corrent code use many kinds of "clever" way to determine operation
target's raid type, as:
  raid_map != NULL
  or
  raid_map[MAX_NR] == RAID[56]_Q_STRIPE

To make code easy to maintenance, this patch put raid type into
bbio, and we can always get raid type from bbio with a "stupid"
way.

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
---
 fs/btrfs/raid56.c  | 10 +++++-----
 fs/btrfs/scrub.c   | 28 +++++++++++++++-------------
 fs/btrfs/volumes.c |  1 +
 fs/btrfs/volumes.h |  1 +
 4 files changed, 22 insertions(+), 18 deletions(-)

diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
index b1dadf7..5d13ac7 100644
--- a/fs/btrfs/raid56.c
+++ b/fs/btrfs/raid56.c
@@ -994,10 +994,12 @@ static struct btrfs_raid_bio *alloc_rbio(struct btrfs_root *root,
 	rbio->bio_pages = p + sizeof(struct page *) * num_pages;
 	rbio->dbitmap = p + sizeof(struct page *) * num_pages * 2;
 
-	if (bbio->raid_map[real_stripes - 1] == RAID6_Q_STRIPE)
+	if (bbio->map_type & BTRFS_BLOCK_GROUP_RAID5)
+		nr_data = real_stripes - 1;
+	else if (bbio->map_type & BTRFS_BLOCK_GROUP_RAID6)
 		nr_data = real_stripes - 2;
 	else
-		nr_data = real_stripes - 1;
+		BUG();
 
 	rbio->nr_data = nr_data;
 	return rbio;
@@ -1850,9 +1852,7 @@ static void __raid_recover_end_io(struct btrfs_raid_bio *rbio)
 		}
 
 		/* all raid6 handling here */
-		if (rbio->bbio->raid_map[rbio->real_stripes - 1] ==
-		    RAID6_Q_STRIPE) {
-
+		if (rbio->bbio->map_type & BTRFS_BLOCK_GROUP_RAID6) {
 			/*
 			 * single failure, rebuild from parity raid5
 			 * style
diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index ce86807..68033d2 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -1276,19 +1276,16 @@ out:
 
 static inline int scrub_nr_raid_mirrors(struct btrfs_bio *bbio)
 {
-	if (bbio->raid_map) {
-		int real_stripes = bbio->num_stripes - bbio->num_tgtdevs;
-
-		if (bbio->raid_map[real_stripes - 1] == RAID6_Q_STRIPE)
-			return 3;
-		else
-			return 2;
-	} else {
+	if (bbio->map_type & BTRFS_BLOCK_GROUP_RAID5)
+		return 2;
+	else if (bbio->map_type & BTRFS_BLOCK_GROUP_RAID6)
+		return 3;
+	else
 		return (int)bbio->num_stripes;
-	}
 }
 
-static inline void scrub_stripe_index_and_offset(u64 logical, u64 *raid_map,
+static inline void scrub_stripe_index_and_offset(u64 logical, u64 map_type,
+						 u64 *raid_map,
 						 u64 mapped_length,
 						 int nstripes, int mirror,
 						 int *stripe_index,
@@ -1296,7 +1293,7 @@ static inline void scrub_stripe_index_and_offset(u64 logical, u64 *raid_map,
 {
 	int i;
 
-	if (raid_map) {
+	if (map_type & (BTRFS_BLOCK_GROUP_RAID5 | BTRFS_BLOCK_GROUP_RAID6)) {
 		/* RAID5/6 */
 		for (i = 0; i < nstripes; i++) {
 			if (raid_map[i] == RAID6_Q_STRIPE ||
@@ -1370,6 +1367,7 @@ static int scrub_setup_recheck_block(struct scrub_block *original_sblock,
 		BUG_ON(page_index >= SCRUB_PAGES_PER_RD_BIO);
 
 		nmirrors = min(scrub_nr_raid_mirrors(bbio), BTRFS_MAX_MIRRORS);
+
 		for (mirror_index = 0; mirror_index < nmirrors;
 		     mirror_index++) {
 			struct scrub_block *sblock;
@@ -1390,7 +1388,9 @@ leave_nomem:
 			sblock->pagev[page_index] = page;
 			page->logical = logical;
 
-			scrub_stripe_index_and_offset(logical, bbio->raid_map,
+			scrub_stripe_index_and_offset(logical,
+						      bbio->map_type,
+						      bbio->raid_map,
 						      mapped_length,
 						      bbio->num_stripes -
 						      bbio->num_tgtdevs,
@@ -1439,7 +1439,9 @@ static void scrub_bio_wait_endio(struct bio *bio, int error)
 
 static inline int scrub_is_page_on_raid56(struct scrub_page *page)
 {
-	return page->recover && page->recover->bbio->raid_map;
+	return page->recover &&
+	       (page->recover->bbio->map_type & (BTRFS_BLOCK_GROUP_RAID5 |
+	       BTRFS_BLOCK_GROUP_RAID6));
 }
 
 static int scrub_submit_raid56_bio_wait(struct btrfs_fs_info *fs_info,
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 8671400..1b0de3a 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -5453,6 +5453,7 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int rw,
 	}
 
 	*bbio_ret = bbio;
+	bbio->map_type = map->type;
 	bbio->num_stripes = num_stripes;
 	bbio->max_errors = max_errors;
 	bbio->mirror_num = mirror_num;
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index db195f0..b3921ad 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -298,6 +298,7 @@ struct btrfs_bio {
 	atomic_t ref_count;
 	atomic_t stripes_pending;
 	struct btrfs_fs_info *fs_info;
+	u64 map_type; /* get from map_lookup->type */
 	bio_end_io_t *end_io;
 	struct bio *orig_bio;
 	unsigned long flags;
-- 
1.8.5.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 15/15] Btrfs: Introduce BTRFS_BLOCK_GROUP_RAID56_MASK to check raid56 simply
  2015-01-13 12:34 [PATCH 00/15] Btrfs: Cleanup for raid56 scrib Zhaolei
                   ` (13 preceding siblings ...)
  2015-01-13 12:34 ` [PATCH 14/15] Btrfs: Include map_type in raid_bio Zhaolei
@ 2015-01-13 12:34 ` Zhaolei
  14 siblings, 0 replies; 18+ messages in thread
From: Zhaolei @ 2015-01-13 12:34 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Zhao Lei, Miao Xie

From: Zhao Lei <zhaolei@cn.fujitsu.com>

So we can check raid56 with:
 (map->type & BTRFS_BLOCK_GROUP_RAID56_MASK)
instead of long:
 (map->type & (BTRFS_BLOCK_GROUP_RAID5 | BTRFS_BLOCK_GROUP_RAID6))

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
---
 fs/btrfs/ctree.h   |  3 +++
 fs/btrfs/inode.c   |  3 +--
 fs/btrfs/scrub.c   | 17 ++++++-----------
 fs/btrfs/volumes.c | 24 +++++++++---------------
 4 files changed, 19 insertions(+), 28 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 7e60741..f6caf77 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1020,6 +1020,9 @@ enum btrfs_raid_types {
 					 BTRFS_BLOCK_GROUP_RAID6 |   \
 					 BTRFS_BLOCK_GROUP_DUP |     \
 					 BTRFS_BLOCK_GROUP_RAID10)
+#define BTRFS_BLOCK_GROUP_RAID56_MASK	(BTRFS_BLOCK_GROUP_RAID5 |   \
+					 BTRFS_BLOCK_GROUP_RAID6)
+
 /*
  * We need a bit for restriper to be able to tell when chunks of type
  * SINGLE are available.  This "extended" profile format is used in
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 8a036ed..54295a6 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -7808,8 +7808,7 @@ static int btrfs_submit_direct_hook(int rw, struct btrfs_dio_private *dip,
 	}
 
 	/* async crcs make it difficult to collect full stripe writes. */
-	if (btrfs_get_alloc_profile(root, 1) &
-	    (BTRFS_BLOCK_GROUP_RAID5 | BTRFS_BLOCK_GROUP_RAID6))
+	if (btrfs_get_alloc_profile(root, 1) & BTRFS_BLOCK_GROUP_RAID56_MASK)
 		async_submit = 0;
 	else
 		async_submit = 1;
diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 68033d2..d34b971 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -1293,7 +1293,7 @@ static inline void scrub_stripe_index_and_offset(u64 logical, u64 map_type,
 {
 	int i;
 
-	if (map_type & (BTRFS_BLOCK_GROUP_RAID5 | BTRFS_BLOCK_GROUP_RAID6)) {
+	if (map_type & BTRFS_BLOCK_GROUP_RAID56_MASK) {
 		/* RAID5/6 */
 		for (i = 0; i < nstripes; i++) {
 			if (raid_map[i] == RAID6_Q_STRIPE ||
@@ -1440,8 +1440,7 @@ static void scrub_bio_wait_endio(struct bio *bio, int error)
 static inline int scrub_is_page_on_raid56(struct scrub_page *page)
 {
 	return page->recover &&
-	       (page->recover->bbio->map_type & (BTRFS_BLOCK_GROUP_RAID5 |
-	       BTRFS_BLOCK_GROUP_RAID6));
+	       (page->recover->bbio->map_type & BTRFS_BLOCK_GROUP_RAID56_MASK);
 }
 
 static int scrub_submit_raid56_bio_wait(struct btrfs_fs_info *fs_info,
@@ -3014,8 +3013,7 @@ static noinline_for_stack int scrub_stripe(struct scrub_ctx *sctx,
 	} else if (map->type & BTRFS_BLOCK_GROUP_DUP) {
 		increment = map->stripe_len;
 		mirror_num = num % map->num_stripes + 1;
-	} else if (map->type & (BTRFS_BLOCK_GROUP_RAID5 |
-				BTRFS_BLOCK_GROUP_RAID6)) {
+	} else if (map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) {
 		get_raid56_logic_offset(physical, num, map, &offset, NULL);
 		increment = map->stripe_len * nr_data_stripes(map);
 		mirror_num = 1;
@@ -3049,8 +3047,7 @@ static noinline_for_stack int scrub_stripe(struct scrub_ctx *sctx,
 	 */
 	logical = base + offset;
 	physical_end = physical + nstripes * map->stripe_len;
-	if (map->type & (BTRFS_BLOCK_GROUP_RAID5 |
-			 BTRFS_BLOCK_GROUP_RAID6)) {
+	if (map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) {
 		get_raid56_logic_offset(physical_end, num,
 					map, &logic_end, NULL);
 		logic_end += base;
@@ -3096,8 +3093,7 @@ static noinline_for_stack int scrub_stripe(struct scrub_ctx *sctx,
 	ret = 0;
 	while (physical < physical_end) {
 		/* for raid56, we skip parity stripe */
-		if (map->type & (BTRFS_BLOCK_GROUP_RAID5 |
-				BTRFS_BLOCK_GROUP_RAID6)) {
+		if (map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) {
 			ret = get_raid56_logic_offset(physical, num,
 					map, &logical, &stripe_logical);
 			logical += base;
@@ -3255,8 +3251,7 @@ again:
 			scrub_free_csums(sctx);
 			if (extent_logical + extent_len <
 			    key.objectid + bytes) {
-				if (map->type & (BTRFS_BLOCK_GROUP_RAID5 |
-					BTRFS_BLOCK_GROUP_RAID6)) {
+				if (map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) {
 					/*
 					 * loop until we find next data stripe
 					 * or we have finished all stripes.
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 1b0de3a..02a7b66 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4196,7 +4196,7 @@ static u32 find_raid56_stripe_len(u32 data_devices, u32 dev_stripe_target)
 
 static void check_raid56_incompat_flag(struct btrfs_fs_info *info, u64 type)
 {
-	if (!(type & (BTRFS_BLOCK_GROUP_RAID5 | BTRFS_BLOCK_GROUP_RAID6)))
+	if (!(type & BTRFS_BLOCK_GROUP_RAID56_MASK))
 		return;
 
 	btrfs_set_fs_incompat(info, RAID56);
@@ -4803,10 +4803,8 @@ unsigned long btrfs_full_stripe_len(struct btrfs_root *root,
 
 	BUG_ON(em->start > logical || em->start + em->len < logical);
 	map = (struct map_lookup *)em->bdev;
-	if (map->type & (BTRFS_BLOCK_GROUP_RAID5 |
-			 BTRFS_BLOCK_GROUP_RAID6)) {
+	if (map->type & BTRFS_BLOCK_GROUP_RAID56_MASK)
 		len = map->stripe_len * nr_data_stripes(map);
-	}
 	free_extent_map(em);
 	return len;
 }
@@ -4826,8 +4824,7 @@ int btrfs_is_parity_mirror(struct btrfs_mapping_tree *map_tree,
 
 	BUG_ON(em->start > logical || em->start + em->len < logical);
 	map = (struct map_lookup *)em->bdev;
-	if (map->type & (BTRFS_BLOCK_GROUP_RAID5 |
-			 BTRFS_BLOCK_GROUP_RAID6))
+	if (map->type & BTRFS_BLOCK_GROUP_RAID56_MASK)
 		ret = 1;
 	free_extent_map(em);
 	return ret;
@@ -4998,7 +4995,7 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int rw,
 	stripe_offset = offset - stripe_offset;
 
 	/* if we're here for raid56, we need to know the stripe aligned start */
-	if (map->type & (BTRFS_BLOCK_GROUP_RAID5 | BTRFS_BLOCK_GROUP_RAID6)) {
+	if (map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) {
 		unsigned long full_stripe_len = stripe_len * nr_data_stripes(map);
 		raid56_full_stripe_start = offset;
 
@@ -5011,8 +5008,7 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int rw,
 
 	if (rw & REQ_DISCARD) {
 		/* we don't discard raid56 yet */
-		if (map->type &
-		    (BTRFS_BLOCK_GROUP_RAID5 | BTRFS_BLOCK_GROUP_RAID6)) {
+		if (map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) {
 			ret = -EOPNOTSUPP;
 			goto out;
 		}
@@ -5022,7 +5018,7 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int rw,
 		/* For writes to RAID[56], allow a full stripeset across all disks.
 		   For other RAID types and for RAID[56] reads, just allow a single
 		   stripe (on a single disk). */
-		if (map->type & (BTRFS_BLOCK_GROUP_RAID5 | BTRFS_BLOCK_GROUP_RAID6) &&
+		if ((map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) &&
 		    (rw & REQ_WRITE)) {
 			max_len = stripe_len * nr_data_stripes(map) -
 				(offset - raid56_full_stripe_start);
@@ -5188,8 +5184,7 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int rw,
 			mirror_num = stripe_index - old_stripe_index + 1;
 		}
 
-	} else if (map->type & (BTRFS_BLOCK_GROUP_RAID5 |
-				BTRFS_BLOCK_GROUP_RAID6)) {
+	} else if (map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) {
 		if (need_raid_map &&
 		    ((rw & (REQ_WRITE | REQ_GET_READ_MIRRORS)) ||
 		     mirror_num > 1)) {
@@ -5253,7 +5248,7 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int rw,
 		bbio->tgtdev_map = (int *)(bbio->stripes + num_alloc_stripes);
 
 	/* build raid_map */
-	if (map->type & (BTRFS_BLOCK_GROUP_RAID5 | BTRFS_BLOCK_GROUP_RAID6) &&
+	if (map->type & BTRFS_BLOCK_GROUP_RAID56_MASK &&
 	    need_raid_map && ((rw & (REQ_WRITE | REQ_GET_READ_MIRRORS)) ||
 	    mirror_num > 1)) {
 		u64 tmp;
@@ -5534,8 +5529,7 @@ int btrfs_rmap_block(struct btrfs_mapping_tree *map_tree,
 		do_div(length, map->num_stripes / map->sub_stripes);
 	else if (map->type & BTRFS_BLOCK_GROUP_RAID0)
 		do_div(length, map->num_stripes);
-	else if (map->type & (BTRFS_BLOCK_GROUP_RAID5 |
-			      BTRFS_BLOCK_GROUP_RAID6)) {
+	else if (map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) {
 		do_div(length, nr_data_stripes(map));
 		rmap_len = map->stripe_len * nr_data_stripes(map);
 	}
-- 
1.8.5.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 04/15] Btrfs: add ref_count and free function for btrfs_bio
  2015-01-13 12:34 ` [PATCH 04/15] Btrfs: add ref_count and free function for btrfs_bio Zhaolei
@ 2015-01-15 12:52   ` David Sterba
  2015-01-16  1:38     ` Zhao Lei
  0 siblings, 1 reply; 18+ messages in thread
From: David Sterba @ 2015-01-15 12:52 UTC (permalink / raw)
  To: Zhaolei; +Cc: linux-btrfs, Miao Xie

The cleanups look good in general, some minor nitpicks below.

On Tue, Jan 13, 2015 at 08:34:37PM +0800, Zhaolei wrote:
> -		kfree(bbio);
> +		put_btrfs_bio(bbio);

Please rename it to btrfs_put_bbio, this is more consistent with other
*_put_* helpers and 'bbio' distinguishes btrfs_bio from regular 'bio'.

>  
>  static void btrfs_end_bio(struct bio *bio, int err)
> diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
> index fb0e8c3..db195f0 100644
> --- a/fs/btrfs/volumes.h
> +++ b/fs/btrfs/volumes.h
> @@ -295,6 +295,7 @@ typedef void (btrfs_bio_end_io_t) (struct btrfs_bio *bio, int err);
>  #define BTRFS_BIO_ORIG_BIO_SUBMITTED	(1 << 0)
>  
>  struct btrfs_bio {
> +	atomic_t ref_count;

	atomic_t refs;

>  	atomic_t stripes_pending;
>  	struct btrfs_fs_info *fs_info;
>  	bio_end_io_t *end_io;
> @@ -394,13 +395,8 @@ struct btrfs_balance_control {
>  
>  int btrfs_account_dev_extents_size(struct btrfs_device *device, u64 start,
>  				   u64 end, u64 *length);
> -
> -#define btrfs_bio_size(total_stripes, real_stripes)		\
> -	(sizeof(struct btrfs_bio) +				\
> -	 (sizeof(struct btrfs_bio_stripe) * (total_stripes)) +	\
> -	 (sizeof(int) * (real_stripes)) +			\
> -	 (sizeof(u64) * (real_stripes)))
> -
> +void get_btrfs_bio(struct btrfs_bio *bbio);

	btrfs_get_bbio

> +void put_btrfs_bio(struct btrfs_bio *bbio);
>  int btrfs_map_block(struct btrfs_fs_info *fs_info, int rw,
>  		    u64 logical, u64 *length,
>  		    struct btrfs_bio **bbio_ret, int mirror_num);

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH 04/15] Btrfs: add ref_count and free function for btrfs_bio
  2015-01-15 12:52   ` David Sterba
@ 2015-01-16  1:38     ` Zhao Lei
  0 siblings, 0 replies; 18+ messages in thread
From: Zhao Lei @ 2015-01-16  1:38 UTC (permalink / raw)
  To: dsterba; +Cc: linux-btrfs, 'Miao Xie'

Hi, David Sterba

* From: David Sterba [mailto:dsterba@suse.cz]
> The cleanups look good in general, some minor nitpicks below.
> 
> On Tue, Jan 13, 2015 at 08:34:37PM +0800, Zhaolei wrote:
> > -		kfree(bbio);
> > +		put_btrfs_bio(bbio);
> 
> Please rename it to btrfs_put_bbio, this is more consistent with other
> *_put_* helpers and 'bbio' distinguishes btrfs_bio from regular 'bio'.
> 
Good suggestion, I like these unified-format name.

> >
> >  static void btrfs_end_bio(struct bio *bio, int err)
> > diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
> > index fb0e8c3..db195f0 100644
> > --- a/fs/btrfs/volumes.h
> > +++ b/fs/btrfs/volumes.h
> > @@ -295,6 +295,7 @@ typedef void (btrfs_bio_end_io_t) (struct btrfs_bio *bio, int err);
> >  #define BTRFS_BIO_ORIG_BIO_SUBMITTED	(1 << 0)
> >
> >  struct btrfs_bio {
> > +	atomic_t ref_count;
> 
> 	atomic_t refs;
> 
Ok.

> >  	atomic_t stripes_pending;
> >  	struct btrfs_fs_info *fs_info;
> >  	bio_end_io_t *end_io;
> > @@ -394,13 +395,8 @@ struct btrfs_balance_control {
> >
> >  int btrfs_account_dev_extents_size(struct btrfs_device *device, u64 start,
> >  				   u64 end, u64 *length);
> > -
> > -#define btrfs_bio_size(total_stripes, real_stripes)		\
> > -	(sizeof(struct btrfs_bio) +				\
> > -	 (sizeof(struct btrfs_bio_stripe) * (total_stripes)) +	\
> > -	 (sizeof(int) * (real_stripes)) +			\
> > -	 (sizeof(u64) * (real_stripes)))
> > -
> > +void get_btrfs_bio(struct btrfs_bio *bbio);
> 
> 	btrfs_get_bbio
> 
Thanks for your suggestion, I'll include above changes in v2.

Thanks
Zhaolei

> > +void put_btrfs_bio(struct btrfs_bio *bbio);
> >  int btrfs_map_block(struct btrfs_fs_info *fs_info, int rw,
> >  		    u64 logical, u64 *length,
> >  		    struct btrfs_bio **bbio_ret, int mirror_num);



^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2015-01-16  1:38 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-01-13 12:34 [PATCH 00/15] Btrfs: Cleanup for raid56 scrib Zhaolei
2015-01-13 12:34 ` [PATCH 01/15] Btrfs: fix a out-of-bound access of raid_map Zhaolei
2015-01-13 12:34 ` [PATCH 02/15] Btrfs: sort raid_map before adding tgtdev stripes Zhaolei
2015-01-13 12:34 ` [PATCH 03/15] Btrfs: Make raid_map array be inlined in btrfs_bio structure Zhaolei
2015-01-13 12:34 ` [PATCH 04/15] Btrfs: add ref_count and free function for btrfs_bio Zhaolei
2015-01-15 12:52   ` David Sterba
2015-01-16  1:38     ` Zhao Lei
2015-01-13 12:34 ` [PATCH 05/15] Btrfs: Fix a jump typo of nodatasum_case to avoid wrong WARN_ON() Zhaolei
2015-01-13 12:34 ` [PATCH 06/15] Btrfs: Remove noneed force_write in scrub_write_block_to_dev_replace Zhaolei
2015-01-13 12:34 ` [PATCH 07/15] Btrfs: Cleanup btrfs_bio_counter_inc_blocked() Zhaolei
2015-01-13 12:34 ` [PATCH 08/15] Btrfs: btrfs_rm_dev_replace_blocked(): Use wait_event() Zhaolei
2015-01-13 12:34 ` [PATCH 09/15] Btrfs: Break loop when reach BTRFS_MAX_MIRRORS in scrub_setup_recheck_block() Zhaolei
2015-01-13 12:34 ` [PATCH 10/15] Btrfs: Avoid trustless page-level-repair in dev-replace Zhaolei
2015-01-13 12:34 ` [PATCH 11/15] Btrfs: Separate finding-right-mirror and writing-to-target's process in scrub_handle_errored_block() Zhaolei
2015-01-13 12:34 ` [PATCH 12/15] Btrfs: Combine per-page recover in dev-replace and scrub Zhaolei
2015-01-13 12:34 ` [PATCH 13/15] Btrfs: Simplify scrub_setup_recheck_block()'s argument Zhaolei
2015-01-13 12:34 ` [PATCH 14/15] Btrfs: Include map_type in raid_bio Zhaolei
2015-01-13 12:34 ` [PATCH 15/15] Btrfs: Introduce BTRFS_BLOCK_GROUP_RAID56_MASK to check raid56 simply Zhaolei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).