[PATCH v4 0/6] Chunk level degradable check

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v4 0/6] Chunk level degradable check
@ 2017-06-28  5:43 Qu Wenruo
  2017-06-28  5:43 ` [PATCH v4 1/6] btrfs: Introduce a function to check if all chunks a OK for degraded rw mount Qu Wenruo
                   ` (8 more replies)
  0 siblings, 9 replies; 17+ messages in thread
From: Qu Wenruo @ 2017-06-28  5:43 UTC (permalink / raw)
  To: linux-btrfs, dsterba; +Cc: ahferroin7, kilobyte, demfloro

The patchset can be fetched from my github repo:
https://github.com/adam900710/linux/tree/degradable

The patchset is based on David's for-4.13-part1 branch.

Btrfs currently uses num_tolerated_disk_barrier_failures to do global
check for tolerated missing device.

Although the one-size-fit-all solution is quite safe, it's too strict
if data and metadata has different duplication level.

For example, if one use Single data and RAID1 metadata for 2 disks, it
means any missing device will make the fs unable to be degraded
mounted.

But in fact, some times all single chunks may be in the existing
device and in that case, we should allow it to be rw degraded mounted.

Such case can be easily reproduced using the following script:
 # mkfs.btrfs -f -m raid1 -d sing /dev/sdb /dev/sdc
 # wipefs -f /dev/sdc
 # mount /dev/sdb -o degraded,rw

If using btrfs-debug-tree to check /dev/sdb, one should find that the
data chunk is only in sdb, so in fact it should allow degraded mount.

This patchset will introduce a new per-chunk degradable check for
btrfs, allow above case to succeed, and it's quite small anyway.

And enhance kernel error message for missing device, at least user
can know what's making mount failed, other than meaningless
"failed to read system chunk/chunk tree -5".

v2:
  Update after almost 2 years.
  Add the last patch to enhance the kernel output, so user can know
  it's missing devices that prevents btrfs to be mounted.
v3:
  Remove one duplicated missing device output
  Use the advice from Anand Jain, not to add new members in btrfs_device,
  but use a new structure extra_rw_degrade_errors, to record error when
  sending down/waiting device.
v3.1:
  Reduce the critical section in btrfs_check_rw_degradable(), follow other
  caller to only acquire the lock when searching, as extent_map has
  refcount to avoid concurrency already.
  The modification itself won't affect the behavior, so tested-by tags are
  added to each patch.
v4:
  Thanks Anand for this dev flush work, which makes us more easier to
  detect flush error in previous transaction.
  Now this patchset won't need to alloc memory, and can just use
  btrfs_device->last_flush_error to check if last flush finished
  correctly.
  New rebase, so old tested by tags are all removed, sorry guys.

Qu Wenruo (6):
  btrfs: Introduce a function to check if all chunks a OK for degraded
    rw mount
  btrfs: Do chunk level rw degrade check at mount time
  btrfs: Do chunk level degradation check for remount
  btrfs: Allow barrier_all_devices to do chunk level device check
  btrfs: Cleanup num_tolerated_disk_barrier_failures
  btrfs: Enhance missing device kernel message

 fs/btrfs/ctree.h   |  2 --
 fs/btrfs/disk-io.c | 81 ++++----------------------------------------
 fs/btrfs/disk-io.h |  2 --
 fs/btrfs/super.c   |  3 +-
 fs/btrfs/volumes.c | 99 +++++++++++++++++++++++++++++++++++++++++-------------
 fs/btrfs/volumes.h |  3 ++
 6 files changed, 85 insertions(+), 105 deletions(-)

-- 
2.13.1

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v4 1/6] btrfs: Introduce a function to check if all chunks a OK for degraded rw mount
  2017-06-28  5:43 [PATCH v4 0/6] Chunk level degradable check Qu Wenruo
@ 2017-06-28  5:43 ` Qu Wenruo
  2017-07-14  7:44   ` Nikolay Borisov
  2017-06-28  5:43 ` [PATCH v4 2/6] btrfs: Do chunk level rw degrade check at mount time Qu Wenruo
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 17+ messages in thread
From: Qu Wenruo @ 2017-06-28  5:43 UTC (permalink / raw)
  To: linux-btrfs, dsterba; +Cc: ahferroin7, kilobyte, demfloro, Anand Jain

Introduce a new function, btrfs_check_rw_degradable(), to check if all
chunks in btrfs is OK for degraded rw mount.

It provides the new basis for accurate btrfs mount/remount and even
runtime degraded mount check other than old one-size-fit-all method.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 fs/btrfs/volumes.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 fs/btrfs/volumes.h |  1 +
 2 files changed, 59 insertions(+)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index c95f018d4a1e..7a72fbdb8262 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -6817,6 +6817,64 @@ int btrfs_read_sys_array(struct btrfs_fs_info *fs_info)
 	return -EIO;
 }
 
+/*
+ * Check if all chunks in the fs is OK for read-write degraded mount
+ *
+ * Return true if all chunks meet the minimal RW mount requirement.
+ * Return false if any chunk doesn't meet the minimal RW mount requirement.
+ */
+bool btrfs_check_rw_degradable(struct btrfs_fs_info *fs_info)
+{
+	struct btrfs_mapping_tree *map_tree = &fs_info->mapping_tree;
+	struct extent_map *em;
+	u64 next_start = 0;
+	bool ret = true;
+
+	read_lock(&map_tree->map_tree.lock);
+	em = lookup_extent_mapping(&map_tree->map_tree, 0, (u64)-1);
+	read_unlock(&map_tree->map_tree.lock);
+	/* No chunk at all? Return false anyway */
+	if (!em) {
+		ret = false;
+		goto out;
+	}
+	while (em) {
+		struct map_lookup *map;
+		int missing = 0;
+		int max_tolerated;
+		int i;
+
+		map = em->map_lookup;
+		max_tolerated =
+			btrfs_get_num_tolerated_disk_barrier_failures(
+					map->type);
+		for (i = 0; i < map->num_stripes; i++) {
+			struct btrfs_device *dev = map->stripes[i].dev;
+
+			if (!dev || !dev->bdev || dev->missing ||
+			    dev->last_flush_error)
+				missing++;
+		}
+		if (missing > max_tolerated) {
+			ret = false;
+			btrfs_warn(fs_info,
+	"chunk %llu missing %d devices, max tolerance is %d for writeble mount",
+				   em->start, missing, max_tolerated);
+			free_extent_map(em);
+			goto out;
+		}
+		next_start = extent_map_end(em);
+		free_extent_map(em);
+
+		read_lock(&map_tree->map_tree.lock);
+		em = lookup_extent_mapping(&map_tree->map_tree, next_start,
+					   (u64)(-1) - next_start);
+		read_unlock(&map_tree->map_tree.lock);
+	}
+out:
+	return ret;
+}
+
 int btrfs_read_chunk_tree(struct btrfs_fs_info *fs_info)
 {
 	struct btrfs_root *root = fs_info->chunk_root;
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 6f45fd60d15a..a5897c7a7e86 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -543,4 +543,5 @@ struct list_head *btrfs_get_fs_uuids(void);
 void btrfs_set_fs_info_ptr(struct btrfs_fs_info *fs_info);
 void btrfs_reset_fs_info_ptr(struct btrfs_fs_info *fs_info);
 
+bool btrfs_check_rw_degradable(struct btrfs_fs_info *fs_info);
 #endif
-- 
2.13.1




^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v4 2/6] btrfs: Do chunk level rw degrade check at mount time
  2017-06-28  5:43 [PATCH v4 0/6] Chunk level degradable check Qu Wenruo
  2017-06-28  5:43 ` [PATCH v4 1/6] btrfs: Introduce a function to check if all chunks a OK for degraded rw mount Qu Wenruo
@ 2017-06-28  5:43 ` Qu Wenruo
  2017-06-28  5:43 ` [PATCH v4 3/6] btrfs: Do chunk level degradation check for remount Qu Wenruo
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Qu Wenruo @ 2017-06-28  5:43 UTC (permalink / raw)
  To: linux-btrfs, dsterba; +Cc: ahferroin7, kilobyte, demfloro

Now use the btrfs_check_rw_degradable() to do mount time degration check.

With this patch, now we can mount with the following case:
 # mkfs.btrfs -f -m raid1 -d single /dev/sdb /dev/sdc
 # wipefs -a /dev/sdc
 # mount /dev/sdb /mnt/btrfs -o degraded
 As the single data chunk is only in sdb, so it's OK to mount as
 degraded, as missing one device is OK for RAID1.

But still fail with the following case as expected:
 # mkfs.btrfs -f -m raid1 -d single /dev/sdb /dev/sdc
 # wipefs -a /dev/sdb
 # mount /dev/sdc /mnt/btrfs -o degraded
 As the data chunk is only in sdb, so it's not OK to mount it as
 degraded.

Reported-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Reported-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
 fs/btrfs/disk-io.c | 11 +++--------
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index b6758892874f..6ba2a53999f9 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3034,15 +3034,10 @@ int open_ctree(struct super_block *sb,
 		btrfs_err(fs_info, "failed to read block groups: %d", ret);
 		goto fail_sysfs;
 	}
-	fs_info->num_tolerated_disk_barrier_failures =
-		btrfs_calc_num_tolerated_disk_barrier_failures(fs_info);
-	if (fs_info->fs_devices->missing_devices >
-	     fs_info->num_tolerated_disk_barrier_failures &&
-	    !(sb->s_flags & MS_RDONLY)) {
+
+	if (!(sb->s_flags & MS_RDONLY) && !btrfs_check_rw_degradable(fs_info)) {
 		btrfs_warn(fs_info,
-"missing devices (%llu) exceeds the limit (%d), writeable mount is not allowed",
-			fs_info->fs_devices->missing_devices,
-			fs_info->num_tolerated_disk_barrier_failures);
+		"writeable mount is not allowed due to too many missing devices");
 		goto fail_sysfs;
 	}
 
-- 
2.13.1




^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v4 3/6] btrfs: Do chunk level degradation check for remount
  2017-06-28  5:43 [PATCH v4 0/6] Chunk level degradable check Qu Wenruo
  2017-06-28  5:43 ` [PATCH v4 1/6] btrfs: Introduce a function to check if all chunks a OK for degraded rw mount Qu Wenruo
  2017-06-28  5:43 ` [PATCH v4 2/6] btrfs: Do chunk level rw degrade check at mount time Qu Wenruo
@ 2017-06-28  5:43 ` Qu Wenruo
  2017-06-28  5:43 ` [PATCH v4 4/6] btrfs: Allow barrier_all_devices to do chunk level device check Qu Wenruo
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Qu Wenruo @ 2017-06-28  5:43 UTC (permalink / raw)
  To: linux-btrfs, dsterba; +Cc: ahferroin7, kilobyte, demfloro

Just the same for mount time check, use btrfs_check_rw_degradable() to
check if we are OK to be remounted rw.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
 fs/btrfs/super.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 74e47794e63f..03fa52008404 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1781,8 +1781,7 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data)
 			goto restore;
 		}
 
-		if (fs_info->fs_devices->missing_devices >
-		     fs_info->num_tolerated_disk_barrier_failures) {
+		if (!btrfs_check_rw_degradable(fs_info)) {
 			btrfs_warn(fs_info,
 				"too many missing devices, writeable remount is not allowed");
 			ret = -EACCES;
-- 
2.13.1




^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v4 4/6] btrfs: Allow barrier_all_devices to do chunk level device check
  2017-06-28  5:43 [PATCH v4 0/6] Chunk level degradable check Qu Wenruo
                   ` (2 preceding siblings ...)
  2017-06-28  5:43 ` [PATCH v4 3/6] btrfs: Do chunk level degradation check for remount Qu Wenruo
@ 2017-06-28  5:43 ` Qu Wenruo
  2017-06-28  5:43 ` [PATCH v4 5/6] btrfs: Cleanup num_tolerated_disk_barrier_failures Qu Wenruo
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Qu Wenruo @ 2017-06-28  5:43 UTC (permalink / raw)
  To: linux-btrfs, dsterba; +Cc: ahferroin7, kilobyte, demfloro

The last user of num_tolerated_disk_barrier_failures is
barrier_all_devices().
But it's can be easily changed to new per-chunk degradable check
framework.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
 fs/btrfs/disk-io.c | 16 +++-------------
 1 file changed, 3 insertions(+), 13 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 6ba2a53999f9..7ec766d8510f 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3518,20 +3518,10 @@ static int wait_dev_flush(struct btrfs_device *device)
 	return bio->bi_error;
 }
 
-static int check_barrier_error(struct btrfs_fs_devices *fsdevs)
+static int check_barrier_error(struct btrfs_fs_info *fs_info)
 {
-	int dev_flush_error = 0;
-	struct btrfs_device *dev;
-
-	list_for_each_entry_rcu(dev, &fsdevs->devices, dev_list) {
-		if (!dev->bdev || dev->last_flush_error)
-			dev_flush_error++;
-	}
-
-	if (dev_flush_error >
-	    fsdevs->fs_info->num_tolerated_disk_barrier_failures)
+	if (!btrfs_check_rw_degradable(fs_info))
 		return -EIO;
-
 	return 0;
 }
 
@@ -3586,7 +3576,7 @@ static int barrier_all_devices(struct btrfs_fs_info *info)
 		 * to arrive at the volume status. So error checking
 		 * is being pushed to a separate loop.
 		 */
-		return check_barrier_error(info->fs_devices);
+		return check_barrier_error(info);
 	}
 	return 0;
 }
-- 
2.13.1




^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v4 5/6] btrfs: Cleanup num_tolerated_disk_barrier_failures
  2017-06-28  5:43 [PATCH v4 0/6] Chunk level degradable check Qu Wenruo
                   ` (3 preceding siblings ...)
  2017-06-28  5:43 ` [PATCH v4 4/6] btrfs: Allow barrier_all_devices to do chunk level device check Qu Wenruo
@ 2017-06-28  5:43 ` Qu Wenruo
  2017-06-28  5:43 ` [PATCH v4 6/6] btrfs: Enhance missing device kernel message Qu Wenruo
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Qu Wenruo @ 2017-06-28  5:43 UTC (permalink / raw)
  To: linux-btrfs, dsterba; +Cc: ahferroin7, kilobyte, demfloro

As we use per-chunk degradable check, now the global
num_tolerated_disk_barrier_failures is of no use.

So cleanup it.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
 fs/btrfs/ctree.h   |  2 --
 fs/btrfs/disk-io.c | 54 ------------------------------------------------------
 fs/btrfs/disk-io.h |  2 --
 fs/btrfs/volumes.c | 17 -----------------
 4 files changed, 75 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 5bdd36664421..99c55580ddeb 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1072,8 +1072,6 @@ struct btrfs_fs_info {
 	/* next backup root to be overwritten */
 	int backup_root_index;
 
-	int num_tolerated_disk_barrier_failures;
-
 	/* device replace state */
 	struct btrfs_dev_replace dev_replace;
 
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 7ec766d8510f..1a2e0b43ef2a 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3610,60 +3610,6 @@ int btrfs_get_num_tolerated_disk_barrier_failures(u64 flags)
 	return min_tolerated;
 }
 
-int btrfs_calc_num_tolerated_disk_barrier_failures(
-	struct btrfs_fs_info *fs_info)
-{
-	struct btrfs_ioctl_space_info space;
-	struct btrfs_space_info *sinfo;
-	u64 types[] = {BTRFS_BLOCK_GROUP_DATA,
-		       BTRFS_BLOCK_GROUP_SYSTEM,
-		       BTRFS_BLOCK_GROUP_METADATA,
-		       BTRFS_BLOCK_GROUP_DATA | BTRFS_BLOCK_GROUP_METADATA};
-	int i;
-	int c;
-	int num_tolerated_disk_barrier_failures =
-		(int)fs_info->fs_devices->num_devices;
-
-	for (i = 0; i < ARRAY_SIZE(types); i++) {
-		struct btrfs_space_info *tmp;
-
-		sinfo = NULL;
-		rcu_read_lock();
-		list_for_each_entry_rcu(tmp, &fs_info->space_info, list) {
-			if (tmp->flags == types[i]) {
-				sinfo = tmp;
-				break;
-			}
-		}
-		rcu_read_unlock();
-
-		if (!sinfo)
-			continue;
-
-		down_read(&sinfo->groups_sem);
-		for (c = 0; c < BTRFS_NR_RAID_TYPES; c++) {
-			u64 flags;
-
-			if (list_empty(&sinfo->block_groups[c]))
-				continue;
-
-			btrfs_get_block_group_info(&sinfo->block_groups[c],
-						   &space);
-			if (space.total_bytes == 0 || space.used_bytes == 0)
-				continue;
-			flags = space.flags;
-
-			num_tolerated_disk_barrier_failures = min(
-				num_tolerated_disk_barrier_failures,
-				btrfs_get_num_tolerated_disk_barrier_failures(
-					flags));
-		}
-		up_read(&sinfo->groups_sem);
-	}
-
-	return num_tolerated_disk_barrier_failures;
-}
-
 int write_all_supers(struct btrfs_fs_info *fs_info, int max_mirrors)
 {
 	struct list_head *head;
diff --git a/fs/btrfs/disk-io.h b/fs/btrfs/disk-io.h
index 4654d129aa76..6c39d969c84e 100644
--- a/fs/btrfs/disk-io.h
+++ b/fs/btrfs/disk-io.h
@@ -142,8 +142,6 @@ struct btrfs_root *btrfs_create_tree(struct btrfs_trans_handle *trans,
 int btree_lock_page_hook(struct page *page, void *data,
 				void (*flush_fn)(void *));
 int btrfs_get_num_tolerated_disk_barrier_failures(u64 flags);
-int btrfs_calc_num_tolerated_disk_barrier_failures(
-	struct btrfs_fs_info *fs_info);
 int __init btrfs_end_io_wq_init(void);
 void btrfs_end_io_wq_exit(void);
 
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 7a72fbdb8262..2cc68c74f701 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1987,9 +1987,6 @@ int btrfs_rm_device(struct btrfs_fs_info *fs_info, const char *device_path,
 		free_fs_devices(cur_devices);
 	}
 
-	fs_info->num_tolerated_disk_barrier_failures =
-		btrfs_calc_num_tolerated_disk_barrier_failures(fs_info);
-
 out:
 	mutex_unlock(&uuid_mutex);
 	return ret;
@@ -2487,8 +2484,6 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
 				   "sysfs: failed to create fsid for sprout");
 	}
 
-	fs_info->num_tolerated_disk_barrier_failures =
-		btrfs_calc_num_tolerated_disk_barrier_failures(fs_info);
 	ret = btrfs_commit_transaction(trans);
 
 	if (seeding_dev) {
@@ -3898,13 +3893,6 @@ int btrfs_balance(struct btrfs_balance_control *bctl,
 			   meta_target, data_target);
 	}
 
-	if (bctl->sys.flags & BTRFS_BALANCE_ARGS_CONVERT) {
-		fs_info->num_tolerated_disk_barrier_failures = min(
-			btrfs_calc_num_tolerated_disk_barrier_failures(fs_info),
-			btrfs_get_num_tolerated_disk_barrier_failures(
-				bctl->sys.target));
-	}
-
 	ret = insert_balance_item(fs_info, bctl);
 	if (ret && ret != -EEXIST)
 		goto out;
@@ -3927,11 +3915,6 @@ int btrfs_balance(struct btrfs_balance_control *bctl,
 	mutex_lock(&fs_info->balance_mutex);
 	atomic_dec(&fs_info->balance_running);
 
-	if (bctl->sys.flags & BTRFS_BALANCE_ARGS_CONVERT) {
-		fs_info->num_tolerated_disk_barrier_failures =
-			btrfs_calc_num_tolerated_disk_barrier_failures(fs_info);
-	}
-
 	if (bargs) {
 		memset(bargs, 0, sizeof(*bargs));
 		update_ioctl_balance_args(fs_info, 0, bargs);
-- 
2.13.1




^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v4 6/6] btrfs: Enhance missing device kernel message
  2017-06-28  5:43 [PATCH v4 0/6] Chunk level degradable check Qu Wenruo
                   ` (4 preceding siblings ...)
  2017-06-28  5:43 ` [PATCH v4 5/6] btrfs: Cleanup num_tolerated_disk_barrier_failures Qu Wenruo
@ 2017-06-28  5:43 ` Qu Wenruo
  2017-06-28 13:54 ` [PATCH v4 0/6] Chunk level degradable check David Sterba
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Qu Wenruo @ 2017-06-28  5:43 UTC (permalink / raw)
  To: linux-btrfs, dsterba; +Cc: ahferroin7, kilobyte, demfloro

For missing device, btrfs will just refuse to mount with almost
meaningless kernel message like:

 BTRFS info (device vdb6): disk space caching is enabled
 BTRFS info (device vdb6): has skinny extents
 BTRFS error (device vdb6): failed to read the system array: -5
 BTRFS error (device vdb6): open_ctree failed

This patch will add extra device missing output, making the result to:

 BTRFS info (device vdb6): disk space caching is enabled
 BTRFS info (device vdb6): has skinny extents
 BTRFS warning (device vdb6): devid 2 uuid 80470722-cad2-4b90-b7c3-fee294552f1b is missing
 BTRFS error (device vdb6): failed to read the system array: -5
 BTRFS error (device vdb6): open_ctree failed

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
---
 fs/btrfs/volumes.c | 24 +++++++++++++++++-------
 fs/btrfs/volumes.h |  2 ++
 2 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 2cc68c74f701..a47e362de0eb 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -6496,6 +6496,7 @@ static int read_one_chunk(struct btrfs_fs_info *fs_info, struct btrfs_key *key,
 		if (!map->stripes[i].dev &&
 		    !btrfs_test_opt(fs_info, DEGRADED)) {
 			free_extent_map(em);
+			btrfs_report_missing_device(fs_info, devid, uuid);
 			return -EIO;
 		}
 		if (!map->stripes[i].dev) {
@@ -6506,8 +6507,7 @@ static int read_one_chunk(struct btrfs_fs_info *fs_info, struct btrfs_key *key,
 				free_extent_map(em);
 				return -EIO;
 			}
-			btrfs_warn(fs_info, "devid %llu uuid %pU is missing",
-				   devid, uuid);
+			btrfs_report_missing_device(fs_info, devid, uuid);
 		}
 		map->stripes[i].dev->in_fs_metadata = 1;
 	}
@@ -6624,17 +6624,21 @@ static int read_one_dev(struct btrfs_fs_info *fs_info,
 
 	device = btrfs_find_device(fs_info, devid, dev_uuid, fs_uuid);
 	if (!device) {
-		if (!btrfs_test_opt(fs_info, DEGRADED))
+		if (!btrfs_test_opt(fs_info, DEGRADED)) {
+			btrfs_report_missing_device(fs_info, devid, dev_uuid);
 			return -EIO;
+		}
 
 		device = add_missing_dev(fs_devices, devid, dev_uuid);
 		if (!device)
 			return -ENOMEM;
-		btrfs_warn(fs_info, "devid %llu uuid %pU missing",
-				devid, dev_uuid);
+		btrfs_report_missing_device(fs_info, devid, dev_uuid);
 	} else {
-		if (!device->bdev && !btrfs_test_opt(fs_info, DEGRADED))
-			return -EIO;
+		if (!device->bdev) {
+			btrfs_report_missing_device(fs_info, devid, dev_uuid);
+			if (!btrfs_test_opt(fs_info, DEGRADED))
+				return -EIO;
+		}
 
 		if(!device->bdev && !device->missing) {
 			/*
@@ -6800,6 +6804,12 @@ int btrfs_read_sys_array(struct btrfs_fs_info *fs_info)
 	return -EIO;
 }
 
+void btrfs_report_missing_device(struct btrfs_fs_info *fs_info, u64 devid,
+				 u8 *uuid)
+{
+	btrfs_warn_rl(fs_info, "devid %llu uuid %pU is missing", devid, uuid);
+}
+
 /*
  * Check if all chunks in the fs is OK for read-write degraded mount
  *
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index a5897c7a7e86..9bd0447cffa8 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -544,4 +544,6 @@ void btrfs_set_fs_info_ptr(struct btrfs_fs_info *fs_info);
 void btrfs_reset_fs_info_ptr(struct btrfs_fs_info *fs_info);
 
 bool btrfs_check_rw_degradable(struct btrfs_fs_info *fs_info);
+void btrfs_report_missing_device(struct btrfs_fs_info *fs_info, u64 devid,
+				 u8 *uuid);
 #endif
-- 
2.13.1




^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 0/6] Chunk level degradable check
  2017-06-28  5:43 [PATCH v4 0/6] Chunk level degradable check Qu Wenruo
                   ` (5 preceding siblings ...)
  2017-06-28  5:43 ` [PATCH v4 6/6] btrfs: Enhance missing device kernel message Qu Wenruo
@ 2017-06-28 13:54 ` David Sterba
  2017-07-10 18:11 ` Dmitrii Tcvetkov
  2017-07-12 15:24 ` David Sterba
  8 siblings, 0 replies; 17+ messages in thread
From: David Sterba @ 2017-06-28 13:54 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs, dsterba, ahferroin7, kilobyte, demfloro

On Wed, Jun 28, 2017 at 01:43:29PM +0800, Qu Wenruo wrote:
> The patchset can be fetched from my github repo:
> https://github.com/adam900710/linux/tree/degradable
> 
> The patchset is based on David's for-4.13-part1 branch.

Thanks. After a quick skim of the patchset, looks good to me for
inclusion to for-next.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 0/6] Chunk level degradable check
  2017-06-28  5:43 [PATCH v4 0/6] Chunk level degradable check Qu Wenruo
                   ` (6 preceding siblings ...)
  2017-06-28 13:54 ` [PATCH v4 0/6] Chunk level degradable check David Sterba
@ 2017-07-10 18:11 ` Dmitrii Tcvetkov
  2017-07-13  0:50   ` David Sterba
  2017-07-12 15:24 ` David Sterba
  8 siblings, 1 reply; 17+ messages in thread
From: Dmitrii Tcvetkov @ 2017-07-10 18:11 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs, dsterba, ahferroin7, kilobyte

On Wed, 28 Jun 2017 13:43:29 +0800
Qu Wenruo <quwenruo@cn.fujitsu.com> wrote:

> The patchset can be fetched from my github repo:
> https://github.com/adam900710/linux/tree/degradable
> 
> The patchset is based on David's for-4.13-part1 branch.
> 
> Btrfs currently uses num_tolerated_disk_barrier_failures to do global
> check for tolerated missing device.
> 
> Although the one-size-fit-all solution is quite safe, it's too strict
> if data and metadata has different duplication level.
> 
> For example, if one use Single data and RAID1 metadata for 2 disks, it
> means any missing device will make the fs unable to be degraded
> mounted.
> 
> But in fact, some times all single chunks may be in the existing
> device and in that case, we should allow it to be rw degraded mounted.
> 
> Such case can be easily reproduced using the following script:
>  # mkfs.btrfs -f -m raid1 -d sing /dev/sdb /dev/sdc
>  # wipefs -f /dev/sdc
>  # mount /dev/sdb -o degraded,rw
> 
> If using btrfs-debug-tree to check /dev/sdb, one should find that the
> data chunk is only in sdb, so in fact it should allow degraded mount.
> 
> This patchset will introduce a new per-chunk degradable check for
> btrfs, allow above case to succeed, and it's quite small anyway.
> 
> And enhance kernel error message for missing device, at least user
> can know what's making mount failed, other than meaningless
> "failed to read system chunk/chunk tree -5".
> 
> v2:
>   Update after almost 2 years.
>   Add the last patch to enhance the kernel output, so user can know
>   it's missing devices that prevents btrfs to be mounted.
> v3:
>   Remove one duplicated missing device output
>   Use the advice from Anand Jain, not to add new members in
> btrfs_device, but use a new structure extra_rw_degrade_errors, to
> record error when sending down/waiting device.
> v3.1:
>   Reduce the critical section in btrfs_check_rw_degradable(), follow
> other caller to only acquire the lock when searching, as extent_map
> has refcount to avoid concurrency already.
>   The modification itself won't affect the behavior, so tested-by
> tags are added to each patch.
> v4:
>   Thanks Anand for this dev flush work, which makes us more easier to
>   detect flush error in previous transaction.
>   Now this patchset won't need to alloc memory, and can just use
>   btrfs_device->last_flush_error to check if last flush finished
>   correctly.
>   New rebase, so old tested by tags are all removed, sorry guys.
> 
> Qu Wenruo (6):
>   btrfs: Introduce a function to check if all chunks a OK for degraded
>     rw mount
>   btrfs: Do chunk level rw degrade check at mount time
>   btrfs: Do chunk level degradation check for remount
>   btrfs: Allow barrier_all_devices to do chunk level device check
>   btrfs: Cleanup num_tolerated_disk_barrier_failures
>   btrfs: Enhance missing device kernel message
> 
>  fs/btrfs/ctree.h   |  2 --
>  fs/btrfs/disk-io.c | 81 ++++----------------------------------------
>  fs/btrfs/disk-io.h |  2 --
>  fs/btrfs/super.c   |  3 +-
>  fs/btrfs/volumes.c | 99
> +++++++++++++++++++++++++++++++++++++++++-------------
> fs/btrfs/volumes.h |  3 ++ 6 files changed, 85 insertions(+), 105
> deletions(-)
> 

Tested on top of current mainline master (commit af3c8d98508d37541d4bf57f13a984a7f73a328c). Didn't find any regressions.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 0/6] Chunk level degradable check
  2017-06-28  5:43 [PATCH v4 0/6] Chunk level degradable check Qu Wenruo
                   ` (7 preceding siblings ...)
  2017-07-10 18:11 ` Dmitrii Tcvetkov
@ 2017-07-12 15:24 ` David Sterba
  2017-07-12 23:53   ` Qu Wenruo
  8 siblings, 1 reply; 17+ messages in thread
From: David Sterba @ 2017-07-12 15:24 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs, dsterba, ahferroin7, kilobyte, demfloro

On Wed, Jun 28, 2017 at 01:43:29PM +0800, Qu Wenruo wrote:
> The patchset can be fetched from my github repo:
> https://github.com/adam900710/linux/tree/degradable
> 
> The patchset is based on David's for-4.13-part1 branch.
> 
> Btrfs currently uses num_tolerated_disk_barrier_failures to do global
> check for tolerated missing device.
> 
> Although the one-size-fit-all solution is quite safe, it's too strict
> if data and metadata has different duplication level.
> 
> For example, if one use Single data and RAID1 metadata for 2 disks, it
> means any missing device will make the fs unable to be degraded
> mounted.
> 
> But in fact, some times all single chunks may be in the existing
> device and in that case, we should allow it to be rw degraded mounted.
> 
> Such case can be easily reproduced using the following script:
>  # mkfs.btrfs -f -m raid1 -d sing /dev/sdb /dev/sdc
>  # wipefs -f /dev/sdc
>  # mount /dev/sdb -o degraded,rw

I've seen wider testing coverage in replies to the previous patchset
iterations. Can we have that added to fstests?

I'm going to add this patchset to the devel queue (ie. not a separate
for-next branch anymore).

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 0/6] Chunk level degradable check
  2017-07-12 15:24 ` David Sterba
@ 2017-07-12 23:53   ` Qu Wenruo
  0 siblings, 0 replies; 17+ messages in thread
From: Qu Wenruo @ 2017-07-12 23:53 UTC (permalink / raw)
  To: dsterba, Qu Wenruo, linux-btrfs, ahferroin7, kilobyte, demfloro



On 2017年07月12日 23:24, David Sterba wrote:
> On Wed, Jun 28, 2017 at 01:43:29PM +0800, Qu Wenruo wrote:
>> The patchset can be fetched from my github repo:
>> https://github.com/adam900710/linux/tree/degradable
>>
>> The patchset is based on David's for-4.13-part1 branch.
>>
>> Btrfs currently uses num_tolerated_disk_barrier_failures to do global
>> check for tolerated missing device.
>>
>> Although the one-size-fit-all solution is quite safe, it's too strict
>> if data and metadata has different duplication level.
>>
>> For example, if one use Single data and RAID1 metadata for 2 disks, it
>> means any missing device will make the fs unable to be degraded
>> mounted.
>>
>> But in fact, some times all single chunks may be in the existing
>> device and in that case, we should allow it to be rw degraded mounted.
>>
>> Such case can be easily reproduced using the following script:
>>   # mkfs.btrfs -f -m raid1 -d sing /dev/sdb /dev/sdc
>>   # wipefs -f /dev/sdc
>>   # mount /dev/sdb -o degraded,rw
> 
> I've seen wider testing coverage in replies to the previous patchset
> iterations. Can we have that added to fstests?

I'm completely OK to add fstests test case.
While the concern is still the same: we need better wrapper to detect 
chunk layout.

Or we can only have static test case to test chunk-level degradable check.

Thanks,
Qu

> 
> I'm going to add this patchset to the devel queue (ie. not a separate
> for-next branch anymore).
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 0/6] Chunk level degradable check
  2017-07-10 18:11 ` Dmitrii Tcvetkov
@ 2017-07-13  0:50   ` David Sterba
  2017-07-13  1:09     ` Adam Borowski
  0 siblings, 1 reply; 17+ messages in thread
From: David Sterba @ 2017-07-13  0:50 UTC (permalink / raw)
  To: Dmitrii Tcvetkov; +Cc: Qu Wenruo, linux-btrfs, dsterba, ahferroin7, kilobyte

On Mon, Jul 10, 2017 at 09:11:50PM +0300, Dmitrii Tcvetkov wrote:
> Tested on top of current mainline master (commit
> af3c8d98508d37541d4bf57f13a984a7f73a328c). Didn't find any
> regressions.

Thanks for testing.

If anybody wants to get their Tested-by in the patches, please let me
know, I'll add the tags (can be done until rc6). Your efforts are
appreciated so you get the credit in the undying history log of linux
kernel.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 0/6] Chunk level degradable check
  2017-07-13  0:50   ` David Sterba
@ 2017-07-13  1:09     ` Adam Borowski
  2017-07-13 12:12       ` Austin S. Hemmelgarn
  0 siblings, 1 reply; 17+ messages in thread
From: Adam Borowski @ 2017-07-13  1:09 UTC (permalink / raw)
  To: dsterba, Dmitrii Tcvetkov, Qu Wenruo, linux-btrfs, ahferroin7

On Thu, Jul 13, 2017 at 02:50:10AM +0200, David Sterba wrote:
> On Mon, Jul 10, 2017 at 09:11:50PM +0300, Dmitrii Tcvetkov wrote:
> > Tested on top of current mainline master (commit
> > af3c8d98508d37541d4bf57f13a984a7f73a328c). Didn't find any
> > regressions.

I've retested this yet again.  No regressions as well.

> Thanks for testing.
> 
> If anybody wants to get their Tested-by in the patches, please let me
> know, I'll add the tags (can be done until rc6). Your efforts are
> appreciated so you get the credit in the undying history log of linux
> kernel.

Heh.  Let's not play such games but finally get this patch set in, it's by
far the biggest problem for multi-device.

Any issues with degraded mounts which I reported before are not related to
this patch set, I merely found them at the time of testing.


Meow!
-- 
⢀⣴⠾⠻⢶⣦⠀ 
⣾⠁⢠⠒⠀⣿⡁ A dumb species has no way to open a tuna can.
⢿⡄⠘⠷⠚⠋⠀ A smart species invents a can opener.
⠈⠳⣄⠀⠀⠀⠀ A master species delegates.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 0/6] Chunk level degradable check
  2017-07-13  1:09     ` Adam Borowski
@ 2017-07-13 12:12       ` Austin S. Hemmelgarn
  0 siblings, 0 replies; 17+ messages in thread
From: Austin S. Hemmelgarn @ 2017-07-13 12:12 UTC (permalink / raw)
  To: Adam Borowski, dsterba, Dmitrii Tcvetkov, Qu Wenruo, linux-btrfs

On 2017-07-12 21:09, Adam Borowski wrote:
> On Thu, Jul 13, 2017 at 02:50:10AM +0200, David Sterba wrote:
>> On Mon, Jul 10, 2017 at 09:11:50PM +0300, Dmitrii Tcvetkov wrote:
>>> Tested on top of current mainline master (commit
>>> af3c8d98508d37541d4bf57f13a984a7f73a328c). Didn't find any
>>> regressions.
> 
> I've retested this yet again.  No regressions as well.
Same here, including checking on a number of MIPS, PPC, and ARM variants 
in QEMU.
> 
>> Thanks for testing.
>>
>> If anybody wants to get their Tested-by in the patches, please let me
>> know, I'll add the tags (can be done until rc6). Your efforts are
>> appreciated so you get the credit in the undying history log of linux
>> kernel.
> 
> Heh.  Let's not play such games but finally get this patch set in, it's by
> far the biggest problem for multi-device.
Agreed, this should have gone in back when the first version was posted...

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 1/6] btrfs: Introduce a function to check if all chunks a OK for degraded rw mount
  2017-06-28  5:43 ` [PATCH v4 1/6] btrfs: Introduce a function to check if all chunks a OK for degraded rw mount Qu Wenruo
@ 2017-07-14  7:44   ` Nikolay Borisov
  2017-07-14  8:19     ` Qu Wenruo
  0 siblings, 1 reply; 17+ messages in thread
From: Nikolay Borisov @ 2017-07-14  7:44 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs, dsterba
  Cc: ahferroin7, kilobyte, demfloro, Anand Jain



On 28.06.2017 08:43, Qu Wenruo wrote:
> Introduce a new function, btrfs_check_rw_degradable(), to check if all
> chunks in btrfs is OK for degraded rw mount.
> 
> It provides the new basis for accurate btrfs mount/remount and even
> runtime degraded mount check other than old one-size-fit-all method.
> 
> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
>  fs/btrfs/volumes.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  fs/btrfs/volumes.h |  1 +
>  2 files changed, 59 insertions(+)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index c95f018d4a1e..7a72fbdb8262 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -6817,6 +6817,64 @@ int btrfs_read_sys_array(struct btrfs_fs_info *fs_info)
>  	return -EIO;
>  }
>  
> +/*
> + * Check if all chunks in the fs is OK for read-write degraded mount
> + *
> + * Return true if all chunks meet the minimal RW mount requirement.
> + * Return false if any chunk doesn't meet the minimal RW mount requirement.
> + */
> +bool btrfs_check_rw_degradable(struct btrfs_fs_info *fs_info)
> +{
> +	struct btrfs_mapping_tree *map_tree = &fs_info->mapping_tree;
> +	struct extent_map *em;
> +	u64 next_start = 0;
> +	bool ret = true;
> +
> +	read_lock(&map_tree->map_tree.lock);
> +	em = lookup_extent_mapping(&map_tree->map_tree, 0, (u64)-1);
> +	read_unlock(&map_tree->map_tree.lock);
> +	/* No chunk at all? Return false anyway */
> +	if (!em) {
> +		ret = false;
> +		goto out;
> +	}
> +	while (em) {
> +		struct map_lookup *map;
> +		int missing = 0;
> +		int max_tolerated;
> +		int i;
> +
> +		map = em->map_lookup;
> +		max_tolerated =
> +			btrfs_get_num_tolerated_disk_barrier_failures(
> +					map->type);
> +		for (i = 0; i < map->num_stripes; i++) {
> +			struct btrfs_device *dev = map->stripes[i].dev;
> +
> +			if (!dev || !dev->bdev || dev->missing ||
> +			    dev->last_flush_error)
> +				missing++;
> +		}
> +		if (missing > max_tolerated) {
> +			ret = false;
> +			btrfs_warn(fs_info,
> +	"chunk %llu missing %d devices, max tolerance is %d for writeble mount",
> +				   em->start, missing, max_tolerated);
> +			free_extent_map(em);
> +			goto out;
> +		}
> +		next_start = extent_map_end(em);
> +		free_extent_map(em);
> +
> +		read_lock(&map_tree->map_tree.lock);
> +		em = lookup_extent_mapping(&map_tree->map_tree, next_start,
> +					   (u64)(-1) - next_start);
> +		read_unlock(&map_tree->map_tree.lock);
> +	}
> +out:
> +	return ret;
> +}
> +

Nit but I think in this function it would be best to directly return
true/false based on context rather than having the superfluous goto.

>  int btrfs_read_chunk_tree(struct btrfs_fs_info *fs_info)
>  {
>  	struct btrfs_root *root = fs_info->chunk_root;
> diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
> index 6f45fd60d15a..a5897c7a7e86 100644
> --- a/fs/btrfs/volumes.h
> +++ b/fs/btrfs/volumes.h
> @@ -543,4 +543,5 @@ struct list_head *btrfs_get_fs_uuids(void);
>  void btrfs_set_fs_info_ptr(struct btrfs_fs_info *fs_info);
>  void btrfs_reset_fs_info_ptr(struct btrfs_fs_info *fs_info);
>  
> +bool btrfs_check_rw_degradable(struct btrfs_fs_info *fs_info);
>  #endif
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 1/6] btrfs: Introduce a function to check if all chunks a OK for degraded rw mount
  2017-07-14  7:44   ` Nikolay Borisov
@ 2017-07-14  8:19     ` Qu Wenruo
  2017-07-18 16:29       ` David Sterba
  0 siblings, 1 reply; 17+ messages in thread
From: Qu Wenruo @ 2017-07-14  8:19 UTC (permalink / raw)
  To: Nikolay Borisov, Qu Wenruo, linux-btrfs, dsterba
  Cc: ahferroin7, kilobyte, demfloro, Anand Jain



On 2017年07月14日 15:44, Nikolay Borisov wrote:
> 
> 
> On 28.06.2017 08:43, Qu Wenruo wrote:
>> Introduce a new function, btrfs_check_rw_degradable(), to check if all
>> chunks in btrfs is OK for degraded rw mount.
>>
>> It provides the new basis for accurate btrfs mount/remount and even
>> runtime degraded mount check other than old one-size-fit-all method.
>>
>> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>> ---
>>   fs/btrfs/volumes.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   fs/btrfs/volumes.h |  1 +
>>   2 files changed, 59 insertions(+)
>>
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index c95f018d4a1e..7a72fbdb8262 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -6817,6 +6817,64 @@ int btrfs_read_sys_array(struct btrfs_fs_info *fs_info)
>>   	return -EIO;
>>   }
>>   
>> +/*
>> + * Check if all chunks in the fs is OK for read-write degraded mount
>> + *
>> + * Return true if all chunks meet the minimal RW mount requirement.
>> + * Return false if any chunk doesn't meet the minimal RW mount requirement.
>> + */
>> +bool btrfs_check_rw_degradable(struct btrfs_fs_info *fs_info)
>> +{
>> +	struct btrfs_mapping_tree *map_tree = &fs_info->mapping_tree;
>> +	struct extent_map *em;
>> +	u64 next_start = 0;
>> +	bool ret = true;
>> +
>> +	read_lock(&map_tree->map_tree.lock);
>> +	em = lookup_extent_mapping(&map_tree->map_tree, 0, (u64)-1);
>> +	read_unlock(&map_tree->map_tree.lock);
>> +	/* No chunk at all? Return false anyway */
>> +	if (!em) {
>> +		ret = false;
>> +		goto out;
>> +	}
>> +	while (em) {
>> +		struct map_lookup *map;
>> +		int missing = 0;
>> +		int max_tolerated;
>> +		int i;
>> +
>> +		map = em->map_lookup;
>> +		max_tolerated =
>> +			btrfs_get_num_tolerated_disk_barrier_failures(
>> +					map->type);
>> +		for (i = 0; i < map->num_stripes; i++) {
>> +			struct btrfs_device *dev = map->stripes[i].dev;
>> +
>> +			if (!dev || !dev->bdev || dev->missing ||
>> +			    dev->last_flush_error)
>> +				missing++;
>> +		}
>> +		if (missing > max_tolerated) {
>> +			ret = false;
>> +			btrfs_warn(fs_info,
>> +	"chunk %llu missing %d devices, max tolerance is %d for writeble mount",
>> +				   em->start, missing, max_tolerated);
>> +			free_extent_map(em);
>> +			goto out;
>> +		}
>> +		next_start = extent_map_end(em);
>> +		free_extent_map(em);
>> +
>> +		read_lock(&map_tree->map_tree.lock);
>> +		em = lookup_extent_mapping(&map_tree->map_tree, next_start,
>> +					   (u64)(-1) - next_start);
>> +		read_unlock(&map_tree->map_tree.lock);
>> +	}
>> +out:
>> +	return ret;
>> +}
>> +
> 
> Nit but I think in this function it would be best to directly return
> true/false based on context rather than having the superfluous goto.

Right, the goto out is not necessary as it's original design to handle 
extent map unlock.
But the final patch uses the current method to free extent map.

Indeed just returning true and false will be better, but goto out also 
seems fine to me.

Thanks,
Qu

> 
>>   int btrfs_read_chunk_tree(struct btrfs_fs_info *fs_info)
>>   {
>>   	struct btrfs_root *root = fs_info->chunk_root;
>> diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
>> index 6f45fd60d15a..a5897c7a7e86 100644
>> --- a/fs/btrfs/volumes.h
>> +++ b/fs/btrfs/volumes.h
>> @@ -543,4 +543,5 @@ struct list_head *btrfs_get_fs_uuids(void);
>>   void btrfs_set_fs_info_ptr(struct btrfs_fs_info *fs_info);
>>   void btrfs_reset_fs_info_ptr(struct btrfs_fs_info *fs_info);
>>   
>> +bool btrfs_check_rw_degradable(struct btrfs_fs_info *fs_info);
>>   #endif
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 1/6] btrfs: Introduce a function to check if all chunks a OK for degraded rw mount
  2017-07-14  8:19     ` Qu Wenruo
@ 2017-07-18 16:29       ` David Sterba
  0 siblings, 0 replies; 17+ messages in thread
From: David Sterba @ 2017-07-18 16:29 UTC (permalink / raw)
  To: Qu Wenruo
  Cc: Nikolay Borisov, Qu Wenruo, linux-btrfs, dsterba, ahferroin7,
	kilobyte, demfloro, Anand Jain

On Fri, Jul 14, 2017 at 04:19:07PM +0800, Qu Wenruo wrote:
> On 2017年07月14日 15:44, Nikolay Borisov wrote:
> > On 28.06.2017 08:43, Qu Wenruo wrote:
> >> Introduce a new function, btrfs_check_rw_degradable(), to check if all
> >> chunks in btrfs is OK for degraded rw mount.
> >>
> >> It provides the new basis for accurate btrfs mount/remount and even
> >> runtime degraded mount check other than old one-size-fit-all method.
> >>
> >> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
> >> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> >> ---
> >>   fs/btrfs/volumes.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>   fs/btrfs/volumes.h |  1 +
> >>   2 files changed, 59 insertions(+)
> >>
> >> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> >> index c95f018d4a1e..7a72fbdb8262 100644
> >> --- a/fs/btrfs/volumes.c
> >> +++ b/fs/btrfs/volumes.c
> >> @@ -6817,6 +6817,64 @@ int btrfs_read_sys_array(struct btrfs_fs_info *fs_info)
> >>   	return -EIO;
> >>   }
> >>   
> >> +/*
> >> + * Check if all chunks in the fs is OK for read-write degraded mount
> >> + *
> >> + * Return true if all chunks meet the minimal RW mount requirement.
> >> + * Return false if any chunk doesn't meet the minimal RW mount requirement.
> >> + */
> >> +bool btrfs_check_rw_degradable(struct btrfs_fs_info *fs_info)
> >> +{
> >> +	struct btrfs_mapping_tree *map_tree = &fs_info->mapping_tree;
> >> +	struct extent_map *em;
> >> +	u64 next_start = 0;
> >> +	bool ret = true;
> >> +
> >> +	read_lock(&map_tree->map_tree.lock);
> >> +	em = lookup_extent_mapping(&map_tree->map_tree, 0, (u64)-1);
> >> +	read_unlock(&map_tree->map_tree.lock);
> >> +	/* No chunk at all? Return false anyway */
> >> +	if (!em) {
> >> +		ret = false;
> >> +		goto out;
> >> +	}
> >> +	while (em) {
> >> +		struct map_lookup *map;
> >> +		int missing = 0;
> >> +		int max_tolerated;
> >> +		int i;
> >> +
> >> +		map = em->map_lookup;
> >> +		max_tolerated =
> >> +			btrfs_get_num_tolerated_disk_barrier_failures(
> >> +					map->type);
> >> +		for (i = 0; i < map->num_stripes; i++) {
> >> +			struct btrfs_device *dev = map->stripes[i].dev;
> >> +
> >> +			if (!dev || !dev->bdev || dev->missing ||
> >> +			    dev->last_flush_error)
> >> +				missing++;
> >> +		}
> >> +		if (missing > max_tolerated) {
> >> +			ret = false;
> >> +			btrfs_warn(fs_info,
> >> +	"chunk %llu missing %d devices, max tolerance is %d for writeble mount",
> >> +				   em->start, missing, max_tolerated);
> >> +			free_extent_map(em);
> >> +			goto out;
> >> +		}
> >> +		next_start = extent_map_end(em);
> >> +		free_extent_map(em);
> >> +
> >> +		read_lock(&map_tree->map_tree.lock);
> >> +		em = lookup_extent_mapping(&map_tree->map_tree, next_start,
> >> +					   (u64)(-1) - next_start);
> >> +		read_unlock(&map_tree->map_tree.lock);
> >> +	}
> >> +out:
> >> +	return ret;
> >> +}
> >> +
> > 
> > Nit but I think in this function it would be best to directly return
> > true/false based on context rather than having the superfluous goto.
> 
> Right, the goto out is not necessary as it's original design to handle 
> extent map unlock.
> But the final patch uses the current method to free extent map.
> 
> Indeed just returning true and false will be better, but goto out also 
> seems fine to me.

Yeah, it conforms to the pattern of single return point, though this
usually is best in functions with multiple jumps sources and some
non-trivial cleanup code.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2017-07-18 16:30 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-06-28  5:43 [PATCH v4 0/6] Chunk level degradable check Qu Wenruo
2017-06-28  5:43 ` [PATCH v4 1/6] btrfs: Introduce a function to check if all chunks a OK for degraded rw mount Qu Wenruo
2017-07-14  7:44   ` Nikolay Borisov
2017-07-14  8:19     ` Qu Wenruo
2017-07-18 16:29       ` David Sterba
2017-06-28  5:43 ` [PATCH v4 2/6] btrfs: Do chunk level rw degrade check at mount time Qu Wenruo
2017-06-28  5:43 ` [PATCH v4 3/6] btrfs: Do chunk level degradation check for remount Qu Wenruo
2017-06-28  5:43 ` [PATCH v4 4/6] btrfs: Allow barrier_all_devices to do chunk level device check Qu Wenruo
2017-06-28  5:43 ` [PATCH v4 5/6] btrfs: Cleanup num_tolerated_disk_barrier_failures Qu Wenruo
2017-06-28  5:43 ` [PATCH v4 6/6] btrfs: Enhance missing device kernel message Qu Wenruo
2017-06-28 13:54 ` [PATCH v4 0/6] Chunk level degradable check David Sterba
2017-07-10 18:11 ` Dmitrii Tcvetkov
2017-07-13  0:50   ` David Sterba
2017-07-13  1:09     ` Adam Borowski
2017-07-13 12:12       ` Austin S. Hemmelgarn
2017-07-12 15:24 ` David Sterba
2017-07-12 23:53   ` Qu Wenruo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).