linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/11] btrfs-progs: fix bugs and CHANGING_FSID_V2 flag
@ 2023-07-07 15:52 Anand Jain
  2023-07-07 15:52 ` [PATCH 01/11] btrfs-progs: fix duplicate missing device Anand Jain
                   ` (10 more replies)
  0 siblings, 11 replies; 22+ messages in thread
From: Anand Jain @ 2023-07-07 15:52 UTC (permalink / raw)
  To: linux-btrfs

Btrfstune currently lacks support for fixing incomplete or broken
previous runs. Instead, it relies on the kernel to assemble the
correct set of devices based on metadata_uuid, generation, and
the CHANGE_FSID_V2 flag.

However, depending on the kernel to handle this interim change_fsid
state may not be suitable for all situations. For instance, if
there are other fsids sharing the same metadata_uuid but are not
part of the incomplete fsid, the assembly process in the kernel
may not be transparent or controllable to the user. So, this patch
set bring the ability to fix incomplete fsid changes to the userland
using btrfstune.

This patch set fixes bugs as in the individual change log,
introduces new fs_devices members and helper functions, and
the last patch provides the feature discussed.

To ensure proper device assembly and to mitigate any potential
incorrect assembly this feature is behind the --noscan and
--device options. And calls automatically, however I am ok to
add another option like --fix-changing_fsid if you think it
makes sense. However, I don't think it is needed as of now
the kernel does it with all normal options.

This patch set depends on:

 --noscan and --device option:
  [PATCH 00/10] btrfs-progs: check and tune: add device and noscan options


Testing:

The btrfs-progs tests/misc-tests/034-metadata_uuid include a series of 4
disk images that contain incomplete fsid states (changing_fsid).

To evaluate this patch set, the images were tested using a local script,
which is not yet prepared for submission.

The typical testing steps involve:

	xz --uncompress --keep <imgs>

	btrfstune -m --noscan --device=disk1.raw disk2.raw
	btrfstune -m --noscan --device=disk2.raw disk1.raw

	btrfstune -m --noscan --device=disk3.raw disk4.raw
	btrfstune -m --noscan --device=disk4.raw disk3.raw

	and so on.

I plan to migrate misc-test/034-metadata_uuid test cases to assess
btrfstune, as the kernel's support for fixing the changing_fsid state
will be removed.

This patch set has successfully passed the btrfs-progs test cases.

Thanks.

Anand Jain (11):
  btrfs-progs: fix duplicate missing device
  btrfs-progs: call warn() for missing device
  btrfs-progs: track missing device counter
  btrfs-progs: NULL initialize device name for missing
  btrfs-progs: tune: check for missing device
  btrfs-progs: track changing_fsid flag in fs_devices
  btrfs-progs: track num_devices per fs_devices
  btrfs-progs: track total_devs in fs devices
  btrfs-progs: track active metadata_uuid per fs_devices
  btrfs-progs: add helper to reunite devices with same metadata_uuid
  btrfs-progs: tune: fix incomplete fsid_change

 common/device-scan.c        | 42 ++++++++++++++++
 common/device-scan.h        |  1 +
 kernel-shared/volumes.c     | 99 +++++++++++++++++++++++++++++++++++--
 kernel-shared/volumes.h     |  8 +++
 tune/change-metadata-uuid.c |  4 +-
 tune/change-uuid.c          |  4 +-
 tune/main.c                 | 16 +++++-
 7 files changed, 166 insertions(+), 8 deletions(-)

-- 
2.39.3


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 01/11] btrfs-progs: fix duplicate missing device
  2023-07-07 15:52 [PATCH 00/11] btrfs-progs: fix bugs and CHANGING_FSID_V2 flag Anand Jain
@ 2023-07-07 15:52 ` Anand Jain
  2023-07-07 15:52 ` [PATCH 02/11] btrfs-progs: call warn() for " Anand Jain
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Anand Jain @ 2023-07-07 15:52 UTC (permalink / raw)
  To: linux-btrfs

btrfs_read_sys_array() adds a device with only devid, but without the
device UUID. As a result, any subsequent
btrfs_find_device(..devid, uuid) calls will still fail, resulting in the
addition of another struct btrfs_device to the list, as shown below.

open_ctree_fd()
::
 btrfs_setup_chunk_tree_and_device_map()
  btrfs_read_sys_array()
   read_one_chunk()
    btrfs_find_device()
    fill_missing_device() <--- dev_uuid wasn't updated
    list_add()
  btrfs_read_chunk_tree
    read_one_dev()
       btrfs_find_device(..,devid, dev_uuid,..); <-- fails
       list_add
       fill_device_from_item(leaf, dev_item, device);

This ends up having two btrfs_device for the same missing devid.

Before:

There are two device with devid 1.

$ ./btrfstune -m --noscan  /tdev/disk4.raw
warning, device 1 is missing
[fsid: 95bbc163-671a-4a0a-bd34-03a65e4b338c]
	size:			120
	metadata_uuid:		95bbc163-671a-4a0a-bd34-03a65e4b338c
	fs_devs_addr:		0xdb64e0
	total_rw_bytes:		1048576000
	[[UUID: 703a4cac-bca0-47e4-98f6-55e530800172]]
		sb_flags: 0x0
		sb_incompact_flags: 0x0
		dev_addr:	0xdb69a0
		device:		(null)
		devid:		1
		generation:	0
		total_bytes:	524288000
		bytes_used:	127926272
		type:		0
		io_align:	4096
		io_width:	4096
		sector_size:	4096
	[[UUID: 00000000-0000-0000-0000-000000000000]]
		sb_flags: 0x0
		sb_incompact_flags: 0x0
		dev_addr:	0xdb3060
		device:		(null)
		devid:		1
		generation:	0
		total_bytes:	0
		bytes_used:	0
		type:		0
		io_align:	0
		io_width:	0
		sector_size:	0
	[[UUID: 1db7564f-e53b-46ff-8a33-a8b2d00d86d1]]
		sb_flags: 0x1000000001
		sb_incompact_flags: 0x141
		dev_addr:	0xdb6e90
		device:		/tdev/disk4.raw
		devid:		2
		generation:	6
		total_bytes:	524288000
		bytes_used:	127926272
		type:		0
		io_align:	4096
		io_width:	4096
		sector_size:	4096

Fix this issue by adding the UUID to the missing device created in
fill_missing_device().

After:

$ ./btrfstune -m --noscan  /tdev/disk4.raw
warning, device 1 is missing
[fsid: 95bbc163-671a-4a0a-bd34-03a65e4b338c]
        size:                   120
        metadata_uuid:          95bbc163-671a-4a0a-bd34-03a65e4b338c
        fs_devs_addr:           0x161f380
        total_rw_bytes:         1048576000
        [[UUID: 703a4cac-bca0-47e4-98f6-55e530800172]]
                sb_flags: 0x0
                sb_incompact_flags: 0x0
                dev_addr:       0x161c060
                device:         (null)
                devid:          1
                generation:     0
                total_bytes:    524288000
                bytes_used:     127926272
                type:           0
                io_align:       4096
                io_width:       4096
                sector_size:    4096
        [[UUID: 1db7564f-e53b-46ff-8a33-a8b2d00d86d1]]
                sb_flags: 0x1000000001
                sb_incompact_flags: 0x141
                dev_addr:       0x161fe90
                device:         /tdev/disk4.raw
                devid:          2
                generation:     6
                total_bytes:    524288000
                bytes_used:     127926272
                type:           0
                io_align:       4096
                io_width:       4096
                sector_size:    4096

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 kernel-shared/volumes.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel-shared/volumes.c b/kernel-shared/volumes.c
index 81abda3f7d1c..92282524867d 100644
--- a/kernel-shared/volumes.c
+++ b/kernel-shared/volumes.c
@@ -2077,12 +2077,13 @@ int btrfs_chunk_readonly(struct btrfs_fs_info *fs_info, u64 chunk_offset)
 	return readonly;
 }
 
-static struct btrfs_device *fill_missing_device(u64 devid)
+static struct btrfs_device *fill_missing_device(u64 devid, u8 *uuid)
 {
 	struct btrfs_device *device;
 
 	device = kzalloc(sizeof(*device), GFP_NOFS);
 	device->devid = devid;
+	memcpy(device->uuid, uuid, BTRFS_UUID_SIZE);
 	device->fd = -1;
 	return device;
 }
@@ -2150,7 +2151,7 @@ static int read_one_chunk(struct btrfs_fs_info *fs_info, struct btrfs_key *key,
 		map->stripes[i].dev = btrfs_find_device(fs_info, devid, uuid,
 							NULL);
 		if (!map->stripes[i].dev) {
-			map->stripes[i].dev = fill_missing_device(devid);
+			map->stripes[i].dev = fill_missing_device(devid, uuid);
 			printf("warning, device %llu is missing\n",
 			       (unsigned long long)devid);
 			list_add(&map->stripes[i].dev->dev_list,
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 02/11] btrfs-progs: call warn() for missing device
  2023-07-07 15:52 [PATCH 00/11] btrfs-progs: fix bugs and CHANGING_FSID_V2 flag Anand Jain
  2023-07-07 15:52 ` [PATCH 01/11] btrfs-progs: fix duplicate missing device Anand Jain
@ 2023-07-07 15:52 ` Anand Jain
  2023-07-13 18:48   ` David Sterba
  2023-07-07 15:52 ` [PATCH 03/11] btrfs-progs: track missing device counter Anand Jain
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 22+ messages in thread
From: Anand Jain @ 2023-07-07 15:52 UTC (permalink / raw)
  To: linux-btrfs

When we add a struct btrfs_device for the missing device, announce a
warning indicating that a device is missing.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 kernel-shared/volumes.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel-shared/volumes.c b/kernel-shared/volumes.c
index 92282524867d..d20cb3177a34 100644
--- a/kernel-shared/volumes.c
+++ b/kernel-shared/volumes.c
@@ -2252,6 +2252,7 @@ static int read_one_dev(struct btrfs_fs_info *fs_info,
 
 	device = btrfs_find_device(fs_info, devid, dev_uuid, fs_uuid);
 	if (!device) {
+		warning("device id %llu is missing", devid);
 		device = kzalloc(sizeof(*device), GFP_NOFS);
 		if (!device)
 			return -ENOMEM;
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 03/11] btrfs-progs: track missing device counter
  2023-07-07 15:52 [PATCH 00/11] btrfs-progs: fix bugs and CHANGING_FSID_V2 flag Anand Jain
  2023-07-07 15:52 ` [PATCH 01/11] btrfs-progs: fix duplicate missing device Anand Jain
  2023-07-07 15:52 ` [PATCH 02/11] btrfs-progs: call warn() for " Anand Jain
@ 2023-07-07 15:52 ` Anand Jain
  2023-07-07 15:52 ` [PATCH 04/11] btrfs-progs: NULL initialize device name for missing Anand Jain
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Anand Jain @ 2023-07-07 15:52 UTC (permalink / raw)
  To: linux-btrfs

Maintain the btrfs_fs_devices::missing counter to track the number of
missing devices.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 kernel-shared/volumes.c | 2 ++
 kernel-shared/volumes.h | 1 +
 2 files changed, 3 insertions(+)

diff --git a/kernel-shared/volumes.c b/kernel-shared/volumes.c
index d20cb3177a34..671396dba689 100644
--- a/kernel-shared/volumes.c
+++ b/kernel-shared/volumes.c
@@ -2156,6 +2156,7 @@ static int read_one_chunk(struct btrfs_fs_info *fs_info, struct btrfs_key *key,
 			       (unsigned long long)devid);
 			list_add(&map->stripes[i].dev->dev_list,
 				 &fs_info->fs_devices->devices);
+			fs_info->fs_devices->missing++;
 		}
 
 	}
@@ -2259,6 +2260,7 @@ static int read_one_dev(struct btrfs_fs_info *fs_info,
 		device->fd = -1;
 		list_add(&device->dev_list,
 			 &fs_info->fs_devices->devices);
+		fs_info->fs_devices->missing++;
 	}
 
 	fill_device_from_item(leaf, dev_item, device);
diff --git a/kernel-shared/volumes.h b/kernel-shared/volumes.h
index ab5ac40269bb..d2915681e6de 100644
--- a/kernel-shared/volumes.h
+++ b/kernel-shared/volumes.h
@@ -90,6 +90,7 @@ struct btrfs_fs_devices {
 
 	u64 total_rw_bytes;
 
+	int missing;
 	int latest_bdev;
 	int lowest_bdev;
 	struct list_head devices;
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 04/11] btrfs-progs: NULL initialize device name for missing
  2023-07-07 15:52 [PATCH 00/11] btrfs-progs: fix bugs and CHANGING_FSID_V2 flag Anand Jain
                   ` (2 preceding siblings ...)
  2023-07-07 15:52 ` [PATCH 03/11] btrfs-progs: track missing device counter Anand Jain
@ 2023-07-07 15:52 ` Anand Jain
  2023-07-07 15:52 ` [PATCH 05/11] btrfs-progs: tune: check for missing device Anand Jain
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Anand Jain @ 2023-07-07 15:52 UTC (permalink / raw)
  To: linux-btrfs

When we add an entry for the  missing device it has no device name,
set the btrfs_device::name parameter to NULL.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 kernel-shared/volumes.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel-shared/volumes.c b/kernel-shared/volumes.c
index 671396dba689..4a8c559d4b20 100644
--- a/kernel-shared/volumes.c
+++ b/kernel-shared/volumes.c
@@ -2082,6 +2082,7 @@ static struct btrfs_device *fill_missing_device(u64 devid, u8 *uuid)
 	struct btrfs_device *device;
 
 	device = kzalloc(sizeof(*device), GFP_NOFS);
+	device->name = NULL;
 	device->devid = devid;
 	memcpy(device->uuid, uuid, BTRFS_UUID_SIZE);
 	device->fd = -1;
@@ -2257,6 +2258,7 @@ static int read_one_dev(struct btrfs_fs_info *fs_info,
 		device = kzalloc(sizeof(*device), GFP_NOFS);
 		if (!device)
 			return -ENOMEM;
+		device->name = NULL;
 		device->fd = -1;
 		list_add(&device->dev_list,
 			 &fs_info->fs_devices->devices);
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 05/11] btrfs-progs: tune: check for missing device
  2023-07-07 15:52 [PATCH 00/11] btrfs-progs: fix bugs and CHANGING_FSID_V2 flag Anand Jain
                   ` (3 preceding siblings ...)
  2023-07-07 15:52 ` [PATCH 04/11] btrfs-progs: NULL initialize device name for missing Anand Jain
@ 2023-07-07 15:52 ` Anand Jain
  2023-07-13 18:49   ` David Sterba
  2023-07-07 15:52 ` [PATCH 06/11] btrfs-progs: track changing_fsid flag in fs_devices Anand Jain
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 22+ messages in thread
From: Anand Jain @ 2023-07-07 15:52 UTC (permalink / raw)
  To: linux-btrfs

If btrfstune is executed on a filesystem that contains a missing device,
the command will now fail.

It is ok fail when any of the options supported by btrfstune are used.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 tune/main.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/tune/main.c b/tune/main.c
index 570e3908ef8a..dc72944a2b67 100644
--- a/tune/main.c
+++ b/tune/main.c
@@ -330,6 +330,13 @@ int BOX_MAIN(btrfstune)(int argc, char *argv[])
 		goto free_out;
 	}
 
+	if (root->fs_info->fs_devices->missing) {
+		error("missing %d device(s), failing the command",
+		       root->fs_info->fs_devices->missing);
+		ret = 1;
+		goto free_out;
+	}
+
  	if (to_bg_tree) {
 		if (to_extent_tree) {
 			error("option --convert-to-block-group-tree conflicts with --convert-from-block-group-tree");
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 06/11] btrfs-progs: track changing_fsid flag in fs_devices
  2023-07-07 15:52 [PATCH 00/11] btrfs-progs: fix bugs and CHANGING_FSID_V2 flag Anand Jain
                   ` (4 preceding siblings ...)
  2023-07-07 15:52 ` [PATCH 05/11] btrfs-progs: tune: check for missing device Anand Jain
@ 2023-07-07 15:52 ` Anand Jain
  2023-07-07 15:52 ` [PATCH 07/11] btrfs-progs: track num_devices per fs_devices Anand Jain
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Anand Jain @ 2023-07-07 15:52 UTC (permalink / raw)
  To: linux-btrfs

To prepare for reuniting separated devices due to an incomplete
fsid change task, consolidate and monitor the per device's
changing_fsid flag in the struct btrfs_fs_devices.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 kernel-shared/volumes.c | 9 +++++++++
 kernel-shared/volumes.h | 2 ++
 tune/change-uuid.c      | 4 +---
 3 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/kernel-shared/volumes.c b/kernel-shared/volumes.c
index 4a8c559d4b20..51b3a16a39af 100644
--- a/kernel-shared/volumes.c
+++ b/kernel-shared/volumes.c
@@ -342,6 +342,9 @@ static int device_list_add(const char *path,
 	u64 devid = btrfs_stack_device_id(&disk_super->dev_item);
 	bool metadata_uuid = (btrfs_super_incompat_flags(disk_super) &
 		BTRFS_FEATURE_INCOMPAT_METADATA_UUID);
+	bool changing_fsid = (btrfs_super_flags(disk_super) &
+			      (BTRFS_SUPER_FLAG_CHANGING_FSID |
+			       BTRFS_SUPER_FLAG_CHANGING_FSID_V2));
 
 	if (metadata_uuid)
 		fs_devices = find_fsid(disk_super->fsid,
@@ -424,6 +427,12 @@ static int device_list_add(const char *path,
                 device->name = name;
         }
 
+	/*
+	 * If changing_fsid the fs_devices will still hold the status from
+	 * the other devices.
+	 */
+	if (changing_fsid)
+		fs_devices->changing_fsid = true;
 
 	if (found_transid > fs_devices->latest_trans) {
 		fs_devices->latest_devid = devid;
diff --git a/kernel-shared/volumes.h b/kernel-shared/volumes.h
index d2915681e6de..9763c677a7cc 100644
--- a/kernel-shared/volumes.h
+++ b/kernel-shared/volumes.h
@@ -99,6 +99,8 @@ struct btrfs_fs_devices {
 	struct btrfs_fs_devices *seed;
 
 	enum btrfs_chunk_allocation_policy chunk_alloc_policy;
+
+	bool changing_fsid;
 };
 
 struct btrfs_bio_stripe {
diff --git a/tune/change-uuid.c b/tune/change-uuid.c
index cbfc8634168b..30cfb145459f 100644
--- a/tune/change-uuid.c
+++ b/tune/change-uuid.c
@@ -214,10 +214,8 @@ int check_unfinished_fsid_change(struct btrfs_fs_info *fs_info,
 				 uuid_t fsid_ret, uuid_t chunk_id_ret)
 {
 	struct btrfs_root *tree_root = fs_info->tree_root;
-	u64 flags = btrfs_super_flags(fs_info->super_copy);
 
-	if (flags & (BTRFS_SUPER_FLAG_CHANGING_FSID |
-		     BTRFS_SUPER_FLAG_CHANGING_FSID_V2)) {
+	if (fs_info->fs_devices->changing_fsid) {
 		memcpy(fsid_ret, fs_info->super_copy->fsid, BTRFS_FSID_SIZE);
 		read_extent_buffer(tree_root->node, chunk_id_ret,
 				btrfs_header_chunk_tree_uuid(tree_root->node),
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 07/11] btrfs-progs: track num_devices per fs_devices
  2023-07-07 15:52 [PATCH 00/11] btrfs-progs: fix bugs and CHANGING_FSID_V2 flag Anand Jain
                   ` (5 preceding siblings ...)
  2023-07-07 15:52 ` [PATCH 06/11] btrfs-progs: track changing_fsid flag in fs_devices Anand Jain
@ 2023-07-07 15:52 ` Anand Jain
  2023-07-07 15:52 ` [PATCH 08/11] btrfs-progs: track total_devs in fs devices Anand Jain
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Anand Jain @ 2023-07-07 15:52 UTC (permalink / raw)
  To: linux-btrfs

Similar to the kernel we need to track the number of devices scanned
per fs_devices. A preparation patch to fix incomplete fsid changing.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 kernel-shared/volumes.c | 1 +
 kernel-shared/volumes.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/kernel-shared/volumes.c b/kernel-shared/volumes.c
index 51b3a16a39af..1e42e4fe105e 100644
--- a/kernel-shared/volumes.c
+++ b/kernel-shared/volumes.c
@@ -405,6 +405,7 @@ static int device_list_add(const char *path,
 			btrfs_stack_device_bytes_used(&disk_super->dev_item);
 		list_add(&device->dev_list, &fs_devices->devices);
 		device->fs_devices = fs_devices;
+		fs_devices->num_devices++;
 	} else if (!device->name || strcmp(device->name, path)) {
 		char *name;
 
diff --git a/kernel-shared/volumes.h b/kernel-shared/volumes.h
index 9763c677a7cc..93e02a901d31 100644
--- a/kernel-shared/volumes.h
+++ b/kernel-shared/volumes.h
@@ -90,6 +90,7 @@ struct btrfs_fs_devices {
 
 	u64 total_rw_bytes;
 
+	int num_devices;
 	int missing;
 	int latest_bdev;
 	int lowest_bdev;
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 08/11] btrfs-progs: track total_devs in fs devices
  2023-07-07 15:52 [PATCH 00/11] btrfs-progs: fix bugs and CHANGING_FSID_V2 flag Anand Jain
                   ` (6 preceding siblings ...)
  2023-07-07 15:52 ` [PATCH 07/11] btrfs-progs: track num_devices per fs_devices Anand Jain
@ 2023-07-07 15:52 ` Anand Jain
  2023-07-13 18:58   ` David Sterba
  2023-07-07 15:52 ` [PATCH 09/11] btrfs-progs: track active metadata_uuid per fs_devices Anand Jain
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 22+ messages in thread
From: Anand Jain @ 2023-07-07 15:52 UTC (permalink / raw)
  To: linux-btrfs

Similar to the kernel, introduce the btrfs_fs_devices::total_devs
attribute to know the overall count of devices that are expected
to be present per filesystem.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 kernel-shared/volumes.c | 4 +++-
 kernel-shared/volumes.h | 1 +
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel-shared/volumes.c b/kernel-shared/volumes.c
index 1e42e4fe105e..fd5890d033c8 100644
--- a/kernel-shared/volumes.c
+++ b/kernel-shared/volumes.c
@@ -367,7 +367,8 @@ static int device_list_add(const char *path,
 			       BTRFS_FSID_SIZE);
 
 		fs_devices->latest_devid = devid;
-		fs_devices->latest_trans = found_transid;
+		/* Below we would set this to found_transid */
+		fs_devices->latest_trans = 0;
 		fs_devices->lowest_devid = (u64)-1;
 		fs_devices->chunk_alloc_policy = BTRFS_CHUNK_ALLOC_REGULAR;
 		device = NULL;
@@ -438,6 +439,7 @@ static int device_list_add(const char *path,
 	if (found_transid > fs_devices->latest_trans) {
 		fs_devices->latest_devid = devid;
 		fs_devices->latest_trans = found_transid;
+		fs_devices->total_devices = device->total_devs;
 	}
 	if (fs_devices->lowest_devid > devid) {
 		fs_devices->lowest_devid = devid;
diff --git a/kernel-shared/volumes.h b/kernel-shared/volumes.h
index 93e02a901d31..09964f96ca37 100644
--- a/kernel-shared/volumes.h
+++ b/kernel-shared/volumes.h
@@ -90,6 +90,7 @@ struct btrfs_fs_devices {
 
 	u64 total_rw_bytes;
 
+	int total_devices;
 	int num_devices;
 	int missing;
 	int latest_bdev;
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 09/11] btrfs-progs: track active metadata_uuid per fs_devices
  2023-07-07 15:52 [PATCH 00/11] btrfs-progs: fix bugs and CHANGING_FSID_V2 flag Anand Jain
                   ` (7 preceding siblings ...)
  2023-07-07 15:52 ` [PATCH 08/11] btrfs-progs: track total_devs in fs devices Anand Jain
@ 2023-07-07 15:52 ` Anand Jain
  2023-07-07 15:52 ` [PATCH 10/11] btrfs-progs: add helper to reunite devices with same metadata_uuid Anand Jain
  2023-07-07 15:52 ` [PATCH 11/11] btrfs-progs: tune: fix incomplete fsid_change Anand Jain
  10 siblings, 0 replies; 22+ messages in thread
From: Anand Jain @ 2023-07-07 15:52 UTC (permalink / raw)
  To: linux-btrfs

When the fsid does not match the metadata_uuid, the METADATA_UUID flag is
set in the superblock.

Changing the fsid using the btrfstune -U|-u option is not possible on a
filesystem with the METADATA_UUID flag set. But we are checking the
METADATA_UUID only from the super_copy, and not from the other scanned
device.

To fix this bug, track the metadata_uuid at the fs_devices level instead
of checking it only on the specified device in the argument, and use it.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 kernel-shared/volumes.c | 2 ++
 kernel-shared/volumes.h | 1 +
 tune/main.c             | 3 ++-
 3 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel-shared/volumes.c b/kernel-shared/volumes.c
index fd5890d033c8..ac9e711a994f 100644
--- a/kernel-shared/volumes.c
+++ b/kernel-shared/volumes.c
@@ -435,6 +435,8 @@ static int device_list_add(const char *path,
 	 */
 	if (changing_fsid)
 		fs_devices->changing_fsid = true;
+	if (metadata_uuid)
+		fs_devices->active_metadata_uuid = true;
 
 	if (found_transid > fs_devices->latest_trans) {
 		fs_devices->latest_devid = devid;
diff --git a/kernel-shared/volumes.h b/kernel-shared/volumes.h
index 09964f96ca37..13d08cc7eea5 100644
--- a/kernel-shared/volumes.h
+++ b/kernel-shared/volumes.h
@@ -103,6 +103,7 @@ struct btrfs_fs_devices {
 	enum btrfs_chunk_allocation_policy chunk_alloc_policy;
 
 	bool changing_fsid;
+	bool active_metadata_uuid;
 };
 
 struct btrfs_bio_stripe {
diff --git a/tune/main.c b/tune/main.c
index dc72944a2b67..3ca9c5716573 100644
--- a/tune/main.c
+++ b/tune/main.c
@@ -444,7 +444,8 @@ int BOX_MAIN(btrfstune)(int argc, char *argv[])
 	}
 
 	if (random_fsid || (new_fsid_str && !change_metadata_uuid)) {
-		if (btrfs_fs_incompat(root->fs_info, METADATA_UUID)) {
+		if (btrfs_fs_incompat(root->fs_info, METADATA_UUID) ||
+		    root->fs_info->fs_devices->active_metadata_uuid) {
 			error(
 		"Cannot rewrite fsid while METADATA_UUID flag is active. \n"
 		"Ensure fsid and metadata_uuid match before retrying.");
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 10/11] btrfs-progs: add helper to reunite devices with same metadata_uuid
  2023-07-07 15:52 [PATCH 00/11] btrfs-progs: fix bugs and CHANGING_FSID_V2 flag Anand Jain
                   ` (8 preceding siblings ...)
  2023-07-07 15:52 ` [PATCH 09/11] btrfs-progs: track active metadata_uuid per fs_devices Anand Jain
@ 2023-07-07 15:52 ` Anand Jain
  2023-07-07 15:52 ` [PATCH 11/11] btrfs-progs: tune: fix incomplete fsid_change Anand Jain
  10 siblings, 0 replies; 22+ messages in thread
From: Anand Jain @ 2023-07-07 15:52 UTC (permalink / raw)
  To: linux-btrfs

The btrfstune -m|M option allows the user to change the fsid of the
filesystem. However, in a multi-device filesystem, if the fsid is not
successfully changed simultaneously on all devices, problems may arise.
In such cases, the user cannot use btrfstune again to resolve the
incomplete fsid change.

This patch provides helper function to find other devices having the
the same metadata_uuid and have them reunited with the given scanned
device so that btrfstune can fix the fsid.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 common/device-scan.c    | 42 ++++++++++++++++++++++++
 common/device-scan.h    |  1 +
 kernel-shared/volumes.c | 73 +++++++++++++++++++++++++++++++++++++++++
 kernel-shared/volumes.h |  2 ++
 4 files changed, 118 insertions(+)

diff --git a/common/device-scan.c b/common/device-scan.c
index 00ce15244a09..512754c01adb 100644
--- a/common/device-scan.c
+++ b/common/device-scan.c
@@ -541,6 +541,48 @@ int btrfs_scan_argv_devices(int dev_optind, int dev_argc, char **dev_argv)
 	return 0;
 }
 
+int scan_reunite_fs_devices(char *path)
+{
+	int ret;
+	int fd;
+	u64 total_devs;
+	struct btrfs_fs_devices *fs_devices;
+
+	ret = check_arg_type(path);
+	if (ret != BTRFS_ARG_BLKDEV && ret != BTRFS_ARG_REG) {
+		if (ret < 0) {
+			errno = -ret;
+			error("invalid argument %s: %m", path);
+		} else {
+			error("not a block device or regular file: %s", path);
+		}
+	}
+
+	fd = open(path, O_RDONLY);
+	if (fd < 0) {
+		error("cannot open %s: %m", path);
+		return -errno;
+	}
+	ret = btrfs_scan_one_device(fd, path, &fs_devices, &total_devs,
+				    BTRFS_SUPER_INFO_OFFSET, SBREAD_DEFAULT);
+	close(fd);
+	if (ret < 0) {
+		errno = -ret;
+		error("device scan of %s failed: %m", path);
+		return ret;
+	}
+
+	ret = 0;
+	/* Check for missing device */
+	if (fs_devices->num_devices != total_devs)
+		ret = reunite_fs_devices(fs_devices);
+
+	if (!ret)
+		fs_devices->sanitized = true;
+
+	return ret;
+}
+
 bool array_append(char **dest, char *src, int *cnt)
 {
 	char *this_tok = strtok(src, ",");
diff --git a/common/device-scan.h b/common/device-scan.h
index 7a6874298051..a1193e7f7020 100644
--- a/common/device-scan.h
+++ b/common/device-scan.h
@@ -61,5 +61,6 @@ void free_seen_fsid(struct seen_fsid *seen_fsid_hash[]);
 int test_uuid_unique(const char *uuid_str);
 bool array_append(char **dest, char *src, int *cnt);
 void free_array(char **prt, int cnt);
+int scan_reunite_fs_devices(char *path);
 
 #endif
diff --git a/kernel-shared/volumes.c b/kernel-shared/volumes.c
index ac9e711a994f..642f7084cf63 100644
--- a/kernel-shared/volumes.c
+++ b/kernel-shared/volumes.c
@@ -2958,3 +2958,76 @@ int btrfs_fix_device_and_super_size(struct btrfs_fs_info *fs_info)
 	}
 	return ret;
 }
+
+static int find_unifiable(struct btrfs_fs_devices *fsinfo_fs_devices,
+			  struct btrfs_fs_devices **ret_fs_devices)
+{
+	u8 *orig_uuid = fsinfo_fs_devices->metadata_uuid;
+	u8 *orig_fsid = fsinfo_fs_devices->fsid;
+	struct btrfs_fs_devices *fs_devices = NULL;
+	int ret = 0;
+
+	list_for_each_entry(fs_devices, &fs_uuids, list) {
+		/* skip the same fs_info fsid */
+		if (!memcmp(fs_devices->fsid, orig_fsid, BTRFS_FSID_SIZE))
+			continue;
+
+		/* skip the metadata_uuid which isn't fs_info metadata_uuid */
+		if (memcmp(fs_devices->metadata_uuid, orig_uuid, BTRFS_FSID_SIZE))
+			continue;
+
+		ret++;
+		*ret_fs_devices = fs_devices;
+	}
+
+	return ret;
+}
+
+int reunite_fs_devices(struct btrfs_fs_devices *fs_devices)
+{
+	struct btrfs_fs_devices *other_fs_devices = NULL;
+	struct btrfs_device *tmp_device;
+	struct btrfs_device *device;
+	int other_fsid_cnt = 0;
+	int missing_devs;
+
+	missing_devs = fs_devices->total_devices - fs_devices->num_devices;
+	other_fsid_cnt = find_unifiable(fs_devices, &other_fs_devices);
+
+	if (other_fsid_cnt == 0) {
+		error("No missing device(s) found");
+		return -EINVAL;
+	} else if (other_fsid_cnt > 1) {
+		error("Found more than one fsid with the same metadata_uuid");
+		error("Try use --device and --noscan options");
+		return -EINVAL;
+	}
+
+	/* Missing count in the fs_info should match with the scanned devices */
+	if (missing_devs != other_fs_devices->num_devices) {
+		error("Missing device(s) found %d expected %d",
+		      other_fs_devices->num_devices, missing_devs);
+		return -EINVAL;
+	}
+
+	list_for_each_entry_safe(device, tmp_device, &other_fs_devices->devices,
+				 dev_list) {
+		/* We have found the missing device, bring it in */
+		list_move(&device->dev_list, &fs_devices->devices);
+		fs_devices->num_devices++;
+		missing_devs--;
+	}
+
+	if (!list_empty(&other_fs_devices->devices) || missing_devs != 0 ||
+	    fs_devices->total_devices != fs_devices->num_devices) {
+		error("Found more or fewer missing devices");
+		return -EINVAL;
+	}
+
+	if (other_fs_devices->changing_fsid)
+		fs_devices->changing_fsid = true;
+	if (other_fs_devices->active_metadata_uuid)
+		fs_devices->active_metadata_uuid = true;
+
+	return 0;
+}
diff --git a/kernel-shared/volumes.h b/kernel-shared/volumes.h
index 13d08cc7eea5..9f755dfa9015 100644
--- a/kernel-shared/volumes.h
+++ b/kernel-shared/volumes.h
@@ -104,6 +104,7 @@ struct btrfs_fs_devices {
 
 	bool changing_fsid;
 	bool active_metadata_uuid;
+	bool sanitized;
 };
 
 struct btrfs_bio_stripe {
@@ -318,5 +319,6 @@ int btrfs_bg_type_to_nparity(u64 flags);
 int btrfs_bg_type_to_sub_stripes(u64 flags);
 u64 btrfs_bg_flags_for_device_num(int number);
 bool btrfs_bg_type_is_stripey(u64 flags);
+int reunite_fs_devices(struct btrfs_fs_devices *fs_devices);
 
 #endif
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 11/11] btrfs-progs: tune: fix incomplete fsid_change
  2023-07-07 15:52 [PATCH 00/11] btrfs-progs: fix bugs and CHANGING_FSID_V2 flag Anand Jain
                   ` (9 preceding siblings ...)
  2023-07-07 15:52 ` [PATCH 10/11] btrfs-progs: add helper to reunite devices with same metadata_uuid Anand Jain
@ 2023-07-07 15:52 ` Anand Jain
  2023-07-13 18:57   ` David Sterba
  10 siblings, 1 reply; 22+ messages in thread
From: Anand Jain @ 2023-07-07 15:52 UTC (permalink / raw)
  To: linux-btrfs

An incomplete fsid state occurs when devices have two or more fsids
associated with the same metadata_uuid. As it can be confusing to
determine which devices should be assembled together, the fix only
works when both the --noscan and --device options are used. This means
the user will have to manually select and assemble the devices with the
same metadata_uuids.

With this fix, the fsid can continue to be changed in the user-spce using
either a new fsid or a user-provided fsid.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 tune/change-metadata-uuid.c | 4 +++-
 tune/main.c                 | 6 ++++++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/tune/change-metadata-uuid.c b/tune/change-metadata-uuid.c
index 295308f299aa..c291324ca4c4 100644
--- a/tune/change-metadata-uuid.c
+++ b/tune/change-metadata-uuid.c
@@ -21,6 +21,7 @@
 #include "kernel-shared/uapi/btrfs.h"
 #include "kernel-shared/ctree.h"
 #include "kernel-shared/transaction.h"
+#include "kernel-shared/volumes.h"
 #include "common/messages.h"
 #include "tune/tune.h"
 
@@ -45,7 +46,8 @@ int set_metadata_uuid(struct btrfs_root *root, const char *uuid_string)
 		return 1;
 	}
 
-	if (check_unfinished_fsid_change(root->fs_info, unused1, unused2)) {
+	if (!root->fs_info->fs_devices->sanitized &&
+	    check_unfinished_fsid_change(root->fs_info, unused1, unused2)) {
 		error("UUID rewrite in progress, cannot change metadata_uuid");
 		return 1;
 	}
diff --git a/tune/main.c b/tune/main.c
index 3ca9c5716573..f6d56af80dbf 100644
--- a/tune/main.c
+++ b/tune/main.c
@@ -323,6 +323,12 @@ int BOX_MAIN(btrfstune)(int argc, char *argv[])
 			goto free_out;
 	}
 
+	if (argv_devices != NULL && noscan) {
+		ret = scan_reunite_fs_devices(device);
+		if (ret)
+			goto free_out;
+	}
+
 	root = open_ctree_fd(fd, device, 0, ctree_flags);
 	if (!root) {
 		error("open ctree failed");
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 02/11] btrfs-progs: call warn() for missing device
  2023-07-07 15:52 ` [PATCH 02/11] btrfs-progs: call warn() for " Anand Jain
@ 2023-07-13 18:48   ` David Sterba
  2023-07-17 18:22     ` Anand Jain
  0 siblings, 1 reply; 22+ messages in thread
From: David Sterba @ 2023-07-13 18:48 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs

On Fri, Jul 07, 2023 at 11:52:32PM +0800, Anand Jain wrote:
> When we add a struct btrfs_device for the missing device, announce a
> warning indicating that a device is missing.
> 
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
>  kernel-shared/volumes.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/kernel-shared/volumes.c b/kernel-shared/volumes.c
> index 92282524867d..d20cb3177a34 100644
> --- a/kernel-shared/volumes.c
> +++ b/kernel-shared/volumes.c
> @@ -2252,6 +2252,7 @@ static int read_one_dev(struct btrfs_fs_info *fs_info,
>  
>  	device = btrfs_find_device(fs_info, devid, dev_uuid, fs_uuid);
>  	if (!device) {
> +		warning("device id %llu is missing", devid);

read_one_dev() is a low level helper that should not print any messages,
the calling context is not known. If you really want to print such
message then please move it to the appropriate caller.

>  		device = kzalloc(sizeof(*device), GFP_NOFS);
>  		if (!device)
>  			return -ENOMEM;
> -- 
> 2.39.3

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 05/11] btrfs-progs: tune: check for missing device
  2023-07-07 15:52 ` [PATCH 05/11] btrfs-progs: tune: check for missing device Anand Jain
@ 2023-07-13 18:49   ` David Sterba
  2023-07-17 19:02     ` Anand Jain
  0 siblings, 1 reply; 22+ messages in thread
From: David Sterba @ 2023-07-13 18:49 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs

On Fri, Jul 07, 2023 at 11:52:35PM +0800, Anand Jain wrote:
> If btrfstune is executed on a filesystem that contains a missing device,
> the command will now fail.
> 
> It is ok fail when any of the options supported by btrfstune are used.

Why? btrfstune has options that only change the superblock, like
feature enable/disable -e, -r, -x and -S. A missing device could be
problem for others but it should be checked only when it's relevant.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 11/11] btrfs-progs: tune: fix incomplete fsid_change
  2023-07-07 15:52 ` [PATCH 11/11] btrfs-progs: tune: fix incomplete fsid_change Anand Jain
@ 2023-07-13 18:57   ` David Sterba
  2023-07-17 19:25     ` Anand Jain
  0 siblings, 1 reply; 22+ messages in thread
From: David Sterba @ 2023-07-13 18:57 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs

On Fri, Jul 07, 2023 at 11:52:41PM +0800, Anand Jain wrote:
> An incomplete fsid state occurs when devices have two or more fsids
> associated with the same metadata_uuid. As it can be confusing to
> determine which devices should be assembled together, the fix only
> works when both the --noscan and --device options are used. This means
> the user will have to manually select and assemble the devices with the
> same metadata_uuids.

This is not a good usability. If the user can determine which devices
should be in the --noscan and --device options why can't we do that
programaticaly? If the found devices can be reliably identified as part
of the filesystem (and there are e.g. no ambiguities due to duplicated
devices) then the command should work it out by itself.

The btrfstune commands are supposed to be restartable, namely the uuid
conversions, basically with the same command that was used to begin the
conversion. If there's a problem that needs user interactions then there
should be specific instructions what to check and how to resolve it
manually.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 08/11] btrfs-progs: track total_devs in fs devices
  2023-07-07 15:52 ` [PATCH 08/11] btrfs-progs: track total_devs in fs devices Anand Jain
@ 2023-07-13 18:58   ` David Sterba
  2023-07-17 19:28     ` Anand Jain
  0 siblings, 1 reply; 22+ messages in thread
From: David Sterba @ 2023-07-13 18:58 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs

On Fri, Jul 07, 2023 at 11:52:38PM +0800, Anand Jain wrote:
> Similar to the kernel, introduce the btrfs_fs_devices::total_devs
> attribute to know the overall count of devices that are expected
> to be present per filesystem.
> 
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
>  kernel-shared/volumes.c | 4 +++-
>  kernel-shared/volumes.h | 1 +
>  2 files changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel-shared/volumes.c b/kernel-shared/volumes.c
> index 1e42e4fe105e..fd5890d033c8 100644
> --- a/kernel-shared/volumes.c
> +++ b/kernel-shared/volumes.c
> @@ -367,7 +367,8 @@ static int device_list_add(const char *path,
>  			       BTRFS_FSID_SIZE);
>  
>  		fs_devices->latest_devid = devid;
> -		fs_devices->latest_trans = found_transid;
> +		/* Below we would set this to found_transid */
> +		fs_devices->latest_trans = 0;
>  		fs_devices->lowest_devid = (u64)-1;
>  		fs_devices->chunk_alloc_policy = BTRFS_CHUNK_ALLOC_REGULAR;
>  		device = NULL;
> @@ -438,6 +439,7 @@ static int device_list_add(const char *path,
>  	if (found_transid > fs_devices->latest_trans) {
>  		fs_devices->latest_devid = devid;
>  		fs_devices->latest_trans = found_transid;
> +		fs_devices->total_devices = device->total_devs;
>  	}
>  	if (fs_devices->lowest_devid > devid) {
>  		fs_devices->lowest_devid = devid;
> diff --git a/kernel-shared/volumes.h b/kernel-shared/volumes.h
> index 93e02a901d31..09964f96ca37 100644
> --- a/kernel-shared/volumes.h
> +++ b/kernel-shared/volumes.h
> @@ -90,6 +90,7 @@ struct btrfs_fs_devices {
>  
>  	u64 total_rw_bytes;
>  
> +	int total_devices;
>  	int num_devices;
>  	int missing;

All the device counters could be added separately, that's for the
kernel/userspace parity.

>  	int latest_bdev;
> -- 
> 2.39.3

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 02/11] btrfs-progs: call warn() for missing device
  2023-07-13 18:48   ` David Sterba
@ 2023-07-17 18:22     ` Anand Jain
  2023-07-19 14:34       ` Anand Jain
  0 siblings, 1 reply; 22+ messages in thread
From: Anand Jain @ 2023-07-17 18:22 UTC (permalink / raw)
  To: dsterba; +Cc: linux-btrfs



On 14/07/2023 02:48, David Sterba wrote:
> On Fri, Jul 07, 2023 at 11:52:32PM +0800, Anand Jain wrote:
>> When we add a struct btrfs_device for the missing device, announce a
>> warning indicating that a device is missing.
>>
>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>> ---
>>   kernel-shared/volumes.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/kernel-shared/volumes.c b/kernel-shared/volumes.c
>> index 92282524867d..d20cb3177a34 100644
>> --- a/kernel-shared/volumes.c
>> +++ b/kernel-shared/volumes.c
>> @@ -2252,6 +2252,7 @@ static int read_one_dev(struct btrfs_fs_info *fs_info,
>>   
>>   	device = btrfs_find_device(fs_info, devid, dev_uuid, fs_uuid);
>>   	if (!device) {
>> +		warning("device id %llu is missing", devid);
> 
> read_one_dev() is a low level helper that should not print any messages,
> the calling context is not known. If you really want to print such
> message then please move it to the appropriate caller.

The motivation for adding the warning comes from the following test:
Despite the missing device, btrfstune -m was still able to change
the fsid.

     $ mkfs.btrfs -f -dsingle -msingle /dev/sdb1 /dev/sdb2
     $ wipefs -a /dev/sdb2
     $ btrfstune -m /dev/sdb1

This series fixes the bug; that is, it makes btrfstune fail in this
test case and this particular patch reports the missing device.

Which is similar to the behavior in the kernel code.

read_one_dev() function is only used by btrfs_read_chunk_tree(),
which is not aware of the missing device through the returned
error code.

Perhaps we can make an exception here?

Thanks.
> 
>>   		device = kzalloc(sizeof(*device), GFP_NOFS);
>>   		if (!device)
>>   			return -ENOMEM;
>> -- 
>> 2.39.3

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 05/11] btrfs-progs: tune: check for missing device
  2023-07-13 18:49   ` David Sterba
@ 2023-07-17 19:02     ` Anand Jain
  0 siblings, 0 replies; 22+ messages in thread
From: Anand Jain @ 2023-07-17 19:02 UTC (permalink / raw)
  To: dsterba; +Cc: linux-btrfs



On 14/07/2023 02:49, David Sterba wrote:
> On Fri, Jul 07, 2023 at 11:52:35PM +0800, Anand Jain wrote:
>> If btrfstune is executed on a filesystem that contains a missing device,
>> the command will now fail.
>>
>> It is ok fail when any of the options supported by btrfstune are used.
> 
> Why? btrfstune has options that only change the superblock, like
> feature enable/disable -e, -r, -x and -S. A missing device could be
> problem for others but it should be checked only when it's relevant.


If the missing device generation number is greater than the one
provided here, the btrfstune may not be effective in the kernel
if devices are assembled. So just block those scenario.

Or we can be keep it open and block it based on the use case feedback.

Thanks.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 11/11] btrfs-progs: tune: fix incomplete fsid_change
  2023-07-13 18:57   ` David Sterba
@ 2023-07-17 19:25     ` Anand Jain
  2023-07-20 14:33       ` Anand Jain
  0 siblings, 1 reply; 22+ messages in thread
From: Anand Jain @ 2023-07-17 19:25 UTC (permalink / raw)
  To: dsterba; +Cc: linux-btrfs



On 14/07/2023 02:57, David Sterba wrote:
> On Fri, Jul 07, 2023 at 11:52:41PM +0800, Anand Jain wrote:
>> An incomplete fsid state occurs when devices have two or more fsids
>> associated with the same metadata_uuid. As it can be confusing to
>> determine which devices should be assembled together, the fix only
>> works when both the --noscan and --device options are used. This means
>> the user will have to manually select and assemble the devices with the
>> same metadata_uuids.
> 
> This is not a good usability. If the user can determine which devices
> should be in the --noscan and --device options why can't we do that
> programaticaly?

Technically, there can be many fsids with the same metadata_uuid.
I'm afraid that in all split-brain scenarios, we may not be able
to successfully identify device partners, and assembling the
filesystem with the wrong device could be disastrous.

> If the found devices can be reliably identified as part
> of the filesystem (and there are e.g. no ambiguities due to duplicated
> devices) then the command should work it out by itself.

I expect btrfstune -m to be used more for modifying btrfs file images
rather than block devices. Users can simply copy the image and change
the fsid, which the system won't be aware of unless the user informs
us.

> The btrfstune commands are supposed to be restartable, namely the uuid
> conversions, basically with the same command that was used to begin the
> conversion. If there's a problem that needs user interactions then there
> should be specific instructions what to check and how to resolve it
> manually.

I was trying to avoid yet another cmd option. If the command is
restarted and the superblock has the CHANGING_FSID flag set, then
the command will fail. In this state, we can add specific
instructions on how to continue (i.e., --noscan and --device options
are mandatory). Or any suggestion?

Thanks.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 08/11] btrfs-progs: track total_devs in fs devices
  2023-07-13 18:58   ` David Sterba
@ 2023-07-17 19:28     ` Anand Jain
  0 siblings, 0 replies; 22+ messages in thread
From: Anand Jain @ 2023-07-17 19:28 UTC (permalink / raw)
  To: dsterba; +Cc: linux-btrfs



On 14/07/2023 02:58, David Sterba wrote:
> On Fri, Jul 07, 2023 at 11:52:38PM +0800, Anand Jain wrote:
>> Similar to the kernel, introduce the btrfs_fs_devices::total_devs
>> attribute to know the overall count of devices that are expected
>> to be present per filesystem.
>>
>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>> ---
>>   kernel-shared/volumes.c | 4 +++-
>>   kernel-shared/volumes.h | 1 +
>>   2 files changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel-shared/volumes.c b/kernel-shared/volumes.c
>> index 1e42e4fe105e..fd5890d033c8 100644
>> --- a/kernel-shared/volumes.c
>> +++ b/kernel-shared/volumes.c
>> @@ -367,7 +367,8 @@ static int device_list_add(const char *path,
>>   			       BTRFS_FSID_SIZE);
>>   
>>   		fs_devices->latest_devid = devid;
>> -		fs_devices->latest_trans = found_transid;
>> +		/* Below we would set this to found_transid */
>> +		fs_devices->latest_trans = 0;
>>   		fs_devices->lowest_devid = (u64)-1;
>>   		fs_devices->chunk_alloc_policy = BTRFS_CHUNK_ALLOC_REGULAR;
>>   		device = NULL;
>> @@ -438,6 +439,7 @@ static int device_list_add(const char *path,
>>   	if (found_transid > fs_devices->latest_trans) {
>>   		fs_devices->latest_devid = devid;
>>   		fs_devices->latest_trans = found_transid;
>> +		fs_devices->total_devices = device->total_devs;
>>   	}
>>   	if (fs_devices->lowest_devid > devid) {
>>   		fs_devices->lowest_devid = devid;
>> diff --git a/kernel-shared/volumes.h b/kernel-shared/volumes.h
>> index 93e02a901d31..09964f96ca37 100644
>> --- a/kernel-shared/volumes.h
>> +++ b/kernel-shared/volumes.h
>> @@ -90,6 +90,7 @@ struct btrfs_fs_devices {
>>   
>>   	u64 total_rw_bytes;
>>   
>> +	int total_devices;
>>   	int num_devices;
>>   	int missing;
> 
> All the device counters could be added separately, that's for the
> kernel/userspace parity.

Hm. I didn't get this. Could you please elaborate on this?

Thanks.

> 
>>   	int latest_bdev;
>> -- 
>> 2.39.3

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 02/11] btrfs-progs: call warn() for missing device
  2023-07-17 18:22     ` Anand Jain
@ 2023-07-19 14:34       ` Anand Jain
  0 siblings, 0 replies; 22+ messages in thread
From: Anand Jain @ 2023-07-19 14:34 UTC (permalink / raw)
  To: dsterba; +Cc: linux-btrfs

On 18/07/2023 02:22, Anand Jain wrote:
> 
> 
> On 14/07/2023 02:48, David Sterba wrote:
>> On Fri, Jul 07, 2023 at 11:52:32PM +0800, Anand Jain wrote:
>>> When we add a struct btrfs_device for the missing device, announce a
>>> warning indicating that a device is missing.
>>>
>>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>>> ---
>>>   kernel-shared/volumes.c | 1 +
>>>   1 file changed, 1 insertion(+)
>>>
>>> diff --git a/kernel-shared/volumes.c b/kernel-shared/volumes.c
>>> index 92282524867d..d20cb3177a34 100644
>>> --- a/kernel-shared/volumes.c
>>> +++ b/kernel-shared/volumes.c
>>> @@ -2252,6 +2252,7 @@ static int read_one_dev(struct btrfs_fs_info 
>>> *fs_info,
>>>       device = btrfs_find_device(fs_info, devid, dev_uuid, fs_uuid);
>>>       if (!device) {
>>> +        warning("device id %llu is missing", devid);
>>
>> read_one_dev() is a low level helper that should not print any messages,
>> the calling context is not known. If you really want to print such
>> message then please move it to the appropriate caller.

Furthermore, there is only one caller from the top function:
open_ctree context.

__open_ctree_fd()
  btrfs_setup_chunk_tree_and_device_map()
   btrfs_read_chunk_tree()
    read_one_dev()

Do you remember any failing test cases or potential concerns
if we print the identified missing device information here?


Thanks.

> 
> The motivation for adding the warning comes from the following test:
> Despite the missing device, btrfstune -m was still able to change
> the fsid.
> 
>      $ mkfs.btrfs -f -dsingle -msingle /dev/sdb1 /dev/sdb2
>      $ wipefs -a /dev/sdb2
>      $ btrfstune -m /dev/sdb1
> 
> This series fixes the bug; that is, it makes btrfstune fail in this
> test case and this particular patch reports the missing device.
> 
> Which is similar to the behavior in the kernel code.
> 
> read_one_dev() function is only used by btrfs_read_chunk_tree(),
> which is not aware of the missing device through the returned
> error code.
> 
> Perhaps we can make an exception here?
> 



> Thanks.
>>
>>>           device = kzalloc(sizeof(*device), GFP_NOFS);
>>>           if (!device)
>>>               return -ENOMEM;
>>> -- 
>>> 2.39.3


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 11/11] btrfs-progs: tune: fix incomplete fsid_change
  2023-07-17 19:25     ` Anand Jain
@ 2023-07-20 14:33       ` Anand Jain
  0 siblings, 0 replies; 22+ messages in thread
From: Anand Jain @ 2023-07-20 14:33 UTC (permalink / raw)
  To: dsterba; +Cc: linux-btrfs

On 18/07/2023 03:25, Anand Jain wrote:
> 
> 
> On 14/07/2023 02:57, David Sterba wrote:
>> On Fri, Jul 07, 2023 at 11:52:41PM +0800, Anand Jain wrote:
>>> An incomplete fsid state occurs when devices have two or more fsids
>>> associated with the same metadata_uuid. As it can be confusing to
>>> determine which devices should be assembled together, the fix only
>>> works when both the --noscan and --device options are used. This means
>>> the user will have to manually select and assemble the devices with the
>>> same metadata_uuids.
>>
>> This is not a good usability. If the user can determine which devices
>> should be in the --noscan and --device options why can't we do that
>> programaticaly?
> 
> Technically, there can be many fsids with the same metadata_uuid.
> I'm afraid that in all split-brain scenarios, we may not be able
> to successfully identify device partners, and assembling the
> filesystem with the wrong device could be disastrous.
> 
>> If the found devices can be reliably identified as part
>> of the filesystem (and there are e.g. no ambiguities due to duplicated
>> devices) then the command should work it out by itself.
> 
> I expect btrfstune -m to be used more for modifying btrfs file images
> rather than block devices. Users can simply copy the image and change
> the fsid, which the system won't be aware of unless the user informs
> us.
> 
>> The btrfstune commands are supposed to be restartable, namely the uuid
>> conversions, basically with the same command that was used to begin the
>> conversion. If there's a problem that needs user interactions then there
>> should be specific instructions what to check and how to resolve it
>> manually.
> 
> I was trying to avoid yet another cmd option. If the command is
> restarted and the superblock has the CHANGING_FSID flag set, then
> the command will fail. In this state, we can add specific
> instructions on how to continue (i.e., --noscan and --device options
> are mandatory). Or any suggestion?
> 

  OR I think it would be better to combine the functionality of both
  --noscan and --device into one option, such as --reunite.
  Better? What do  you think?

> Thanks.


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2023-07-20 14:34 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-07 15:52 [PATCH 00/11] btrfs-progs: fix bugs and CHANGING_FSID_V2 flag Anand Jain
2023-07-07 15:52 ` [PATCH 01/11] btrfs-progs: fix duplicate missing device Anand Jain
2023-07-07 15:52 ` [PATCH 02/11] btrfs-progs: call warn() for " Anand Jain
2023-07-13 18:48   ` David Sterba
2023-07-17 18:22     ` Anand Jain
2023-07-19 14:34       ` Anand Jain
2023-07-07 15:52 ` [PATCH 03/11] btrfs-progs: track missing device counter Anand Jain
2023-07-07 15:52 ` [PATCH 04/11] btrfs-progs: NULL initialize device name for missing Anand Jain
2023-07-07 15:52 ` [PATCH 05/11] btrfs-progs: tune: check for missing device Anand Jain
2023-07-13 18:49   ` David Sterba
2023-07-17 19:02     ` Anand Jain
2023-07-07 15:52 ` [PATCH 06/11] btrfs-progs: track changing_fsid flag in fs_devices Anand Jain
2023-07-07 15:52 ` [PATCH 07/11] btrfs-progs: track num_devices per fs_devices Anand Jain
2023-07-07 15:52 ` [PATCH 08/11] btrfs-progs: track total_devs in fs devices Anand Jain
2023-07-13 18:58   ` David Sterba
2023-07-17 19:28     ` Anand Jain
2023-07-07 15:52 ` [PATCH 09/11] btrfs-progs: track active metadata_uuid per fs_devices Anand Jain
2023-07-07 15:52 ` [PATCH 10/11] btrfs-progs: add helper to reunite devices with same metadata_uuid Anand Jain
2023-07-07 15:52 ` [PATCH 11/11] btrfs-progs: tune: fix incomplete fsid_change Anand Jain
2023-07-13 18:57   ` David Sterba
2023-07-17 19:25     ` Anand Jain
2023-07-20 14:33       ` Anand Jain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).