* [PATCH 0/4 v4] btrfs-progs: recover from failed metadata_uuid port kernel
@ 2023-09-15 4:08 Anand Jain
2023-09-15 4:08 ` [PATCH 1/4] btrfs-progs: tune use the latest bdev in fs_devices for super_copy Anand Jain
` (4 more replies)
0 siblings, 5 replies; 11+ messages in thread
From: Anand Jain @ 2023-09-15 4:08 UTC (permalink / raw)
To: linux-btrfs; +Cc: dsterba
v4:
Remove the patch that has already been merged.
Update the commit log of 1/4 as per David's review comment (Thanks).
No code changes.
v3:
This current patchset contains the remaining unmerged patches and
addresses the reported bug:
bug report: https://github.com/kdave/btrfs-progs/actions/runs/5956097489/job/16156138260
In v3 of this patchset, btrfs_fs_devices::inconsistent_super variable
added, which helps determine whether all the devices in the fs_devices
share the same fsid and metadata_uuid.
v2:
The earlier revision, v2, of this patchset consisted of 16 patches, out of
which 12 have already been merged into the devel branch.
v2: https://patchwork.kernel.org/project/linux-btrfs/list/?series=776027
Anand Jain (4):
btrfs-progs: tune use the latest bdev in fs_devices for super_copy
btrfs-progs: add support to fix superblock with CHANGING_FSID_V2 flag
btrfs-progs: recover from the failed btrfstune -m|M
btrfs-progs: test btrfstune -m|M ability to fix previous failures
kernel-shared/volumes.c | 193 +++++++++++++++++++--
kernel-shared/volumes.h | 1 +
tests/misc-tests/034-metadata-uuid/test.sh | 70 ++++++--
tune/change-metadata-uuid.c | 48 ++++-
tune/change-uuid.c | 4 +-
tune/main.c | 3 +
tune/tune.h | 2 -
7 files changed, 280 insertions(+), 41 deletions(-)
--
2.38.1
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/4] btrfs-progs: tune use the latest bdev in fs_devices for super_copy
2023-09-15 4:08 [PATCH 0/4 v4] btrfs-progs: recover from failed metadata_uuid port kernel Anand Jain
@ 2023-09-15 4:08 ` Anand Jain
2023-09-15 4:08 ` [PATCH 2/4] btrfs-progs: add support to fix superblock with CHANGING_FSID_V2 flag Anand Jain
` (3 subsequent siblings)
4 siblings, 0 replies; 11+ messages in thread
From: Anand Jain @ 2023-09-15 4:08 UTC (permalink / raw)
To: linux-btrfs; +Cc: dsterba
Currently, btrfstune relies on the superblock of the device specified
in the btrfstune argument for fs_info::super_copy. However, it should
use fs_devices::latest_bdev, as it points to the device with the highest
fs_devices::generation number. This will contain the superblock updates
that other devices may have missed and we can now support reuniting
devices following failures of btrfstune -m|M|u|U as in the patches:
btrfs-progs: add support to fix superblock with CHANGING_FSID_V2 flag
btrfs-progs: recover from the failed btrfstune -m|M
Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
tune/main.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tune/main.c b/tune/main.c
index 902c9d97956e..2d8f2b012caf 100644
--- a/tune/main.c
+++ b/tune/main.c
@@ -345,6 +345,9 @@ int BOX_MAIN(btrfstune)(int argc, char *argv[])
goto free_out;
}
+ if (change_metadata_uuid || random_fsid || new_fsid_str)
+ ctree_flags |= OPEN_CTREE_USE_LATEST_BDEV;
+
root = open_ctree_fd(fd, device, 0, ctree_flags);
if (!root) {
--
2.38.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/4] btrfs-progs: add support to fix superblock with CHANGING_FSID_V2 flag
2023-09-15 4:08 [PATCH 0/4 v4] btrfs-progs: recover from failed metadata_uuid port kernel Anand Jain
2023-09-15 4:08 ` [PATCH 1/4] btrfs-progs: tune use the latest bdev in fs_devices for super_copy Anand Jain
@ 2023-09-15 4:08 ` Anand Jain
2023-09-15 4:08 ` [PATCH 3/4] btrfs-progs: recover from the failed btrfstune -m|M Anand Jain
` (2 subsequent siblings)
4 siblings, 0 replies; 11+ messages in thread
From: Anand Jain @ 2023-09-15 4:08 UTC (permalink / raw)
To: linux-btrfs; +Cc: dsterba
Most of the code and functions in this patch is copied from the kernel.
Now, with this patch applied, there is no need to mount the device to
complete the incomplete 'btrfstune -m|M' command (CHANING_FSID_V2 flag).
Instead, the same command could be run, which will successfully complete
the operation.
Currently, the 'tests/misc-tests/034-metadata-uuid' tests the kernel using
four sets of disk images with CHANING_FSID_V2. Now, this test case has been
updated (as in the next patch) to test the the progs part.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
kernel-shared/volumes.c | 193 +++++++++++++++++++++++++++++++++++++---
kernel-shared/volumes.h | 1 +
2 files changed, 182 insertions(+), 12 deletions(-)
diff --git a/kernel-shared/volumes.c b/kernel-shared/volumes.c
index e3f8ee3e0242..6a1ec0d907e0 100644
--- a/kernel-shared/volumes.c
+++ b/kernel-shared/volumes.c
@@ -338,6 +338,159 @@ static struct btrfs_fs_devices *find_fsid(u8 *fsid, u8 *metadata_uuid)
return NULL;
}
+static u8 *btrfs_sb_fsid_ptr(struct btrfs_super_block *sb)
+{
+ bool has_metadata_uuid = (btrfs_super_incompat_flags(sb) &
+ BTRFS_FEATURE_INCOMPAT_METADATA_UUID);
+
+ return has_metadata_uuid ? sb->metadata_uuid : sb->fsid;
+}
+
+static bool match_fsid_fs_devices(const struct btrfs_fs_devices *fs_devices,
+ const u8 *fsid, const u8 *metadata_fsid)
+{
+ if (memcmp(fsid, fs_devices->fsid, BTRFS_FSID_SIZE) != 0)
+ return false;
+
+ if (!metadata_fsid)
+ return true;
+
+ if (memcmp(metadata_fsid, fs_devices->metadata_uuid, BTRFS_FSID_SIZE) != 0)
+ return false;
+
+ return true;
+}
+
+/*
+ * First check if the metadata_uuid is different from the fsid in the given
+ * fs_devices. Then check if the given fsid is the same as the metadata_uuid
+ * in the fs_devices. If it is, return true; otherwise, return false.
+ */
+static inline bool check_fsid_changed(const struct btrfs_fs_devices *fs_devices,
+ const u8 *fsid)
+{
+ return memcmp(fs_devices->fsid, fs_devices->metadata_uuid,
+ BTRFS_FSID_SIZE) != 0 &&
+ memcmp(fs_devices->metadata_uuid, fsid, BTRFS_FSID_SIZE) == 0;
+}
+
+static struct btrfs_fs_devices *find_fsid_with_metadata_uuid(
+ struct btrfs_super_block *disk_super)
+{
+
+ struct btrfs_fs_devices *fs_devices;
+
+ /*
+ * Handle scanned device having completed its fsid change but
+ * belonging to a fs_devices that was created by first scanning
+ * a device which didn't have its fsid/metadata_uuid changed
+ * at all and the CHANGING_FSID_V2 flag set.
+ */
+ list_for_each_entry(fs_devices, &fs_uuids, fs_list) {
+ if (!fs_devices->changing_fsid)
+ continue;
+
+ if (match_fsid_fs_devices(fs_devices, disk_super->metadata_uuid,
+ fs_devices->fsid))
+ return fs_devices;
+ }
+
+ /*
+ * Handle scanned device having completed its fsid change but
+ * belonging to a fs_devices that was created by a device that
+ * has an outdated pair of fsid/metadata_uuid and
+ * CHANGING_FSID_V2 flag set.
+ */
+ list_for_each_entry(fs_devices, &fs_uuids, fs_list) {
+ if (!fs_devices->changing_fsid)
+ continue;
+
+ if (check_fsid_changed(fs_devices, disk_super->metadata_uuid))
+ return fs_devices;
+ }
+
+ return find_fsid(disk_super->fsid, disk_super->metadata_uuid);
+}
+
+/*
+ * Handle scanned device having its CHANGING_FSID_V2 flag set and the fs_devices
+ * being created with a disk that has already completed its fsid change. Such
+ * disk can belong to an fs which has its FSID changed or to one which doesn't.
+ * Handle both cases here.
+ */
+static struct btrfs_fs_devices *find_fsid_inprogress(
+ struct btrfs_super_block *disk_super)
+{
+ struct btrfs_fs_devices *fs_devices;
+
+ list_for_each_entry(fs_devices, &fs_uuids, fs_list) {
+ if (fs_devices->changing_fsid)
+ continue;
+
+ if (check_fsid_changed(fs_devices, disk_super->fsid))
+ return fs_devices;
+ }
+
+ return find_fsid(disk_super->fsid, NULL);
+}
+
+static struct btrfs_fs_devices *find_fsid_changed(
+ struct btrfs_super_block *disk_super)
+{
+ struct btrfs_fs_devices *fs_devices;
+
+ /*
+ * Handles the case where scanned device is part of an fs that had
+ * multiple successful changes of FSID but currently device didn't
+ * observe it. Meaning our fsid will be different than theirs. We need
+ * to handle two subcases :
+ * 1 - The fs still continues to have different METADATA/FSID uuids.
+ * 2 - The fs is switched back to its original FSID (METADATA/FSID
+ * are equal).
+ */
+ list_for_each_entry(fs_devices, &fs_uuids, fs_list) {
+ /* Changed UUIDs */
+ if (check_fsid_changed(fs_devices, disk_super->metadata_uuid) &&
+ memcmp(fs_devices->fsid, disk_super->fsid,
+ BTRFS_FSID_SIZE) != 0)
+ return fs_devices;
+
+ /* Unchanged UUIDs */
+ if (memcmp(fs_devices->metadata_uuid, fs_devices->fsid,
+ BTRFS_FSID_SIZE) == 0 &&
+ memcmp(fs_devices->fsid, disk_super->metadata_uuid,
+ BTRFS_FSID_SIZE) == 0)
+ return fs_devices;
+ }
+
+ return NULL;
+}
+
+static struct btrfs_fs_devices *find_fsid_reverted_metadata(
+ struct btrfs_super_block *disk_super)
+{
+ struct btrfs_fs_devices *fs_devices;
+
+ /*
+ * Handle the case where the scanned device is part of an fs whose last
+ * metadata UUID change reverted it to the original FSID. At the same
+ * time fs_devices was first created by another constituent device
+ * which didn't fully observe the operation. This results in an
+ * btrfs_fs_devices created with metadata/fsid different AND
+ * btrfs_fs_devices::fsid_change set AND the metadata_uuid of the
+ * fs_devices equal to the FSID of the disk.
+ */
+ list_for_each_entry(fs_devices, &fs_uuids, fs_list) {
+ if (!fs_devices->changing_fsid)
+ continue;
+
+ if (check_fsid_changed(fs_devices, disk_super->fsid))
+ return fs_devices;
+ }
+
+ return NULL;
+}
+
static int device_list_add(const char *path,
struct btrfs_super_block *disk_super,
struct btrfs_fs_devices **fs_devices_ret)
@@ -352,11 +505,18 @@ static int device_list_add(const char *path,
(BTRFS_SUPER_FLAG_CHANGING_FSID |
BTRFS_SUPER_FLAG_CHANGING_FSID_V2));
- if (metadata_uuid)
- fs_devices = find_fsid(disk_super->fsid,
- disk_super->metadata_uuid);
- else
- fs_devices = find_fsid(disk_super->fsid, NULL);
+ if (changing_fsid) {
+ if (!metadata_uuid)
+ fs_devices = find_fsid_inprogress(disk_super);
+ else
+ fs_devices = find_fsid_changed(disk_super);
+ } else if (metadata_uuid) {
+ fs_devices = find_fsid_with_metadata_uuid(disk_super);
+ } else {
+ fs_devices = find_fsid_reverted_metadata(disk_super);
+ if (!fs_devices)
+ fs_devices = find_fsid(disk_super->fsid, NULL);
+ }
if (!fs_devices) {
fs_devices = kzalloc(sizeof(*fs_devices), GFP_NOFS);
@@ -381,7 +541,20 @@ static int device_list_add(const char *path,
} else {
device = find_device(fs_devices, devid,
disk_super->dev_item.uuid);
+ /*
+ * If this disk has been pulled into an fs devices created by
+ * a device which had the CHANGING_FSID_V2 flag then replace the
+ * metadata_uuid/fsid values of the fs_devices.
+ */
+ if (fs_devices->changing_fsid &&
+ found_transid > fs_devices->latest_generation) {
+ memcpy(fs_devices->fsid, disk_super->fsid,
+ BTRFS_FSID_SIZE);
+ memcpy(fs_devices->metadata_uuid,
+ btrfs_sb_fsid_ptr(disk_super), BTRFS_FSID_SIZE);
+ }
}
+
if (!device) {
device = kzalloc(sizeof(*device), GFP_NOFS);
if (!device) {
@@ -435,19 +608,15 @@ static int device_list_add(const char *path,
device->name = name;
}
- /*
- * If changing_fsid the fs_devices will still hold the status from
- * the other devices.
- */
if (changing_fsid)
- fs_devices->changing_fsid = true;
- if (metadata_uuid)
- fs_devices->active_metadata_uuid = true;
+ fs_devices->inconsistent_super = changing_fsid;
if (found_transid > fs_devices->latest_generation) {
fs_devices->latest_devid = devid;
fs_devices->latest_generation = found_transid;
fs_devices->total_devices = device->total_devs;
+ fs_devices->active_metadata_uuid = metadata_uuid;
+ fs_devices->changing_fsid = changing_fsid;
}
if (fs_devices->lowest_devid > devid) {
fs_devices->lowest_devid = devid;
diff --git a/kernel-shared/volumes.h b/kernel-shared/volumes.h
index d68ef0dc221a..ac9775647e12 100644
--- a/kernel-shared/volumes.h
+++ b/kernel-shared/volumes.h
@@ -111,6 +111,7 @@ struct btrfs_fs_devices {
bool changing_fsid;
bool active_metadata_uuid;
+ bool inconsistent_super;
};
struct btrfs_bio_stripe {
--
2.38.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 3/4] btrfs-progs: recover from the failed btrfstune -m|M
2023-09-15 4:08 [PATCH 0/4 v4] btrfs-progs: recover from failed metadata_uuid port kernel Anand Jain
2023-09-15 4:08 ` [PATCH 1/4] btrfs-progs: tune use the latest bdev in fs_devices for super_copy Anand Jain
2023-09-15 4:08 ` [PATCH 2/4] btrfs-progs: add support to fix superblock with CHANGING_FSID_V2 flag Anand Jain
@ 2023-09-15 4:08 ` Anand Jain
2023-09-15 4:08 ` [PATCH 4/4] btrfs-progs: test btrfstune -m|M ability to fix previous failures Anand Jain
2023-10-02 17:00 ` [PATCH 0/4 v4] btrfs-progs: recover from failed metadata_uuid port kernel David Sterba
4 siblings, 0 replies; 11+ messages in thread
From: Anand Jain @ 2023-09-15 4:08 UTC (permalink / raw)
To: linux-btrfs; +Cc: dsterba
Currently, to fix device following the write failure of one or more devices
during btrfstune -m|M, we rely on the kernel's ability to reassemble devices,
even when they possess distinct fsids.
Kernel hinges combinations of metadata_uuid and generation number, with
additional cues taken from the fsid and the BTRFS_SUPER_FLAG_CHANGING_FSID_V2
flag. This patch adds this logic to btrfs-progs.
In complex scenarios (such as multiple fsids with the same metadata_uuid and
matching generation), user intervention becomes necessary to resolve the
situations which btrfs-prog can do better.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
tune/change-metadata-uuid.c | 48 ++++++++++++++++++++++++++++++-------
tune/change-uuid.c | 4 ++--
tune/tune.h | 2 --
3 files changed, 42 insertions(+), 12 deletions(-)
diff --git a/tune/change-metadata-uuid.c b/tune/change-metadata-uuid.c
index 11c6b4669949..c118d06fae10 100644
--- a/tune/change-metadata-uuid.c
+++ b/tune/change-metadata-uuid.c
@@ -23,9 +23,31 @@
#include "kernel-shared/uapi/btrfs.h"
#include "kernel-shared/ctree.h"
#include "kernel-shared/transaction.h"
+#include "kernel-shared/volumes.h"
#include "common/messages.h"
#include "tune/tune.h"
+/*
+ * Return 0 for no unfinished metadata_uuid change.
+ * Return >0 for unfinished metadata_uuid change, and restore unfinished
+ * fsid/metadata_uuid into fsid_ret/metadata_uuid_ret.
+ */
+static int check_unfinished_metadata_uuid(struct btrfs_fs_info *fs_info,
+ uuid_t fsid_ret,
+ uuid_t metadata_uuid_ret)
+{
+ struct btrfs_root *tree_root = fs_info->tree_root;
+
+ if (fs_info->fs_devices->inconsistent_super) {
+ memcpy(fsid_ret, fs_info->super_copy->fsid, BTRFS_FSID_SIZE);
+ read_extent_buffer(tree_root->node, metadata_uuid_ret,
+ btrfs_header_chunk_tree_uuid(tree_root->node),
+ BTRFS_UUID_SIZE);
+ return 1;
+ }
+ return 0;
+}
+
int set_metadata_uuid(struct btrfs_root *root, const char *new_fsid_string)
{
struct btrfs_super_block *disk_super;
@@ -47,15 +69,25 @@ int set_metadata_uuid(struct btrfs_root *root, const char *new_fsid_string)
return 1;
}
- if (check_unfinished_fsid_change(root->fs_info, fsid, metadata_uuid)) {
- error("UUID rewrite in progress, cannot change metadata_uuid");
- return 1;
- }
+ if (check_unfinished_metadata_uuid(root->fs_info, fsid,
+ metadata_uuid)) {
+ if (new_fsid_string) {
+ uuid_t tmp;
- if (new_fsid_string)
- uuid_parse(new_fsid_string, fsid);
- else
- uuid_generate(fsid);
+ uuid_parse(new_fsid_string, tmp);
+ if (memcmp(tmp, fsid, BTRFS_FSID_SIZE)) {
+ error(
+ "new fsid %s is not the same with unfinished fsid change",
+ new_fsid_string);
+ return -EINVAL;
+ }
+ }
+ } else {
+ if (new_fsid_string)
+ uuid_parse(new_fsid_string, fsid);
+ else
+ uuid_generate(fsid);
+ }
new_fsid = (memcmp(fsid, disk_super->fsid, BTRFS_FSID_SIZE) != 0);
diff --git a/tune/change-uuid.c b/tune/change-uuid.c
index 4030bd523bce..810e85e1af45 100644
--- a/tune/change-uuid.c
+++ b/tune/change-uuid.c
@@ -211,8 +211,8 @@ static int change_fsid_done(struct btrfs_fs_info *fs_info)
* Return >0 for unfinished fsid change, and restore unfinished fsid/
* chunk_tree_id into fsid_ret/chunk_id_ret.
*/
-int check_unfinished_fsid_change(struct btrfs_fs_info *fs_info,
- uuid_t fsid_ret, uuid_t chunk_id_ret)
+static int check_unfinished_fsid_change(struct btrfs_fs_info *fs_info,
+ uuid_t fsid_ret, uuid_t chunk_id_ret)
{
struct btrfs_root *tree_root = fs_info->tree_root;
diff --git a/tune/tune.h b/tune/tune.h
index 0ef249d89eee..e84cc336846c 100644
--- a/tune/tune.h
+++ b/tune/tune.h
@@ -24,8 +24,6 @@ struct btrfs_fs_info;
int update_seeding_flag(struct btrfs_root *root, const char *device, int set_flag, int force);
-int check_unfinished_fsid_change(struct btrfs_fs_info *fs_info,
- uuid_t fsid_ret, uuid_t chunk_id_ret);
int change_uuid(struct btrfs_fs_info *fs_info, const char *new_fsid_str);
int set_metadata_uuid(struct btrfs_root *root, const char *uuid_string);
--
2.38.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 4/4] btrfs-progs: test btrfstune -m|M ability to fix previous failures
2023-09-15 4:08 [PATCH 0/4 v4] btrfs-progs: recover from failed metadata_uuid port kernel Anand Jain
` (2 preceding siblings ...)
2023-09-15 4:08 ` [PATCH 3/4] btrfs-progs: recover from the failed btrfstune -m|M Anand Jain
@ 2023-09-15 4:08 ` Anand Jain
2023-10-02 17:16 ` David Sterba
2023-10-02 17:19 ` David Sterba
2023-10-02 17:00 ` [PATCH 0/4 v4] btrfs-progs: recover from failed metadata_uuid port kernel David Sterba
4 siblings, 2 replies; 11+ messages in thread
From: Anand Jain @ 2023-09-15 4:08 UTC (permalink / raw)
To: linux-btrfs; +Cc: dsterba
The misc-test/034-metadata_uuid test case, has four sets of disk images to
simulate failed writes during btrfstune -m|M operations. As of now, this
tests kernel only. Update the test case to verify btrfstune -m|M's
capacity to recover from the same scenarios.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
tests/misc-tests/034-metadata-uuid/test.sh | 70 ++++++++++++++++------
1 file changed, 53 insertions(+), 17 deletions(-)
diff --git a/tests/misc-tests/034-metadata-uuid/test.sh b/tests/misc-tests/034-metadata-uuid/test.sh
index 479c7da7a5b2..0b06f1266f57 100755
--- a/tests/misc-tests/034-metadata-uuid/test.sh
+++ b/tests/misc-tests/034-metadata-uuid/test.sh
@@ -195,13 +195,42 @@ check_multi_fsid_unchanged() {
check_flag_cleared "$1" "$2"
}
-failure_recovery() {
+failure_recovery_progs() {
+ local image1
+ local image2
+ local loop1
+ local loop2
+ local devcount
+
+ image1=$(extract_image "$1")
+ image2=$(extract_image "$2")
+ loop1=$(run_check_stdout $SUDO_HELPER losetup --find --show "$image1")
+ loop2=$(run_check_stdout $SUDO_HELPER losetup --find --show "$image2")
+
+ run_check $SUDO_HELPER udevadm settle
+
+ # Scan to make sure btrfs detects both devices before trying to mount
+ #run_check "$TOP/btrfstune" -m --noscan --device="$loop1" "$loop2"
+ run_check "$TOP/btrfstune" -m "$loop2"
+
+ # perform any specific check
+ "$3" "$loop1" "$loop2"
+
+ # cleanup
+ run_check $SUDO_HELPER losetup -d "$loop1"
+ run_check $SUDO_HELPER losetup -d "$loop2"
+ rm -f -- "$image1" "$image2"
+}
+
+failure_recovery_kernel() {
local image1
local image2
local loop1
local loop2
local devcount
+ reload_btrfs
+
image1=$(extract_image "$1")
image2=$(extract_image "$2")
loop1=$(run_check_stdout $SUDO_HELPER losetup --find --show "$image1")
@@ -226,47 +255,55 @@ failure_recovery() {
rm -f -- "$image1" "$image2"
}
+failure_recovery() {
+ failure_recovery_progs $@
+ failure_recovery_kernel $@
+}
+
reload_btrfs() {
run_check $SUDO_HELPER rmmod btrfs
run_check $SUDO_HELPER modprobe btrfs
}
-# for full coverage we need btrfs to actually be a module
-modinfo btrfs > /dev/null 2>&1 || _not_run "btrfs must be a module"
-run_mayfail $SUDO_HELPER modprobe -r btrfs || _not_run "btrfs must be unloadable"
-run_mayfail $SUDO_HELPER modprobe btrfs || _not_run "loading btrfs module failed"
+test_progs() {
+ run_check_mkfs_test_dev
+ check_btrfstune
+
+ run_check_mkfs_test_dev
+ check_dump_super_output
-run_check_mkfs_test_dev
-check_btrfstune
+ run_check_mkfs_test_dev
+ check_image_restore
+}
+
+check_kernel_reloadable() {
+ # for full coverage we need btrfs to actually be a module
+ modinfo btrfs > /dev/null 2>&1 || _not_run "btrfs must be a module"
+ run_mayfail $SUDO_HELPER modprobe -r btrfs || _not_run "btrfs must be unloadable"
+ run_mayfail $SUDO_HELPER modprobe btrfs || _not_run "loading btrfs module failed"
+}
-run_check_mkfs_test_dev
-check_dump_super_output
+check_kernel_reloadable
-run_check_mkfs_test_dev
-check_image_restore
+test_progs
# disk1 is an image which has no metadata uuid flags set and disk2 is part of
# the same fs but has the in-progress flag set. Test that whicever is scanned
# first will result in consistent filesystem.
failure_recovery "./disk1.raw.xz" "./disk2.raw.xz" check_inprogress_flag
-reload_btrfs
failure_recovery "./disk2.raw.xz" "./disk1.raw.xz" check_inprogress_flag
# disk4 contains an image in with the in-progress flag set and disk 3 is part
# of the same filesystem but has both METADATA_UUID incompat and a new
# metadata uuid set. So disk 3 must always take precedence
-reload_btrfs
failure_recovery "./disk3.raw.xz" "./disk4.raw.xz" check_completed
-reload_btrfs
failure_recovery "./disk4.raw.xz" "./disk3.raw.xz" check_completed
# disk5 contains an image which has undergone a successful fsid change more
# than once, disk6 on the other hand is member of the same filesystem but
# hasn't completed its last change. Thus it has both the FSID_CHANGING flag set
# and METADATA_UUID flag set.
-reload_btrfs
failure_recovery "./disk5.raw.xz" "./disk6.raw.xz" check_multi_fsid_change
-reload_btrfs
failure_recovery "./disk6.raw.xz" "./disk5.raw.xz" check_multi_fsid_change
# disk7 contains an image which has undergone a successful fsid change once to
@@ -275,5 +312,4 @@ failure_recovery "./disk6.raw.xz" "./disk5.raw.xz" check_multi_fsid_change
# during the process change. So disk 7 looks as if it never underwent fsid change
# and disk 8 has FSID_CHANGING_FLAG and METADATA_UUID but is stale.
failure_recovery "./disk7.raw.xz" "./disk8.raw.xz" check_multi_fsid_unchanged
-reload_btrfs
failure_recovery "./disk8.raw.xz" "./disk7.raw.xz" check_multi_fsid_unchanged
--
2.38.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 0/4 v4] btrfs-progs: recover from failed metadata_uuid port kernel
2023-09-15 4:08 [PATCH 0/4 v4] btrfs-progs: recover from failed metadata_uuid port kernel Anand Jain
` (3 preceding siblings ...)
2023-09-15 4:08 ` [PATCH 4/4] btrfs-progs: test btrfstune -m|M ability to fix previous failures Anand Jain
@ 2023-10-02 17:00 ` David Sterba
4 siblings, 0 replies; 11+ messages in thread
From: David Sterba @ 2023-10-02 17:00 UTC (permalink / raw)
To: Anand Jain; +Cc: linux-btrfs, dsterba
On Fri, Sep 15, 2023 at 12:08:55PM +0800, Anand Jain wrote:
> v4:
> Remove the patch that has already been merged.
> Update the commit log of 1/4 as per David's review comment (Thanks).
> No code changes.
>
> v3:
> This current patchset contains the remaining unmerged patches and
> addresses the reported bug:
>
> bug report: https://github.com/kdave/btrfs-progs/actions/runs/5956097489/job/16156138260
>
> In v3 of this patchset, btrfs_fs_devices::inconsistent_super variable
> added, which helps determine whether all the devices in the fs_devices
> share the same fsid and metadata_uuid.
>
> v2:
> The earlier revision, v2, of this patchset consisted of 16 patches, out of
> which 12 have already been merged into the devel branch.
>
> v2: https://patchwork.kernel.org/project/linux-btrfs/list/?series=776027
>
> Anand Jain (4):
> btrfs-progs: tune use the latest bdev in fs_devices for super_copy
> btrfs-progs: add support to fix superblock with CHANGING_FSID_V2 flag
> btrfs-progs: recover from the failed btrfstune -m|M
> btrfs-progs: test btrfstune -m|M ability to fix previous failures
Added to devel, thanks.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 4/4] btrfs-progs: test btrfstune -m|M ability to fix previous failures
2023-09-15 4:08 ` [PATCH 4/4] btrfs-progs: test btrfstune -m|M ability to fix previous failures Anand Jain
@ 2023-10-02 17:16 ` David Sterba
2023-10-02 17:19 ` David Sterba
1 sibling, 0 replies; 11+ messages in thread
From: David Sterba @ 2023-10-02 17:16 UTC (permalink / raw)
To: Anand Jain; +Cc: linux-btrfs, dsterba
On Fri, Sep 15, 2023 at 12:08:59PM +0800, Anand Jain wrote:
> The misc-test/034-metadata_uuid test case, has four sets of disk images to
> simulate failed writes during btrfstune -m|M operations. As of now, this
> tests kernel only. Update the test case to verify btrfstune -m|M's
> capacity to recover from the same scenarios.
>
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
> tests/misc-tests/034-metadata-uuid/test.sh | 70 ++++++++++++++++------
> 1 file changed, 53 insertions(+), 17 deletions(-)
>
> diff --git a/tests/misc-tests/034-metadata-uuid/test.sh b/tests/misc-tests/034-metadata-uuid/test.sh
> index 479c7da7a5b2..0b06f1266f57 100755
> --- a/tests/misc-tests/034-metadata-uuid/test.sh
> +++ b/tests/misc-tests/034-metadata-uuid/test.sh
> @@ -195,13 +195,42 @@ check_multi_fsid_unchanged() {
> check_flag_cleared "$1" "$2"
> }
>
> -failure_recovery() {
> +failure_recovery_progs() {
> + local image1
> + local image2
> + local loop1
> + local loop2
> + local devcount
> +
> + image1=$(extract_image "$1")
> + image2=$(extract_image "$2")
> + loop1=$(run_check_stdout $SUDO_HELPER losetup --find --show "$image1")
> + loop2=$(run_check_stdout $SUDO_HELPER losetup --find --show "$image2")
> +
> + run_check $SUDO_HELPER udevadm settle
> +
> + # Scan to make sure btrfs detects both devices before trying to mount
> + #run_check "$TOP/btrfstune" -m --noscan --device="$loop1" "$loop2"
> + run_check "$TOP/btrfstune" -m "$loop2"
This lacks $SUDO_HELPER so it does fails when the whole testuite is not
run by a root user. Please make sure that 'make test-...' actually works
before sending the patches.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 4/4] btrfs-progs: test btrfstune -m|M ability to fix previous failures
2023-09-15 4:08 ` [PATCH 4/4] btrfs-progs: test btrfstune -m|M ability to fix previous failures Anand Jain
2023-10-02 17:16 ` David Sterba
@ 2023-10-02 17:19 ` David Sterba
2023-10-03 8:00 ` Anand Jain
1 sibling, 1 reply; 11+ messages in thread
From: David Sterba @ 2023-10-02 17:19 UTC (permalink / raw)
To: Anand Jain; +Cc: linux-btrfs, dsterba
On Fri, Sep 15, 2023 at 12:08:59PM +0800, Anand Jain wrote:
> The misc-test/034-metadata_uuid test case, has four sets of disk images to
> simulate failed writes during btrfstune -m|M operations. As of now, this
> tests kernel only. Update the test case to verify btrfstune -m|M's
> capacity to recover from the same scenarios.
>
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
With all the problems fixed, the test still fails. I'm not sure which case it
is:
====== RUN CHECK root_helper losetup --find --show ./disk1.raw.restored
/dev/loop0
====== RUN CHECK root_helper losetup --find --show ./disk2.raw.restored
/dev/loop1
====== RUN CHECK root_helper udevadm settle
====== RUN CHECK root_helper /labs/dsterba/gits/btrfs-progs/btrfstune -m /dev/loop1
parent transid verify failed on 30425088 wanted 6 found 4
parent transid verify failed on 30441472 wanted 6 found 4
Error writing to device 1
ERROR: failed to write tree block 30457856: Operation not permitted
ERROR: btrfstune failed
failed: root_helper /labs/dsterba/gits/btrfs-progs/btrfstune -m /dev/loop1
test failed for case 034-metadata-uuid
Looks like a write that's beyond the device limit. I'll keep the patches
and tests in devel so you can have a look.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 4/4] btrfs-progs: test btrfstune -m|M ability to fix previous failures
2023-10-02 17:19 ` David Sterba
@ 2023-10-03 8:00 ` Anand Jain
2023-10-03 8:38 ` Anand Jain
0 siblings, 1 reply; 11+ messages in thread
From: Anand Jain @ 2023-10-03 8:00 UTC (permalink / raw)
To: dsterba; +Cc: linux-btrfs, dsterba
On 3/10/23 01:19, David Sterba wrote:
> On Fri, Sep 15, 2023 at 12:08:59PM +0800, Anand Jain wrote:
>> The misc-test/034-metadata_uuid test case, has four sets of disk images to
>> simulate failed writes during btrfstune -m|M operations. As of now, this
>> tests kernel only. Update the test case to verify btrfstune -m|M's
>> capacity to recover from the same scenarios.
>>
>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>
> With all the problems fixed, the test still fails. I'm not sure which case it
> is:
>
> ====== RUN CHECK root_helper losetup --find --show ./disk1.raw.restored
> /dev/loop0
> ====== RUN CHECK root_helper losetup --find --show ./disk2.raw.restored
> /dev/loop1
> ====== RUN CHECK root_helper udevadm settle
> ====== RUN CHECK root_helper /labs/dsterba/gits/btrfs-progs/btrfstune -m /dev/loop1
> parent transid verify failed on 30425088 wanted 6 found 4
> parent transid verify failed on 30441472 wanted 6 found 4
> Error writing to device 1
> ERROR: failed to write tree block 30457856: Operation not permitted
> ERROR: btrfstune failed
> failed: root_helper /labs/dsterba/gits/btrfs-progs/btrfstune -m /dev/loop1
> test failed for case 034-metadata-uuid
>
> Looks like a write that's beyond the device limit. I'll keep the patches
> and tests in devel so you can have a look.
As a root user, your devel branch passes here.
(Generally, I have been using the following command as root:)
$ make TEST=034* test-misc
[LD] fssum
[LD] fsstress
[TEST] misc-tests.sh
[TEST/misc] 034-metadata-uuid
Scanning /btrfs-progs/tests/misc-tests-results.txt
Let me try as a non-root user.
Also, could you please make sure that all the
'tests/misc-tests/034-metadata-uuid/*.restored' files are removed before
starting the test case?
Thanks, Anand
Thanks, Anand
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 4/4] btrfs-progs: test btrfstune -m|M ability to fix previous failures
2023-10-03 8:00 ` Anand Jain
@ 2023-10-03 8:38 ` Anand Jain
2023-10-03 17:36 ` David Sterba
0 siblings, 1 reply; 11+ messages in thread
From: Anand Jain @ 2023-10-03 8:38 UTC (permalink / raw)
To: dsterba; +Cc: linux-btrfs, dsterba
On 3/10/23 16:00, Anand Jain wrote:
>
>
> On 3/10/23 01:19, David Sterba wrote:
>> On Fri, Sep 15, 2023 at 12:08:59PM +0800, Anand Jain wrote:
>>> The misc-test/034-metadata_uuid test case, has four sets of disk
>>> images to
>>> simulate failed writes during btrfstune -m|M operations. As of now, this
>>> tests kernel only. Update the test case to verify btrfstune -m|M's
>>> capacity to recover from the same scenarios.
>>>
>>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>>
>> With all the problems fixed, the test still fails. I'm not sure which
>> case it
>> is:
>>
>> ====== RUN CHECK root_helper losetup --find --show ./disk1.raw.restored
>> /dev/loop0
>> ====== RUN CHECK root_helper losetup --find --show ./disk2.raw.restored
>> /dev/loop1
>> ====== RUN CHECK root_helper udevadm settle
>> ====== RUN CHECK root_helper /labs/dsterba/gits/btrfs-progs/btrfstune
>> -m /dev/loop1
>> parent transid verify failed on 30425088 wanted 6 found 4
>> parent transid verify failed on 30441472 wanted 6 found 4
>> Error writing to device 1
>> ERROR: failed to write tree block 30457856: Operation not permitted
>> ERROR: btrfstune failed
>> failed: root_helper /labs/dsterba/gits/btrfs-progs/btrfstune -m
>> /dev/loop1
>> test failed for case 034-metadata-uuid
>>
>> Looks like a write that's beyond the device limit. I'll keep the patches
>> and tests in devel so you can have a look.
>
>
> As a root user, your devel branch passes here.
>
> (Generally, I have been using the following command as root:)
>
> $ make TEST=034* test-misc
> [LD] fssum
> [LD] fsstress
> [TEST] misc-tests.sh
> [TEST/misc] 034-metadata-uuid
> Scanning /btrfs-progs/tests/misc-tests-results.txt
>
> Let me try as a non-root user.
>
> Also, could you please make sure that all the
> 'tests/misc-tests/034-metadata-uuid/*.restored' files are removed before
> starting the test case?
This pass as non-root.
$ sudo make TEST=034* test-misc
[LD] fssum
[LD] fsstress
[TEST] misc-tests.sh
[TEST/misc] 034-metadata-uuid
Scanning /btrfs-progs/tests/misc-tests-results.txt
So I think there might be some stale *restored images; Could you pls check.
Thanks, Anand
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 4/4] btrfs-progs: test btrfstune -m|M ability to fix previous failures
2023-10-03 8:38 ` Anand Jain
@ 2023-10-03 17:36 ` David Sterba
0 siblings, 0 replies; 11+ messages in thread
From: David Sterba @ 2023-10-03 17:36 UTC (permalink / raw)
To: Anand Jain; +Cc: linux-btrfs, dsterba
On Tue, Oct 03, 2023 at 04:38:49PM +0800, Anand Jain wrote:
> On 3/10/23 16:00, Anand Jain wrote:
> >
> >
> > On 3/10/23 01:19, David Sterba wrote:
> >> On Fri, Sep 15, 2023 at 12:08:59PM +0800, Anand Jain wrote:
> >>> The misc-test/034-metadata_uuid test case, has four sets of disk
> >>> images to
> >>> simulate failed writes during btrfstune -m|M operations. As of now, this
> >>> tests kernel only. Update the test case to verify btrfstune -m|M's
> >>> capacity to recover from the same scenarios.
> >>>
> >>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> >>
> >> With all the problems fixed, the test still fails. I'm not sure which
> >> case it
> >> is:
> >>
> >> ====== RUN CHECK root_helper losetup --find --show ./disk1.raw.restored
> >> /dev/loop0
> >> ====== RUN CHECK root_helper losetup --find --show ./disk2.raw.restored
> >> /dev/loop1
> >> ====== RUN CHECK root_helper udevadm settle
> >> ====== RUN CHECK root_helper /labs/dsterba/gits/btrfs-progs/btrfstune
> >> -m /dev/loop1
> >> parent transid verify failed on 30425088 wanted 6 found 4
> >> parent transid verify failed on 30441472 wanted 6 found 4
> >> Error writing to device 1
> >> ERROR: failed to write tree block 30457856: Operation not permitted
> >> ERROR: btrfstune failed
> >> failed: root_helper /labs/dsterba/gits/btrfs-progs/btrfstune -m
> >> /dev/loop1
> >> test failed for case 034-metadata-uuid
> >>
> >> Looks like a write that's beyond the device limit. I'll keep the patches
> >> and tests in devel so you can have a look.
> >
> >
> > As a root user, your devel branch passes here.
> >
> > (Generally, I have been using the following command as root:)
> >
> > $ make TEST=034* test-misc
> > [LD] fssum
> > [LD] fsstress
> > [TEST] misc-tests.sh
> > [TEST/misc] 034-metadata-uuid
> > Scanning /btrfs-progs/tests/misc-tests-results.txt
> >
> > Let me try as a non-root user.
> >
> > Also, could you please make sure that all the
> > 'tests/misc-tests/034-metadata-uuid/*.restored' files are removed before
> > starting the test case?
>
> This pass as non-root.
>
> $ sudo make TEST=034* test-misc
> [LD] fssum
> [LD] fsstress
> [TEST] misc-tests.sh
> [TEST/misc] 034-metadata-uuid
> Scanning /btrfs-progs/tests/misc-tests-results.txt
>
> So I think there might be some stale *restored images; Could you pls check.
It was indeed something on my side, the test now passes and also in CI.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2023-10-03 17:43 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-15 4:08 [PATCH 0/4 v4] btrfs-progs: recover from failed metadata_uuid port kernel Anand Jain
2023-09-15 4:08 ` [PATCH 1/4] btrfs-progs: tune use the latest bdev in fs_devices for super_copy Anand Jain
2023-09-15 4:08 ` [PATCH 2/4] btrfs-progs: add support to fix superblock with CHANGING_FSID_V2 flag Anand Jain
2023-09-15 4:08 ` [PATCH 3/4] btrfs-progs: recover from the failed btrfstune -m|M Anand Jain
2023-09-15 4:08 ` [PATCH 4/4] btrfs-progs: test btrfstune -m|M ability to fix previous failures Anand Jain
2023-10-02 17:16 ` David Sterba
2023-10-02 17:19 ` David Sterba
2023-10-03 8:00 ` Anand Jain
2023-10-03 8:38 ` Anand Jain
2023-10-03 17:36 ` David Sterba
2023-10-02 17:00 ` [PATCH 0/4 v4] btrfs-progs: recover from failed metadata_uuid port kernel David Sterba
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).