* [PATCH 0/2] btrfs-progs: btrfstune: fix false alerts on dev-replace when changing csum
@ 2024-06-02 3:45 Qu Wenruo
2024-06-02 3:45 ` [PATCH 1/2] btrfs-progs: print-tree: add support for dev-replace item Qu Wenruo
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Qu Wenruo @ 2024-06-02 3:45 UTC (permalink / raw)
To: linux-btrfs
There is a bug report that if a btrfs has experienced any dev-replace,
"btrfstune --csum" would always report a running dev-replace and refuse
to continue.
It turns out that, DEV_REPLACE item is not a transient one (but not
created properly as mkfs time either), after a dev-replace the
DEV_REPLACE item would exist forever, recording the last dev-replace
timestamp.
Although I really hate such behavior (especially when balance item would
be gone after a balance is finished/canceled), at least fix the problem
first.
The first patch enhances the print-tree to properly output the
contents of the dev-replace item, then the second patch fixes the
problem.
I'd like to add test cases for it, but it turns out there is no csum
change test cases.
My guess is we do not have proper way to skip the test if it's
experiemental feature?
Maybe it's time to move csum change out of experiemental features?
Qu Wenruo (2):
btrfs-progs: print-tree: add support for dev-replace item
btrfs-progs: change-csum: handle finished dev-replace correctly
kernel-shared/print-tree.c | 79 ++++++++++++++++++++++++++++++++++++++
tune/change-csum.c | 18 +++++++--
2 files changed, 94 insertions(+), 3 deletions(-)
--
2.45.1
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH 1/2] btrfs-progs: print-tree: add support for dev-replace item
2024-06-02 3:45 [PATCH 0/2] btrfs-progs: btrfstune: fix false alerts on dev-replace when changing csum Qu Wenruo
@ 2024-06-02 3:45 ` Qu Wenruo
2024-06-02 3:45 ` [PATCH 2/2] btrfs-progs: change-csum: handle finished dev-replace correctly Qu Wenruo
2024-06-03 19:57 ` [PATCH 0/2] btrfs-progs: btrfstune: fix false alerts on dev-replace when changing csum David Sterba
2 siblings, 0 replies; 4+ messages in thread
From: Qu Wenruo @ 2024-06-02 3:45 UTC (permalink / raw)
To: linux-btrfs
This is inspired by a recent bug that csum change doesn't detect
finished dev-replace.
At the time of that csum change patch, there is no print-tree to
show the content of btrfs_dev_replace_item thus contributes to the bug.
This patch adds the new output for btrfs_dev_replace_item, and the
example looks like this:
item 1 key (0 DEV_REPLACE 0) itemoff 16171 itemsize 72
src devid -1 cursor left 1179648000 cursor right 1179648000 mode ALWAYS
state FINISHED write errors 0 uncorrectable read errors 0
start time 1717282771 (2024-06-02 08:29:31)
stop time 1717282771 (2024-06-02 08:29:31)
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
kernel-shared/print-tree.c | 79 ++++++++++++++++++++++++++++++++++++++
1 file changed, 79 insertions(+)
diff --git a/kernel-shared/print-tree.c b/kernel-shared/print-tree.c
index a4ddbb2ae9cb..a9a4abaa2456 100644
--- a/kernel-shared/print-tree.c
+++ b/kernel-shared/print-tree.c
@@ -36,6 +36,7 @@
#include "common/defs.h"
#include "common/internal.h"
#include "common/messages.h"
+#include "uapi/btrfs.h"
static void print_dir_item_type(struct extent_buffer *eb,
struct btrfs_dir_item *di)
@@ -1389,6 +1390,81 @@ static void print_header_info(struct extent_buffer *eb, unsigned int mode)
fflush(stdout);
}
+#define DEV_REPLACE_STRING_LEN 64
+#define DEV_REPLACE_MODE_ENTRY(dest, name) \
+ case BTRFS_DEV_REPLACE_ITEM_CONT_READING_FROM_SRCDEV_MODE_##name: \
+ strncpy((dest), #name, DEV_REPLACE_STRING_LEN); \
+ break;
+
+static void replace_mode_to_str(u64 flags, char *ret)
+{
+ ret[0] = '\0';
+ switch(flags) {
+ DEV_REPLACE_MODE_ENTRY(ret, ALWAYS);
+ DEV_REPLACE_MODE_ENTRY(ret, AVOID);
+ default:
+ snprintf(ret, DEV_REPLACE_STRING_LEN, "unknown value(%llu)",
+ flags);
+ }
+}
+#undef DEV_REPLACE_MODE_ENTRY
+
+#define DEV_REPLACE_STATE_ENTRY(dest, name) \
+ case BTRFS_IOCTL_DEV_REPLACE_STATE_##name: \
+ strncpy((dest), #name, DEV_REPLACE_STRING_LEN); \
+ break;
+
+static void replace_state_to_str(u64 flags, char *ret)
+{
+ ret[0] = '\0';
+ switch(flags) {
+ DEV_REPLACE_STATE_ENTRY(ret, NEVER_STARTED);
+ DEV_REPLACE_STATE_ENTRY(ret, FINISHED);
+ DEV_REPLACE_STATE_ENTRY(ret, CANCELED);
+ DEV_REPLACE_STATE_ENTRY(ret, STARTED);
+ DEV_REPLACE_STATE_ENTRY(ret, SUSPENDED);
+ default:
+ snprintf(ret, DEV_REPLACE_STRING_LEN, "unknown value(%llu)",
+ flags);
+ }
+}
+#undef DEV_REPLACE_STATE_ENTRY
+
+static void print_u64_timespec(u64 timespec, const char *prefix)
+{
+ char time_str[256];
+ struct tm tm;
+ time_t time = timespec;
+
+ localtime_r(&time, &tm);
+ strftime(time_str, sizeof(time_str), "%Y-%m-%d %H:%M:%S", &tm);
+ printf("%s%llu (%s)\n", prefix, timespec, time_str);
+}
+
+static void print_dev_replace_item(struct extent_buffer *eb,
+ struct btrfs_dev_replace_item *ptr)
+{
+ char mode_str[DEV_REPLACE_STRING_LEN] = { 0 };
+ char state_str[DEV_REPLACE_STRING_LEN] = { 0 };
+
+ replace_mode_to_str(
+ btrfs_dev_replace_cont_reading_from_srcdev_mode(eb, ptr),
+ mode_str);
+ replace_state_to_str(
+ btrfs_dev_replace_replace_state(eb, ptr),
+ state_str);
+ printf("\t\tsrc devid %lld cursor left %llu cursor right %llu mode %s\n",
+ btrfs_dev_replace_src_devid(eb, ptr),
+ btrfs_dev_replace_cursor_left(eb, ptr),
+ btrfs_dev_replace_cursor_right(eb, ptr),
+ mode_str);
+ printf("\t\tstate %s write errors %llu uncorrectable read errors %llu\n",
+ state_str, btrfs_dev_replace_num_write_errors(eb, ptr),
+ btrfs_dev_replace_num_uncorrectable_read_errors(eb, ptr));
+ print_u64_timespec(btrfs_dev_replace_time_started(eb, ptr), "\t\tstart time ");
+ print_u64_timespec(btrfs_dev_replace_time_started(eb, ptr), "\t\tstop time ");
+}
+
void __btrfs_print_leaf(struct extent_buffer *eb, unsigned int mode)
{
struct btrfs_disk_key disk_key;
@@ -1563,6 +1639,9 @@ void __btrfs_print_leaf(struct extent_buffer *eb, unsigned int mode)
case BTRFS_RAID_STRIPE_KEY:
print_raid_stripe_key(eb, item_size, ptr);
break;
+ case BTRFS_DEV_REPLACE_KEY:
+ print_dev_replace_item(eb, ptr);
+ break;
};
fflush(stdout);
}
--
2.45.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 2/2] btrfs-progs: change-csum: handle finished dev-replace correctly
2024-06-02 3:45 [PATCH 0/2] btrfs-progs: btrfstune: fix false alerts on dev-replace when changing csum Qu Wenruo
2024-06-02 3:45 ` [PATCH 1/2] btrfs-progs: print-tree: add support for dev-replace item Qu Wenruo
@ 2024-06-02 3:45 ` Qu Wenruo
2024-06-03 19:57 ` [PATCH 0/2] btrfs-progs: btrfstune: fix false alerts on dev-replace when changing csum David Sterba
2 siblings, 0 replies; 4+ messages in thread
From: Qu Wenruo @ 2024-06-02 3:45 UTC (permalink / raw)
To: linux-btrfs
[BUG]
If a btrfs experienced dev-replace, even it's already finished,
btrfstune would refuse to change its csum:
WARNING: Experimental build with unstable or unfinished features
WARNING: Switching checksums is experimental, do not use for valuable data!
Proceed to switch checksums
ERROR: running dev-replace detected, please finish or cancel it.
ERROR: btrfstune failed
[CAUSE]
The current dev-replace detection is only checking if we have
DEV_REPLACE item in device tree.
However DEV_REPLACE item will also exist even if a dev-replace finished,
so the existing check can not handle such case at all.
[FIX]
If an dev-replace item is found, further check the state of the item to
prevent false alerts.
Fixes: #798
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
tune/change-csum.c | 18 +++++++++++++++---
1 file changed, 15 insertions(+), 3 deletions(-)
diff --git a/tune/change-csum.c b/tune/change-csum.c
index 0780a18b090b..c2972a509914 100644
--- a/tune/change-csum.c
+++ b/tune/change-csum.c
@@ -73,16 +73,28 @@ static int check_csum_change_requreiment(struct btrfs_fs_info *fs_info, u16 new_
key.type = BTRFS_DEV_REPLACE_KEY;
key.offset = 0;
ret = btrfs_search_slot(NULL, dev_root, &key, &path, 0, 0);
- btrfs_release_path(&path);
if (ret < 0) {
+ btrfs_release_path(&path);
errno = -ret;
error("failed to check the dev-reaplce status: %m");
return ret;
}
if (ret == 0) {
- error("running dev-replace detected, please finish or cancel it.");
- return -EINVAL;
+ struct btrfs_dev_replace_item *ptr;
+ u64 state;
+
+ ptr = btrfs_item_ptr(path.nodes[0], path.slots[0],
+ struct btrfs_dev_replace_item);
+ state = btrfs_dev_replace_replace_state(path.nodes[0], ptr);
+ if (state == BTRFS_IOCTL_DEV_REPLACE_STATE_STARTED ||
+ state == BTRFS_IOCTL_DEV_REPLACE_STATE_SUSPENDED) {
+ btrfs_release_path(&path);
+ error(
+ "running/suspended dev-replace detected, please finish or cancel it");
+ return -EINVAL;
+ }
}
+ btrfs_release_path(&path);
if (fs_info->csum_type == new_csum_type) {
error("the fs is already using csum type %s (%u)",
--
2.45.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH 0/2] btrfs-progs: btrfstune: fix false alerts on dev-replace when changing csum
2024-06-02 3:45 [PATCH 0/2] btrfs-progs: btrfstune: fix false alerts on dev-replace when changing csum Qu Wenruo
2024-06-02 3:45 ` [PATCH 1/2] btrfs-progs: print-tree: add support for dev-replace item Qu Wenruo
2024-06-02 3:45 ` [PATCH 2/2] btrfs-progs: change-csum: handle finished dev-replace correctly Qu Wenruo
@ 2024-06-03 19:57 ` David Sterba
2 siblings, 0 replies; 4+ messages in thread
From: David Sterba @ 2024-06-03 19:57 UTC (permalink / raw)
To: Qu Wenruo; +Cc: linux-btrfs
On Sun, Jun 02, 2024 at 01:15:31PM +0930, Qu Wenruo wrote:
> There is a bug report that if a btrfs has experienced any dev-replace,
> "btrfstune --csum" would always report a running dev-replace and refuse
> to continue.
>
> It turns out that, DEV_REPLACE item is not a transient one (but not
> created properly as mkfs time either), after a dev-replace the
> DEV_REPLACE item would exist forever, recording the last dev-replace
> timestamp.
I don't think we want to create the item at mkfs time, this would be
confusing and not related to any previous dev-replace operation.
> Although I really hate such behavior (especially when balance item would
> be gone after a balance is finished/canceled), at least fix the problem
> first.
It's handled differently yeah, though the two operations are bit
different, one would run balance more often that dev-replace. But it's
still only for internal tracking, other than that it's not handled the
same way I don't se any problem.
> The first patch enhances the print-tree to properly output the
> contents of the dev-replace item, then the second patch fixes the
> problem.
>
> I'd like to add test cases for it, but it turns out there is no csum
> change test cases.
I have tests locally from the time I developed it but yeah they should
be in the testsuite.
> My guess is we do not have proper way to skip the test if it's
> experiemental feature?
Currently there's no way to detect it. For some things we can use the
help text, like is done for some features in fstests.
> Maybe it's time to move csum change out of experiemental features?
This depends on creating the separate command group 'tune', this is in
the phase of interface review.
> Qu Wenruo (2):
> btrfs-progs: print-tree: add support for dev-replace item
> btrfs-progs: change-csum: handle finished dev-replace correctly
Added to devel, thanks.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-06-03 19:57 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-02 3:45 [PATCH 0/2] btrfs-progs: btrfstune: fix false alerts on dev-replace when changing csum Qu Wenruo
2024-06-02 3:45 ` [PATCH 1/2] btrfs-progs: print-tree: add support for dev-replace item Qu Wenruo
2024-06-02 3:45 ` [PATCH 2/2] btrfs-progs: change-csum: handle finished dev-replace correctly Qu Wenruo
2024-06-03 19:57 ` [PATCH 0/2] btrfs-progs: btrfstune: fix false alerts on dev-replace when changing csum David Sterba
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox