* [PATCH v2 0/3] btrfs-progs: avoid repeated data write for metadata and a small cleanup
@ 2022-08-02 7:52 Qu Wenruo
2022-08-02 7:52 ` [PATCH v2 1/3] btrfs-progs: avoid repeated data write for metadata Qu Wenruo
` (5 more replies)
0 siblings, 6 replies; 8+ messages in thread
From: Qu Wenruo @ 2022-08-02 7:52 UTC (permalink / raw)
To: linux-btrfs
[CHANGELOG]
v2:
- Separate the fixes from the initial patch
- Fix a bug in BUG_ON() condition which causes mkfs test failure
There is a bug report from Shinichiro that for zoned device mkfs -m DUP
(using RST) doesn't work.
It turns out to be a bug in commit 2a93728391a1 ("btrfs-progs: use
write_data_to_disk() to replace write_extent_to_disk()"), which I
wrongly assumed that write_data_to_disk() will only write the data to
one mirror.
In fact, write_data_to_disk() writes data to all mirrors, thus the
@mirror argument is completely unnecessary.
The first patch will fix the problem and cleanup the unnecessary
argument to avoid confusion.
Then the 2nd patch will fix a BUG_ON() condition.
Finally the last patch will cleanup write_and_map_eb() to completely
rely on write_data_to_disk(), without manually handling RAID56.
Qu Wenruo (3):
btrfs-progs: avoid repeated data write for metadata
btrfs-progs: fix a BUG_ON() condition for write_data_to_disk()
btrfs-progs: use write_data_to_disk() to handle RAID56 in
write_and_map_eb()
image/main.c | 2 +-
kernel-shared/disk-io.c | 39 +++------------------------------------
kernel-shared/extent_io.c | 12 +++++++++---
kernel-shared/extent_io.h | 2 +-
4 files changed, 14 insertions(+), 41 deletions(-)
--
2.37.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v2 1/3] btrfs-progs: avoid repeated data write for metadata
2022-08-02 7:52 [PATCH v2 0/3] btrfs-progs: avoid repeated data write for metadata and a small cleanup Qu Wenruo
@ 2022-08-02 7:52 ` Qu Wenruo
2022-08-02 7:52 ` [PATCH v2 2/3] btrfs-progs: fix a BUG_ON() condition for write_data_to_disk() Qu Wenruo
` (4 subsequent siblings)
5 siblings, 0 replies; 8+ messages in thread
From: Qu Wenruo @ 2022-08-02 7:52 UTC (permalink / raw)
To: linux-btrfs; +Cc: Shinichiro Kawasaki
[BUG]
Shinichiro reported that "mkfs.btrfs -m DUP" is doing repeated write
into the device.
For non-zoned device this is not a big deal, but for zoned device this
is critical, as zoned device doesn't support overwrite at all.
[CAUSE]
The problem is related to write_and_map_eb() call, since commit
2a93728391a1 ("btrfs-progs: use write_data_to_disk() to replace
write_extent_to_disk()"), we call write_data_to_disk() for metadata
write back.
But the problem is, write_data_to_disk() will call btrfs_map_block()
with rw = WRITE.
By that btrfs_map_block() will always return all stripes, while in
write_data_to_disk() we also iterate through each mirror of the range.
This results above repeated writeback.
[FIX]
Fix this problem by completely remove @mirror argument
from write_data_to_disk().
With extra comments to explicitly show that function will write to
all mirrors.
Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Fixes: 2a93728391a1 ("btrfs-progs: use write_data_to_disk() to replace write_extent_to_disk()")
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
image/main.c | 2 +-
kernel-shared/disk-io.c | 24 ++++++++++--------------
kernel-shared/extent_io.c | 10 ++++++++--
kernel-shared/extent_io.h | 2 +-
4 files changed, 20 insertions(+), 18 deletions(-)
diff --git a/image/main.c b/image/main.c
index 5bcd10f021d7..6793fe4b9076 100644
--- a/image/main.c
+++ b/image/main.c
@@ -1495,7 +1495,7 @@ static int restore_one_work(struct mdrestore_struct *mdres,
}
} else if (async->start != BTRFS_SUPER_INFO_OFFSET) {
ret = write_data_to_disk(mdres->info, buffer,
- async->start, out_len, 0);
+ async->start, out_len);
if (ret) {
error("failed to write data");
exit(1);
diff --git a/kernel-shared/disk-io.c b/kernel-shared/disk-io.c
index 26b1c9aa192a..a6e66aee7bf7 100644
--- a/kernel-shared/disk-io.c
+++ b/kernel-shared/disk-io.c
@@ -452,8 +452,6 @@ struct extent_buffer* read_tree_block(struct btrfs_fs_info *fs_info, u64 bytenr,
int write_and_map_eb(struct btrfs_fs_info *fs_info, struct extent_buffer *eb)
{
int ret;
- int mirror_num;
- int max_mirror;
u64 length;
u64 *raid_map = NULL;
struct btrfs_multi_bio *multi = NULL;
@@ -483,18 +481,16 @@ int write_and_map_eb(struct btrfs_fs_info *fs_info, struct extent_buffer *eb)
goto out;
}
- /* For non-RAID56, we just writeback data to each mirror */
- max_mirror = btrfs_num_copies(fs_info, eb->start, eb->len);
- for (mirror_num = 1; mirror_num <= max_mirror; mirror_num++) {
- ret = write_data_to_disk(fs_info, eb->data, eb->start, eb->len,
- mirror_num);
- if (ret < 0) {
- errno = -ret;
- error(
- "failed to write bytenr %llu length %u to mirror %d: %m",
- eb->start, eb->len, mirror_num);
- goto out;
- }
+ /*
+ * For non-RAID56, we just writeback data to all mirrors using
+ * write_data_to_disk().
+ */
+ ret = write_data_to_disk(fs_info, eb->data, eb->start, eb->len);
+ if (ret < 0) {
+ errno = -ret;
+ error("failed to write bytenr %llu length %u: %m",
+ eb->start, eb->len);
+ goto out;
}
out:
diff --git a/kernel-shared/extent_io.c b/kernel-shared/extent_io.c
index d6326ab2dc52..a05249815bb1 100644
--- a/kernel-shared/extent_io.c
+++ b/kernel-shared/extent_io.c
@@ -938,8 +938,14 @@ int read_data_from_disk(struct btrfs_fs_info *info, void *buf, u64 logical,
return 0;
}
+/*
+ * Write the data in @buf to logical bytenr @offset.
+ *
+ * Such data will be written to all mirrors and RAID56 P/Q will also be
+ * properly handled.
+ */
int write_data_to_disk(struct btrfs_fs_info *info, void *buf, u64 offset,
- u64 bytes, int mirror)
+ u64 bytes)
{
struct btrfs_multi_bio *multi = NULL;
struct btrfs_device *device;
@@ -956,7 +962,7 @@ int write_data_to_disk(struct btrfs_fs_info *info, void *buf, u64 offset,
dev_nr = 0;
ret = btrfs_map_block(info, WRITE, offset, &this_len, &multi,
- mirror, &raid_map);
+ 0, &raid_map);
if (ret) {
fprintf(stderr, "Couldn't map the block %llu\n",
offset);
diff --git a/kernel-shared/extent_io.h b/kernel-shared/extent_io.h
index fc2e4cc455d6..2148a8112428 100644
--- a/kernel-shared/extent_io.h
+++ b/kernel-shared/extent_io.h
@@ -162,7 +162,7 @@ int clear_extent_buffer_dirty(struct extent_buffer *eb);
int read_data_from_disk(struct btrfs_fs_info *info, void *buf, u64 logical,
u64 *len, int mirror);
int write_data_to_disk(struct btrfs_fs_info *info, void *buf, u64 offset,
- u64 bytes, int mirror);
+ u64 bytes);
void extent_buffer_bitmap_clear(struct extent_buffer *eb, unsigned long start,
unsigned long pos, unsigned long len);
void extent_buffer_bitmap_set(struct extent_buffer *eb, unsigned long start,
--
2.37.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 2/3] btrfs-progs: fix a BUG_ON() condition for write_data_to_disk()
2022-08-02 7:52 [PATCH v2 0/3] btrfs-progs: avoid repeated data write for metadata and a small cleanup Qu Wenruo
2022-08-02 7:52 ` [PATCH v2 1/3] btrfs-progs: avoid repeated data write for metadata Qu Wenruo
@ 2022-08-02 7:52 ` Qu Wenruo
2022-08-02 7:52 ` [PATCH v2 3/3] btrfs-progs: use write_data_to_disk() to handle RAID56 in write_and_map_eb() Qu Wenruo
` (3 subsequent siblings)
5 siblings, 0 replies; 8+ messages in thread
From: Qu Wenruo @ 2022-08-02 7:52 UTC (permalink / raw)
To: linux-btrfs
The BUG_ON() condition in write_data_to_disk() is no longer correct.
Now write_raid56_with_parity() will return the bytes written of last
stripe.
Thus a success writeback can trigger the BUG_ON(ret).
Fix the condition to (ret < 0).
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
kernel-shared/extent_io.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel-shared/extent_io.c b/kernel-shared/extent_io.c
index a05249815bb1..48bcf2cf2f96 100644
--- a/kernel-shared/extent_io.c
+++ b/kernel-shared/extent_io.c
@@ -990,7 +990,7 @@ int write_data_to_disk(struct btrfs_fs_info *info, void *buf, u64 offset,
memcpy(eb->data, buf + total_write, this_len);
ret = write_raid56_with_parity(info, eb, multi,
stripe_len, raid_map);
- BUG_ON(ret);
+ BUG_ON(ret < 0);
free(eb);
kfree(raid_map);
--
2.37.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 3/3] btrfs-progs: use write_data_to_disk() to handle RAID56 in write_and_map_eb()
2022-08-02 7:52 [PATCH v2 0/3] btrfs-progs: avoid repeated data write for metadata and a small cleanup Qu Wenruo
2022-08-02 7:52 ` [PATCH v2 1/3] btrfs-progs: avoid repeated data write for metadata Qu Wenruo
2022-08-02 7:52 ` [PATCH v2 2/3] btrfs-progs: fix a BUG_ON() condition for write_data_to_disk() Qu Wenruo
@ 2022-08-02 7:52 ` Qu Wenruo
2022-08-02 8:04 ` [PATCH v2 0/3] btrfs-progs: avoid repeated data write for metadata and a small cleanup Johannes Thumshirn
` (2 subsequent siblings)
5 siblings, 0 replies; 8+ messages in thread
From: Qu Wenruo @ 2022-08-02 7:52 UTC (permalink / raw)
To: linux-btrfs
Function write_data_to_disk() can handle RAID56 writes without any
problem.
So just call write_data_to_disk() inside write_and_map_eb() instead of
manually doing the RAID56 write.
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
kernel-shared/disk-io.c | 31 +------------------------------
1 file changed, 1 insertion(+), 30 deletions(-)
diff --git a/kernel-shared/disk-io.c b/kernel-shared/disk-io.c
index a6e66aee7bf7..d276e52df060 100644
--- a/kernel-shared/disk-io.c
+++ b/kernel-shared/disk-io.c
@@ -452,39 +452,10 @@ struct extent_buffer* read_tree_block(struct btrfs_fs_info *fs_info, u64 bytenr,
int write_and_map_eb(struct btrfs_fs_info *fs_info, struct extent_buffer *eb)
{
int ret;
- u64 length;
u64 *raid_map = NULL;
struct btrfs_multi_bio *multi = NULL;
- length = eb->len;
- ret = btrfs_map_block(fs_info, WRITE, eb->start, &length,
- &multi, 0, &raid_map);
- if (ret < 0) {
- errno = -ret;
- error("failed to map bytenr %llu length %u: %m",
- eb->start, eb->len);
- goto out;
- }
-
- /* RAID56 write back need RMW */
- if (raid_map) {
- ret = write_raid56_with_parity(fs_info, eb, multi,
- length, raid_map);
- if (ret < 0) {
- errno = -ret;
- error(
- "failed to write raid56 stripe for bytenr %llu length %llu: %m",
- eb->start, length);
- } else {
- ret = 0;
- }
- goto out;
- }
-
- /*
- * For non-RAID56, we just writeback data to all mirrors using
- * write_data_to_disk().
- */
+ /* write_data_to_disk() will handle all mirrors and RAID56. */
ret = write_data_to_disk(fs_info, eb->data, eb->start, eb->len);
if (ret < 0) {
errno = -ret;
--
2.37.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2 0/3] btrfs-progs: avoid repeated data write for metadata and a small cleanup
2022-08-02 7:52 [PATCH v2 0/3] btrfs-progs: avoid repeated data write for metadata and a small cleanup Qu Wenruo
` (2 preceding siblings ...)
2022-08-02 7:52 ` [PATCH v2 3/3] btrfs-progs: use write_data_to_disk() to handle RAID56 in write_and_map_eb() Qu Wenruo
@ 2022-08-02 8:04 ` Johannes Thumshirn
2022-08-02 8:06 ` Qu Wenruo
2022-08-02 8:34 ` Shinichiro Kawasaki
2022-08-03 19:25 ` David Sterba
5 siblings, 1 reply; 8+ messages in thread
From: Johannes Thumshirn @ 2022-08-02 8:04 UTC (permalink / raw)
To: Qu Wenruo, linux-btrfs@vger.kernel.org
On 02.08.22 09:53, Qu Wenruo wrote:
> [CHANGELOG]
> v2:
> - Separate the fixes from the initial patch
> - Fix a bug in BUG_ON() condition which causes mkfs test failure
>
> There is a bug report from Shinichiro that for zoned device mkfs -m DUP
> (using RST) doesn't work.
Nit, it's without RST.
Anyways, for the series:
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 0/3] btrfs-progs: avoid repeated data write for metadata and a small cleanup
2022-08-02 8:04 ` [PATCH v2 0/3] btrfs-progs: avoid repeated data write for metadata and a small cleanup Johannes Thumshirn
@ 2022-08-02 8:06 ` Qu Wenruo
0 siblings, 0 replies; 8+ messages in thread
From: Qu Wenruo @ 2022-08-02 8:06 UTC (permalink / raw)
To: Johannes Thumshirn, Qu Wenruo, linux-btrfs@vger.kernel.org
On 2022/8/2 16:04, Johannes Thumshirn wrote:
> On 02.08.22 09:53, Qu Wenruo wrote:
>> [CHANGELOG]
>> v2:
>> - Separate the fixes from the initial patch
>> - Fix a bug in BUG_ON() condition which causes mkfs test failure
>>
>> There is a bug report from Shinichiro that for zoned device mkfs -m DUP
>> (using RST) doesn't work.
>
> Nit, it's without RST.
My bad, for RST it should be -d DUP.
Thanks,
Qu
>
> Anyways, for the series:
> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 0/3] btrfs-progs: avoid repeated data write for metadata and a small cleanup
2022-08-02 7:52 [PATCH v2 0/3] btrfs-progs: avoid repeated data write for metadata and a small cleanup Qu Wenruo
` (3 preceding siblings ...)
2022-08-02 8:04 ` [PATCH v2 0/3] btrfs-progs: avoid repeated data write for metadata and a small cleanup Johannes Thumshirn
@ 2022-08-02 8:34 ` Shinichiro Kawasaki
2022-08-03 19:25 ` David Sterba
5 siblings, 0 replies; 8+ messages in thread
From: Shinichiro Kawasaki @ 2022-08-02 8:34 UTC (permalink / raw)
To: Qu Wenruo; +Cc: linux-btrfs@vger.kernel.org
On Aug 02, 2022 / 15:52, Qu Wenruo wrote:
> [CHANGELOG]
> v2:
> - Separate the fixes from the initial patch
> - Fix a bug in BUG_ON() condition which causes mkfs test failure
>
> There is a bug report from Shinichiro that for zoned device mkfs -m DUP
> (using RST) doesn't work.
>
> It turns out to be a bug in commit 2a93728391a1 ("btrfs-progs: use
> write_data_to_disk() to replace write_extent_to_disk()"), which I
> wrongly assumed that write_data_to_disk() will only write the data to
> one mirror.
>
> In fact, write_data_to_disk() writes data to all mirrors, thus the
> @mirror argument is completely unnecessary.
>
> The first patch will fix the problem and cleanup the unnecessary
> argument to avoid confusion.
>
> Then the 2nd patch will fix a BUG_ON() condition.
>
> Finally the last patch will cleanup write_and_map_eb() to completely
> rely on write_data_to_disk(), without manually handling RAID56.
I reconfirmed that this v2 series avoids the mkfs.btrfs failure. It
also avoids the duplicate write. For the series:
Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
--
Shin'ichiro Kawasaki
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 0/3] btrfs-progs: avoid repeated data write for metadata and a small cleanup
2022-08-02 7:52 [PATCH v2 0/3] btrfs-progs: avoid repeated data write for metadata and a small cleanup Qu Wenruo
` (4 preceding siblings ...)
2022-08-02 8:34 ` Shinichiro Kawasaki
@ 2022-08-03 19:25 ` David Sterba
5 siblings, 0 replies; 8+ messages in thread
From: David Sterba @ 2022-08-03 19:25 UTC (permalink / raw)
To: Qu Wenruo; +Cc: linux-btrfs
On Tue, Aug 02, 2022 at 03:52:40PM +0800, Qu Wenruo wrote:
> [CHANGELOG]
> v2:
> - Separate the fixes from the initial patch
> - Fix a bug in BUG_ON() condition which causes mkfs test failure
>
> There is a bug report from Shinichiro that for zoned device mkfs -m DUP
> (using RST) doesn't work.
>
> It turns out to be a bug in commit 2a93728391a1 ("btrfs-progs: use
> write_data_to_disk() to replace write_extent_to_disk()"), which I
> wrongly assumed that write_data_to_disk() will only write the data to
> one mirror.
>
> In fact, write_data_to_disk() writes data to all mirrors, thus the
> @mirror argument is completely unnecessary.
>
> The first patch will fix the problem and cleanup the unnecessary
> argument to avoid confusion.
>
> Then the 2nd patch will fix a BUG_ON() condition.
>
> Finally the last patch will cleanup write_and_map_eb() to completely
> rely on write_data_to_disk(), without manually handling RAID56.
>
>
> Qu Wenruo (3):
> btrfs-progs: avoid repeated data write for metadata
> btrfs-progs: fix a BUG_ON() condition for write_data_to_disk()
> btrfs-progs: use write_data_to_disk() to handle RAID56 in
> write_and_map_eb()
Added to devel, thanks.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2022-08-03 19:30 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-08-02 7:52 [PATCH v2 0/3] btrfs-progs: avoid repeated data write for metadata and a small cleanup Qu Wenruo
2022-08-02 7:52 ` [PATCH v2 1/3] btrfs-progs: avoid repeated data write for metadata Qu Wenruo
2022-08-02 7:52 ` [PATCH v2 2/3] btrfs-progs: fix a BUG_ON() condition for write_data_to_disk() Qu Wenruo
2022-08-02 7:52 ` [PATCH v2 3/3] btrfs-progs: use write_data_to_disk() to handle RAID56 in write_and_map_eb() Qu Wenruo
2022-08-02 8:04 ` [PATCH v2 0/3] btrfs-progs: avoid repeated data write for metadata and a small cleanup Johannes Thumshirn
2022-08-02 8:06 ` Qu Wenruo
2022-08-02 8:34 ` Shinichiro Kawasaki
2022-08-03 19:25 ` David Sterba
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.