* [PATCH v2 0/3] btrfs-progs: avoid repeated data write for metadata and a small cleanup
@ 2022-08-02 7:52 Qu Wenruo
2022-08-02 7:52 ` [PATCH v2 1/3] btrfs-progs: avoid repeated data write for metadata Qu Wenruo
` (5 more replies)
0 siblings, 6 replies; 8+ messages in thread
From: Qu Wenruo @ 2022-08-02 7:52 UTC (permalink / raw)
To: linux-btrfs
[CHANGELOG]
v2:
- Separate the fixes from the initial patch
- Fix a bug in BUG_ON() condition which causes mkfs test failure
There is a bug report from Shinichiro that for zoned device mkfs -m DUP
(using RST) doesn't work.
It turns out to be a bug in commit 2a93728391a1 ("btrfs-progs: use
write_data_to_disk() to replace write_extent_to_disk()"), which I
wrongly assumed that write_data_to_disk() will only write the data to
one mirror.
In fact, write_data_to_disk() writes data to all mirrors, thus the
@mirror argument is completely unnecessary.
The first patch will fix the problem and cleanup the unnecessary
argument to avoid confusion.
Then the 2nd patch will fix a BUG_ON() condition.
Finally the last patch will cleanup write_and_map_eb() to completely
rely on write_data_to_disk(), without manually handling RAID56.
Qu Wenruo (3):
btrfs-progs: avoid repeated data write for metadata
btrfs-progs: fix a BUG_ON() condition for write_data_to_disk()
btrfs-progs: use write_data_to_disk() to handle RAID56 in
write_and_map_eb()
image/main.c | 2 +-
kernel-shared/disk-io.c | 39 +++------------------------------------
kernel-shared/extent_io.c | 12 +++++++++---
kernel-shared/extent_io.h | 2 +-
4 files changed, 14 insertions(+), 41 deletions(-)
--
2.37.0
^ permalink raw reply [flat|nested] 8+ messages in thread* [PATCH v2 1/3] btrfs-progs: avoid repeated data write for metadata 2022-08-02 7:52 [PATCH v2 0/3] btrfs-progs: avoid repeated data write for metadata and a small cleanup Qu Wenruo @ 2022-08-02 7:52 ` Qu Wenruo 2022-08-02 7:52 ` [PATCH v2 2/3] btrfs-progs: fix a BUG_ON() condition for write_data_to_disk() Qu Wenruo ` (4 subsequent siblings) 5 siblings, 0 replies; 8+ messages in thread From: Qu Wenruo @ 2022-08-02 7:52 UTC (permalink / raw) To: linux-btrfs; +Cc: Shinichiro Kawasaki [BUG] Shinichiro reported that "mkfs.btrfs -m DUP" is doing repeated write into the device. For non-zoned device this is not a big deal, but for zoned device this is critical, as zoned device doesn't support overwrite at all. [CAUSE] The problem is related to write_and_map_eb() call, since commit 2a93728391a1 ("btrfs-progs: use write_data_to_disk() to replace write_extent_to_disk()"), we call write_data_to_disk() for metadata write back. But the problem is, write_data_to_disk() will call btrfs_map_block() with rw = WRITE. By that btrfs_map_block() will always return all stripes, while in write_data_to_disk() we also iterate through each mirror of the range. This results above repeated writeback. [FIX] Fix this problem by completely remove @mirror argument from write_data_to_disk(). With extra comments to explicitly show that function will write to all mirrors. Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Fixes: 2a93728391a1 ("btrfs-progs: use write_data_to_disk() to replace write_extent_to_disk()") Signed-off-by: Qu Wenruo <wqu@suse.com> --- image/main.c | 2 +- kernel-shared/disk-io.c | 24 ++++++++++-------------- kernel-shared/extent_io.c | 10 ++++++++-- kernel-shared/extent_io.h | 2 +- 4 files changed, 20 insertions(+), 18 deletions(-) diff --git a/image/main.c b/image/main.c index 5bcd10f021d7..6793fe4b9076 100644 --- a/image/main.c +++ b/image/main.c @@ -1495,7 +1495,7 @@ static int restore_one_work(struct mdrestore_struct *mdres, } } else if (async->start != BTRFS_SUPER_INFO_OFFSET) { ret = write_data_to_disk(mdres->info, buffer, - async->start, out_len, 0); + async->start, out_len); if (ret) { error("failed to write data"); exit(1); diff --git a/kernel-shared/disk-io.c b/kernel-shared/disk-io.c index 26b1c9aa192a..a6e66aee7bf7 100644 --- a/kernel-shared/disk-io.c +++ b/kernel-shared/disk-io.c @@ -452,8 +452,6 @@ struct extent_buffer* read_tree_block(struct btrfs_fs_info *fs_info, u64 bytenr, int write_and_map_eb(struct btrfs_fs_info *fs_info, struct extent_buffer *eb) { int ret; - int mirror_num; - int max_mirror; u64 length; u64 *raid_map = NULL; struct btrfs_multi_bio *multi = NULL; @@ -483,18 +481,16 @@ int write_and_map_eb(struct btrfs_fs_info *fs_info, struct extent_buffer *eb) goto out; } - /* For non-RAID56, we just writeback data to each mirror */ - max_mirror = btrfs_num_copies(fs_info, eb->start, eb->len); - for (mirror_num = 1; mirror_num <= max_mirror; mirror_num++) { - ret = write_data_to_disk(fs_info, eb->data, eb->start, eb->len, - mirror_num); - if (ret < 0) { - errno = -ret; - error( - "failed to write bytenr %llu length %u to mirror %d: %m", - eb->start, eb->len, mirror_num); - goto out; - } + /* + * For non-RAID56, we just writeback data to all mirrors using + * write_data_to_disk(). + */ + ret = write_data_to_disk(fs_info, eb->data, eb->start, eb->len); + if (ret < 0) { + errno = -ret; + error("failed to write bytenr %llu length %u: %m", + eb->start, eb->len); + goto out; } out: diff --git a/kernel-shared/extent_io.c b/kernel-shared/extent_io.c index d6326ab2dc52..a05249815bb1 100644 --- a/kernel-shared/extent_io.c +++ b/kernel-shared/extent_io.c @@ -938,8 +938,14 @@ int read_data_from_disk(struct btrfs_fs_info *info, void *buf, u64 logical, return 0; } +/* + * Write the data in @buf to logical bytenr @offset. + * + * Such data will be written to all mirrors and RAID56 P/Q will also be + * properly handled. + */ int write_data_to_disk(struct btrfs_fs_info *info, void *buf, u64 offset, - u64 bytes, int mirror) + u64 bytes) { struct btrfs_multi_bio *multi = NULL; struct btrfs_device *device; @@ -956,7 +962,7 @@ int write_data_to_disk(struct btrfs_fs_info *info, void *buf, u64 offset, dev_nr = 0; ret = btrfs_map_block(info, WRITE, offset, &this_len, &multi, - mirror, &raid_map); + 0, &raid_map); if (ret) { fprintf(stderr, "Couldn't map the block %llu\n", offset); diff --git a/kernel-shared/extent_io.h b/kernel-shared/extent_io.h index fc2e4cc455d6..2148a8112428 100644 --- a/kernel-shared/extent_io.h +++ b/kernel-shared/extent_io.h @@ -162,7 +162,7 @@ int clear_extent_buffer_dirty(struct extent_buffer *eb); int read_data_from_disk(struct btrfs_fs_info *info, void *buf, u64 logical, u64 *len, int mirror); int write_data_to_disk(struct btrfs_fs_info *info, void *buf, u64 offset, - u64 bytes, int mirror); + u64 bytes); void extent_buffer_bitmap_clear(struct extent_buffer *eb, unsigned long start, unsigned long pos, unsigned long len); void extent_buffer_bitmap_set(struct extent_buffer *eb, unsigned long start, -- 2.37.0 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 2/3] btrfs-progs: fix a BUG_ON() condition for write_data_to_disk() 2022-08-02 7:52 [PATCH v2 0/3] btrfs-progs: avoid repeated data write for metadata and a small cleanup Qu Wenruo 2022-08-02 7:52 ` [PATCH v2 1/3] btrfs-progs: avoid repeated data write for metadata Qu Wenruo @ 2022-08-02 7:52 ` Qu Wenruo 2022-08-02 7:52 ` [PATCH v2 3/3] btrfs-progs: use write_data_to_disk() to handle RAID56 in write_and_map_eb() Qu Wenruo ` (3 subsequent siblings) 5 siblings, 0 replies; 8+ messages in thread From: Qu Wenruo @ 2022-08-02 7:52 UTC (permalink / raw) To: linux-btrfs The BUG_ON() condition in write_data_to_disk() is no longer correct. Now write_raid56_with_parity() will return the bytes written of last stripe. Thus a success writeback can trigger the BUG_ON(ret). Fix the condition to (ret < 0). Signed-off-by: Qu Wenruo <wqu@suse.com> --- kernel-shared/extent_io.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel-shared/extent_io.c b/kernel-shared/extent_io.c index a05249815bb1..48bcf2cf2f96 100644 --- a/kernel-shared/extent_io.c +++ b/kernel-shared/extent_io.c @@ -990,7 +990,7 @@ int write_data_to_disk(struct btrfs_fs_info *info, void *buf, u64 offset, memcpy(eb->data, buf + total_write, this_len); ret = write_raid56_with_parity(info, eb, multi, stripe_len, raid_map); - BUG_ON(ret); + BUG_ON(ret < 0); free(eb); kfree(raid_map); -- 2.37.0 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 3/3] btrfs-progs: use write_data_to_disk() to handle RAID56 in write_and_map_eb() 2022-08-02 7:52 [PATCH v2 0/3] btrfs-progs: avoid repeated data write for metadata and a small cleanup Qu Wenruo 2022-08-02 7:52 ` [PATCH v2 1/3] btrfs-progs: avoid repeated data write for metadata Qu Wenruo 2022-08-02 7:52 ` [PATCH v2 2/3] btrfs-progs: fix a BUG_ON() condition for write_data_to_disk() Qu Wenruo @ 2022-08-02 7:52 ` Qu Wenruo 2022-08-02 8:04 ` [PATCH v2 0/3] btrfs-progs: avoid repeated data write for metadata and a small cleanup Johannes Thumshirn ` (2 subsequent siblings) 5 siblings, 0 replies; 8+ messages in thread From: Qu Wenruo @ 2022-08-02 7:52 UTC (permalink / raw) To: linux-btrfs Function write_data_to_disk() can handle RAID56 writes without any problem. So just call write_data_to_disk() inside write_and_map_eb() instead of manually doing the RAID56 write. Signed-off-by: Qu Wenruo <wqu@suse.com> --- kernel-shared/disk-io.c | 31 +------------------------------ 1 file changed, 1 insertion(+), 30 deletions(-) diff --git a/kernel-shared/disk-io.c b/kernel-shared/disk-io.c index a6e66aee7bf7..d276e52df060 100644 --- a/kernel-shared/disk-io.c +++ b/kernel-shared/disk-io.c @@ -452,39 +452,10 @@ struct extent_buffer* read_tree_block(struct btrfs_fs_info *fs_info, u64 bytenr, int write_and_map_eb(struct btrfs_fs_info *fs_info, struct extent_buffer *eb) { int ret; - u64 length; u64 *raid_map = NULL; struct btrfs_multi_bio *multi = NULL; - length = eb->len; - ret = btrfs_map_block(fs_info, WRITE, eb->start, &length, - &multi, 0, &raid_map); - if (ret < 0) { - errno = -ret; - error("failed to map bytenr %llu length %u: %m", - eb->start, eb->len); - goto out; - } - - /* RAID56 write back need RMW */ - if (raid_map) { - ret = write_raid56_with_parity(fs_info, eb, multi, - length, raid_map); - if (ret < 0) { - errno = -ret; - error( - "failed to write raid56 stripe for bytenr %llu length %llu: %m", - eb->start, length); - } else { - ret = 0; - } - goto out; - } - - /* - * For non-RAID56, we just writeback data to all mirrors using - * write_data_to_disk(). - */ + /* write_data_to_disk() will handle all mirrors and RAID56. */ ret = write_data_to_disk(fs_info, eb->data, eb->start, eb->len); if (ret < 0) { errno = -ret; -- 2.37.0 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2 0/3] btrfs-progs: avoid repeated data write for metadata and a small cleanup 2022-08-02 7:52 [PATCH v2 0/3] btrfs-progs: avoid repeated data write for metadata and a small cleanup Qu Wenruo ` (2 preceding siblings ...) 2022-08-02 7:52 ` [PATCH v2 3/3] btrfs-progs: use write_data_to_disk() to handle RAID56 in write_and_map_eb() Qu Wenruo @ 2022-08-02 8:04 ` Johannes Thumshirn 2022-08-02 8:06 ` Qu Wenruo 2022-08-02 8:34 ` Shinichiro Kawasaki 2022-08-03 19:25 ` David Sterba 5 siblings, 1 reply; 8+ messages in thread From: Johannes Thumshirn @ 2022-08-02 8:04 UTC (permalink / raw) To: Qu Wenruo, linux-btrfs@vger.kernel.org On 02.08.22 09:53, Qu Wenruo wrote: > [CHANGELOG] > v2: > - Separate the fixes from the initial patch > - Fix a bug in BUG_ON() condition which causes mkfs test failure > > There is a bug report from Shinichiro that for zoned device mkfs -m DUP > (using RST) doesn't work. Nit, it's without RST. Anyways, for the series: Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 0/3] btrfs-progs: avoid repeated data write for metadata and a small cleanup 2022-08-02 8:04 ` [PATCH v2 0/3] btrfs-progs: avoid repeated data write for metadata and a small cleanup Johannes Thumshirn @ 2022-08-02 8:06 ` Qu Wenruo 0 siblings, 0 replies; 8+ messages in thread From: Qu Wenruo @ 2022-08-02 8:06 UTC (permalink / raw) To: Johannes Thumshirn, Qu Wenruo, linux-btrfs@vger.kernel.org On 2022/8/2 16:04, Johannes Thumshirn wrote: > On 02.08.22 09:53, Qu Wenruo wrote: >> [CHANGELOG] >> v2: >> - Separate the fixes from the initial patch >> - Fix a bug in BUG_ON() condition which causes mkfs test failure >> >> There is a bug report from Shinichiro that for zoned device mkfs -m DUP >> (using RST) doesn't work. > > Nit, it's without RST. My bad, for RST it should be -d DUP. Thanks, Qu > > Anyways, for the series: > Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 0/3] btrfs-progs: avoid repeated data write for metadata and a small cleanup 2022-08-02 7:52 [PATCH v2 0/3] btrfs-progs: avoid repeated data write for metadata and a small cleanup Qu Wenruo ` (3 preceding siblings ...) 2022-08-02 8:04 ` [PATCH v2 0/3] btrfs-progs: avoid repeated data write for metadata and a small cleanup Johannes Thumshirn @ 2022-08-02 8:34 ` Shinichiro Kawasaki 2022-08-03 19:25 ` David Sterba 5 siblings, 0 replies; 8+ messages in thread From: Shinichiro Kawasaki @ 2022-08-02 8:34 UTC (permalink / raw) To: Qu Wenruo; +Cc: linux-btrfs@vger.kernel.org On Aug 02, 2022 / 15:52, Qu Wenruo wrote: > [CHANGELOG] > v2: > - Separate the fixes from the initial patch > - Fix a bug in BUG_ON() condition which causes mkfs test failure > > There is a bug report from Shinichiro that for zoned device mkfs -m DUP > (using RST) doesn't work. > > It turns out to be a bug in commit 2a93728391a1 ("btrfs-progs: use > write_data_to_disk() to replace write_extent_to_disk()"), which I > wrongly assumed that write_data_to_disk() will only write the data to > one mirror. > > In fact, write_data_to_disk() writes data to all mirrors, thus the > @mirror argument is completely unnecessary. > > The first patch will fix the problem and cleanup the unnecessary > argument to avoid confusion. > > Then the 2nd patch will fix a BUG_ON() condition. > > Finally the last patch will cleanup write_and_map_eb() to completely > rely on write_data_to_disk(), without manually handling RAID56. I reconfirmed that this v2 series avoids the mkfs.btrfs failure. It also avoids the duplicate write. For the series: Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> -- Shin'ichiro Kawasaki ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 0/3] btrfs-progs: avoid repeated data write for metadata and a small cleanup 2022-08-02 7:52 [PATCH v2 0/3] btrfs-progs: avoid repeated data write for metadata and a small cleanup Qu Wenruo ` (4 preceding siblings ...) 2022-08-02 8:34 ` Shinichiro Kawasaki @ 2022-08-03 19:25 ` David Sterba 5 siblings, 0 replies; 8+ messages in thread From: David Sterba @ 2022-08-03 19:25 UTC (permalink / raw) To: Qu Wenruo; +Cc: linux-btrfs On Tue, Aug 02, 2022 at 03:52:40PM +0800, Qu Wenruo wrote: > [CHANGELOG] > v2: > - Separate the fixes from the initial patch > - Fix a bug in BUG_ON() condition which causes mkfs test failure > > There is a bug report from Shinichiro that for zoned device mkfs -m DUP > (using RST) doesn't work. > > It turns out to be a bug in commit 2a93728391a1 ("btrfs-progs: use > write_data_to_disk() to replace write_extent_to_disk()"), which I > wrongly assumed that write_data_to_disk() will only write the data to > one mirror. > > In fact, write_data_to_disk() writes data to all mirrors, thus the > @mirror argument is completely unnecessary. > > The first patch will fix the problem and cleanup the unnecessary > argument to avoid confusion. > > Then the 2nd patch will fix a BUG_ON() condition. > > Finally the last patch will cleanup write_and_map_eb() to completely > rely on write_data_to_disk(), without manually handling RAID56. > > > Qu Wenruo (3): > btrfs-progs: avoid repeated data write for metadata > btrfs-progs: fix a BUG_ON() condition for write_data_to_disk() > btrfs-progs: use write_data_to_disk() to handle RAID56 in > write_and_map_eb() Added to devel, thanks. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2022-08-03 19:30 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-08-02 7:52 [PATCH v2 0/3] btrfs-progs: avoid repeated data write for metadata and a small cleanup Qu Wenruo 2022-08-02 7:52 ` [PATCH v2 1/3] btrfs-progs: avoid repeated data write for metadata Qu Wenruo 2022-08-02 7:52 ` [PATCH v2 2/3] btrfs-progs: fix a BUG_ON() condition for write_data_to_disk() Qu Wenruo 2022-08-02 7:52 ` [PATCH v2 3/3] btrfs-progs: use write_data_to_disk() to handle RAID56 in write_and_map_eb() Qu Wenruo 2022-08-02 8:04 ` [PATCH v2 0/3] btrfs-progs: avoid repeated data write for metadata and a small cleanup Johannes Thumshirn 2022-08-02 8:06 ` Qu Wenruo 2022-08-02 8:34 ` Shinichiro Kawasaki 2022-08-03 19:25 ` David Sterba
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox