* [PATCH 0/2] btrfs: migrate the remaining functions exposed by a full fstests with larger metadata folios
@ 2023-12-12 5:24 Qu Wenruo
2023-12-12 5:24 ` [PATCH 1/2] btrfs: migrate eb_bitmap_offset() to folio interfaces Qu Wenruo
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Qu Wenruo @ 2023-12-12 5:24 UTC (permalink / raw)
To: linux-btrfs
[REPO]
This patchset along with all the previous migrations (and the final
enablement patch) can be found here:
https://github.com/adam900710/linux/tree/eb_memory
With all the previous migrations (although only tested without larger
folios), we are finally just one step away from enabling larger folio
support for btrfs metadata.
During my local full fstests runs with larger metadata folios, there are
only two bugs hit, all related to some code path not yet handling folios
correct:
- eb_bitmap_offset()
- btrfs_repair_eb_io_failure()
Otherwise my local branch can already pass local fstests without new
regressions.
So here is the final (and I hope is the last) migrations for involed
metadata code path, before the final patch enabling larger folio
support.
Qu Wenruo (2):
btrfs: migrate eb_bitmap_offset() to folio interfaces
btrfs: migrate btrfs_repair_io_failure() to folio interfaces
fs/btrfs/bio.c | 15 +++++++++++----
fs/btrfs/bio.h | 4 ++--
fs/btrfs/disk-io.c | 13 +++++++------
fs/btrfs/extent_io.c | 22 ++++++++++------------
4 files changed, 30 insertions(+), 24 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH 1/2] btrfs: migrate eb_bitmap_offset() to folio interfaces
2023-12-12 5:24 [PATCH 0/2] btrfs: migrate the remaining functions exposed by a full fstests with larger metadata folios Qu Wenruo
@ 2023-12-12 5:24 ` Qu Wenruo
2023-12-12 5:24 ` [PATCH 2/2] btrfs: migrate btrfs_repair_io_failure() " Qu Wenruo
2023-12-13 22:39 ` [PATCH 0/2] btrfs: migrate the remaining functions exposed by a full fstests with larger metadata folios David Sterba
2 siblings, 0 replies; 4+ messages in thread
From: Qu Wenruo @ 2023-12-12 5:24 UTC (permalink / raw)
To: linux-btrfs
[BUG]
Test case btrfs/002 would fail if larger folios are enabled for
metadata:
assertion failed: folio, in fs/btrfs/extent_io.c:4358
------------[ cut here ]------------
kernel BUG at fs/btrfs/extent_io.c:4358!
invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 1 PID: 30916 Comm: fsstress Tainted: G OE 6.7.0-rc3-custom+ #128
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 2/2/2022
RIP: 0010:assert_eb_folio_uptodate+0x98/0xe0 [btrfs]
Call Trace:
<TASK>
extent_buffer_test_bit+0x3c/0x70 [btrfs]
free_space_test_bit+0xcd/0x140 [btrfs]
modify_free_space_bitmap+0x27a/0x430 [btrfs]
add_to_free_space_tree+0x8d/0x160 [btrfs]
__btrfs_free_extent.isra.0+0xef1/0x13c0 [btrfs]
__btrfs_run_delayed_refs+0x786/0x13c0 [btrfs]
btrfs_run_delayed_refs+0x33/0x120 [btrfs]
btrfs_commit_transaction+0xa2/0x1350 [btrfs]
iterate_supers+0x77/0xe0
ksys_sync+0x60/0xa0
__do_sys_sync+0xa/0x20
do_syscall_64+0x3f/0xf0
entry_SYSCALL_64_after_hwframe+0x6e/0x76
</TASK>
[CAUSE]
The function extent_buffer_test_bit() is not larger folio compatible.
It still assume the old fixed page size, when an extent buffer with
large folio passed in, only eb->folios[0] is populated.
Then if the target bit range falls in the 2nd page of the folio, then we
would check eb->folios[1], and trigger the ASSERT().
[FIX]
Just migrate eb_bitmap_offset() to folio interfaces, using the
folio_size() to replace PAGE_SIZE.
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
fs/btrfs/extent_io.c | 22 ++++++++++------------
1 file changed, 10 insertions(+), 12 deletions(-)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 614d10655991..dfd9f2d6e3fe 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -4466,22 +4466,22 @@ void copy_extent_buffer(const struct extent_buffer *dst,
}
/*
- * Calculate the page and offset of the byte containing the given bit number.
+ * Calculate the folio and offset of the byte containing the given bit number.
*
* @eb: the extent buffer
* @start: offset of the bitmap item in the extent buffer
* @nr: bit number
- * @page_index: return index of the page in the extent buffer that contains
+ * @folio_index: return index of the folio in the extent buffer that contains
* the given bit number
- * @page_offset: return offset into the page given by page_index
+ * @folio_offset: return offset into the folio given by folio_index
*
* This helper hides the ugliness of finding the byte in an extent buffer which
* contains a given bit.
*/
static inline void eb_bitmap_offset(const struct extent_buffer *eb,
unsigned long start, unsigned long nr,
- unsigned long *page_index,
- size_t *page_offset)
+ unsigned long *folio_index,
+ size_t *folio_offset)
{
size_t byte_offset = BIT_BYTE(nr);
size_t offset;
@@ -4491,10 +4491,10 @@ static inline void eb_bitmap_offset(const struct extent_buffer *eb,
* the bitmap item in the extent buffer + the offset of the byte in the
* bitmap item.
*/
- offset = start + offset_in_page(eb->start) + byte_offset;
+ offset = start + offset_in_folio(eb->folios[0], eb->start) + byte_offset;
- *page_index = offset >> PAGE_SHIFT;
- *page_offset = offset_in_page(offset);
+ *folio_index = offset >> folio_shift(eb->folios[0]);
+ *folio_offset = offset_in_folio(eb->folios[0], offset);
}
/*
@@ -4507,15 +4507,13 @@ static inline void eb_bitmap_offset(const struct extent_buffer *eb,
int extent_buffer_test_bit(const struct extent_buffer *eb, unsigned long start,
unsigned long nr)
{
- u8 *kaddr;
- struct page *page;
unsigned long i;
size_t offset;
+ u8 *kaddr;
eb_bitmap_offset(eb, start, nr, &i, &offset);
- page = folio_page(eb->folios[i], 0);
assert_eb_folio_uptodate(eb, i);
- kaddr = page_address(page);
+ kaddr = folio_address(eb->folios[i]);
return 1U & (kaddr[offset] >> (nr & (BITS_PER_BYTE - 1)));
}
--
2.43.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 2/2] btrfs: migrate btrfs_repair_io_failure() to folio interfaces
2023-12-12 5:24 [PATCH 0/2] btrfs: migrate the remaining functions exposed by a full fstests with larger metadata folios Qu Wenruo
2023-12-12 5:24 ` [PATCH 1/2] btrfs: migrate eb_bitmap_offset() to folio interfaces Qu Wenruo
@ 2023-12-12 5:24 ` Qu Wenruo
2023-12-13 22:39 ` [PATCH 0/2] btrfs: migrate the remaining functions exposed by a full fstests with larger metadata folios David Sterba
2 siblings, 0 replies; 4+ messages in thread
From: Qu Wenruo @ 2023-12-12 5:24 UTC (permalink / raw)
To: linux-btrfs
[BUG]
Test case btrfs/124 failed if larger metadata folio is enabled, the
dying message looks like this:
BTRFS error (device dm-2): bad tree block start, mirror 2 want 31686656 have 0
BTRFS info (device dm-2): read error corrected: ino 0 off 31686656 (dev /dev/mapper/test-scratch2 sector 20928)
BUG: kernel NULL pointer dereference, address: 0000000000000020
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
CPU: 6 PID: 350881 Comm: btrfs Tainted: G OE 6.7.0-rc3-custom+ #128
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 2/2/2022
RIP: 0010:btrfs_read_extent_buffer+0x106/0x180 [btrfs]
PKRU: 55555554
Call Trace:
<TASK>
read_tree_block+0x33/0xb0 [btrfs]
read_block_for_search+0x23e/0x340 [btrfs]
btrfs_search_slot+0x2f9/0xe60 [btrfs]
btrfs_lookup_csum+0x75/0x160 [btrfs]
btrfs_lookup_bio_sums+0x21a/0x560 [btrfs]
btrfs_submit_chunk+0x152/0x680 [btrfs]
btrfs_submit_bio+0x1c/0x50 [btrfs]
submit_one_bio+0x40/0x80 [btrfs]
submit_extent_page+0x158/0x390 [btrfs]
btrfs_do_readpage+0x330/0x740 [btrfs]
extent_readahead+0x38d/0x6c0 [btrfs]
read_pages+0x94/0x2c0
page_cache_ra_unbounded+0x12d/0x190
relocate_file_extent_cluster+0x7c1/0x9d0 [btrfs]
relocate_block_group+0x2d3/0x560 [btrfs]
btrfs_relocate_block_group+0x2c7/0x4b0 [btrfs]
btrfs_relocate_chunk+0x4c/0x1a0 [btrfs]
btrfs_balance+0x925/0x13c0 [btrfs]
btrfs_ioctl+0x19f1/0x25d0 [btrfs]
__x64_sys_ioctl+0x90/0xd0
do_syscall_64+0x3f/0xf0
entry_SYSCALL_64_after_hwframe+0x6e/0x76
[CAUSE]
The dying line is at btrfs_repair_io_failure() call inside
btrfs_repair_eb_io_failure().
The function is still relying on the extent buffer using page sized
folios.
When the extent buffer is using larger folio, we go into the 2nd slot of
folios[], and triggered the NULL pointer dereference.
[FIX]
Migrate btrfs_repair_io_failure() to folio interfaces.
So that when we hit a larger folio, we just submit the whole folio in
one go.
This also affects data repair path through btrfs_end_repair_bio(),
thankfully data is still fully page based, we can just add an
ASSERT(), and use page_folio() to convert the page to folio.
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
fs/btrfs/bio.c | 15 +++++++++++----
fs/btrfs/bio.h | 4 ++--
fs/btrfs/disk-io.c | 13 +++++++------
3 files changed, 20 insertions(+), 12 deletions(-)
diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c
index 4f3b693a16b1..c6b4b1ba953f 100644
--- a/fs/btrfs/bio.c
+++ b/fs/btrfs/bio.c
@@ -194,6 +194,12 @@ static void btrfs_end_repair_bio(struct btrfs_bio *repair_bbio,
struct bio_vec *bv = bio_first_bvec_all(&repair_bbio->bio);
int mirror = repair_bbio->mirror_num;
+ /*
+ * We can only hit here for data bio, which doesn't support
+ * larger folios yet.
+ */
+ ASSERT(folio_order(page_folio(bv->bv_page)) == 0);
+
if (repair_bbio->bio.bi_status ||
!btrfs_data_csum_ok(repair_bbio, dev, 0, bv)) {
bio_reset(&repair_bbio->bio, NULL, REQ_OP_READ);
@@ -215,7 +221,7 @@ static void btrfs_end_repair_bio(struct btrfs_bio *repair_bbio,
btrfs_repair_io_failure(fs_info, btrfs_ino(inode),
repair_bbio->file_offset, fs_info->sectorsize,
repair_bbio->saved_iter.bi_sector << SECTOR_SHIFT,
- bv->bv_page, bv->bv_offset, mirror);
+ page_folio(bv->bv_page), bv->bv_offset, mirror);
} while (mirror != fbio->bbio->mirror_num);
done:
@@ -767,8 +773,8 @@ void btrfs_submit_bio(struct btrfs_bio *bbio, int mirror_num)
* freeing the bio.
*/
int btrfs_repair_io_failure(struct btrfs_fs_info *fs_info, u64 ino, u64 start,
- u64 length, u64 logical, struct page *page,
- unsigned int pg_offset, int mirror_num)
+ u64 length, u64 logical, struct folio *folio,
+ unsigned int folio_offset, int mirror_num)
{
struct btrfs_io_stripe smap = { 0 };
struct bio_vec bvec;
@@ -799,7 +805,8 @@ int btrfs_repair_io_failure(struct btrfs_fs_info *fs_info, u64 ino, u64 start,
bio_init(&bio, smap.dev->bdev, &bvec, 1, REQ_OP_WRITE | REQ_SYNC);
bio.bi_iter.bi_sector = smap.physical >> SECTOR_SHIFT;
- __bio_add_page(&bio, page, length, pg_offset);
+ ret = bio_add_folio(&bio, folio, length, folio_offset);
+ ASSERT(ret);
ret = submit_bio_wait(&bio);
if (ret) {
/* try to remap that extent elsewhere? */
diff --git a/fs/btrfs/bio.h b/fs/btrfs/bio.h
index ca79decee060..bbaed317161a 100644
--- a/fs/btrfs/bio.h
+++ b/fs/btrfs/bio.h
@@ -105,7 +105,7 @@ void btrfs_bio_end_io(struct btrfs_bio *bbio, blk_status_t status);
void btrfs_submit_bio(struct btrfs_bio *bbio, int mirror_num);
void btrfs_submit_repair_write(struct btrfs_bio *bbio, int mirror_num, bool dev_replace);
int btrfs_repair_io_failure(struct btrfs_fs_info *fs_info, u64 ino, u64 start,
- u64 length, u64 logical, struct page *page,
- unsigned int pg_offset, int mirror_num);
+ u64 length, u64 logical, struct folio *folio,
+ unsigned int folio_offset, int mirror_num);
#endif
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index a482ba513a18..369e71677adf 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -183,21 +183,22 @@ static int btrfs_repair_eb_io_failure(const struct extent_buffer *eb,
int mirror_num)
{
struct btrfs_fs_info *fs_info = eb->fs_info;
- int i, num_pages = num_extent_pages(eb);
+ int num_folios = num_extent_folios(eb);
int ret = 0;
if (sb_rdonly(fs_info->sb))
return -EROFS;
- for (i = 0; i < num_pages; i++) {
- u64 start = max_t(u64, eb->start, folio_pos(eb->folios[i]));
+ for (int i = 0; i < num_folios ; i++) {
+ struct folio *folio = eb->folios[i];
+ u64 start = max_t(u64, eb->start, folio_pos(folio));
u64 end = min_t(u64, eb->start + eb->len,
- folio_pos(eb->folios[i]) + PAGE_SIZE);
+ folio_pos(folio) + folio_size(folio));
u32 len = end - start;
ret = btrfs_repair_io_failure(fs_info, 0, start, len,
- start, folio_page(eb->folios[i], 0),
- offset_in_page(start), mirror_num);
+ start, folio, offset_in_folio(folio, start),
+ mirror_num);
if (ret)
break;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH 0/2] btrfs: migrate the remaining functions exposed by a full fstests with larger metadata folios
2023-12-12 5:24 [PATCH 0/2] btrfs: migrate the remaining functions exposed by a full fstests with larger metadata folios Qu Wenruo
2023-12-12 5:24 ` [PATCH 1/2] btrfs: migrate eb_bitmap_offset() to folio interfaces Qu Wenruo
2023-12-12 5:24 ` [PATCH 2/2] btrfs: migrate btrfs_repair_io_failure() " Qu Wenruo
@ 2023-12-13 22:39 ` David Sterba
2 siblings, 0 replies; 4+ messages in thread
From: David Sterba @ 2023-12-13 22:39 UTC (permalink / raw)
To: Qu Wenruo; +Cc: linux-btrfs
On Tue, Dec 12, 2023 at 03:54:08PM +1030, Qu Wenruo wrote:
> [REPO]
> This patchset along with all the previous migrations (and the final
> enablement patch) can be found here:
>
> https://github.com/adam900710/linux/tree/eb_memory
>
> With all the previous migrations (although only tested without larger
> folios), we are finally just one step away from enabling larger folio
> support for btrfs metadata.
>
> During my local full fstests runs with larger metadata folios, there are
> only two bugs hit, all related to some code path not yet handling folios
> correct:
>
> - eb_bitmap_offset()
> - btrfs_repair_eb_io_failure()
>
> Otherwise my local branch can already pass local fstests without new
> regressions.
>
> So here is the final (and I hope is the last) migrations for involed
> metadata code path, before the final patch enabling larger folio
> support.
Great, thanks. We'll need to test the first batch of folio conversion
but so far it seems it's ok, enabling the higher order folios can be
done at rc3 time in case we want to target the next major release, or we
can postpone it to the following one if needed.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-12-13 22:46 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-12-12 5:24 [PATCH 0/2] btrfs: migrate the remaining functions exposed by a full fstests with larger metadata folios Qu Wenruo
2023-12-12 5:24 ` [PATCH 1/2] btrfs: migrate eb_bitmap_offset() to folio interfaces Qu Wenruo
2023-12-12 5:24 ` [PATCH 2/2] btrfs: migrate btrfs_repair_io_failure() " Qu Wenruo
2023-12-13 22:39 ` [PATCH 0/2] btrfs: migrate the remaining functions exposed by a full fstests with larger metadata folios David Sterba
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox