From: Qu Wenruo <wqu@suse.com>
To: linux-btrfs@vger.kernel.org
Subject: [PATCH v2 0/5] btrfs: add the missing preparations exposed by initial large data folio support
Date: Sat, 29 Mar 2025 19:49:35 +1030 [thread overview]
Message-ID: <cover.1743239672.git.wqu@suse.com> (raw)
[CHANGELOG]
v2:
- Rebased to the latest misc-next
As there are some conflicts regarding to:
* parameter list indent
* iov_iter parameter name
- Fix a busy loop caused by underflowing check_range_has_page() parameters
This is only happening for fs block size < page size cases, e.g.
the fs block size is 4K, page size is 64K.
And the lockend is (4K - 1).
The lockend in btrfs_punch_hole_lock_range() will be (u64)-1, as
round_down(lockend + 1, PAGE_SIZE) leads to 0, resulting
btrfs_punch_hole_lock_range() to do a busy loop waiting for all folios
to be purged.
With all the existing preparations, finally I can enable and start
initial testing (aka, short fsstress loops).
To no one's surprise, it immeidately exposed several problems:
- A critical subpage helper is still using offset_in_page()
Which causes incorrect start bit calculation and triggered the
ASSERT() inside subpage locking.
Fixed in the first patch.
- Fix a dead busy loop in btrfs_punch_hole_lock_range()
This is only happening when fs block size < page size.
@page_lockend can underflow, causing it to be -1, and
btrfs_punch_hole_lock_range() will busy loop waiting for all the
pages of an inode to be dropped, meanwhile
truncate_pagecache_range() is only called for the range inside the
first page (aka, doing nothing).
Fixed in the second patch.
- Buffered write can not shrink reserved space properly
Since we're reserving data and metadata space before grabbing the
folio, we have to reserved as much space as possible, other than just
reserving space for the range inside the page.
If the folio we got is smaller than what we expect, we have to shrink
the reserved space, but things like btrfs_delalloc_release_extents()
can not handle it.
Fixed in the third patch, with a new helper
btrfs_delalloc_shrink_extents().
This will also be a topic in in the iomap migration, iomap goes
valid_folio() callbacks to make sure no extent map is changed during
our buffered write, thus they can reserve a large range of space,
other than our current over-reserve-then-shrink.
Our behavior is super safe, but less optimized compared to iomap.
- Buffered write is not utilizing large folios
Since buffered write is the main entrance to allocate large folios,
without its support there will be no large folios at all.
Addressed in the forth patch.
- A dead busy loop inside btrfs_punch_hole_lock_range()
It turns out that the usage of filemap_range_has_page() is never a
good idea for large folios, as we can easily hit the following case:
start end
| |
|//|//|//|//| | | | | | | | |//|//|
\ / \ /
Folio A Folio B
Fixed in the last patch, with a helper check_range_has_page() to do
the check with large folios in mind.
This will also be a topic in the iomap migration, as our zero range
behavior is quite different from the iomap one, and the
filemap_range_has_page() behavior looks a little overkilled to me.
I'm pretty sure there will be more hidden bugs after I throw the whole
fstests to my local branch, but that's all the bugs I have so far.
Qu Wenruo (5):
btrfs: subpage: fix a bug that blocks large folios
btrfs: avoid page_lockend underflow in btrfs_punch_hole_lock_range()
btrfs: refactor how we handle reserved space inside copy_one_range()
btrfs: prepare btrfs_buffered_write() for large data folios
btrfs: prepare btrfs_punch_hole_lock_range() for large data folios
fs/btrfs/delalloc-space.c | 24 +++++
fs/btrfs/delalloc-space.h | 3 +-
fs/btrfs/file.c | 203 ++++++++++++++++++++++++++++----------
fs/btrfs/subpage.c | 2 +-
4 files changed, 177 insertions(+), 55 deletions(-)
--
2.49.0
next reply other threads:[~2025-03-29 9:20 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-29 9:19 Qu Wenruo [this message]
2025-03-29 9:19 ` [PATCH v2 1/5] btrfs: subpage: fix a bug that blocks large folios Qu Wenruo
2025-03-29 9:19 ` [PATCH v2 2/5] btrfs: avoid page_lockend underflow in btrfs_punch_hole_lock_range() Qu Wenruo
2025-03-31 11:24 ` Filipe Manana
2025-03-29 9:19 ` [PATCH v2 3/5] btrfs: refactor how we handle reserved space inside copy_one_range() Qu Wenruo
2025-03-29 9:19 ` [PATCH v2 4/5] btrfs: prepare btrfs_buffered_write() for large data folios Qu Wenruo
2025-03-29 9:19 ` [PATCH v2 5/5] btrfs: prepare btrfs_punch_hole_lock_range() " Qu Wenruo
2025-03-31 11:50 ` Filipe Manana
2025-03-31 21:19 ` Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1743239672.git.wqu@suse.com \
--to=wqu@suse.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox