From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Filipe Manana <fdmanana@kernel.org>, Qu Wenruo <wqu@suse.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 4/4] btrfs: prepare btrfs_punch_hole_lock_range() for large data folios
Date: Sat, 29 Mar 2025 13:48:44 +1030 [thread overview]
Message-ID: <4e47199a-8257-457f-933d-d310e67bdd64@gmx.com> (raw)
In-Reply-To: <CAL3q7H4hxQbKfcuW+hPEFkiJk3KX9bRzNbjLms+Z1z_U=9PNPw@mail.gmail.com>
在 2025/3/29 04:27, Filipe Manana 写道:
> On Thu, Mar 27, 2025 at 10:33 PM Qu Wenruo <wqu@suse.com> wrote:
>>
>> The function btrfs_punch_hole_lock_range() needs to make sure there is
>> no other folio in the range, thus it goes with filemap_range_has_page(),
>> which works pretty fine.
>>
>> But if we have large folios, under the following case
>> filemap_range_has_page() will always return true, forcing
>> btrfs_punch_hole_lock_range() to do a very time consuming busy loop:
>>
>> start end
>> | |
>> |//|//|//|//| | | | | | | | |//|//|
>> \ / \ /
>> Folio A Folio B
>>
>> In above case, folio A and B contains our start/end index, and there is
>
> contains -> contain
> index -> indexes
> there is -> there are
>
>> no other folios in the range.
>> Thus there is no other folios and we do not need to retry inside
>
> is -> are.
>
> "Thus there is no other folios" is repeated from the previous
> sentence, so it can be just:
>
> Thus we do not need to retry inside...
>
>> btrfs_punch_hole_lock_range().
>>
>> To prepare for large data folios, introduce a helper,
>> check_range_has_page(), which will:
>>
>> - Grab all the folios inside the range
>>
>> - Skip any large folios that covers the start and end index
>
> covers -> cover
> index -> indexes
>
>>
>> - If any other folios is found return true
>
> is found -> are found
>
>>
>> - Otherwise return false
>>
>> This new helper is going to handle both large folios and regular ones.
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>> fs/btrfs/file.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++--
>> 1 file changed, 49 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
>> index 5d10ae321687..417c90ffc6fa 100644
>> --- a/fs/btrfs/file.c
>> +++ b/fs/btrfs/file.c
>> @@ -2157,6 +2157,54 @@ static int find_first_non_hole(struct btrfs_inode *inode, u64 *start, u64 *len)
>> return ret;
>> }
>>
>> +/*
>> + * The helper to check if there is no folio in the range.
>> + *
>> + * We can not utilized filemap_range_has_page() in a filemap with large folios
>> + * as we can hit the following false postive:
>
> postive -> positive
>
>> + *
>> + * start end
>> + * | |
>> + * |//|//|//|//| | | | | | | | |//|//|
>> + * \ / \ /
>> + * Folio A Folio B
>> + *
>> + * That large folio A and B covers the start and end index.
>
> covers -> cover
> index -> indexes
>
> Anyway, those are minor things, so:
>
> Reviewed-by: Filipe Manana <fdmanana@suse.com>
Thanks a lot for the review, although there is some more fixes needed in
this patch.
Thus I'll resent with all your comments addressed, but not with RoB tag
in case you find something else wrong in the fix.
>
> Thanks.
>
>
>
>> + * In that case filemap_range_has_page() will always return true, but the above
>> + * case is fine for btrfs_punch_hole_lock_range() usage.
>> + *
>> + * So here we only ensure that no other folio is in the range, excluding the
>> + * head/tail large folio.
>> + */
>> +static bool check_range_has_page(struct inode *inode, u64 start, u64 end)
>> +{
Fsstress exposed rare cases where btrfs_punch_hole_lock_range() can be
called with a range inside the same folio:
btrfs_punch_hole_lock_range: r/i=5/2745 start=4096(65536)
end=20479(18446744073709551615) enter
And since both @lockstart and @lockend are inside the same folio 0, the
page_lockend calculation will be -1, causing us to check to the end of
the inode.
But then we can hit a cases some one else is still holding the folio at 64K:
check_range_has_page: r/i=5/2745 start=65536(1)
end=18446744073709551615(281474976710655) enter
check_range_has_page: found folio=65536(1) size=65536(1)
Then we will hit a deadloop waiting for folio 64K meanwhile our zero
range doesn't even need that folio.
The original filemap_range_has_page() has a basic check to skip such
case completely, which is missing in the new helper.
Will get this fixed and resend.
Thanks,
Qu>> + struct folio_batch fbatch;
>> + bool ret = false;
>> + const pgoff_t start_index = start >> PAGE_SHIFT;
>> + const pgoff_t end_index = end >> PAGE_SHIFT;
>> + pgoff_t tmp = start_index;
>> + int found_folios;
>> +
>> + folio_batch_init(&fbatch);
>> + found_folios = filemap_get_folios(inode->i_mapping, &tmp, end_index,
>> + &fbatch);
>> + for (int i = 0; i < found_folios; i++) {
>> + struct folio *folio = fbatch.folios[i];
>> +
>> + /* A large folio begins before the start. Not a target. */
>> + if (folio->index < start_index)
>> + continue;
>> + /* A large folio extends beyond the end. Not a target. */
>> + if (folio->index + folio_nr_pages(folio) > end_index)
>> + continue;
>> + /* A folio doesn't cover the head/tail index. Found a target. */
>> + ret = true;
>> + break;
>> + }
>> + folio_batch_release(&fbatch);
>> + return ret;
>> +}
>> +
>> static void btrfs_punch_hole_lock_range(struct inode *inode,
>> const u64 lockstart,
>> const u64 lockend,
>> @@ -2188,8 +2236,7 @@ static void btrfs_punch_hole_lock_range(struct inode *inode,
>> * locking the range check if we have pages in the range, and if
>> * we do, unlock the range and retry.
>> */
>> - if (!filemap_range_has_page(inode->i_mapping, page_lockstart,
>> - page_lockend))
>> + if (!check_range_has_page(inode, page_lockstart, page_lockend))
>> break;
>>
>> unlock_extent(&BTRFS_I(inode)->io_tree, lockstart, lockend,
>> --
>> 2.49.0
>>
>>
>
prev parent reply other threads:[~2025-03-29 3:18 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-27 22:31 [PATCH 0/4] btrfs: add the missing preparations exposed by initial large data folio support Qu Wenruo
2025-03-27 22:31 ` [PATCH 1/4] btrfs: subpage: fix a bug that blocks large folios Qu Wenruo
2025-03-28 17:36 ` Filipe Manana
2025-03-27 22:31 ` [PATCH 2/4] btrfs: refactor how we handle reserved space inside copy_one_range() Qu Wenruo
2025-03-28 17:45 ` Filipe Manana
2025-03-27 22:31 ` [PATCH 3/4] btrfs: prepare btrfs_buffered_write() for large data folios Qu Wenruo
2025-03-28 17:51 ` Filipe Manana
2025-03-27 22:31 ` [PATCH 4/4] btrfs: prepare btrfs_punch_hole_lock_range() " Qu Wenruo
2025-03-28 17:57 ` Filipe Manana
2025-03-29 3:18 ` Qu Wenruo [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4e47199a-8257-457f-933d-d310e67bdd64@gmx.com \
--to=quwenruo.btrfs@gmx.com \
--cc=fdmanana@kernel.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox