From: Qu Wenruo <wqu@suse.com>
To: Johannes Thumshirn <jth@kernel.org>, Chris Mason <clm@fb.com>,
Josef Bacik <josef@toxicpanda.com>,
David Sterba <dsterba@suse.com>
Cc: linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org,
Filipe Manana <fdmanana@suse.com>,
Johannes Thumshirn <johannes.thumshirn@wdc.com>
Subject: Re: [PATCH v2 3/3] btrfs: update stripe_extent delete loop assumptions
Date: Thu, 11 Jul 2024 17:25:02 +0930 [thread overview]
Message-ID: <5838503b-a4aa-4023-901b-99d637cadac4@suse.com> (raw)
In-Reply-To: <ca001842-92f4-46ff-80ee-e7a8a97fc433@suse.com>
在 2024/7/11 17:14, Qu Wenruo 写道:
>
>
> 在 2024/7/11 16:25, Qu Wenruo 写道:
>>
>>
>> 在 2024/7/11 15:51, Johannes Thumshirn 写道:
>>> From: Johannes Thumshirn <johannes.thumshirn@wdc.com>
>>>
>>> btrfs_delete_raid_extent() was written under the assumption, that it's
>>> call-chain always passes a start, length tuple that matches a single
>>> extent. But btrfs_delete_raid_extent() is called by
>>> do_free_extent_acounting() which in term is called by >
>>> __btrfs_free_extent().
>>
>> But from the call site __btrfs_free_extent(), it is still called for a
>> single extent.
>>
>> Or we will get an error and abort the current transaction.
>
> Or does it mean, one data extent can have multiple RST entries?
>
> Is that a non-zoned RST specific behavior?
> As I still remember that we split ordered extents for zoned devices, so
> that it should always have one extent for each split OE.
OK, it's indeed an RST specific behavior (at least for RST with
non-zoned devices).
I can have the following layout:
item 15 key (258 EXTENT_DATA 419430400) itemoff 15306 itemsize 53
generation 10 type 1 (regular)
extent data disk byte 1808793600 nr 117440512
extent data offset 0 nr 117440512 ram 117440512
extent compression 0 (none)
Which is a large data extent with 112MiB length.
Meanwhile for the RST entries there are 3 split ones:
item 13 key (1808793600 RAID_STRIPE 33619968) itemoff 15835
itemsize 32
stripe 0 devid 2 physical 1787822080
stripe 1 devid 1 physical 1808793600
item 14 key (1842413568 RAID_STRIPE 58789888) itemoff 15803
itemsize 32
stripe 0 devid 2 physical 1821442048
stripe 1 devid 1 physical 1842413568
item 15 key (1901203456 RAID_STRIPE 25030656) itemoff 15771
itemsize 32
stripe 0 devid 2 physical 1880231936
stripe 1 devid 1 physical 1901203456
So yes, it's possible to have multiple RST entries for a single data
extent, it's no longer the old zoned behavior.
In that case, the patch looks fine to me.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Thanks,
Qu
>
> Thanks,
> Qu
>>
>>>
>>> But this call-chain passes in a start address and a length that can
>>> possibly match multiple on-disk extents.
>>
>> Mind to give a more detailed example on this?
>>
>> Thanks,
>> Qu
>>
>>>
>>> To make this possible, we have to adjust the start and length of each
>>> btree node lookup, to not delete beyond the requested range.
>>>
>>> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
>>> ---
>>> fs/btrfs/raid-stripe-tree.c | 5 +++++
>>> 1 file changed, 5 insertions(+)
>>>
>>> diff --git a/fs/btrfs/raid-stripe-tree.c b/fs/btrfs/raid-stripe-tree.c
>>> index fd56535b2289..6f65be334637 100644
>>> --- a/fs/btrfs/raid-stripe-tree.c
>>> +++ b/fs/btrfs/raid-stripe-tree.c
>>> @@ -66,6 +66,11 @@ int btrfs_delete_raid_extent(struct
>>> btrfs_trans_handle *trans, u64 start, u64 le
>>> if (ret)
>>> break;
>>> + start += key.offset;
>>> + length -= key.offset;
>>> + if (length == 0)
>>> + break;
>>> +
>>> btrfs_release_path(path);
>>> }
>>>
>>
>
prev parent reply other threads:[~2024-07-11 7:55 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-11 6:21 [PATCH v2 0/3] btrfs: more RAID stripe tree updates Johannes Thumshirn
2024-07-11 6:21 ` [PATCH v2 1/3] btrfs: don't hold dev_replace rwsem over whole of btrfs_map_block Johannes Thumshirn
2024-07-11 6:21 ` [PATCH v2 2/3] btrfs: replace stripe extents Johannes Thumshirn
2024-07-11 7:51 ` Naohiro Aota
2024-07-12 6:34 ` Johannes Thumshirn
2024-07-11 6:21 ` [PATCH v2 3/3] btrfs: update stripe_extent delete loop assumptions Johannes Thumshirn
2024-07-11 6:55 ` Qu Wenruo
2024-07-11 7:44 ` Qu Wenruo
2024-07-11 7:55 ` Qu Wenruo [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5838503b-a4aa-4023-901b-99d637cadac4@suse.com \
--to=wqu@suse.com \
--cc=clm@fb.com \
--cc=dsterba@suse.com \
--cc=fdmanana@suse.com \
--cc=johannes.thumshirn@wdc.com \
--cc=josef@toxicpanda.com \
--cc=jth@kernel.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox