From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: dsterba@suse.cz, Qu Wenruo <wqu@suse.com>
Cc: linux-btrfs@vger.kernel.org,
Christoph Anton Mitterer <calestyo@scientia.org>
Subject: Re: [PATCH] btrfs: defrag: add under utilized extent to defrag target list
Date: Sat, 13 Jan 2024 13:47:25 +1030 [thread overview]
Message-ID: <7bef3393-a1b4-4a18-98cb-508cfb1ca6ee@gmx.com> (raw)
In-Reply-To: <20240112155806.GS31555@twin.jikos.cz>
On 2024/1/13 02:28, David Sterba wrote:
> On Thu, Jan 11, 2024 at 04:54:47PM +1030, Qu Wenruo wrote:
>>
>>
>> On 2024/1/11 03:39, David Sterba wrote:
>>> On Fri, Jan 05, 2024 at 06:03:40PM +1030, Qu Wenruo wrote:
>>>> [BUG]
>>>> The following script can lead to a very under utilized extent and we
>>>> have no way to use defrag to properly reclaim its wasted space:
>>>>
>>>> # mkfs.btrfs -f $dev
>>>> # mount $dev $mnt
>>>> # xfs_io -f -c "pwrite 0 128M" $mnt/foobar
>>>> # sync
>>>> # btrfs filesystem defrag $mnt/foobar
>>>> # sync
>>>
>>> I don't see what's wrong with this example, as Filipe noted there's a
>>> truncate missing, but still this should be explained better.
>>
>> Sorry, the full explanation looks like this:
>>
>> After above truncation, we will got the following file extent layout:
>>
>> item 6 key (257 EXTENT_DATA 0) itemoff 15813 itemsize 53
>> generation 7 type 1 (regular)
>> extent data disk byte 298844160 nr 134217728
>> extent data offset 0 nr 4096 ram 134217728
>> extent compression 0 (none)
>>
>> That would be the last 4K referring that 128M extent, aka, wasted
>> (128M-4K) bytes, or 99.695% of the extent.
>
> Ok, so it's the known issue.
>
>> Normally we expect defrag to properly re-dirty the extent so that we can
>> free that 128M extent.
>> But defrag won't touch it at all, mostly due to there is no next extent
>> to merge.
>>
>>> Is this the problem when an overwritten and shared extent is partially
>>> overwritten but still occupying the whole range, aka. bookend extent?
>>> If yes, defrag was never meant to deal with that, though we could use
>>> the interface for that.
>>
>> If we don't go defrag, there is really no good way to do it safely.
>>
>> Sure you can copy the file to another non-btrfs location or dd it.
>> But that's not safe if there is still some process accessing it etc.
>>
>>> As Andrei pointed out, this is more like a garbage collection, get rid
>>> of extent that is partially unreachable. Detecting such extent requires
>>> looking for the unreferenced part of the extent while defragmentation
>>> deals with live data. This could be a new ioctl entirely too. But first
>>> I'd like to know if we're talking about the same thing.
>>
>> Yes, we're talking about the bookend problem.
>> As I would expect defrag to free most, if not all, such bookend extents.
>> (And that's exactly what I recommend to the initial report)
>
> Here the defrag can mean two things, the interface (ioctl and command)
> and the implementation. As defrag tries to merge adjacent extents or
> coalesce small extents and move it to a new location, this may not be
> always necessary just to get rid of the unreachable extent parts.
To me, defrag just means re-dirty the file range.
Whether it would result contig extent or lead to more fragments is not
ensured.
(E.g. defrag a fragmented file, but the fs itself is also super
fragmented, or due to very high memory pressure we have to do writeback
very often).
>
> From the interface side, we can add a mode that does only the garbage
> collection, effectively just looking up the unreachable parts, trimming
> the extents but leaving the live data intact.
Another thing is, the same bookend problem would lead to very different
behavior, based on whether the file extent has an adjacent extent.
To me, the very basic defrag is just re-dirty all extents, no matter
what (and I believe some older fs is doing exactly that?).
It's us doing extra checks to avoid wasting IO on already very good extents.
>
> The modes of operation:
>
> - current defrag, move if filters and conditons allow that
> - defrag + garbage collect extents
> - just garbage collect extents
Thus I don't really think there is any different between garbage
collection or "defrag".
They are the same thing, just re-dirty the file extents.
(And as stated above, if the result would be better is never ensured).
What we really want to do is, just add extra filters to allow end users
to re-dirty the last file extent.
Thanks,
Qu
>
> The third mode is for use case to let it run on the whole filesystem but
> not try to rewrite everything.
>
> I'm not sure how would it affect send/receive, read-only subvolumes
> should not be touched.
>
next prev parent reply other threads:[~2024-01-13 3:17 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-05 7:33 [PATCH] btrfs: defrag: add under utilized extent to defrag target list Qu Wenruo
2024-01-05 16:45 ` Andrei Borzenkov
2024-01-05 20:11 ` Qu Wenruo
2024-01-09 14:55 ` Filipe Manana
2024-01-09 16:12 ` Filipe Manana
2024-01-09 21:04 ` Qu Wenruo
2024-01-09 21:57 ` Christoph Anton Mitterer
2024-01-09 22:17 ` Qu Wenruo
2024-01-10 17:09 ` David Sterba
2024-01-11 6:24 ` Qu Wenruo
2024-01-12 15:58 ` David Sterba
2024-01-13 3:17 ` Qu Wenruo [this message]
2024-01-13 8:05 ` Andrei Borzenkov
2024-01-13 8:32 ` Qu Wenruo
2024-01-13 3:47 ` Christoph Anton Mitterer
2024-02-05 5:39 ` Christoph Anton Mitterer
2024-02-05 5:42 ` Qu Wenruo
2024-04-20 4:30 ` Skirnir Torvaldsson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7bef3393-a1b4-4a18-98cb-508cfb1ca6ee@gmx.com \
--to=quwenruo.btrfs@gmx.com \
--cc=calestyo@scientia.org \
--cc=dsterba@suse.cz \
--cc=linux-btrfs@vger.kernel.org \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox