From: Reindl Harald <h.reindl@thelounge.net>
To: Lukas Czerner <lczerner@redhat.com>
Cc: Wang Shilong <wangshilong1991@gmail.com>,
linux-ext4@vger.kernel.org, Wang Shilong <wshilong@ddn.com>,
Shuichi Ihara <sihara@ddn.com>,
Andreas Dilger <adilger@dilger.ca>
Subject: Re: [PATCH] ext4: introduce EXT4_BG_WAS_TRIMMED to optimize trim
Date: Wed, 27 May 2020 12:56:44 +0200 [thread overview]
Message-ID: <ece9cd79-6d97-db36-66bb-f02bd6bf6573@thelounge.net> (raw)
In-Reply-To: <20200527103214.knm2vmnwjt64j55l@work>
Am 27.05.20 um 12:32 schrieb Lukas Czerner:
> On Wed, May 27, 2020 at 12:11:52PM +0200, Reindl Harald wrote:
>>
>> Am 27.05.20 um 11:57 schrieb Lukas Czerner:
>>> On Wed, May 27, 2020 at 11:32:02AM +0200, Reindl Harald wrote:
>>>>
>>>>
>>>> Am 27.05.20 um 11:19 schrieb Lukas Czerner:
>>>>> On Wed, May 27, 2020 at 04:38:50PM +0900, Wang Shilong wrote:
>>>>>> From: Wang Shilong <wshilong@ddn.com>
>>>>>>
>>>>>> Currently WAS_TRIMMED flag is not persistent, whenever filesystem was
>>>>>> remounted, fstrim need walk all block groups again, the problem with
>>>>>> this is FSTRIM could be slow on very large LUN SSD based filesystem.
>>>>>>
>>>>>> To avoid this kind of problem, we introduce a block group flag
>>>>>> EXT4_BG_WAS_TRIMMED, the side effect of this is we need introduce
>>>>>> extra one block group dirty write after trimming block group.
>>>>
>>>> would that also fix the issue that *way too much* is trimmed all the
>>>> time, no matter if it's a thin provisioned vmware disk or a phyiscal
>>>> RAID10 with SSD
>>>
>>> no, the mechanism remains the same, but the proposal is to make it
>>> pesisten across re-mounts.
>>>
>>>>
>>>> no way of 315 MB deletes within 2 hours or so on a system with just 485M
>>>> used
>>>
>>> The reason is that we're working on block group granularity. So if you
>>> have almost free block group, and you free some blocks from it, the flag
>>> gets freed and next time you run fstrim it'll trim all the free space in
>>> the group. Then again if you free some blocks from the group, the flags
>>> gets cleared again ...
>>>
>>> But I don't think this is a problem at all. Certainly not worth tracking
>>> free/trimmed extents to solve it.
>>
>> it is a problem
>>
>> on a daily "fstrim -av" you trim gigabytes of alredy trimmed blocks
>> which for example on a vmware thin provisioned vdisk makes it down to
>> CBT (changed-block-tracking)
>>
>> so instead completly ignore that untouched space thanks to CBT it's
>> considered as changed and verified in the follow up backup run which
>> takes magnitutdes longer than needed
>
> Looks like you identified the problem then ;)
well, in a perfect world.....
> But seriously, trim/discard was always considered advisory and the
> storage is completely free to do whatever it wants to do with the
> information. I might even be the case that the discard requests are
> ignored and we might not even need optimization like this. But
> regardless it does take time to go through the block gropus and as a
> result this optimization is useful in the fs itself.
luckily at least fstrim is non-blocking in a vmware environment, on my
physical box it takes ages
this machine *does nothing* than wait to be cloned, 235 MB pretended
deleted data within 50 minutes is absurd on a completly idle guest
so even when i am all in for optimizations thatÄs way over top
[root@master:~]$ fstrim -av
/boot: 0 B (0 bytes) trimmed on /dev/sda1
/: 235.8 MiB (247201792 bytes) trimmed on /dev/sdb1
[root@master:~]$ df
Filesystem Type Size Used Avail Use% Mounted on
/dev/sdb1 ext4 5.8G 502M 5.3G 9% /
/dev/sda1 ext4 485M 39M 443M 9% /boot
> However it seems to me that the situation you're describing calls for
> optimization on a storage side (TP vdisk in your case), not file system
> side.
>
> And again, for fine grained discard you can use -o discard
with a terrible performance impact at runtime
next prev parent reply other threads:[~2020-05-27 10:56 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-27 7:38 [PATCH] ext4: introduce EXT4_BG_WAS_TRIMMED to optimize trim Wang Shilong
2020-05-27 9:19 ` Lukas Czerner
2020-05-27 9:32 ` Reindl Harald
2020-05-27 9:57 ` Lukas Czerner
2020-05-27 10:11 ` Reindl Harald
2020-05-27 10:32 ` Lukas Czerner
2020-05-27 10:56 ` Reindl Harald [this message]
2020-05-27 19:11 ` Andreas Dilger
2020-05-27 10:06 ` Wang Shilong
2020-05-27 10:21 ` Lukas Czerner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ece9cd79-6d97-db36-66bb-f02bd6bf6573@thelounge.net \
--to=h.reindl@thelounge.net \
--cc=adilger@dilger.ca \
--cc=lczerner@redhat.com \
--cc=linux-ext4@vger.kernel.org \
--cc=sihara@ddn.com \
--cc=wangshilong1991@gmail.com \
--cc=wshilong@ddn.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox