Re: Btrfs autodefrag wrote 5TB in one day to a 0.5TB SSD without a measurable benefit

public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed

From: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
To: Remi Gauvin <remi@georgianit.com>
Cc: linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: Btrfs autodefrag wrote 5TB in one day to a 0.5TB SSD without a measurable benefit
Date: Mon, 14 Mar 2022 19:39:51 -0400	[thread overview]
Message-ID: <Yi/SR7CNbtDvIsPn@hungrycats.org> (raw)
In-Reply-To: <23441a6c-3860-4e99-0e56-43490d8c0ac2@georgianit.com>

On Mon, Mar 14, 2022 at 07:07:44PM -0400, Remi Gauvin wrote:
> On 2022-03-14 6:51 p.m., Zygo Blaxell wrote:
> > If you never use prealloc or defrag, it's usually not a problem.
> 
> You're assuming that the file is created from scratch on that media.  VM
> and databases that are restored from images/backups, or re-written as
> some kind of maintenance, (shrink vm images, compress database, or
> whatever) become a huge problem.

VM images do sometimes combine sequential and random writes and create
a lot of waste.  They are one of the outliers that is a problem case
even with a normal life cycle (as opposed to one interrupted by backup
restore).  A cluster command in SQL can instantly double a DB size.

> In one instance, I had a VM image that was taking up more than 100% of
> it's filesize due to lack of defrag.  For a while I was regularly
> defragmenting those with target size of 100MB as the only way to garbage
> collect, but that is a shameful waste of write cycles on SSD.  Adding
> compress-force=lzo was the only way for me to solve this issue, (and it
> even seems to help performance (on SSD, *not* HDD), though probably not
> for small random reads,, I haven't properly compared that.)

Ideally we'd have a proper garbage collection tool for btrfs that ran
defrag _only_ on extents that are holding references to wasted space,
which is the side-effect of defrag that most people want instead of
what defrag nominally tries to do.  I have it on my already too-long
to-do list.

If we're adding a mount option for this (I'm not opposed to it, I'm
pointing out that it's not the first tool to reach for), then ideally
we'd overload it for the compressed batch size (currently hardcoded
at 512K).  I have IO patterns that would like compress-force to write
128M uncompressed extents, and provide enough extents at a time to keep
all the cores busy sequentially compressing a single extent on each one.

next prev parent reply	other threads:[~2022-03-14 23:40 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-06 15:59 Btrfs autodefrag wrote 5TB in one day to a 0.5TB SSD without a measurable benefit Jan Ziak
2022-03-07  0:48 ` Qu Wenruo
2022-03-07  2:23   ` Jan Ziak
2022-03-07  2:39     ` Qu Wenruo
2022-03-07  7:31       ` Qu Wenruo
2022-03-10  1:10         ` Jan Ziak
2022-03-10  1:26           ` Qu Wenruo
2022-03-10  4:33             ` Jan Ziak
2022-03-10  6:42               ` Qu Wenruo
2022-03-10 21:31                 ` Jan Ziak
2022-03-10 23:27                   ` Qu Wenruo
2022-03-11  2:42                     ` Jan Ziak
2022-03-11  2:59                       ` Qu Wenruo
2022-03-11  5:04                         ` Jan Ziak
2022-03-11 16:31                           ` Jan Ziak
2022-03-11 20:02                             ` Jan Ziak
2022-03-11 23:04                             ` Qu Wenruo
2022-03-11 23:28                               ` Jan Ziak
2022-03-11 23:39                                 ` Qu Wenruo
2022-03-12  0:01                                   ` Jan Ziak
2022-03-12  0:15                                     ` Qu Wenruo
2022-03-12  3:16                                     ` Zygo Blaxell
2022-03-12  2:43                                 ` Zygo Blaxell
2022-03-12  3:24                                   ` Qu Wenruo
2022-03-12  3:48                                     ` Zygo Blaxell
2022-03-14 20:09                         ` Phillip Susi
2022-03-14 22:59                           ` Zygo Blaxell
2022-03-15 18:28                             ` Phillip Susi
2022-03-15 19:28                               ` Jan Ziak
2022-03-15 21:06                               ` Zygo Blaxell
2022-03-15 22:20                                 ` Jan Ziak
2022-03-16 17:02                                   ` Zygo Blaxell
2022-03-16 17:48                                     ` Jan Ziak
2022-03-17  2:11                                       ` Zygo Blaxell
2022-03-16 18:46                                 ` Phillip Susi
2022-03-16 19:59                                   ` Zygo Blaxell
2022-03-20 17:50                             ` Forza
2022-03-20 21:15                               ` Zygo Blaxell
2022-03-08 21:57       ` Jan Ziak
2022-03-08 23:40         ` Qu Wenruo
2022-03-09 22:22           ` Jan Ziak
2022-03-09 22:44             ` Qu Wenruo
2022-03-09 22:55               ` Jan Ziak
2022-03-09 23:00                 ` Jan Ziak
2022-03-09  4:48         ` Zygo Blaxell
2022-03-07 14:30 ` Phillip Susi
2022-03-08 21:43   ` Jan Ziak
2022-03-09 18:46     ` Phillip Susi
2022-03-09 21:35       ` Jan Ziak
2022-03-14 20:02         ` Phillip Susi
2022-03-14 21:53           ` Jan Ziak
2022-03-14 22:24             ` Remi Gauvin
2022-03-14 22:51               ` Zygo Blaxell
2022-03-14 23:07                 ` Remi Gauvin
2022-03-14 23:39                   ` Zygo Blaxell [this message]
2022-03-15 14:14                     ` Remi Gauvin
2022-03-15 18:51                       ` Zygo Blaxell
2022-03-15 19:22                         ` Remi Gauvin
2022-03-15 21:08                           ` Zygo Blaxell
2022-03-15 18:15             ` Phillip Susi
2022-03-16 16:52           ` Andrei Borzenkov
2022-03-16 18:28             ` Jan Ziak
2022-03-16 18:31             ` Phillip Susi
2022-03-16 18:43               ` Andrei Borzenkov
2022-03-16 18:46               ` Jan Ziak
2022-03-16 19:04               ` Zygo Blaxell
2022-03-17 20:34                 ` Phillip Susi
2022-03-17 22:06                   ` Zygo Blaxell
2022-03-16 12:47 ` Kai Krakow
2022-03-16 18:18   ` Jan Ziak
  -- strict thread matches above, loose matches on Subject: below --
2022-06-17  0:20 Jan Ziak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yi/SR7CNbtDvIsPn@hungrycats.org \
    --to=ce3g8jdj@umail.furryterror.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=remi@georgianit.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox