From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Jan Ziak <0xe2.0x9a.0x9b@gmail.com>, linux-btrfs@vger.kernel.org
Subject: Re: Btrfs autodefrag wrote 5TB in one day to a 0.5TB SSD without a measurable benefit
Date: Mon, 7 Mar 2022 08:48:15 +0800 [thread overview]
Message-ID: <455d2012-aeaf-42c5-fadb-a5dc67beff35@gmx.com> (raw)
In-Reply-To: <CAODFU0rZEy064KkSK1juHA6=r2zC4=Go8Me2V2DqHWb-AirL-Q@mail.gmail.com>
On 2022/3/6 23:59, Jan Ziak wrote:
> I would like to report that btrfs in Linux kernel 5.16.12 mounted with
> the autodefrag option wrote 5TB in a single day to a 1TB SSD that is
> about 50% full.
>
> Defragmenting 0.5TB on a drive that is 50% full should write far less than 5TB.
If using defrag ioctl, that's a good and solid expectation.
>
> Benefits to the fragmentation of the most written files over the
> course of the one day (sqlite database files) are nil. Please see the
> data below. Also note that the sqlite file is using up to 10 GB more
> than it should due to fragmentation.
Autodefrag will mark any file which got smaller writes (<64K) for scan.
For smaller extents than 64K, they will be re-dirtied for writeback.
So in theory, if the cleaner is triggered very frequently to do
autodefrag, it can indeed easily amplify the writes.
Are you using commit= mount option? Which would reduce the commit
interval thus trigger autodefrag more frequently.
>
> CPU utilization on an otherwise idle machine is approximately 600% all
> the time: btrfs-cleaner 100%, kworkers...btrfs 500%.
The problem is why the CPU usage is at 100% for cleaner.
Would you please apply this patch on your kernel?
https://patchwork.kernel.org/project/linux-btrfs/patch/bf2635d213e0c85251c4cd0391d8fbf274d7d637.1645705266.git.wqu@suse.com/
Then enable the following trace events:
btrfs:defrag_one_locked_range
btrfs:defrag_add_target
btrfs:defrag_file_start
btrfs:defrag_file_end
Those trace events would show why we're doing the same re-dirty again
and again, and mostly why the CPU usage is so high.
Thanks,
Qu
>
> I am not just asking you to fix this issue - I am asking you how is it
> possible for an algorithm that is significantly worse than O(N*log(N))
> to be merged into the Linux kernel in the first place!?
>
> Please try to avoid discussing no-CoW (chattr +C) in your response,
> because it is beside the point. Thanks.
>
> ----
>
> A day before:
>
> $ smartctl -a /dev/nvme0n1 | grep Units
> Data Units Read: 449,265,485 [230 TB]
> Data Units Written: 406,386,721 [208 TB]
>
> $ compsize file.sqlite
> Processed 1 file, 1757129 regular extents (2934077 refs), 0 inline.
> Type Perc Disk Usage Uncompressed Referenced
> TOTAL 100% 46G 46G 37G
> none 100% 46G 46G 37G
>
> ----
>
> A day after:
>
> $ smartctl -a /dev/nvme0n1 | grep Units
> Data Units Read: 473,211,419 [242 TB]
> Data Units Written: 417,249,915 [213 TB]
>
> $ compsize file.sqlite
> Processed 1 file, 1834778 regular extents (3050838 refs), 0 inline.
> Type Perc Disk Usage Uncompressed Referenced
> TOTAL 100% 47G 47G 37G
> none 100% 47G 47G 37G
>
> $ filefrag file.sqlite
> (Ctrl-C after waiting more than 10 minutes, consuming 100% CPU)
>
> ----
>
> Manual defragmentation decreased the file's size by 7 GB:
>
> $ btrfs-defrag file.sqlite
> $ sync
> $ compsize file.sqlite
> Processed 6 files, 13074 regular extents (20260 refs), 0 inline.
> Type Perc Disk Usage Uncompressed Referenced
> TOTAL 100% 40G 40G 37G
> none 100% 40G 40G 37G
>
> ----
>
> Sincerely
> Jan
next prev parent reply other threads:[~2022-03-07 0:48 UTC|newest]
Thread overview: 71+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-06 15:59 Btrfs autodefrag wrote 5TB in one day to a 0.5TB SSD without a measurable benefit Jan Ziak
2022-03-07 0:48 ` Qu Wenruo [this message]
2022-03-07 2:23 ` Jan Ziak
2022-03-07 2:39 ` Qu Wenruo
2022-03-07 7:31 ` Qu Wenruo
2022-03-10 1:10 ` Jan Ziak
2022-03-10 1:26 ` Qu Wenruo
2022-03-10 4:33 ` Jan Ziak
2022-03-10 6:42 ` Qu Wenruo
2022-03-10 21:31 ` Jan Ziak
2022-03-10 23:27 ` Qu Wenruo
2022-03-11 2:42 ` Jan Ziak
2022-03-11 2:59 ` Qu Wenruo
2022-03-11 5:04 ` Jan Ziak
2022-03-11 16:31 ` Jan Ziak
2022-03-11 20:02 ` Jan Ziak
2022-03-11 23:04 ` Qu Wenruo
2022-03-11 23:28 ` Jan Ziak
2022-03-11 23:39 ` Qu Wenruo
2022-03-12 0:01 ` Jan Ziak
2022-03-12 0:15 ` Qu Wenruo
2022-03-12 3:16 ` Zygo Blaxell
2022-03-12 2:43 ` Zygo Blaxell
2022-03-12 3:24 ` Qu Wenruo
2022-03-12 3:48 ` Zygo Blaxell
2022-03-14 20:09 ` Phillip Susi
2022-03-14 22:59 ` Zygo Blaxell
2022-03-15 18:28 ` Phillip Susi
2022-03-15 19:28 ` Jan Ziak
2022-03-15 21:06 ` Zygo Blaxell
2022-03-15 22:20 ` Jan Ziak
2022-03-16 17:02 ` Zygo Blaxell
2022-03-16 17:48 ` Jan Ziak
2022-03-17 2:11 ` Zygo Blaxell
2022-03-16 18:46 ` Phillip Susi
2022-03-16 19:59 ` Zygo Blaxell
2022-03-20 17:50 ` Forza
2022-03-20 21:15 ` Zygo Blaxell
2022-03-08 21:57 ` Jan Ziak
2022-03-08 23:40 ` Qu Wenruo
2022-03-09 22:22 ` Jan Ziak
2022-03-09 22:44 ` Qu Wenruo
2022-03-09 22:55 ` Jan Ziak
2022-03-09 23:00 ` Jan Ziak
2022-03-09 4:48 ` Zygo Blaxell
2022-03-07 14:30 ` Phillip Susi
2022-03-08 21:43 ` Jan Ziak
2022-03-09 18:46 ` Phillip Susi
2022-03-09 21:35 ` Jan Ziak
2022-03-14 20:02 ` Phillip Susi
2022-03-14 21:53 ` Jan Ziak
2022-03-14 22:24 ` Remi Gauvin
2022-03-14 22:51 ` Zygo Blaxell
2022-03-14 23:07 ` Remi Gauvin
2022-03-14 23:39 ` Zygo Blaxell
2022-03-15 14:14 ` Remi Gauvin
2022-03-15 18:51 ` Zygo Blaxell
2022-03-15 19:22 ` Remi Gauvin
2022-03-15 21:08 ` Zygo Blaxell
2022-03-15 18:15 ` Phillip Susi
2022-03-16 16:52 ` Andrei Borzenkov
2022-03-16 18:28 ` Jan Ziak
2022-03-16 18:31 ` Phillip Susi
2022-03-16 18:43 ` Andrei Borzenkov
2022-03-16 18:46 ` Jan Ziak
2022-03-16 19:04 ` Zygo Blaxell
2022-03-17 20:34 ` Phillip Susi
2022-03-17 22:06 ` Zygo Blaxell
2022-03-16 12:47 ` Kai Krakow
2022-03-16 18:18 ` Jan Ziak
-- strict thread matches above, loose matches on Subject: below --
2022-06-17 0:20 Jan Ziak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=455d2012-aeaf-42c5-fadb-a5dc67beff35@gmx.com \
--to=quwenruo.btrfs@gmx.com \
--cc=0xe2.0x9a.0x9b@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox