From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Christoph Anton Mitterer <calestyo@scientia.org>,
linux-btrfs@vger.kernel.org
Subject: Re: btrfs thinks fs is full, though 11GB should be still free
Date: Tue, 12 Dec 2023 11:28:45 +1030 [thread overview]
Message-ID: <3cfc3cdf-e6f2-400e-ac12-5ddb2840954d@gmx.com> (raw)
In-Reply-To: <f1f3b0f2a48f9092ea54f05b0f6596c58370e0b2.camel@scientia.org>
On 2023/12/12 10:42, Christoph Anton Mitterer wrote:
> On Tue, 2023-12-12 at 10:24 +1030, Qu Wenruo wrote:
>> Then the last thing is extent bookends.
>>
>> COW and small random writes can easily lead to extra space wasted by
>> extent bookends.
>
> Is there a way to check this? Would I just seem maaany extents when I
> look at the files with filefrag?
>
>
> I mean Prometheus, continuously collects metrics from a number of nodes
> an (sooner or later) writes them to disk.
> I don't really know their code, so I have no idea if they already write
> every tiny metric, or only large bunches thereof.
>
> Since they do maintain a WAL, I'd assume the former.
>
> Every know and then, the WAL is written to chunk files which are rather
> large, well ~160M or so in my case, but that depends on how many
> metrics one collects. I think they always write data for a period of
> 2h.
> Later on, they further compact that chunks (I think after 8 hours and
> so on), in which case some larger rewritings would be done.
> Though in my case this doesn't happen, as I run Thanos on top of
> Prometheus, and for that one needs to disable Prometheus' own
> compaction.
>
>
> I've had already previously looked at the extents for these "compacted"
> chunk files, but the worst file had only 32 extents (as reported by
> filefrag).
Filefrag doesn't work that well on btrfs AFAIK, as btrfs is emitting
merged extents to fiemap ioctl, but for fragmented one, filefrag should
be enough to detect them.
>
> Looking at the WAL files:
> /data/main/prometheus/metrics2/wal# filefrag * | grep -v ' 0 extents
> found'
> 00001030: 82 extents found
> 00001031: 81 extents found
> 00001032: 79 extents found
> 00001033: 82 extents found
> 00001034: 78 extents found
> 00001035: 78 extents found
> 00001036: 81 extents found
> 00001037: 79 extents found
> 00001038: 79 extents found
> 00001039: 89 extents found
> 00001040: 80 extents found
> 00001041: 74 extents found
> 00001042: 81 extents found
> 00001043: 97 extents found
> 00001044: 101 extents found
> 00001045: 316 extents found
> checkpoint.00001029: FIBMAP/FIEMAP unsupported
>
> (I did the grep -v, because there were a gazillion of empty wal files,
> presumably created when the fs was already full).
>
> The above numbers though still don't look to bad, do they?
Depends, in my previous 16M case. you only got 2 extents, but still
wasted 8M (33.3% space wasted).
But WAL indeeds looks like a bad patter for btrfs.
>
> And checking all:
> # find /data/main/ -type f -execdir filefrag {} \; | cut -d : -f 2 |
> sort | uniq -c | sort -V
> 3706 0 extents found
> 450 1 extent found
> 25 3 extents found
> 62 2 extents found
> 1 8 extents found
> 1 9 extents found
> 1 10 extents found
> 1 11 extents found
> 1 32 extents found
> 1 74 extents found
> 1 80 extents found
> 1 89 extents found
> 1 97 extents found
> 1 101 extents found
> 1 316 extents found
> 2 78 extents found
> 2 82 extents found
> 3 5 extents found
> 3 79 extents found
> 3 81 extents found
> 6 4 extents found
>
>
>
>> E.g. You write a 16M data extents, then over-write the tailing 8M,
>> now
>> we have two data extents, the old 16M and the new 8M, wasting 8M
>> space.
>>
>> In that case, you can try defrag, but you still need to delete some
>> data
>> first so that you can do defrag...
>
>
> Well my main concern is rather how to prevent this from happening in
> the first place... the data is already all backuped into Thanos, so I
> could also just wipe the fs.
> But this seems to occur repeatedly (well, okay only twice so far O:-)
> ).
> So that would mean we have some IO pattern that "kills" btrfs.
Thus we have "autodefrag" mount option for such use case.
Thanks,
Qu
>
>
> Cheers,
> Chris.
next prev parent reply other threads:[~2023-12-12 0:58 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-11 20:26 btrfs thinks fs is full, though 11GB should be still free Christoph Anton Mitterer
2023-12-11 20:57 ` Qu Wenruo
2023-12-11 22:23 ` Christoph Anton Mitterer
2023-12-11 22:26 ` Christoph Anton Mitterer
2023-12-11 23:20 ` Qu Wenruo
2023-12-11 23:38 ` Christoph Anton Mitterer
2023-12-11 23:54 ` Qu Wenruo
2023-12-12 0:12 ` Christoph Anton Mitterer
2023-12-12 0:58 ` Qu Wenruo [this message]
2023-12-12 2:30 ` Qu Wenruo
2023-12-12 3:27 ` Christoph Anton Mitterer
2023-12-12 3:40 ` Christoph Anton Mitterer
2023-12-12 4:13 ` Qu Wenruo
2023-12-15 2:33 ` Chris Murphy
2023-12-15 3:12 ` Qu Wenruo
2023-12-18 16:24 ` Christoph Anton Mitterer
2023-12-18 19:18 ` Goffredo Baroncelli
2023-12-18 20:04 ` Goffredo Baroncelli
2023-12-18 22:38 ` Christoph Anton Mitterer
2023-12-19 8:22 ` Andrei Borzenkov
2023-12-19 19:09 ` Goffredo Baroncelli
2023-12-21 13:53 ` Christoph Anton Mitterer
2023-12-21 18:03 ` Goffredo Baroncelli
2023-12-21 22:06 ` Christoph Anton Mitterer
2023-12-21 13:46 ` Christoph Anton Mitterer
2023-12-21 20:41 ` Qu Wenruo
2023-12-21 22:15 ` Christoph Anton Mitterer
2023-12-21 22:41 ` Qu Wenruo
2023-12-21 22:54 ` Christoph Anton Mitterer
2023-12-22 0:53 ` Qu Wenruo
2023-12-22 0:56 ` Christoph Anton Mitterer
2023-12-22 1:13 ` Qu Wenruo
2023-12-22 1:23 ` Christoph Anton Mitterer
2024-01-05 3:30 ` Christoph Anton Mitterer
2024-01-05 7:07 ` Qu Wenruo
2024-01-06 0:42 ` Christoph Anton Mitterer
2024-01-06 5:40 ` Qu Wenruo
2024-01-06 8:12 ` Andrei Borzenkov
2024-12-14 19:09 ` Christoph Anton Mitterer
2023-12-18 19:54 ` Qu Wenruo
2023-12-18 22:30 ` Christoph Anton Mitterer
2023-12-13 1:49 ` Remi Gauvin
2023-12-13 8:29 ` Andrea Gelmini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3cfc3cdf-e6f2-400e-ac12-5ddb2840954d@gmx.com \
--to=quwenruo.btrfs@gmx.com \
--cc=calestyo@scientia.org \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox