From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Christoph Anton Mitterer <calestyo@scientia.org>,
linux-btrfs@vger.kernel.org
Subject: Re: btrfs thinks fs is full, though 11GB should be still free
Date: Tue, 12 Dec 2023 11:28:45 +1030 [thread overview]
Message-ID: <3cfc3cdf-e6f2-400e-ac12-5ddb2840954d@gmx.com> (raw)
In-Reply-To: <f1f3b0f2a48f9092ea54f05b0f6596c58370e0b2.camel@scientia.org>
On 2023/12/12 10:42, Christoph Anton Mitterer wrote:
> On Tue, 2023-12-12 at 10:24 +1030, Qu Wenruo wrote:
>> Then the last thing is extent bookends.
>>
>> COW and small random writes can easily lead to extra space wasted by
>> extent bookends.
>
> Is there a way to check this? Would I just seem maaany extents when I
> look at the files with filefrag?
>
>
> I mean Prometheus, continuously collects metrics from a number of nodes
> an (sooner or later) writes them to disk.
> I don't really know their code, so I have no idea if they already write
> every tiny metric, or only large bunches thereof.
>
> Since they do maintain a WAL, I'd assume the former.
>
> Every know and then, the WAL is written to chunk files which are rather
> large, well ~160M or so in my case, but that depends on how many
> metrics one collects. I think they always write data for a period of
> 2h.
> Later on, they further compact that chunks (I think after 8 hours and
> so on), in which case some larger rewritings would be done.
> Though in my case this doesn't happen, as I run Thanos on top of
> Prometheus, and for that one needs to disable Prometheus' own
> compaction.
>
>
> I've had already previously looked at the extents for these "compacted"
> chunk files, but the worst file had only 32 extents (as reported by
> filefrag).
Filefrag doesn't work that well on btrfs AFAIK, as btrfs is emitting
merged extents to fiemap ioctl, but for fragmented one, filefrag should
be enough to detect them.
>
> Looking at the WAL files:
> /data/main/prometheus/metrics2/wal# filefrag * | grep -v ' 0 extents
> found'
> 00001030: 82 extents found
> 00001031: 81 extents found
> 00001032: 79 extents found
> 00001033: 82 extents found
> 00001034: 78 extents found
> 00001035: 78 extents found
> 00001036: 81 extents found
> 00001037: 79 extents found
> 00001038: 79 extents found
> 00001039: 89 extents found
> 00001040: 80 extents found
> 00001041: 74 extents found
> 00001042: 81 extents found
> 00001043: 97 extents found
> 00001044: 101 extents found
> 00001045: 316 extents found
> checkpoint.00001029: FIBMAP/FIEMAP unsupported
>
> (I did the grep -v, because there were a gazillion of empty wal files,
> presumably created when the fs was already full).
>
> The above numbers though still don't look to bad, do they?
Depends, in my previous 16M case. you only got 2 extents, but still
wasted 8M (33.3% space wasted).
But WAL indeeds looks like a bad patter for btrfs.
>
> And checking all:
> # find /data/main/ -type f -execdir filefrag {} \; | cut -d : -f 2 |
> sort | uniq -c | sort -V
> 3706 0 extents found
> 450 1 extent found
> 25 3 extents found
> 62 2 extents found
> 1 8 extents found
> 1 9 extents found
> 1 10 extents found
> 1 11 extents found
> 1 32 extents found
> 1 74 extents found
> 1 80 extents found
> 1 89 extents found
> 1 97 extents found
> 1 101 extents found
> 1 316 extents found
> 2 78 extents found
> 2 82 extents found
> 3 5 extents found
> 3 79 extents found
> 3 81 extents found
> 6 4 extents found
>
>
>
>> E.g. You write a 16M data extents, then over-write the tailing 8M,
>> now
>> we have two data extents, the old 16M and the new 8M, wasting 8M
>> space.
>>
>> In that case, you can try defrag, but you still need to delete some
>> data
>> first so that you can do defrag...
>
>
> Well my main concern is rather how to prevent this from happening in
> the first place... the data is already all backuped into Thanos, so I
> could also just wipe the fs.
> But this seems to occur repeatedly (well, okay only twice so far O:-)
> ).
> So that would mean we have some IO pattern that "kills" btrfs.
Thus we have "autodefrag" mount option for such use case.
Thanks,
Qu
>
>
> Cheers,
> Chris.
next prev parent reply other threads:[~2023-12-12 0:58 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-11 20:26 btrfs thinks fs is full, though 11GB should be still free Christoph Anton Mitterer
2023-12-11 20:57 ` Qu Wenruo
2023-12-11 22:23 ` Christoph Anton Mitterer
2023-12-11 22:26 ` Christoph Anton Mitterer
2023-12-11 23:20 ` Qu Wenruo
2023-12-11 23:38 ` Christoph Anton Mitterer
2023-12-11 23:54 ` Qu Wenruo
2023-12-12 0:12 ` Christoph Anton Mitterer
2023-12-12 0:58 ` Qu Wenruo [this message]
2023-12-12 2:30 ` Qu Wenruo
2023-12-12 3:27 ` Christoph Anton Mitterer
2023-12-12 3:40 ` Christoph Anton Mitterer
2023-12-12 4:13 ` Qu Wenruo
2023-12-15 2:33 ` Chris Murphy
2023-12-15 3:12 ` Qu Wenruo
2023-12-18 16:24 ` Christoph Anton Mitterer
2023-12-18 19:18 ` Goffredo Baroncelli
2023-12-18 20:04 ` Goffredo Baroncelli
2023-12-18 22:38 ` Christoph Anton Mitterer
2023-12-19 8:22 ` Andrei Borzenkov
2023-12-19 19:09 ` Goffredo Baroncelli
2023-12-21 13:53 ` Christoph Anton Mitterer
2023-12-21 18:03 ` Goffredo Baroncelli
2023-12-21 22:06 ` Christoph Anton Mitterer
2023-12-21 13:46 ` Christoph Anton Mitterer
2023-12-21 20:41 ` Qu Wenruo
2023-12-21 22:15 ` Christoph Anton Mitterer
2023-12-21 22:41 ` Qu Wenruo
2023-12-21 22:54 ` Christoph Anton Mitterer
2023-12-22 0:53 ` Qu Wenruo
2023-12-22 0:56 ` Christoph Anton Mitterer
2023-12-22 1:13 ` Qu Wenruo
2023-12-22 1:23 ` Christoph Anton Mitterer
2024-01-05 3:30 ` Christoph Anton Mitterer
2024-01-05 7:07 ` Qu Wenruo
2024-01-06 0:42 ` Christoph Anton Mitterer
2024-01-06 5:40 ` Qu Wenruo
2024-01-06 8:12 ` Andrei Borzenkov
2024-12-14 19:09 ` Christoph Anton Mitterer
2023-12-18 19:54 ` Qu Wenruo
2023-12-18 22:30 ` Christoph Anton Mitterer
2023-12-13 1:49 ` Remi Gauvin
2023-12-13 8:29 ` Andrea Gelmini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3cfc3cdf-e6f2-400e-ac12-5ddb2840954d@gmx.com \
--to=quwenruo.btrfs@gmx.com \
--cc=calestyo@scientia.org \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.