public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Christoph Anton Mitterer <calestyo@scientia.org>,
	linux-btrfs@vger.kernel.org
Subject: Re: btrfs thinks fs is full, though 11GB should be still free
Date: Tue, 12 Dec 2023 13:00:41 +1030	[thread overview]
Message-ID: <da1fb280-3291-4e01-9f00-e7184c019773@gmx.com> (raw)
In-Reply-To: <3cfc3cdf-e6f2-400e-ac12-5ddb2840954d@gmx.com>



On 2023/12/12 11:28, Qu Wenruo wrote:
>
>
> On 2023/12/12 10:42, Christoph Anton Mitterer wrote:
>> On Tue, 2023-12-12 at 10:24 +1030, Qu Wenruo wrote:
>>> Then the last thing is extent bookends.
>>>
>>> COW and small random writes can easily lead to extra space wasted by
>>> extent bookends.
>>
>> Is there a way to check this? Would I just seem maaany extents when I
>> look at the files with filefrag?

IIRC compsize can do it.

https://github.com/kilobyte/compsize

Thanks,
Qu
>>
>>
>> I mean Prometheus, continuously collects metrics from a number of nodes
>> an (sooner or later) writes them to disk.
>> I don't really know their code, so I have no idea if they already write
>> every tiny metric, or only large bunches thereof.
>>
>> Since they do maintain a WAL, I'd assume the former.
>>
>> Every know and then, the WAL is written to chunk files which are rather
>> large, well ~160M or so in my case, but that depends on how many
>> metrics one collects. I think they always write data for a period of
>> 2h.
>> Later on, they further compact that chunks (I think after 8 hours and
>> so on), in which case some larger rewritings would be done.
>> Though in my case this doesn't happen, as I run Thanos on top of
>> Prometheus, and for that one needs to disable Prometheus' own
>> compaction.
>>
>>
>> I've had already previously looked at the extents for these "compacted"
>> chunk files, but the worst file had only 32 extents (as reported by
>> filefrag).
>
> Filefrag doesn't work that well on btrfs AFAIK, as btrfs is emitting
> merged extents to fiemap ioctl, but for fragmented one, filefrag should
> be enough to detect them.
>>
>> Looking at the WAL files:
>> /data/main/prometheus/metrics2/wal# filefrag * | grep -v ' 0 extents
>> found'
>> 00001030: 82 extents found
>> 00001031: 81 extents found
>> 00001032: 79 extents found
>> 00001033: 82 extents found
>> 00001034: 78 extents found
>> 00001035: 78 extents found
>> 00001036: 81 extents found
>> 00001037: 79 extents found
>> 00001038: 79 extents found
>> 00001039: 89 extents found
>> 00001040: 80 extents found
>> 00001041: 74 extents found
>> 00001042: 81 extents found
>> 00001043: 97 extents found
>> 00001044: 101 extents found
>> 00001045: 316 extents found
>> checkpoint.00001029: FIBMAP/FIEMAP unsupported
>>
>> (I did the grep -v, because there were a gazillion of empty wal files,
>> presumably created when the fs was already full).
>>
>> The above numbers though still don't look to bad, do they?
>
> Depends, in my previous 16M case. you only got 2 extents, but still
> wasted 8M (33.3% space wasted).
>
> But WAL indeeds looks like a bad patter for btrfs.
>
>>
>> And checking all:
>> # find /data/main/ -type f -execdir filefrag {} \; | cut -d : -f 2 |
>> sort | uniq -c | sort -V
>>     3706  0 extents found
>>      450  1 extent found
>>       25  3 extents found
>>       62  2 extents found
>>        1  8 extents found
>>        1  9 extents found
>>        1  10 extents found
>>        1  11 extents found
>>        1  32 extents found
>>        1  74 extents found
>>        1  80 extents found
>>        1  89 extents found
>>        1  97 extents found
>>        1  101 extents found
>>        1  316 extents found
>>        2  78 extents found
>>        2  82 extents found
>>        3  5 extents found
>>        3  79 extents found
>>        3  81 extents found
>>        6  4 extents found
>>
>>
>>
>>> E.g. You write a 16M data extents, then over-write the tailing 8M,
>>> now
>>> we have two data extents, the old 16M and the new 8M, wasting 8M
>>> space.
>>>
>>> In that case, you can try defrag, but you still need to delete some
>>> data
>>> first so that you can do defrag...
>>
>>
>> Well my main concern is rather how to prevent this from happening in
>> the first place... the data is already all backuped into Thanos, so I
>> could also just wipe the fs.
>> But this seems to occur repeatedly (well, okay only twice so far O:-)
>> ).
>> So that would mean we have some IO pattern that "kills" btrfs.
>
> Thus we have "autodefrag" mount option for such use case.
>
> Thanks,
> Qu
>>
>>
>> Cheers,
>> Chris.
>

  reply	other threads:[~2023-12-12  2:37 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-11 20:26 btrfs thinks fs is full, though 11GB should be still free Christoph Anton Mitterer
2023-12-11 20:57 ` Qu Wenruo
2023-12-11 22:23   ` Christoph Anton Mitterer
2023-12-11 22:26     ` Christoph Anton Mitterer
2023-12-11 23:20     ` Qu Wenruo
2023-12-11 23:38       ` Christoph Anton Mitterer
2023-12-11 23:54         ` Qu Wenruo
2023-12-12  0:12           ` Christoph Anton Mitterer
2023-12-12  0:58             ` Qu Wenruo
2023-12-12  2:30               ` Qu Wenruo [this message]
2023-12-12  3:27               ` Christoph Anton Mitterer
2023-12-12  3:40                 ` Christoph Anton Mitterer
2023-12-12  4:13                   ` Qu Wenruo
2023-12-15  2:33                     ` Chris Murphy
2023-12-15  3:12                       ` Qu Wenruo
2023-12-18 16:24                     ` Christoph Anton Mitterer
2023-12-18 19:18                       ` Goffredo Baroncelli
2023-12-18 20:04                         ` Goffredo Baroncelli
2023-12-18 22:38                         ` Christoph Anton Mitterer
2023-12-19  8:22                           ` Andrei Borzenkov
2023-12-19 19:09                             ` Goffredo Baroncelli
2023-12-21 13:53                               ` Christoph Anton Mitterer
2023-12-21 18:03                                 ` Goffredo Baroncelli
2023-12-21 22:06                                   ` Christoph Anton Mitterer
2023-12-21 13:46                             ` Christoph Anton Mitterer
2023-12-21 20:41                               ` Qu Wenruo
2023-12-21 22:15                                 ` Christoph Anton Mitterer
2023-12-21 22:41                                   ` Qu Wenruo
2023-12-21 22:54                                     ` Christoph Anton Mitterer
2023-12-22  0:53                                       ` Qu Wenruo
2023-12-22  0:56                                         ` Christoph Anton Mitterer
2023-12-22  1:13                                           ` Qu Wenruo
2023-12-22  1:23                                             ` Christoph Anton Mitterer
2024-01-05  3:30                                             ` Christoph Anton Mitterer
2024-01-05  7:07                                               ` Qu Wenruo
2024-01-06  0:42                                                 ` Christoph Anton Mitterer
2024-01-06  5:40                                                   ` Qu Wenruo
2024-01-06  8:12                                                     ` Andrei Borzenkov
2024-12-14 19:09                                                   ` Christoph Anton Mitterer
2023-12-18 19:54                       ` Qu Wenruo
2023-12-18 22:30                     ` Christoph Anton Mitterer
2023-12-13  1:49                 ` Remi Gauvin
2023-12-13  8:29             ` Andrea Gelmini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=da1fb280-3291-4e01-9f00-e7184c019773@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=calestyo@scientia.org \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox