Re: Massive overhead even after deleting checkpoints

linux-nilfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Ryusuke Konishi <konishi.ryusuke@gmail.com>
To: "Felix E. Klee" <felix.klee@inka.de>
Cc: linux-nilfs@vger.kernel.org
Subject: Re: Massive overhead even after deleting checkpoints
Date: Sat, 11 Jan 2025 15:21:58 +0900	[thread overview]
Message-ID: <CAKFNMons0oLVgByGXEa4Pv3rgxmgEYP9h4z_fjgMm1qjEJDHFA@mail.gmail.com> (raw)
In-Reply-To: <CAKFNMomiYJhNXTTVH5wRuWSBEEYmHcnxqRU8iUPVxFNmfcezMw@mail.gmail.com>

On Sat, Jan 11, 2025 at 2:29 PM Ryusuke Konishi wrote:
>
> On Sat, Jan 11, 2025 at 3:25 AM Felix E. Klee wrote:
> >
> > On Fri, Jan 10, 2025 at 6:37 PM Ryusuke Konishi
> > <konishi.ryusuke@gmail.com> wrote:
> > > Example:
> > > $ sudo nilfs-clean -S 20/0.1
> >
> > Thank you! That improved things. But there is still a lot of overhead.
> > It’s 3.0TB in total vs. 2.5TB actually used by files:
> >
> >     $ sudo nilfs-clean -S 20/0.1
> >     $ df -h /bigstore/
> >     Filesystem            Size  Used Avail Use% Mounted on
> >     /dev/mapper/bigstore  3.5T  3.0T  338G  91% /bigstore
> >     $ du -sh /bigstore/
> >     2.5T    /bigstore/
> >
> > As mentioned in my original email, initially usage according to `df` was
> > 3.3TB. So only 0.3TB have been gained.
> >
> > > $ sudo lssu -l
> >
> > It generates 28 MB of data that starts off like this:
> >
> >           SEGNUM        DATE     TIME STAT     NBLOCKS       NLIVEBLOCKS
> >                3  2025-01-10 12:19:48 -d--        2048       2036 ( 99%)
> >                4  2025-01-10 12:19:48 -d--        2048       2040 ( 99%)
> >                5  2025-01-10 12:19:48 -d--        2048       2036 ( 99%)
> >                6  2025-01-10 12:19:48 -d--        2048       2040 ( 99%)
> >                7  2025-01-10 12:19:48 -d--        2048       2036 ( 99%)
> >
> > I have no idea what to make out of this.
>
> The output seems to be after GC, but by default nilfs considers blocks
> less than an hour old as live (in use), so if you run "lssu -l" again
> or add the "-p 0" option to set the protection period to 0 seconds,
> the results may be different.
>
> $ sudo lssu -l -p 0
>
> Note that the disk capacity output of the df command includes the
> reserved space of the file system. By default, NILFS reserves 5% of
> the disk capacity as a reserved space for GC and normal file system
> operations (the ratio is the same as ext4). Therefore, the effective
> capacity of a 3.5TiB disk is about 3.3TiB.
>
> In addition to that, NILFS has overhead due to various metadata, the
> largest of which are DAT for disk address management (1), segment
> summary for managing segments and logs (2), and B-tree blocks (3).
>
> Of these, (3) should be included in the du output capacity, so (1) and
> (2) are likely to be the main causes.
> (1) is just over 32 bytes per 4KiB block, which is about 0.78%, and
> (2) is at most 1.5% depending on usage, so there is a total overhead
> of just over 2.3%.
> If the effective capacity is 3.3TiB, the calculated overhead is
> 0.076TiB, so the upper limit capacity should be around 3.2TiB
> (theoretically).
>
> Other factors may include the 3600 second protection period, and the
> fact that the NILFS df output is roughly calculated from used segments
> rather than actual used blocks, so this difference may be affecting
> it.

Incidentally, the reason why the df output (used capacity) of NILFS is
calculated from the used segments and not the number of used blocks is
because the blocks in use on NILFS change dynamically depending on the
conditions, making it difficult to respond immediately. If the
dissociation is large, I think some kind of algorithm should be
introduced to improve it.

The actual blocks in use should be able to be calculated as follows
using the output of "lssu -l" (when the block size is 4KiB).  For your
reference.

$ sudo lssu -l -p 0 | awk 'NR>1{sum+=$6}END{print sum*4096}' | numfmt --to=iec-i

Regards,
Ryusuke Konishi

next prev parent reply	other threads:[~2025-01-11  6:22 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-10 15:54 Massive overhead even after deleting checkpoints Felix E. Klee
2025-01-10 17:36 ` Ryusuke Konishi
2025-01-10 18:25   ` Felix E. Klee
2025-01-11  5:29     ` Ryusuke Konishi
2025-01-11  6:21       ` Ryusuke Konishi [this message]
2025-01-16 11:08         ` Felix E. Klee
2025-01-16 18:24           ` Ryusuke Konishi
2025-01-31  8:13             ` Felix E. Klee
2025-02-06  7:07               ` Ryusuke Konishi
2025-02-07  4:00                 ` Felix E. Klee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKFNMons0oLVgByGXEa4Pv3rgxmgEYP9h4z_fjgMm1qjEJDHFA@mail.gmail.com \
    --to=konishi.ryusuke@gmail.com \
    --cc=felix.klee@inka.de \
    --cc=linux-nilfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).