Re: Question about btrfs compression

Linux Btrfs filesystem development
 help / color / mirror / Atom feed

From: David Sterba <dsterba@suse.cz>
To: Sijie Lan <sijielan@gmail.com>
Cc: linux-btrfs@vger.kernel.org, wqu@suse.com
Subject: Re: Question about btrfs compression
Date: Tue, 28 May 2024 18:06:26 +0200	[thread overview]
Message-ID: <20240528160626.GF8631@twin.jikos.cz> (raw)
In-Reply-To: <CAGAHmYDquz9v1eABjGkYq=ja1vPkwAz9HmcCQVZk0htb5W830w@mail.gmail.com>

On Fri, May 24, 2024 at 02:32:38AM +0800, Sijie Lan wrote:
> As described in
> https://archive.kernel.org/oldwiki/btrfs.wiki.kernel.org/index.php/Compression.html.
> "The compression processes ranges of a file of maximum size 128 KiB
> and compresses each 4 KiB (or page-sized) block separately. Accessing
> a byte in the middle of the given 128 KiB range requires to decompress
> the whole range. This is not optimal and is subject to optimizations
> and further development."
> 
> Since Btrfs compresses each 4KiB block of data separately. My question
> is that when randomly accessing some of the bytes in the entire
> compressed chunk, decompressing some of the bytes in the entire chunk
> seems to be much more efficient than decompressing the entire range
> (128 KiB), but the current method still decompresses the entire 128KiB
> chunk when accessing some of the bytes in the chunk (e.g., 4KiB). So
> why is it not optimized for this, is it a basic implementation
> difficulty or something else?

Technically it's possible. I'm not sure if this would be that noticeable
because page cache readahead reads more memory around what is accessed
so the requested range could be bigger once it ends up in the
decompression.

Assuming we're reading 4k page, we'd need to support seeking into the
compressed stream.  This depends on the used compression algorithm,
we'd need somehow let each decompression to start from there.

For LZO the chunking is done manually, as it needs a contiguous buffer
and referes to previous data directly. We could skip the segments,
though I'm not sure if we store the compressed size (ie. the amount to
skip).

Zstd or zlib keep the state separately and there's no outer container so
we'd need API support for that, returning offset in compressed stream
for a given uncompressed offset in the result. I'm not familiar with the
formats but to make this work the internal segments would have to track
the compressed size and would be able to go over them without
decompression.

The last thing is probably the interface for btrfs compression to fill
just the given range.

     prev parent reply	other threads:[~2024-05-28 16:06 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-23 18:32 Question about btrfs compression Sijie Lan
2024-05-28 16:06 ` David Sterba [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240528160626.GF8631@twin.jikos.cz \
    --to=dsterba@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=sijielan@gmail.com \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox