All of lore.kernel.org
 help / color / mirror / Atom feed
From: Austin S Hemmelgarn <ahferroin7@gmail.com>
To: Imran Geriskovan <imran.geriskovan@gmail.com>,
	linux-btrfs@vger.kernel.org
Subject: Re: compression disk space saving - what are your results?
Date: Wed, 2 Dec 2015 09:39:08 -0500	[thread overview]
Message-ID: <565F028C.6000707@gmail.com> (raw)
In-Reply-To: <CAK5rZE6-SMfseAfEu1GxLrgYNOBMhNbiQ8tqWVWUr6dFDL8OKw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1752 bytes --]

On 2015-12-02 09:03, Imran Geriskovan wrote:
>>> What are your disk space savings when using btrfs with compression?
>
>> * There's the compress vs. compress-force option and discussion.  A
>> number of posters have reported that for mostly text, compress didn't
>> give them expected compression results and they needed to use compress-
>> force.
>
> "compress-force" option compresses regardless of the "compressibility"
> of the file.
>
> "compress" option makes some inference about the "compressibility"
> and decides to compress or not.
>
> I wonder how that inference is done?
> Can anyone provide some pseudo code for it?
I'm not certain how BTRFS does it, but my guess would be trying to 
compress the block, then storing the uncompressed version if the 
compressed one is bigger.

The program lrzip has an option to do per-block compression checks kind 
of like this, but it's method is to try LZO compression on the block 
(which is fast), and only use the selected compression method (bzip2 by 
default I think, but it can also do zlib and xz) if the LZO compression 
ratio is is good enough.  If we went with a similar method, I'd say we 
should integrate LZ4 support first, and use that for the test.  I think 
NTFS compression on Windows might do something similar, but they use an 
old LZ77 derivative for their compression (I think it's referred to as 
LZNT1, and it's designed for speed, and usually doesn't get much better 
than a 30% compression ratio).

On a side note, I really wish BTRFS would just add LZ4 support.  It's a 
lot more deterministic WRT decompression time than LZO, gets a similar 
compression ratio, and runs faster on most processors for both 
compression and decompression.


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3019 bytes --]

  reply	other threads:[~2015-12-02 14:39 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-02  9:46 compression disk space saving - what are your results? Tomasz Chmielewski
2015-12-02 10:36 ` Duncan
2015-12-02 14:03   ` Imran Geriskovan
2015-12-02 14:39     ` Austin S Hemmelgarn [this message]
2015-12-03  6:29       ` Duncan
2015-12-03 12:09         ` Imran Geriskovan
2015-12-04 12:33           ` Austin S Hemmelgarn
2015-12-04 12:37         ` Austin S Hemmelgarn
2015-12-02 13:03 ` Austin S Hemmelgarn
2015-12-02 13:53   ` Tomasz Chmielewski
2015-12-02 14:03     ` Wang Shilong
2015-12-02 14:06       ` Tomasz Chmielewski
2015-12-02 14:49     ` Austin S Hemmelgarn
2015-12-22  3:55       ` Kai Krakow
2015-12-22 17:25         ` james northrup
2015-12-05 13:37 ` Marc Joliet
2015-12-05 14:11   ` Marc Joliet
2015-12-06  4:21     ` Duncan
2015-12-06 11:26       ` Marc Joliet
2015-12-05 19:38 ` guido_kuenne

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=565F028C.6000707@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=imran.geriskovan@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.