linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: Nick Terrell <terrelln@fb.com>, Adam Borowski <kilobyte@angband.pl>
Cc: "dsterba@suse.cz" <dsterba@suse.cz>, E V <eliventer@gmail.com>,
	linux-btrfs <linux-btrfs@vger.kernel.org>,
	Yann Collet <cyan@fb.com>
Subject: Re: [PATCH v2 3/4] btrfs: Add zstd support
Date: Wed, 5 Jul 2017 15:57:56 -0400	[thread overview]
Message-ID: <a5672463-f06f-a902-73e3-b101bcb2b3c3@gmail.com> (raw)
In-Reply-To: <B19A0CF8-3CB8-4EA1-AC6A-2383B4BD944A@fb.com>

On 2017-07-05 15:35, Nick Terrell wrote:
> On 7/5/17, 11:45 AM, "Austin S. Hemmelgarn" <ahferroin7@gmail.com> wrote:
>> On 2017-07-05 14:18, Adam Borowski wrote:
>>> On Wed, Jul 05, 2017 at 07:43:27AM -0400, Austin S. Hemmelgarn
>>> wrote:
>>>> On 2017-06-30 19:01, Nick Terrell wrote:
>>>>>> There is also the fact of deciding what to use for the default
>>>>>> when specified without a level.  This is easy for lzo and zlib,
>>>>>> where we can just use the existing level, but for zstd we would
>>>>>> need to decide how to handle a user just specifying 'zstd'
>>>>>> without a level.  I agree with E V that level 3 appears to be
>>>>>> the turnover point, and would suggest using that for the
>>>>>> default.
>>>>>
>>>>> I chose level 1 because I thought if we had to choose one
>>>>> speed/compression trade off, faster would be better. However,
>>>>> with a configerable compression level, level 3 is a great
>>>>> default, and is the default of the zstd CLI.
>>>> Actually, even if it's not configurable, I would prefer 3, as that
>>>> still performs better in both respects (speed and compression
>>>> ratio) than zlib while being sufficiently different from lzo
>>>> performance to make it easy to decide on one or the other.  As far
>>>> as configurable levels for regular usage on a filesystem, there are
>>>> only three levels you benchmarked that I would be interested in,
>>>> namely level 1 (highly active data on slow storage with a fast
>>>> CPU), 3 (stuff I would use zlib for today), and 15 (stuff I would
>>>> use out-of-band compression for today (for example, archival
>>>> storage)).
>>>
>>> If you guys are going to argue between 1 and 3, just go the cracovian
>>> deal and settle at 2. :þ
> 
> I'll change the default to level 3 in the next update.
> 
>>>
>>> But more seriously: zstd looks so much better than lzo and zlib than
>>> I'd suggest making it the default compression in cases where there's
>>> no way to choose, such as chattr +c.  But then, changing the default
>>> before the previous LTS kernel can mount it would be irresponsible --
>>> thus, if you can get it into 4.14, we're looking at 4.19 at soonest
>>> (or later if we consider distro kernels).
>> To be entirely honest, we probably should have switched to LZO as the
>> default a while back to put things more in-line with ZFS (which
>> traditionally favors performance for in-line compression) and Windows
>> (which uses a custom LZ77 derivative that's wicked fast on most modern
>> systems).  I would say that as soon as we get to the point that the last
>> two LTS releases support it, zstd should probably become the default.
>>
>> Also, slightly OT, but I would love to have the ability to set per
>> volume (not subvolume but volume itself) what to use for compression
>> when no algorithm is specified.
>>>
>>> Which means the timing is quite tight: if, per DSterba's request,
>>> /lib/ parts are going via a non-btrfs tree, there'll be not enough
>>> adequate testing in -next.  Thus, would it be possible to have /lib/
>>> patches in btrfs-next but not in for-linus?  That would allow testing
>>> so you can catch the LTS train.
>>>
>>>>>>> So, I don't see any problem making the level configurable.
>>>>>> I would actually love to see this, I regularly make use of
>>>>>> varying compression both on BTRFS (with separate filesystems)
>>>>>> and on the ZFS-based NAS systems we have at work (where it can
>>>>>> be set per-dataset) to allow better compression on less
>>>>>> frequently accessed data.
>>>>>
>>>>> I would love to see configurable compression level as well. Would
>>>>> you want me to add it to my patch set, or should I adapt my patch
>>>>> set to work on top of it when it is ready?
>>>
>>> Note that as zstd is new, there's no backwards compat to care of,
>>> thus you are free to use whatever -o/prop syntax you'd like.  If zstd
>>> ends up being configurable while zlib is not -- oh well, there's no
>>> reason to use zlib anymore other than for mounting with old kernels,
>>> in which case we can't use configurable props anyway.  Unlike zlib,
>>> lzo is not strictly worse than configurable zstd, but it has only one
>>> level so there's nothing to configure as well.
>>>
>>> Thus, I'd suggest skipping the issue and implement levels for zstd
>>> only.
>> I would mostly agree, with one possible exception.  _If_ zlib at the max
>> level gets similar compression ratios to zstd on it's higher levels
>> _and_ it also gets better performance on at least one aspect (I think
>> zlib at level 9 will probably have better compression performance than
>> zstd at level 15, but I'm not sure about the compression ratio), then I
>> would argue that it might be worth implementing levels for zlib.
>> Actually determining that will of course require proper testing, but
>> that's probably better left as a separate discussion.
> 
> For every zlib level, there is one or more level if zstd that performes
> better in all of compression speed, decompression speed, and compression
> level. zstd also has levels that compress better than zlib, but slower
> compression speed and faster decompression speed.
It's the slower compression speed that has me arguing for the 
possibility of configurable levels on zlib.  11MB/s is painfully slow 
considering that most decent HDD's these days can get almost 5-10x that 
speed with no compression.  There are cases (WORM pattern archival 
storage for example) where slow writes to that degree may be acceptable, 
but for most users they won't be, and zlib at level 9 would probably be 
a better choice.  I don't think it can beat zstd at level 15 for 
compression ratio, but if they're even close, then zlib would still be a 
better option at that high of a compression level most of the time.
> 
> These benchmarks are run in userland, but I expect the results to be the
> same in BtrFS. See the lzbench compression benchmark
> https://github.com/inikep/lzbench. There are corner cases where zlib can
> compress better than zstd, but I wouldn't expect to see them often, and
> the gains on other files should offset the small loss on the edge cases.
> 
>>> As for testing: I assume you guys are doing stress testing on amd64
>>> already, right?  I got two crappy arm64 machines but won't be able to
>>> test there before 4.13 (Icenowy has been lazy and didn't push any
>>> near-mainline patch set recently; every single Pine64/Pinebook kernel
>>> pushed by anyone but her didn't work for me so I don't even bother
>>> trying anymore).  On the other hand, the armhf box I'm running Debian
>>> rebuilds on is a gem among cheap SoCs.  After Nick fixed the
>>> flickering workspaces bug, there were no further hiccups; on monday I
>>> switched to 4.12.0 + btrfs-for-4.13 + zstd (thus luckily avoiding
>>> that nowait aio bug), also no explosions yet.
>> Which reminds me, I forgot to mention in my last e-mail, I ran stress
>> tests over the holiday weekend with pretty much the same kernel
>> combination on QEMU instances for amd64, i686, armv7a, armv6 (EABI only
>> on both), arm64, ppc64 (both big and little endian), and 32-bit MIPS
>> (new ABI only, also both big and little endian), and saw no issues
>> relating to zstd or BTRFS (I did run into some other issues, but they
>> were because I still don't have things quite configured properly for
>> this new testing setup).
> 
> Thanks for running the stress tests!
> 
Always glad to help.

  reply	other threads:[~2017-07-05 19:58 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-29 19:41 [PATCH v2 0/4] Add xxhash and zstd modules Nick Terrell
2017-06-29 19:41 ` [PATCH v2 1/4] lib: Add xxhash module Nick Terrell
2017-06-29 19:41 ` [PATCH v2 3/4] btrfs: Add zstd support Nick Terrell
2017-06-30  3:24   ` Adam Borowski
2017-06-30 12:16   ` E V
2017-06-30 14:21     ` David Sterba
2017-06-30 18:25       ` Austin S. Hemmelgarn
2017-06-30 23:01         ` Nick Terrell
2017-07-05 11:43           ` Austin S. Hemmelgarn
2017-07-05 18:18             ` Adam Borowski
2017-07-05 18:45               ` Austin S. Hemmelgarn
2017-07-05 19:35                 ` Nick Terrell
2017-07-05 19:57                   ` Austin S. Hemmelgarn [this message]
2017-07-06  0:25                     ` Nick Terrell
2017-07-06 11:59                       ` Austin S. Hemmelgarn
2017-07-06 12:09                         ` Lionel Bouton
2017-07-06 12:27                           ` Austin S. Hemmelgarn
2017-07-10 21:11     ` Clemens Eisserer
2017-07-06 16:32   ` Adam Borowski
2017-07-07 23:17     ` Nick Terrell
2017-07-07 23:40       ` Adam Borowski
2017-07-08  3:07         ` Adam Borowski
2017-07-10 12:36           ` Austin S. Hemmelgarn
2017-07-10 20:57             ` Nick Terrell
2017-07-11  4:57             ` Nick Terrell
2017-07-11  6:01               ` Nick Terrell
2017-07-12  3:38                 ` Adam Borowski
2017-07-18 18:21   ` David Sterba
2017-06-29 19:41 ` [PATCH v2 4/4] squashfs: " Nick Terrell
2017-06-30  7:36 ` [PATCH v2 0/4] Add xxhash and zstd modules David Sterba
2017-06-30 16:46 ` Timofey Titovets
2017-06-30 19:52   ` Nick Terrell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a5672463-f06f-a902-73e3-b101bcb2b3c3@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=cyan@fb.com \
    --cc=dsterba@suse.cz \
    --cc=eliventer@gmail.com \
    --cc=kilobyte@angband.pl \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=terrelln@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).