linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/4] Add xxhash and zstd modules
@ 2017-07-20 21:27 Nick Terrell
  2017-07-20 21:27 ` [PATCH v3 1/4] lib: Add xxhash module Nick Terrell
                   ` (4 more replies)
  0 siblings, 5 replies; 16+ messages in thread
From: Nick Terrell @ 2017-07-20 21:27 UTC (permalink / raw)
  To: Nick Terrell
  Cc: Austin S . Hemmelgarn, Herbert Xu, kernel-team, Chris Mason,
	Yann Collet, Adam Borowski, David Sterba, squashfs-devel,
	linux-btrfs, linux-kernel

Hi all,

This patch set adds xxhash, zstd compression, and zstd decompression
modules. It also adds zstd support to BtrFS and SquashFS.

Each patch has relevant summaries, benchmarks, and tests.

Best,
Nick Terrell

Changelog:

v1 -> v2:
- Make pointer in lib/xxhash.c:394 non-const (1/4)
- Use div_u64() for division of u64s (2/4)
- Reduce stack usage of ZSTD_compressSequences(), ZSTD_buildSeqTable(),
  ZSTD_decompressSequencesLong(), FSE_buildDTable(), FSE_decompress_wksp(),
  HUF_writeCTable(), HUF_readStats(), HUF_readCTable(),
  HUF_compressWeights(), HUF_readDTableX2(), and HUF_readDTableX4() (2/4)
- No zstd function uses more than 400 B of stack space (2/4)

v2 -> v3:
- Work around gcc-7 bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81388
  (2/4)
- Fix bug in dictionary compression from upstream commit cc1522351f (2/4)
- Port upstream BtrFS commits e1ddce71d6, 389a6cfc2a, and 6acafd1eff (3/4)
- Change default compression level for BtrFS to 3 (3/4)

Nick Terrell (4):
  lib: Add xxhash module
  lib: Add zstd modules
  btrfs: Add zstd support
  squashfs: Add zstd support

 fs/btrfs/Kconfig           |    2 +
 fs/btrfs/Makefile          |    2 +-
 fs/btrfs/compression.c     |    1 +
 fs/btrfs/compression.h     |    6 +-
 fs/btrfs/ctree.h           |    1 +
 fs/btrfs/disk-io.c         |    2 +
 fs/btrfs/ioctl.c           |    6 +-
 fs/btrfs/props.c           |    6 +
 fs/btrfs/super.c           |   12 +-
 fs/btrfs/sysfs.c           |    2 +
 fs/btrfs/zstd.c            |  435 ++++++
 fs/squashfs/Kconfig        |   14 +
 fs/squashfs/Makefile       |    1 +
 fs/squashfs/decompressor.c |    7 +
 fs/squashfs/decompressor.h |    4 +
 fs/squashfs/squashfs_fs.h  |    1 +
 fs/squashfs/zstd_wrapper.c |  150 ++
 include/linux/xxhash.h     |  236 +++
 include/linux/zstd.h       | 1157 +++++++++++++++
 include/uapi/linux/btrfs.h |    8 +-
 lib/Kconfig                |   11 +
 lib/Makefile               |    3 +
 lib/xxhash.c               |  500 +++++++
 lib/zstd/Makefile          |   18 +
 lib/zstd/bitstream.h       |  374 +++++
 lib/zstd/compress.c        | 3479 ++++++++++++++++++++++++++++++++++++++++++++
 lib/zstd/decompress.c      | 2526 ++++++++++++++++++++++++++++++++
 lib/zstd/entropy_common.c  |  243 ++++
 lib/zstd/error_private.h   |   53 +
 lib/zstd/fse.h             |  575 ++++++++
 lib/zstd/fse_compress.c    |  795 ++++++++++
 lib/zstd/fse_decompress.c  |  332 +++++
 lib/zstd/huf.h             |  212 +++
 lib/zstd/huf_compress.c    |  770 ++++++++++
 lib/zstd/huf_decompress.c  |  960 ++++++++++++
 lib/zstd/mem.h             |  151 ++
 lib/zstd/zstd_common.c     |   75 +
 lib/zstd/zstd_internal.h   |  250 ++++
 lib/zstd/zstd_opt.h        | 1014 +++++++++++++
 39 files changed, 14382 insertions(+), 12 deletions(-)
 create mode 100644 fs/btrfs/zstd.c
 create mode 100644 fs/squashfs/zstd_wrapper.c
 create mode 100644 include/linux/xxhash.h
 create mode 100644 include/linux/zstd.h
 create mode 100644 lib/xxhash.c
 create mode 100644 lib/zstd/Makefile
 create mode 100644 lib/zstd/bitstream.h
 create mode 100644 lib/zstd/compress.c
 create mode 100644 lib/zstd/decompress.c
 create mode 100644 lib/zstd/entropy_common.c
 create mode 100644 lib/zstd/error_private.h
 create mode 100644 lib/zstd/fse.h
 create mode 100644 lib/zstd/fse_compress.c
 create mode 100644 lib/zstd/fse_decompress.c
 create mode 100644 lib/zstd/huf.h
 create mode 100644 lib/zstd/huf_compress.c
 create mode 100644 lib/zstd/huf_decompress.c
 create mode 100644 lib/zstd/mem.h
 create mode 100644 lib/zstd/zstd_common.c
 create mode 100644 lib/zstd/zstd_internal.h
 create mode 100644 lib/zstd/zstd_opt.h

--
2.9.3

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v3 1/4] lib: Add xxhash module
  2017-07-20 21:27 [PATCH v3 0/4] Add xxhash and zstd modules Nick Terrell
@ 2017-07-20 21:27 ` Nick Terrell
  2017-07-20 21:27 ` [PATCH v3 3/4] btrfs: Add zstd support Nick Terrell
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 16+ messages in thread
From: Nick Terrell @ 2017-07-20 21:27 UTC (permalink / raw)
  To: Nick Terrell
  Cc: Austin S . Hemmelgarn, Herbert Xu, kernel-team, Chris Mason,
	Yann Collet, Adam Borowski, David Sterba, squashfs-devel,
	linux-btrfs, linux-kernel

Adds xxhash kernel module with xxh32 and xxh64 hashes. xxhash is an
extremely fast non-cryptographic hash algorithm for checksumming.
The zstd compression and decompression modules added in the next patch
require xxhash. I extracted it out from zstd since it is useful on its
own. I copied the code from the upstream XXHash source repository and
translated it into kernel style. I ran benchmarks and tests in the kernel
and tests in userland.

I benchmarked xxhash as a special character device. I ran in four modes,
no-op, xxh32, xxh64, and crc32. The no-op mode simply copies the data to
kernel space and ignores it. The xxh32, xxh64, and crc32 modes compute
hashes on the copied data. I also ran it with four different buffer sizes.
The benchmark file is located in the upstream zstd source repository under
`contrib/linux-kernel/xxhash_test.c` [1].

I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM.
The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor,
16 GB of RAM, and a SSD. I benchmarked using the file `filesystem.squashfs`
from `ubuntu-16.10-desktop-amd64.iso`, which is 1,536,217,088 B large.
Run the following commands for the benchmark:

    modprobe xxhash_test
    mknod xxhash_test c 245 0
    time cp filesystem.squashfs xxhash_test

The time is reported by the time of the userland `cp`.
The GB/s is computed with

    1,536,217,008 B / time(buffer size, hash)

which includes the time to copy from userland.
The Normalized GB/s is computed with

    1,536,217,088 B / (time(buffer size, hash) - time(buffer size, none)).


| Buffer Size (B) | Hash  | Time (s) | GB/s | Adjusted GB/s |
|-----------------|-------|----------|------|---------------|
|            1024 | none  |    0.408 | 3.77 |             - |
|            1024 | xxh32 |    0.649 | 2.37 |          6.37 |
|            1024 | xxh64 |    0.542 | 2.83 |         11.46 |
|            1024 | crc32 |    1.290 | 1.19 |          1.74 |
|            4096 | none  |    0.380 | 4.04 |             - |
|            4096 | xxh32 |    0.645 | 2.38 |          5.79 |
|            4096 | xxh64 |    0.500 | 3.07 |         12.80 |
|            4096 | crc32 |    1.168 | 1.32 |          1.95 |
|            8192 | none  |    0.351 | 4.38 |             - |
|            8192 | xxh32 |    0.614 | 2.50 |          5.84 |
|            8192 | xxh64 |    0.464 | 3.31 |         13.60 |
|            8192 | crc32 |    1.163 | 1.32 |          1.89 |
|           16384 | none  |    0.346 | 4.43 |             - |
|           16384 | xxh32 |    0.590 | 2.60 |          6.30 |
|           16384 | xxh64 |    0.466 | 3.30 |         12.80 |
|           16384 | crc32 |    1.183 | 1.30 |          1.84 |

Tested in userland using the test-suite in the zstd repo under
`contrib/linux-kernel/test/XXHashUserlandTest.cpp` [2] by mocking the
kernel functions. A line in each branch of every function in `xxhash.c`
was commented out to ensure that the test-suite fails. Additionally
tested while testing zstd and with SMHasher [3].

[1] https://phabricator.intern.facebook.com/P57526246
[2] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/test/XXHashUserlandTest.cpp
[3] https://github.com/aappleby/smhasher

zstd source repository: https://github.com/facebook/zstd
XXHash source repository: https://github.com/cyan4973/xxhash

Signed-off-by: Nick Terrell <terrelln@fb.com>
---
v1 -> v2:
- Make pointer in lib/xxhash.c:394 non-const

 include/linux/xxhash.h | 236 +++++++++++++++++++++++
 lib/Kconfig            |   3 +
 lib/Makefile           |   1 +
 lib/xxhash.c           | 500 +++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 740 insertions(+)
 create mode 100644 include/linux/xxhash.h
 create mode 100644 lib/xxhash.c

diff --git a/include/linux/xxhash.h b/include/linux/xxhash.h
new file mode 100644
index 0000000..9e1f42c
--- /dev/null
+++ b/include/linux/xxhash.h
@@ -0,0 +1,236 @@
+/*
+ * xxHash - Extremely Fast Hash algorithm
+ * Copyright (C) 2012-2016, Yann Collet.
+ *
+ * BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met:
+ *
+ *   * Redistributions of source code must retain the above copyright
+ *     notice, this list of conditions and the following disclaimer.
+ *   * Redistributions in binary form must reproduce the above
+ *     copyright notice, this list of conditions and the following disclaimer
+ *     in the documentation and/or other materials provided with the
+ *     distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * This program is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU General Public License version 2 as published by the
+ * Free Software Foundation. This program is dual-licensed; you may select
+ * either version 2 of the GNU General Public License ("GPL") or BSD license
+ * ("BSD").
+ *
+ * You can contact the author at:
+ * - xxHash homepage: http://cyan4973.github.io/xxHash/
+ * - xxHash source repository: https://github.com/Cyan4973/xxHash
+ */
+
+/*
+ * Notice extracted from xxHash homepage:
+ *
+ * xxHash is an extremely fast Hash algorithm, running at RAM speed limits.
+ * It also successfully passes all tests from the SMHasher suite.
+ *
+ * Comparison (single thread, Windows Seven 32 bits, using SMHasher on a Core 2
+ * Duo @3GHz)
+ *
+ * Name            Speed       Q.Score   Author
+ * xxHash          5.4 GB/s     10
+ * CrapWow         3.2 GB/s      2       Andrew
+ * MumurHash 3a    2.7 GB/s     10       Austin Appleby
+ * SpookyHash      2.0 GB/s     10       Bob Jenkins
+ * SBox            1.4 GB/s      9       Bret Mulvey
+ * Lookup3         1.2 GB/s      9       Bob Jenkins
+ * SuperFastHash   1.2 GB/s      1       Paul Hsieh
+ * CityHash64      1.05 GB/s    10       Pike & Alakuijala
+ * FNV             0.55 GB/s     5       Fowler, Noll, Vo
+ * CRC32           0.43 GB/s     9
+ * MD5-32          0.33 GB/s    10       Ronald L. Rivest
+ * SHA1-32         0.28 GB/s    10
+ *
+ * Q.Score is a measure of quality of the hash function.
+ * It depends on successfully passing SMHasher test set.
+ * 10 is a perfect score.
+ *
+ * A 64-bits version, named xxh64 offers much better speed,
+ * but for 64-bits applications only.
+ * Name     Speed on 64 bits    Speed on 32 bits
+ * xxh64       13.8 GB/s            1.9 GB/s
+ * xxh32        6.8 GB/s            6.0 GB/s
+ */
+
+#ifndef XXHASH_H
+#define XXHASH_H
+
+#include <linux/types.h>
+
+/*-****************************
+ * Simple Hash Functions
+ *****************************/
+
+/**
+ * xxh32() - calculate the 32-bit hash of the input with a given seed.
+ *
+ * @input:  The data to hash.
+ * @length: The length of the data to hash.
+ * @seed:   The seed can be used to alter the result predictably.
+ *
+ * Speed on Core 2 Duo @ 3 GHz (single thread, SMHasher benchmark) : 5.4 GB/s
+ *
+ * Return:  The 32-bit hash of the data.
+ */
+uint32_t xxh32(const void *input, size_t length, uint32_t seed);
+
+/**
+ * xxh64() - calculate the 64-bit hash of the input with a given seed.
+ *
+ * @input:  The data to hash.
+ * @length: The length of the data to hash.
+ * @seed:   The seed can be used to alter the result predictably.
+ *
+ * This function runs 2x faster on 64-bit systems, but slower on 32-bit systems.
+ *
+ * Return:  The 64-bit hash of the data.
+ */
+uint64_t xxh64(const void *input, size_t length, uint64_t seed);
+
+/*-****************************
+ * Streaming Hash Functions
+ *****************************/
+
+/*
+ * These definitions are only meant to allow allocation of XXH state
+ * statically, on stack, or in a struct for example.
+ * Do not use members directly.
+ */
+
+/**
+ * struct xxh32_state - private xxh32 state, do not use members directly
+ */
+struct xxh32_state {
+	uint32_t total_len_32;
+	uint32_t large_len;
+	uint32_t v1;
+	uint32_t v2;
+	uint32_t v3;
+	uint32_t v4;
+	uint32_t mem32[4];
+	uint32_t memsize;
+};
+
+/**
+ * struct xxh32_state - private xxh64 state, do not use members directly
+ */
+struct xxh64_state {
+	uint64_t total_len;
+	uint64_t v1;
+	uint64_t v2;
+	uint64_t v3;
+	uint64_t v4;
+	uint64_t mem64[4];
+	uint32_t memsize;
+};
+
+/**
+ * xxh32_reset() - reset the xxh32 state to start a new hashing operation
+ *
+ * @state: The xxh32 state to reset.
+ * @seed:  Initialize the hash state with this seed.
+ *
+ * Call this function on any xxh32_state to prepare for a new hashing operation.
+ */
+void xxh32_reset(struct xxh32_state *state, uint32_t seed);
+
+/**
+ * xxh32_update() - hash the data given and update the xxh32 state
+ *
+ * @state:  The xxh32 state to update.
+ * @input:  The data to hash.
+ * @length: The length of the data to hash.
+ *
+ * After calling xxh32_reset() call xxh32_update() as many times as necessary.
+ *
+ * Return:  Zero on success, otherwise an error code.
+ */
+int xxh32_update(struct xxh32_state *state, const void *input, size_t length);
+
+/**
+ * xxh32_digest() - produce the current xxh32 hash
+ *
+ * @state: Produce the current xxh32 hash of this state.
+ *
+ * A hash value can be produced at any time. It is still possible to continue
+ * inserting input into the hash state after a call to xxh32_digest(), and
+ * generate new hashes later on, by calling xxh32_digest() again.
+ *
+ * Return: The xxh32 hash stored in the state.
+ */
+uint32_t xxh32_digest(const struct xxh32_state *state);
+
+/**
+ * xxh64_reset() - reset the xxh64 state to start a new hashing operation
+ *
+ * @state: The xxh64 state to reset.
+ * @seed:  Initialize the hash state with this seed.
+ */
+void xxh64_reset(struct xxh64_state *state, uint64_t seed);
+
+/**
+ * xxh64_update() - hash the data given and update the xxh64 state
+ * @state:  The xxh64 state to update.
+ * @input:  The data to hash.
+ * @length: The length of the data to hash.
+ *
+ * After calling xxh64_reset() call xxh64_update() as many times as necessary.
+ *
+ * Return:  Zero on success, otherwise an error code.
+ */
+int xxh64_update(struct xxh64_state *state, const void *input, size_t length);
+
+/**
+ * xxh64_digest() - produce the current xxh64 hash
+ *
+ * @state: Produce the current xxh64 hash of this state.
+ *
+ * A hash value can be produced at any time. It is still possible to continue
+ * inserting input into the hash state after a call to xxh64_digest(), and
+ * generate new hashes later on, by calling xxh64_digest() again.
+ *
+ * Return: The xxh64 hash stored in the state.
+ */
+uint64_t xxh64_digest(const struct xxh64_state *state);
+
+/*-**************************
+ * Utils
+ ***************************/
+
+/**
+ * xxh32_copy_state() - copy the source state into the destination state
+ *
+ * @src: The source xxh32 state.
+ * @dst: The destination xxh32 state.
+ */
+void xxh32_copy_state(struct xxh32_state *dst, const struct xxh32_state *src);
+
+/**
+ * xxh64_copy_state() - copy the source state into the destination state
+ *
+ * @src: The source xxh64 state.
+ * @dst: The destination xxh64 state.
+ */
+void xxh64_copy_state(struct xxh64_state *dst, const struct xxh64_state *src);
+
+#endif /* XXHASH_H */
diff --git a/lib/Kconfig b/lib/Kconfig
index 6762529..5e7541f 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -192,6 +192,9 @@ config CRC8
 	  when they need to do cyclic redundancy check according CRC8
 	  algorithm. Module will be called crc8.

+config XXHASH
+	tristate
+
 config AUDIT_GENERIC
 	bool
 	depends on AUDIT && !AUDIT_ARCH
diff --git a/lib/Makefile b/lib/Makefile
index 40c1837..d06b68a 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -102,6 +102,7 @@ obj-$(CONFIG_CRC4)	+= crc4.o
 obj-$(CONFIG_CRC7)	+= crc7.o
 obj-$(CONFIG_LIBCRC32C)	+= libcrc32c.o
 obj-$(CONFIG_CRC8)	+= crc8.o
+obj-$(CONFIG_XXHASH)	+= xxhash.o
 obj-$(CONFIG_GENERIC_ALLOCATOR) += genalloc.o

 obj-$(CONFIG_842_COMPRESS) += 842/
diff --git a/lib/xxhash.c b/lib/xxhash.c
new file mode 100644
index 0000000..aa61e2a
--- /dev/null
+++ b/lib/xxhash.c
@@ -0,0 +1,500 @@
+/*
+ * xxHash - Extremely Fast Hash algorithm
+ * Copyright (C) 2012-2016, Yann Collet.
+ *
+ * BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met:
+ *
+ *   * Redistributions of source code must retain the above copyright
+ *     notice, this list of conditions and the following disclaimer.
+ *   * Redistributions in binary form must reproduce the above
+ *     copyright notice, this list of conditions and the following disclaimer
+ *     in the documentation and/or other materials provided with the
+ *     distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * This program is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU General Public License version 2 as published by the
+ * Free Software Foundation. This program is dual-licensed; you may select
+ * either version 2 of the GNU General Public License ("GPL") or BSD license
+ * ("BSD").
+ *
+ * You can contact the author at:
+ * - xxHash homepage: http://cyan4973.github.io/xxHash/
+ * - xxHash source repository: https://github.com/Cyan4973/xxHash
+ */
+
+#include <asm/unaligned.h>
+#include <linux/errno.h>
+#include <linux/compiler.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/string.h>
+#include <linux/xxhash.h>
+
+/*-*************************************
+ * Macros
+ **************************************/
+#define xxh_rotl32(x, r) ((x << r) | (x >> (32 - r)))
+#define xxh_rotl64(x, r) ((x << r) | (x >> (64 - r)))
+
+#ifdef __LITTLE_ENDIAN
+# define XXH_CPU_LITTLE_ENDIAN 1
+#else
+# define XXH_CPU_LITTLE_ENDIAN 0
+#endif
+
+/*-*************************************
+ * Constants
+ **************************************/
+static const uint32_t PRIME32_1 = 2654435761U;
+static const uint32_t PRIME32_2 = 2246822519U;
+static const uint32_t PRIME32_3 = 3266489917U;
+static const uint32_t PRIME32_4 =  668265263U;
+static const uint32_t PRIME32_5 =  374761393U;
+
+static const uint64_t PRIME64_1 = 11400714785074694791ULL;
+static const uint64_t PRIME64_2 = 14029467366897019727ULL;
+static const uint64_t PRIME64_3 =  1609587929392839161ULL;
+static const uint64_t PRIME64_4 =  9650029242287828579ULL;
+static const uint64_t PRIME64_5 =  2870177450012600261ULL;
+
+/*-**************************
+ *  Utils
+ ***************************/
+void xxh32_copy_state(struct xxh32_state *dst, const struct xxh32_state *src)
+{
+	memcpy(dst, src, sizeof(*dst));
+}
+EXPORT_SYMBOL(xxh32_copy_state);
+
+void xxh64_copy_state(struct xxh64_state *dst, const struct xxh64_state *src)
+{
+	memcpy(dst, src, sizeof(*dst));
+}
+EXPORT_SYMBOL(xxh64_copy_state);
+
+/*-***************************
+ * Simple Hash Functions
+ ****************************/
+static uint32_t xxh32_round(uint32_t seed, const uint32_t input)
+{
+	seed += input * PRIME32_2;
+	seed = xxh_rotl32(seed, 13);
+	seed *= PRIME32_1;
+	return seed;
+}
+
+uint32_t xxh32(const void *input, const size_t len, const uint32_t seed)
+{
+	const uint8_t *p = (const uint8_t *)input;
+	const uint8_t *b_end = p + len;
+	uint32_t h32;
+
+	if (len >= 16) {
+		const uint8_t *const limit = b_end - 16;
+		uint32_t v1 = seed + PRIME32_1 + PRIME32_2;
+		uint32_t v2 = seed + PRIME32_2;
+		uint32_t v3 = seed + 0;
+		uint32_t v4 = seed - PRIME32_1;
+
+		do {
+			v1 = xxh32_round(v1, get_unaligned_le32(p));
+			p += 4;
+			v2 = xxh32_round(v2, get_unaligned_le32(p));
+			p += 4;
+			v3 = xxh32_round(v3, get_unaligned_le32(p));
+			p += 4;
+			v4 = xxh32_round(v4, get_unaligned_le32(p));
+			p += 4;
+		} while (p <= limit);
+
+		h32 = xxh_rotl32(v1, 1) + xxh_rotl32(v2, 7) +
+			xxh_rotl32(v3, 12) + xxh_rotl32(v4, 18);
+	} else {
+		h32 = seed + PRIME32_5;
+	}
+
+	h32 += (uint32_t)len;
+
+	while (p + 4 <= b_end) {
+		h32 += get_unaligned_le32(p) * PRIME32_3;
+		h32 = xxh_rotl32(h32, 17) * PRIME32_4;
+		p += 4;
+	}
+
+	while (p < b_end) {
+		h32 += (*p) * PRIME32_5;
+		h32 = xxh_rotl32(h32, 11) * PRIME32_1;
+		p++;
+	}
+
+	h32 ^= h32 >> 15;
+	h32 *= PRIME32_2;
+	h32 ^= h32 >> 13;
+	h32 *= PRIME32_3;
+	h32 ^= h32 >> 16;
+
+	return h32;
+}
+EXPORT_SYMBOL(xxh32);
+
+static uint64_t xxh64_round(uint64_t acc, const uint64_t input)
+{
+	acc += input * PRIME64_2;
+	acc = xxh_rotl64(acc, 31);
+	acc *= PRIME64_1;
+	return acc;
+}
+
+static uint64_t xxh64_merge_round(uint64_t acc, uint64_t val)
+{
+	val = xxh64_round(0, val);
+	acc ^= val;
+	acc = acc * PRIME64_1 + PRIME64_4;
+	return acc;
+}
+
+uint64_t xxh64(const void *input, const size_t len, const uint64_t seed)
+{
+	const uint8_t *p = (const uint8_t *)input;
+	const uint8_t *const b_end = p + len;
+	uint64_t h64;
+
+	if (len >= 32) {
+		const uint8_t *const limit = b_end - 32;
+		uint64_t v1 = seed + PRIME64_1 + PRIME64_2;
+		uint64_t v2 = seed + PRIME64_2;
+		uint64_t v3 = seed + 0;
+		uint64_t v4 = seed - PRIME64_1;
+
+		do {
+			v1 = xxh64_round(v1, get_unaligned_le64(p));
+			p += 8;
+			v2 = xxh64_round(v2, get_unaligned_le64(p));
+			p += 8;
+			v3 = xxh64_round(v3, get_unaligned_le64(p));
+			p += 8;
+			v4 = xxh64_round(v4, get_unaligned_le64(p));
+			p += 8;
+		} while (p <= limit);
+
+		h64 = xxh_rotl64(v1, 1) + xxh_rotl64(v2, 7) +
+			xxh_rotl64(v3, 12) + xxh_rotl64(v4, 18);
+		h64 = xxh64_merge_round(h64, v1);
+		h64 = xxh64_merge_round(h64, v2);
+		h64 = xxh64_merge_round(h64, v3);
+		h64 = xxh64_merge_round(h64, v4);
+
+	} else {
+		h64  = seed + PRIME64_5;
+	}
+
+	h64 += (uint64_t)len;
+
+	while (p + 8 <= b_end) {
+		const uint64_t k1 = xxh64_round(0, get_unaligned_le64(p));
+
+		h64 ^= k1;
+		h64 = xxh_rotl64(h64, 27) * PRIME64_1 + PRIME64_4;
+		p += 8;
+	}
+
+	if (p + 4 <= b_end) {
+		h64 ^= (uint64_t)(get_unaligned_le32(p)) * PRIME64_1;
+		h64 = xxh_rotl64(h64, 23) * PRIME64_2 + PRIME64_3;
+		p += 4;
+	}
+
+	while (p < b_end) {
+		h64 ^= (*p) * PRIME64_5;
+		h64 = xxh_rotl64(h64, 11) * PRIME64_1;
+		p++;
+	}
+
+	h64 ^= h64 >> 33;
+	h64 *= PRIME64_2;
+	h64 ^= h64 >> 29;
+	h64 *= PRIME64_3;
+	h64 ^= h64 >> 32;
+
+	return h64;
+}
+EXPORT_SYMBOL(xxh64);
+
+/*-**************************************************
+ * Advanced Hash Functions
+ ***************************************************/
+void xxh32_reset(struct xxh32_state *statePtr, const uint32_t seed)
+{
+	/* use a local state for memcpy() to avoid strict-aliasing warnings */
+	struct xxh32_state state;
+
+	memset(&state, 0, sizeof(state));
+	state.v1 = seed + PRIME32_1 + PRIME32_2;
+	state.v2 = seed + PRIME32_2;
+	state.v3 = seed + 0;
+	state.v4 = seed - PRIME32_1;
+	memcpy(statePtr, &state, sizeof(state));
+}
+EXPORT_SYMBOL(xxh32_reset);
+
+void xxh64_reset(struct xxh64_state *statePtr, const uint64_t seed)
+{
+	/* use a local state for memcpy() to avoid strict-aliasing warnings */
+	struct xxh64_state state;
+
+	memset(&state, 0, sizeof(state));
+	state.v1 = seed + PRIME64_1 + PRIME64_2;
+	state.v2 = seed + PRIME64_2;
+	state.v3 = seed + 0;
+	state.v4 = seed - PRIME64_1;
+	memcpy(statePtr, &state, sizeof(state));
+}
+EXPORT_SYMBOL(xxh64_reset);
+
+int xxh32_update(struct xxh32_state *state, const void *input, const size_t len)
+{
+	const uint8_t *p = (const uint8_t *)input;
+	const uint8_t *const b_end = p + len;
+
+	if (input == NULL)
+		return -EINVAL;
+
+	state->total_len_32 += (uint32_t)len;
+	state->large_len |= (len >= 16) | (state->total_len_32 >= 16);
+
+	if (state->memsize + len < 16) { /* fill in tmp buffer */
+		memcpy((uint8_t *)(state->mem32) + state->memsize, input, len);
+		state->memsize += (uint32_t)len;
+		return 0;
+	}
+
+	if (state->memsize) { /* some data left from previous update */
+		const uint32_t *p32 = state->mem32;
+
+		memcpy((uint8_t *)(state->mem32) + state->memsize, input,
+			16 - state->memsize);
+
+		state->v1 = xxh32_round(state->v1, get_unaligned_le32(p32));
+		p32++;
+		state->v2 = xxh32_round(state->v2, get_unaligned_le32(p32));
+		p32++;
+		state->v3 = xxh32_round(state->v3, get_unaligned_le32(p32));
+		p32++;
+		state->v4 = xxh32_round(state->v4, get_unaligned_le32(p32));
+		p32++;
+
+		p += 16-state->memsize;
+		state->memsize = 0;
+	}
+
+	if (p <= b_end - 16) {
+		const uint8_t *const limit = b_end - 16;
+		uint32_t v1 = state->v1;
+		uint32_t v2 = state->v2;
+		uint32_t v3 = state->v3;
+		uint32_t v4 = state->v4;
+
+		do {
+			v1 = xxh32_round(v1, get_unaligned_le32(p));
+			p += 4;
+			v2 = xxh32_round(v2, get_unaligned_le32(p));
+			p += 4;
+			v3 = xxh32_round(v3, get_unaligned_le32(p));
+			p += 4;
+			v4 = xxh32_round(v4, get_unaligned_le32(p));
+			p += 4;
+		} while (p <= limit);
+
+		state->v1 = v1;
+		state->v2 = v2;
+		state->v3 = v3;
+		state->v4 = v4;
+	}
+
+	if (p < b_end) {
+		memcpy(state->mem32, p, (size_t)(b_end-p));
+		state->memsize = (uint32_t)(b_end-p);
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL(xxh32_update);
+
+uint32_t xxh32_digest(const struct xxh32_state *state)
+{
+	const uint8_t *p = (const uint8_t *)state->mem32;
+	const uint8_t *const b_end = (const uint8_t *)(state->mem32) +
+		state->memsize;
+	uint32_t h32;
+
+	if (state->large_len) {
+		h32 = xxh_rotl32(state->v1, 1) + xxh_rotl32(state->v2, 7) +
+			xxh_rotl32(state->v3, 12) + xxh_rotl32(state->v4, 18);
+	} else {
+		h32 = state->v3 /* == seed */ + PRIME32_5;
+	}
+
+	h32 += state->total_len_32;
+
+	while (p + 4 <= b_end) {
+		h32 += get_unaligned_le32(p) * PRIME32_3;
+		h32 = xxh_rotl32(h32, 17) * PRIME32_4;
+		p += 4;
+	}
+
+	while (p < b_end) {
+		h32 += (*p) * PRIME32_5;
+		h32 = xxh_rotl32(h32, 11) * PRIME32_1;
+		p++;
+	}
+
+	h32 ^= h32 >> 15;
+	h32 *= PRIME32_2;
+	h32 ^= h32 >> 13;
+	h32 *= PRIME32_3;
+	h32 ^= h32 >> 16;
+
+	return h32;
+}
+EXPORT_SYMBOL(xxh32_digest);
+
+int xxh64_update(struct xxh64_state *state, const void *input, const size_t len)
+{
+	const uint8_t *p = (const uint8_t *)input;
+	const uint8_t *const b_end = p + len;
+
+	if (input == NULL)
+		return -EINVAL;
+
+	state->total_len += len;
+
+	if (state->memsize + len < 32) { /* fill in tmp buffer */
+		memcpy(((uint8_t *)state->mem64) + state->memsize, input, len);
+		state->memsize += (uint32_t)len;
+		return 0;
+	}
+
+	if (state->memsize) { /* tmp buffer is full */
+		uint64_t *p64 = state->mem64;
+
+		memcpy(((uint8_t *)p64) + state->memsize, input,
+			32 - state->memsize);
+
+		state->v1 = xxh64_round(state->v1, get_unaligned_le64(p64));
+		p64++;
+		state->v2 = xxh64_round(state->v2, get_unaligned_le64(p64));
+		p64++;
+		state->v3 = xxh64_round(state->v3, get_unaligned_le64(p64));
+		p64++;
+		state->v4 = xxh64_round(state->v4, get_unaligned_le64(p64));
+
+		p += 32 - state->memsize;
+		state->memsize = 0;
+	}
+
+	if (p + 32 <= b_end) {
+		const uint8_t *const limit = b_end - 32;
+		uint64_t v1 = state->v1;
+		uint64_t v2 = state->v2;
+		uint64_t v3 = state->v3;
+		uint64_t v4 = state->v4;
+
+		do {
+			v1 = xxh64_round(v1, get_unaligned_le64(p));
+			p += 8;
+			v2 = xxh64_round(v2, get_unaligned_le64(p));
+			p += 8;
+			v3 = xxh64_round(v3, get_unaligned_le64(p));
+			p += 8;
+			v4 = xxh64_round(v4, get_unaligned_le64(p));
+			p += 8;
+		} while (p <= limit);
+
+		state->v1 = v1;
+		state->v2 = v2;
+		state->v3 = v3;
+		state->v4 = v4;
+	}
+
+	if (p < b_end) {
+		memcpy(state->mem64, p, (size_t)(b_end-p));
+		state->memsize = (uint32_t)(b_end - p);
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL(xxh64_update);
+
+uint64_t xxh64_digest(const struct xxh64_state *state)
+{
+	const uint8_t *p = (const uint8_t *)state->mem64;
+	const uint8_t *const b_end = (const uint8_t *)state->mem64 +
+		state->memsize;
+	uint64_t h64;
+
+	if (state->total_len >= 32) {
+		const uint64_t v1 = state->v1;
+		const uint64_t v2 = state->v2;
+		const uint64_t v3 = state->v3;
+		const uint64_t v4 = state->v4;
+
+		h64 = xxh_rotl64(v1, 1) + xxh_rotl64(v2, 7) +
+			xxh_rotl64(v3, 12) + xxh_rotl64(v4, 18);
+		h64 = xxh64_merge_round(h64, v1);
+		h64 = xxh64_merge_round(h64, v2);
+		h64 = xxh64_merge_round(h64, v3);
+		h64 = xxh64_merge_round(h64, v4);
+	} else {
+		h64  = state->v3 + PRIME64_5;
+	}
+
+	h64 += (uint64_t)state->total_len;
+
+	while (p + 8 <= b_end) {
+		const uint64_t k1 = xxh64_round(0, get_unaligned_le64(p));
+
+		h64 ^= k1;
+		h64 = xxh_rotl64(h64, 27) * PRIME64_1 + PRIME64_4;
+		p += 8;
+	}
+
+	if (p + 4 <= b_end) {
+		h64 ^= (uint64_t)(get_unaligned_le32(p)) * PRIME64_1;
+		h64 = xxh_rotl64(h64, 23) * PRIME64_2 + PRIME64_3;
+		p += 4;
+	}
+
+	while (p < b_end) {
+		h64 ^= (*p) * PRIME64_5;
+		h64 = xxh_rotl64(h64, 11) * PRIME64_1;
+		p++;
+	}
+
+	h64 ^= h64 >> 33;
+	h64 *= PRIME64_2;
+	h64 ^= h64 >> 29;
+	h64 *= PRIME64_3;
+	h64 ^= h64 >> 32;
+
+	return h64;
+}
+EXPORT_SYMBOL(xxh64_digest);
+
+MODULE_LICENSE("Dual BSD/GPL");
+MODULE_DESCRIPTION("xxHash");
--
2.9.3

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 3/4] btrfs: Add zstd support
  2017-07-20 21:27 [PATCH v3 0/4] Add xxhash and zstd modules Nick Terrell
  2017-07-20 21:27 ` [PATCH v3 1/4] lib: Add xxhash module Nick Terrell
@ 2017-07-20 21:27 ` Nick Terrell
  2017-07-23 19:28   ` kbuild test robot
                     ` (2 more replies)
  2017-07-20 21:27 ` [PATCH v3 4/4] squashfs: " Nick Terrell
                   ` (2 subsequent siblings)
  4 siblings, 3 replies; 16+ messages in thread
From: Nick Terrell @ 2017-07-20 21:27 UTC (permalink / raw)
  To: Nick Terrell
  Cc: Austin S . Hemmelgarn, Herbert Xu, kernel-team, Chris Mason,
	Yann Collet, Adam Borowski, David Sterba, squashfs-devel,
	linux-btrfs, linux-kernel

Add zstd compression and decompression support to BtrFS. zstd at its
fastest level compresses almost as well as zlib, while offering much
faster compression and decompression, approaching lzo speeds.

I benchmarked btrfs with zstd compression against no compression, lzo
compression, and zlib compression. I benchmarked two scenarios. Copying
a set of files to btrfs, and then reading the files. Copying a tarball
to btrfs, extracting it to btrfs, and then reading the extracted files.
After every operation, I call `sync` and include the sync time.
Between every pair of operations I unmount and remount the filesystem
to avoid caching. The benchmark files can be found in the upstream
zstd source repository under
`contrib/linux-kernel/{btrfs-benchmark.sh,btrfs-extract-benchmark.sh}`
[1] [2].

I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM.
The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor,
16 GB of RAM, and a SSD.

The first compression benchmark is copying 10 copies of the unzipped
Silesia corpus [3] into a BtrFS filesystem mounted with
`-o compress-force=Method`. The decompression benchmark times how long
it takes to `tar` all 10 copies into `/dev/null`. The compression ratio is
measured by comparing the output of `df` and `du`. See the benchmark file
[1] for details. I benchmarked multiple zstd compression levels, although
the patch uses zstd level 1.

| Method  | Ratio | Compression MB/s | Decompression speed |
|---------|-------|------------------|---------------------|
| None    |  0.99 |              504 |                 686 |
| lzo     |  1.66 |              398 |                 442 |
| zlib    |  2.58 |               65 |                 241 |
| zstd 1  |  2.57 |              260 |                 383 |
| zstd 3  |  2.71 |              174 |                 408 |
| zstd 6  |  2.87 |               70 |                 398 |
| zstd 9  |  2.92 |               43 |                 406 |
| zstd 12 |  2.93 |               21 |                 408 |
| zstd 15 |  3.01 |               11 |                 354 |

The next benchmark first copies `linux-4.11.6.tar` [4] to btrfs. Then it
measures the compression ratio, extracts the tar, and deletes the tar.
Then it measures the compression ratio again, and `tar`s the extracted
files into `/dev/null`. See the benchmark file [2] for details.

| Method | Tar Ratio | Extract Ratio | Copy (s) | Extract (s)| Read (s) |
|--------|-----------|---------------|----------|------------|----------|
| None   |      0.97 |          0.78 |    0.981 |      5.501 |    8.807 |
| lzo    |      2.06 |          1.38 |    1.631 |      8.458 |    8.585 |
| zlib   |      3.40 |          1.86 |    7.750 |     21.544 |   11.744 |
| zstd 1 |      3.57 |          1.85 |    2.579 |     11.479 |    9.389 |

[1] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/btrfs-benchmark.sh
[2] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/btrfs-extract-benchmark.sh
[3] http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia
[4] https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.11.6.tar.xz

zstd source repository: https://github.com/facebook/zstd

Signed-off-by: Nick Terrell <terrelln@fb.com>
---
v2 -> v3:
- Port upstream BtrFS commits e1ddce71d6, 389a6cfc2a, and 6acafd1eff
- Change default compression level for BtrFS to 3

 fs/btrfs/Kconfig           |   2 +
 fs/btrfs/Makefile          |   2 +-
 fs/btrfs/compression.c     |   1 +
 fs/btrfs/compression.h     |   6 +-
 fs/btrfs/ctree.h           |   1 +
 fs/btrfs/disk-io.c         |   2 +
 fs/btrfs/ioctl.c           |   6 +-
 fs/btrfs/props.c           |   6 +
 fs/btrfs/super.c           |  12 +-
 fs/btrfs/sysfs.c           |   2 +
 fs/btrfs/zstd.c            | 435 +++++++++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/btrfs.h |   8 +-
 12 files changed, 471 insertions(+), 12 deletions(-)
 create mode 100644 fs/btrfs/zstd.c

diff --git a/fs/btrfs/Kconfig b/fs/btrfs/Kconfig
index 80e9c18..a26c63b 100644
--- a/fs/btrfs/Kconfig
+++ b/fs/btrfs/Kconfig
@@ -6,6 +6,8 @@ config BTRFS_FS
 	select ZLIB_DEFLATE
 	select LZO_COMPRESS
 	select LZO_DECOMPRESS
+	select ZSTD_COMPRESS
+	select ZSTD_DECOMPRESS
 	select RAID6_PQ
 	select XOR_BLOCKS
 	select SRCU
diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile
index 128ce17..962a95a 100644
--- a/fs/btrfs/Makefile
+++ b/fs/btrfs/Makefile
@@ -6,7 +6,7 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o root-tree.o dir-item.o \
 	   transaction.o inode.o file.o tree-defrag.o \
 	   extent_map.o sysfs.o struct-funcs.o xattr.o ordered-data.o \
 	   extent_io.o volumes.o async-thread.o ioctl.o locking.o orphan.o \
-	   export.o tree-log.o free-space-cache.o zlib.o lzo.o \
+	   export.o tree-log.o free-space-cache.o zlib.o lzo.o zstd.o \
 	   compression.o delayed-ref.o relocation.o delayed-inode.o scrub.o \
 	   reada.o backref.o ulist.o qgroup.o send.o dev-replace.o raid56.o \
 	   uuid-tree.o props.o hash.o free-space-tree.o
diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index d2ef9ac..4ff42d1 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -704,6 +704,7 @@ static struct {
 static const struct btrfs_compress_op * const btrfs_compress_op[] = {
 	&btrfs_zlib_compress,
 	&btrfs_lzo_compress,
+	&btrfs_zstd_compress,
 };

 void __init btrfs_init_compress(void)
diff --git a/fs/btrfs/compression.h b/fs/btrfs/compression.h
index 87f6d33..2269e00 100644
--- a/fs/btrfs/compression.h
+++ b/fs/btrfs/compression.h
@@ -99,8 +99,9 @@ enum btrfs_compression_type {
 	BTRFS_COMPRESS_NONE  = 0,
 	BTRFS_COMPRESS_ZLIB  = 1,
 	BTRFS_COMPRESS_LZO   = 2,
-	BTRFS_COMPRESS_TYPES = 2,
-	BTRFS_COMPRESS_LAST  = 3,
+	BTRFS_COMPRESS_ZSTD  = 3,
+	BTRFS_COMPRESS_TYPES = 3,
+	BTRFS_COMPRESS_LAST  = 4,
 };

 struct btrfs_compress_op {
@@ -128,5 +129,6 @@ struct btrfs_compress_op {

 extern const struct btrfs_compress_op btrfs_zlib_compress;
 extern const struct btrfs_compress_op btrfs_lzo_compress;
+extern const struct btrfs_compress_op btrfs_zstd_compress;

 #endif
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 3f3eb7b..845d77c 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -270,6 +270,7 @@ struct btrfs_super_block {
 	 BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS |		\
 	 BTRFS_FEATURE_INCOMPAT_BIG_METADATA |		\
 	 BTRFS_FEATURE_INCOMPAT_COMPRESS_LZO |		\
+	 BTRFS_FEATURE_INCOMPAT_COMPRESS_ZSTD |		\
 	 BTRFS_FEATURE_INCOMPAT_RAID56 |		\
 	 BTRFS_FEATURE_INCOMPAT_EXTENDED_IREF |		\
 	 BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA |	\
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 080e2eb..04632f4 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2828,6 +2828,8 @@ int open_ctree(struct super_block *sb,
 	features |= BTRFS_FEATURE_INCOMPAT_MIXED_BACKREF;
 	if (fs_info->compress_type == BTRFS_COMPRESS_LZO)
 		features |= BTRFS_FEATURE_INCOMPAT_COMPRESS_LZO;
+	else if (fs_info->compress_type == BTRFS_COMPRESS_ZSTD)
+		features |= BTRFS_FEATURE_INCOMPAT_COMPRESS_ZSTD;

 	if (features & BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA)
 		btrfs_info(fs_info, "has skinny extents");
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index fa1b78c..b9963d9 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -327,8 +327,10 @@ static int btrfs_ioctl_setflags(struct file *file, void __user *arg)

 		if (fs_info->compress_type == BTRFS_COMPRESS_LZO)
 			comp = "lzo";
-		else
+		else if (fs_info->compress_type == BTRFS_COMPRESS_ZLIB)
 			comp = "zlib";
+		else
+			comp = "zstd";
 		ret = btrfs_set_prop(inode, "btrfs.compression",
 				     comp, strlen(comp), 0);
 		if (ret)
@@ -1466,6 +1468,8 @@ int btrfs_defrag_file(struct inode *inode, struct file *file,

 	if (range->compress_type == BTRFS_COMPRESS_LZO) {
 		btrfs_set_fs_incompat(fs_info, COMPRESS_LZO);
+	} else if (range->compress_type == BTRFS_COMPRESS_ZSTD) {
+		btrfs_set_fs_incompat(fs_info, COMPRESS_ZSTD);
 	}

 	ret = defrag_count;
diff --git a/fs/btrfs/props.c b/fs/btrfs/props.c
index 4b23ae5..20631e9 100644
--- a/fs/btrfs/props.c
+++ b/fs/btrfs/props.c
@@ -390,6 +390,8 @@ static int prop_compression_validate(const char *value, size_t len)
 		return 0;
 	else if (!strncmp("zlib", value, len))
 		return 0;
+	else if (!strncmp("zstd", value, len))
+		return 0;

 	return -EINVAL;
 }
@@ -412,6 +414,8 @@ static int prop_compression_apply(struct inode *inode,
 		type = BTRFS_COMPRESS_LZO;
 	else if (!strncmp("zlib", value, len))
 		type = BTRFS_COMPRESS_ZLIB;
+	else if (!strncmp("zstd", value, len))
+		type = BTRFS_COMPRESS_ZSTD;
 	else
 		return -EINVAL;

@@ -429,6 +433,8 @@ static const char *prop_compression_extract(struct inode *inode)
 		return "zlib";
 	case BTRFS_COMPRESS_LZO:
 		return "lzo";
+	case BTRFS_COMPRESS_ZSTD:
+		return "zstd";
 	}

 	return NULL;
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 12540b6..c370dea 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -513,6 +513,14 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
 				btrfs_clear_opt(info->mount_opt, NODATASUM);
 				btrfs_set_fs_incompat(info, COMPRESS_LZO);
 				no_compress = 0;
+			} else if (strcmp(args[0].from, "zstd") == 0) {
+				compress_type = "zstd";
+				info->compress_type = BTRFS_COMPRESS_ZSTD;
+				btrfs_set_opt(info->mount_opt, COMPRESS);
+				btrfs_clear_opt(info->mount_opt, NODATACOW);
+				btrfs_clear_opt(info->mount_opt, NODATASUM);
+				btrfs_set_fs_incompat(info, COMPRESS_ZSTD);
+				no_compress = 0;
 			} else if (strncmp(args[0].from, "no", 2) == 0) {
 				compress_type = "no";
 				btrfs_clear_opt(info->mount_opt, COMPRESS);
@@ -1227,8 +1235,10 @@ static int btrfs_show_options(struct seq_file *seq, struct dentry *dentry)
 	if (btrfs_test_opt(info, COMPRESS)) {
 		if (info->compress_type == BTRFS_COMPRESS_ZLIB)
 			compress_type = "zlib";
-		else
+		else if (info->compress_type == BTRFS_COMPRESS_LZO)
 			compress_type = "lzo";
+		else
+			compress_type = "zstd";
 		if (btrfs_test_opt(info, FORCE_COMPRESS))
 			seq_printf(seq, ",compress-force=%s", compress_type);
 		else
diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index c2d5f35..2b6d37c 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -200,6 +200,7 @@ BTRFS_FEAT_ATTR_INCOMPAT(mixed_backref, MIXED_BACKREF);
 BTRFS_FEAT_ATTR_INCOMPAT(default_subvol, DEFAULT_SUBVOL);
 BTRFS_FEAT_ATTR_INCOMPAT(mixed_groups, MIXED_GROUPS);
 BTRFS_FEAT_ATTR_INCOMPAT(compress_lzo, COMPRESS_LZO);
+BTRFS_FEAT_ATTR_INCOMPAT(compress_zstd, COMPRESS_ZSTD);
 BTRFS_FEAT_ATTR_INCOMPAT(big_metadata, BIG_METADATA);
 BTRFS_FEAT_ATTR_INCOMPAT(extended_iref, EXTENDED_IREF);
 BTRFS_FEAT_ATTR_INCOMPAT(raid56, RAID56);
@@ -212,6 +213,7 @@ static struct attribute *btrfs_supported_feature_attrs[] = {
 	BTRFS_FEAT_ATTR_PTR(default_subvol),
 	BTRFS_FEAT_ATTR_PTR(mixed_groups),
 	BTRFS_FEAT_ATTR_PTR(compress_lzo),
+	BTRFS_FEAT_ATTR_PTR(compress_zstd),
 	BTRFS_FEAT_ATTR_PTR(big_metadata),
 	BTRFS_FEAT_ATTR_PTR(extended_iref),
 	BTRFS_FEAT_ATTR_PTR(raid56),
diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c
new file mode 100644
index 0000000..1822068
--- /dev/null
+++ b/fs/btrfs/zstd.c
@@ -0,0 +1,435 @@
+/*
+ * Copyright (c) 2016-present, Facebook, Inc.
+ * All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License v2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; if not, write to the
+ * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
+ * Boston, MA 021110-1307, USA.
+ */
+#include <linux/kernel.h>
+#include <linux/slab.h>
+#include <linux/vmalloc.h>
+#include <linux/init.h>
+#include <linux/err.h>
+#include <linux/sched.h>
+#include <linux/pagemap.h>
+#include <linux/bio.h>
+#include <linux/zstd.h>
+#include "compression.h"
+
+#define ZSTD_BTRFS_MAX_WINDOWLOG 17
+#define ZSTD_BTRFS_MAX_INPUT (1 << ZSTD_BTRFS_MAX_WINDOWLOG)
+#define ZSTD_BTRFS_DEFAULT_LEVEL 3
+
+static ZSTD_parameters zstd_get_btrfs_parameters(size_t src_len)
+{
+	ZSTD_parameters params = ZSTD_getParams(ZSTD_BTRFS_DEFAULT_LEVEL, src_len, 0);
+
+	if (params.cParams.windowLog > ZSTD_BTRFS_MAX_WINDOWLOG)
+		params.cParams.windowLog = ZSTD_BTRFS_MAX_WINDOWLOG;
+	WARN_ON(src_len > ZSTD_BTRFS_MAX_INPUT);
+	return params;
+}
+
+struct workspace {
+	void *mem;
+	size_t size;
+	char *buf;
+	struct list_head list;
+};
+
+static void zstd_free_workspace(struct list_head *ws)
+{
+	struct workspace *workspace = list_entry(ws, struct workspace, list);
+
+	kvfree(workspace->mem);
+	kfree(workspace->buf);
+	kfree(workspace);
+}
+
+static struct list_head *zstd_alloc_workspace(void)
+{
+	ZSTD_parameters params =
+			zstd_get_btrfs_parameters(ZSTD_BTRFS_MAX_INPUT);
+	struct workspace *workspace;
+
+	workspace = kzalloc(sizeof(*workspace), GFP_KERNEL);
+	if (!workspace)
+		return ERR_PTR(-ENOMEM);
+
+	workspace->size = max_t(size_t,
+			ZSTD_CStreamWorkspaceBound(params.cParams),
+			ZSTD_DStreamWorkspaceBound(ZSTD_BTRFS_MAX_INPUT));
+	workspace->mem = kvmalloc(workspace->size, GFP_KERNEL);
+	workspace->buf = kmalloc(PAGE_SIZE, GFP_KERNEL);
+	if (!workspace->mem || !workspace->buf)
+		goto fail;
+
+	INIT_LIST_HEAD(&workspace->list);
+
+	return &workspace->list;
+fail:
+	zstd_free_workspace(&workspace->list);
+	return ERR_PTR(-ENOMEM);
+}
+
+static int zstd_compress_pages(struct list_head *ws,
+		struct address_space *mapping,
+		u64 start,
+		struct page **pages,
+		unsigned long *out_pages,
+		unsigned long *total_in,
+		unsigned long *total_out)
+{
+	struct workspace *workspace = list_entry(ws, struct workspace, list);
+	ZSTD_CStream *stream;
+	int ret = 0;
+	int nr_pages = 0;
+	struct page *in_page = NULL;  /* The current page to read */
+	struct page *out_page = NULL; /* The current page to write to */
+	ZSTD_inBuffer in_buf = { NULL, 0, 0 };
+	ZSTD_outBuffer out_buf = { NULL, 0, 0 };
+	unsigned long tot_in = 0;
+	unsigned long tot_out = 0;
+	unsigned long len = *total_out;
+	const unsigned long nr_dest_pages = *out_pages;
+	unsigned long max_out = nr_dest_pages * PAGE_SIZE;
+	ZSTD_parameters params = zstd_get_btrfs_parameters(len);
+
+	*out_pages = 0;
+	*total_out = 0;
+	*total_in = 0;
+
+	/* Initialize the stream */
+	stream = ZSTD_initCStream(params, len, workspace->mem,
+			workspace->size);
+	if (!stream) {
+		pr_warn("BTRFS: ZSTD_initCStream failed\n");
+		ret = -EIO;
+		goto out;
+	}
+
+	/* map in the first page of input data */
+	in_page = find_get_page(mapping, start >> PAGE_SHIFT);
+	in_buf.src = kmap(in_page);
+	in_buf.pos = 0;
+	in_buf.size = min_t(size_t, len, PAGE_SIZE);
+
+
+	/* Allocate and map in the output buffer */
+	out_page = alloc_page(GFP_NOFS | __GFP_HIGHMEM);
+	if (out_page == NULL) {
+		ret = -ENOMEM;
+		goto out;
+	}
+	pages[nr_pages++] = out_page;
+	out_buf.dst = kmap(out_page);
+	out_buf.pos = 0;
+	out_buf.size = min_t(size_t, max_out, PAGE_SIZE);
+
+	while (1) {
+		size_t ret2;
+
+		ret2 = ZSTD_compressStream(stream, &out_buf, &in_buf);
+		if (ZSTD_isError(ret2)) {
+			pr_debug("BTRFS: ZSTD_compressStream returned %d\n",
+					ZSTD_getErrorCode(ret2));
+			ret = -EIO;
+			goto out;
+		}
+
+		/* Check to see if we are making it bigger */
+		if (tot_in + in_buf.pos > 8192 &&
+				tot_in + in_buf.pos <
+				tot_out + out_buf.pos) {
+			ret = -E2BIG;
+			goto out;
+		}
+
+		/* We've reached the end of our output range */
+		if (out_buf.pos >= max_out) {
+			tot_out += out_buf.pos;
+			ret = -E2BIG;
+			goto out;
+		}
+
+		/* Check if we need more output space */
+		if (out_buf.pos == out_buf.size) {
+			tot_out += PAGE_SIZE;
+			max_out -= PAGE_SIZE;
+			kunmap(out_page);
+			if (nr_pages == nr_dest_pages) {
+				out_page = NULL;
+				ret = -E2BIG;
+				goto out;
+			}
+			out_page = alloc_page(GFP_NOFS | __GFP_HIGHMEM);
+			if (out_page == NULL) {
+				ret = -ENOMEM;
+				goto out;
+			}
+			pages[nr_pages++] = out_page;
+			out_buf.dst = kmap(out_page);
+			out_buf.pos = 0;
+			out_buf.size = min_t(size_t, max_out, PAGE_SIZE);
+		}
+
+		/* We've reached the end of the input */
+		if (in_buf.pos >= len) {
+			tot_in += in_buf.pos;
+			break;
+		}
+
+		/* Check if we need more input */
+		if (in_buf.pos == in_buf.size) {
+			tot_in += PAGE_SIZE;
+			kunmap(in_page);
+			put_page(in_page);
+
+			start += PAGE_SIZE;
+			len -= PAGE_SIZE;
+			in_page = find_get_page(mapping, start >> PAGE_SHIFT);
+			in_buf.src = kmap(in_page);
+			in_buf.pos = 0;
+			in_buf.size = min_t(size_t, len, PAGE_SIZE);
+		}
+	}
+	while (1) {
+		size_t ret2;
+
+		ret2 = ZSTD_endStream(stream, &out_buf);
+		if (ZSTD_isError(ret2)) {
+			pr_debug("BTRFS: ZSTD_endStream returned %d\n",
+					ZSTD_getErrorCode(ret2));
+			ret = -EIO;
+			goto out;
+		}
+		if (ret2 == 0) {
+			tot_out += out_buf.pos;
+			break;
+		}
+		if (out_buf.pos >= max_out) {
+			tot_out += out_buf.pos;
+			ret = -E2BIG;
+			goto out;
+		}
+
+		tot_out += PAGE_SIZE;
+		max_out -= PAGE_SIZE;
+		kunmap(out_page);
+		if (nr_pages == nr_dest_pages) {
+			out_page = NULL;
+			ret = -E2BIG;
+			goto out;
+		}
+		out_page = alloc_page(GFP_NOFS | __GFP_HIGHMEM);
+		if (out_page == NULL) {
+			ret = -ENOMEM;
+			goto out;
+		}
+		pages[nr_pages++] = out_page;
+		out_buf.dst = kmap(out_page);
+		out_buf.pos = 0;
+		out_buf.size = min_t(size_t, max_out, PAGE_SIZE);
+	}
+
+	if (tot_out >= tot_in) {
+		ret = -E2BIG;
+		goto out;
+	}
+
+	ret = 0;
+	*total_in = tot_in;
+	*total_out = tot_out;
+out:
+	*out_pages = nr_pages;
+	/* Cleanup */
+	if (in_page) {
+		kunmap(in_page);
+		put_page(in_page);
+	}
+	if (out_page)
+		kunmap(out_page);
+	return ret;
+}
+
+static int zstd_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
+{
+	struct workspace *workspace = list_entry(ws, struct workspace, list);
+	struct page **pages_in = cb->compressed_pages;
+	u64 disk_start = cb->start;
+	struct bio *orig_bio = cb->orig_bio;
+	size_t srclen = cb->compressed_len;
+	ZSTD_DStream *stream;
+	int ret = 0;
+	unsigned long page_in_index = 0;
+	unsigned long total_pages_in = DIV_ROUND_UP(srclen, PAGE_SIZE);
+	unsigned long buf_start;
+	unsigned long total_out = 0;
+	ZSTD_inBuffer in_buf = { NULL, 0, 0 };
+	ZSTD_outBuffer out_buf = { NULL, 0, 0 };
+
+	stream = ZSTD_initDStream(
+			ZSTD_BTRFS_MAX_INPUT, workspace->mem, workspace->size);
+	if (!stream) {
+		pr_debug("BTRFS: ZSTD_initDStream failed\n");
+		ret = -EIO;
+		goto done;
+	}
+
+	in_buf.src = kmap(pages_in[page_in_index]);
+	in_buf.pos = 0;
+	in_buf.size = min_t(size_t, srclen, PAGE_SIZE);
+
+	out_buf.dst = workspace->buf;
+	out_buf.pos = 0;
+	out_buf.size = PAGE_SIZE;
+
+	while (1) {
+		size_t ret2;
+
+		ret2 = ZSTD_decompressStream(stream, &out_buf, &in_buf);
+		if (ZSTD_isError(ret2)) {
+			pr_debug("BTRFS: ZSTD_decompressStream returned %d\n",
+					ZSTD_getErrorCode(ret2));
+			ret = -EIO;
+			goto done;
+		}
+		buf_start = total_out;
+		total_out += out_buf.pos;
+		out_buf.pos = 0;
+
+		ret = btrfs_decompress_buf2page(out_buf.dst, buf_start,
+				total_out, disk_start, orig_bio);
+		if (ret == 0)
+			break;
+
+		if (in_buf.pos >= srclen)
+			break;
+
+		/* Check if we've hit the end of a frame */
+		if (ret2 == 0)
+			break;
+
+		if (in_buf.pos == in_buf.size) {
+			kunmap(pages_in[page_in_index++]);
+			if (page_in_index >= total_pages_in) {
+				in_buf.src = NULL;
+				ret = -EIO;
+				goto done;
+			}
+			srclen -= PAGE_SIZE;
+			in_buf.src = kmap(pages_in[page_in_index]);
+			in_buf.pos = 0;
+			in_buf.size = min_t(size_t, srclen, PAGE_SIZE);
+		}
+	}
+	ret = 0;
+	zero_fill_bio(orig_bio);
+done:
+	if (in_buf.src)
+		kunmap(pages_in[page_in_index]);
+	return ret;
+}
+
+static int zstd_decompress(struct list_head *ws, unsigned char *data_in,
+		struct page *dest_page,
+		unsigned long start_byte,
+		size_t srclen, size_t destlen)
+{
+	struct workspace *workspace = list_entry(ws, struct workspace, list);
+	ZSTD_DStream *stream;
+	int ret = 0;
+	size_t ret2;
+	ZSTD_inBuffer in_buf = { NULL, 0, 0 };
+	ZSTD_outBuffer out_buf = { NULL, 0, 0 };
+	unsigned long total_out = 0;
+	unsigned long pg_offset = 0;
+	char *kaddr;
+
+	stream = ZSTD_initDStream(
+			ZSTD_BTRFS_MAX_INPUT, workspace->mem, workspace->size);
+	if (!stream) {
+		pr_warn("BTRFS: ZSTD_initDStream failed\n");
+		ret = -EIO;
+		goto finish;
+	}
+
+	destlen = min_t(size_t, destlen, PAGE_SIZE);
+
+	in_buf.src = data_in;
+	in_buf.pos = 0;
+	in_buf.size = srclen;
+
+	out_buf.dst = workspace->buf;
+	out_buf.pos = 0;
+	out_buf.size = PAGE_SIZE;
+
+	ret2 = 1;
+	while (pg_offset < destlen && in_buf.pos < in_buf.size) {
+		unsigned long buf_start;
+		unsigned long buf_offset;
+		unsigned long bytes;
+
+		/* Check if the frame is over and we still need more input */
+		if (ret2 == 0) {
+			pr_debug("BTRFS: ZSTD_decompressStream ended early\n");
+			ret = -EIO;
+			goto finish;
+		}
+		ret2 = ZSTD_decompressStream(stream, &out_buf, &in_buf);
+		if (ZSTD_isError(ret2)) {
+			pr_debug("BTRFS: ZSTD_decompressStream returned %d\n",
+					ZSTD_getErrorCode(ret2));
+			ret = -EIO;
+			goto finish;
+		}
+
+		buf_start = total_out;
+		total_out += out_buf.pos;
+		out_buf.pos = 0;
+
+		if (total_out <= start_byte)
+			continue;
+
+		if (total_out > start_byte && buf_start < start_byte)
+			buf_offset = start_byte - buf_start;
+		else
+			buf_offset = 0;
+
+		bytes = min_t(unsigned long, destlen - pg_offset,
+				out_buf.size - buf_offset);
+
+		kaddr = kmap_atomic(dest_page);
+		memcpy(kaddr + pg_offset, out_buf.dst + buf_offset, bytes);
+		kunmap_atomic(kaddr);
+
+		pg_offset += bytes;
+	}
+	ret = 0;
+finish:
+	if (pg_offset < destlen) {
+		kaddr = kmap_atomic(dest_page);
+		memset(kaddr + pg_offset, 0, destlen - pg_offset);
+		kunmap_atomic(kaddr);
+	}
+	return ret;
+}
+
+const struct btrfs_compress_op btrfs_zstd_compress = {
+	.alloc_workspace = zstd_alloc_workspace,
+	.free_workspace = zstd_free_workspace,
+	.compress_pages = zstd_compress_pages,
+	.decompress_bio = zstd_decompress_bio,
+	.decompress = zstd_decompress,
+};
diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h
index 9aa74f3..378230c 100644
--- a/include/uapi/linux/btrfs.h
+++ b/include/uapi/linux/btrfs.h
@@ -255,13 +255,7 @@ struct btrfs_ioctl_fs_info_args {
 #define BTRFS_FEATURE_INCOMPAT_DEFAULT_SUBVOL	(1ULL << 1)
 #define BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS	(1ULL << 2)
 #define BTRFS_FEATURE_INCOMPAT_COMPRESS_LZO	(1ULL << 3)
-/*
- * some patches floated around with a second compression method
- * lets save that incompat here for when they do get in
- * Note we don't actually support it, we're just reserving the
- * number
- */
-#define BTRFS_FEATURE_INCOMPAT_COMPRESS_LZOv2	(1ULL << 4)
+#define BTRFS_FEATURE_INCOMPAT_COMPRESS_ZSTD	(1ULL << 4)

 /*
  * older kernels tried to do bigger metadata blocks, but the
--
2.9.3

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 4/4] squashfs: Add zstd support
  2017-07-20 21:27 [PATCH v3 0/4] Add xxhash and zstd modules Nick Terrell
  2017-07-20 21:27 ` [PATCH v3 1/4] lib: Add xxhash module Nick Terrell
  2017-07-20 21:27 ` [PATCH v3 3/4] btrfs: Add zstd support Nick Terrell
@ 2017-07-20 21:27 ` Nick Terrell
  2017-07-31  1:50   ` Phillip Lougher
  2017-07-21 11:16 ` [PATCH v3 0/4] Add xxhash and zstd modules Austin S. Hemmelgarn
  2017-07-21 15:56 ` Austin S. Hemmelgarn
  4 siblings, 1 reply; 16+ messages in thread
From: Nick Terrell @ 2017-07-20 21:27 UTC (permalink / raw)
  To: Nick Terrell
  Cc: Austin S . Hemmelgarn, Herbert Xu, kernel-team, Chris Mason,
	Yann Collet, Adam Borowski, David Sterba, squashfs-devel,
	linux-btrfs, linux-kernel, Sean Purcell

Add zstd compression and decompression support to SquashFS. zstd is a
great fit for SquashFS because it can compress at ratios approaching xz,
while decompressing twice as fast as zlib. For SquashFS in particular,
it can decompress as fast as lzo and lz4. It also has the flexibility
to turn down the compression ratio for faster compression times.

The compression benchmark is run on the file tree from the SquashFS archive
found in ubuntu-16.10-desktop-amd64.iso [1]. It uses `mksquashfs` with the
default block size (128 KB) and and various compression algorithms/levels.
xz and zstd are also benchmarked with 256 KB blocks. The decompression
benchmark times how long it takes to `tar` the file tree into `/dev/null`.
See the benchmark file in the upstream zstd source repository located under
`contrib/linux-kernel/squashfs-benchmark.sh` [2] for details.

I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM.
The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor,
16 GB of RAM, and a SSD.

| Method         | Ratio | Compression MB/s | Decompression MB/s |
|----------------|-------|------------------|--------------------|
| gzip           |  2.92 |               15 |                128 |
| lzo            |  2.64 |              9.5 |                217 |
| lz4            |  2.12 |               94 |                218 |
| xz             |  3.43 |              5.5 |                 35 |
| xz 256 KB      |  3.53 |              5.4 |                 40 |
| zstd 1         |  2.71 |               96 |                210 |
| zstd 5         |  2.93 |               69 |                198 |
| zstd 10        |  3.01 |               41 |                225 |
| zstd 15        |  3.13 |             11.4 |                224 |
| zstd 16 256 KB |  3.24 |              8.1 |                210 |

This patch was written by Sean Purcell <me@seanp.xyz>, but I will be
taking over the submission process.

[1] http://releases.ubuntu.com/16.10/
[2] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/squashfs-benchmark.sh

zstd source repository: https://github.com/facebook/zstd

Cc: Sean Purcell <me@seanp.xyz>
Signed-off-by: Nick Terrell <terrelln@fb.com>
---
 fs/squashfs/Kconfig        |  14 +++++
 fs/squashfs/Makefile       |   1 +
 fs/squashfs/decompressor.c |   7 +++
 fs/squashfs/decompressor.h |   4 ++
 fs/squashfs/squashfs_fs.h  |   1 +
 fs/squashfs/zstd_wrapper.c | 150 +++++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 177 insertions(+)
 create mode 100644 fs/squashfs/zstd_wrapper.c

diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig
index ffb093e..1adb334 100644
--- a/fs/squashfs/Kconfig
+++ b/fs/squashfs/Kconfig
@@ -165,6 +165,20 @@ config SQUASHFS_XZ

 	  If unsure, say N.

+config SQUASHFS_ZSTD
+	bool "Include support for ZSTD compressed file systems"
+	depends on SQUASHFS
+	select ZSTD_DECOMPRESS
+	help
+	  Saying Y here includes support for reading Squashfs file systems
+	  compressed with ZSTD compression.  ZSTD gives better compression than
+	  the default ZLIB compression, while using less CPU.
+
+	  ZSTD is not the standard compression used in Squashfs and so most
+	  file systems will be readable without selecting this option.
+
+	  If unsure, say N.
+
 config SQUASHFS_4K_DEVBLK_SIZE
 	bool "Use 4K device block size?"
 	depends on SQUASHFS
diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile
index 246a6f3..6655631 100644
--- a/fs/squashfs/Makefile
+++ b/fs/squashfs/Makefile
@@ -15,3 +15,4 @@ squashfs-$(CONFIG_SQUASHFS_LZ4) += lz4_wrapper.o
 squashfs-$(CONFIG_SQUASHFS_LZO) += lzo_wrapper.o
 squashfs-$(CONFIG_SQUASHFS_XZ) += xz_wrapper.o
 squashfs-$(CONFIG_SQUASHFS_ZLIB) += zlib_wrapper.o
+squashfs-$(CONFIG_SQUASHFS_ZSTD) += zstd_wrapper.o
diff --git a/fs/squashfs/decompressor.c b/fs/squashfs/decompressor.c
index d2bc136..8366398 100644
--- a/fs/squashfs/decompressor.c
+++ b/fs/squashfs/decompressor.c
@@ -65,6 +65,12 @@ static const struct squashfs_decompressor squashfs_zlib_comp_ops = {
 };
 #endif

+#ifndef CONFIG_SQUASHFS_ZSTD
+static const struct squashfs_decompressor squashfs_zstd_comp_ops = {
+	NULL, NULL, NULL, NULL, ZSTD_COMPRESSION, "zstd", 0
+};
+#endif
+
 static const struct squashfs_decompressor squashfs_unknown_comp_ops = {
 	NULL, NULL, NULL, NULL, 0, "unknown", 0
 };
@@ -75,6 +81,7 @@ static const struct squashfs_decompressor *decompressor[] = {
 	&squashfs_lzo_comp_ops,
 	&squashfs_xz_comp_ops,
 	&squashfs_lzma_unsupported_comp_ops,
+	&squashfs_zstd_comp_ops,
 	&squashfs_unknown_comp_ops
 };

diff --git a/fs/squashfs/decompressor.h b/fs/squashfs/decompressor.h
index a25713c..0f5a8e4 100644
--- a/fs/squashfs/decompressor.h
+++ b/fs/squashfs/decompressor.h
@@ -58,4 +58,8 @@ extern const struct squashfs_decompressor squashfs_lzo_comp_ops;
 extern const struct squashfs_decompressor squashfs_zlib_comp_ops;
 #endif

+#ifdef CONFIG_SQUASHFS_ZSTD
+extern const struct squashfs_decompressor squashfs_zstd_comp_ops;
+#endif
+
 #endif
diff --git a/fs/squashfs/squashfs_fs.h b/fs/squashfs/squashfs_fs.h
index 506f4ba..24d12fd 100644
--- a/fs/squashfs/squashfs_fs.h
+++ b/fs/squashfs/squashfs_fs.h
@@ -241,6 +241,7 @@ struct meta_index {
 #define LZO_COMPRESSION		3
 #define XZ_COMPRESSION		4
 #define LZ4_COMPRESSION		5
+#define ZSTD_COMPRESSION	6

 struct squashfs_super_block {
 	__le32			s_magic;
diff --git a/fs/squashfs/zstd_wrapper.c b/fs/squashfs/zstd_wrapper.c
new file mode 100644
index 0000000..8cb7c76
--- /dev/null
+++ b/fs/squashfs/zstd_wrapper.c
@@ -0,0 +1,150 @@
+/*
+ * Squashfs - a compressed read only filesystem for Linux
+ *
+ * Copyright (c) 2016-present, Facebook, Inc.
+ * All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2,
+ * or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
+ *
+ * zstd_wrapper.c
+ */
+
+#include <linux/mutex.h>
+#include <linux/buffer_head.h>
+#include <linux/slab.h>
+#include <linux/zstd.h>
+#include <linux/vmalloc.h>
+
+#include "squashfs_fs.h"
+#include "squashfs_fs_sb.h"
+#include "squashfs.h"
+#include "decompressor.h"
+#include "page_actor.h"
+
+struct workspace {
+	void *mem;
+	size_t mem_size;
+};
+
+static void *zstd_init(struct squashfs_sb_info *msblk, void *buff)
+{
+	struct workspace *wksp = kmalloc(sizeof(*wksp), GFP_KERNEL);
+	if (wksp == NULL)
+		goto failed;
+	wksp->mem_size = ZSTD_DStreamWorkspaceBound(max_t(size_t,
+				msblk->block_size, SQUASHFS_METADATA_SIZE));
+	wksp->mem = vmalloc(wksp->mem_size);
+	if (wksp->mem == NULL)
+		goto failed;
+
+	return wksp;
+
+failed:
+	ERROR("Failed to allocate zstd workspace\n");
+	kfree(wksp);
+	return ERR_PTR(-ENOMEM);
+}
+
+
+static void zstd_free(void *strm)
+{
+	struct workspace *wksp = strm;
+
+	if (wksp)
+		vfree(wksp->mem);
+	kfree(wksp);
+}
+
+
+static int zstd_uncompress(struct squashfs_sb_info *msblk, void *strm,
+	struct buffer_head **bh, int b, int offset, int length,
+	struct squashfs_page_actor *output)
+{
+	struct workspace *wksp = strm;
+	ZSTD_DStream *stream;
+	size_t total_out = 0;
+	size_t zstd_err;
+	int k = 0;
+	ZSTD_inBuffer in_buf = { NULL, 0, 0 };
+	ZSTD_outBuffer out_buf = { NULL, 0, 0 };
+
+	stream = ZSTD_initDStream(wksp->mem_size, wksp->mem, wksp->mem_size);
+
+	if (!stream) {
+		ERROR("Failed to initialize zstd decompressor\n");
+		goto out;
+	}
+
+	out_buf.size = PAGE_SIZE;
+	out_buf.dst = squashfs_first_page(output);
+
+	do {
+		if (in_buf.pos == in_buf.size && k < b) {
+			int avail = min(length, msblk->devblksize - offset);
+			length -= avail;
+			in_buf.src = bh[k]->b_data + offset;
+			in_buf.size = avail;
+			in_buf.pos = 0;
+			offset = 0;
+		}
+
+		if (out_buf.pos == out_buf.size) {
+			out_buf.dst = squashfs_next_page(output);
+			if (out_buf.dst == NULL) {
+				/* shouldn't run out of pages before stream is
+				 * done */
+				squashfs_finish_page(output);
+				goto out;
+			}
+			out_buf.pos = 0;
+			out_buf.size = PAGE_SIZE;
+		}
+
+		total_out -= out_buf.pos;
+		zstd_err = ZSTD_decompressStream(stream, &out_buf, &in_buf);
+		total_out += out_buf.pos; /* add the additional data produced */
+
+		if (in_buf.pos == in_buf.size && k < b)
+			put_bh(bh[k++]);
+	} while (zstd_err != 0 && !ZSTD_isError(zstd_err));
+
+	squashfs_finish_page(output);
+
+	if (ZSTD_isError(zstd_err)) {
+		ERROR("zstd decompression error: %d\n",
+				(int)ZSTD_getErrorCode(zstd_err));
+		goto out;
+	}
+
+	if (k < b)
+		goto out;
+
+	return (int)total_out;
+
+out:
+	for (; k < b; k++)
+		put_bh(bh[k]);
+
+	return -EIO;
+}
+
+const struct squashfs_decompressor squashfs_zstd_comp_ops = {
+	.init = zstd_init,
+	.free = zstd_free,
+	.decompress = zstd_uncompress,
+	.id = ZSTD_COMPRESSION,
+	.name = "zstd",
+	.supported = 1
+};
--
2.9.3

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 0/4] Add xxhash and zstd modules
  2017-07-20 21:27 [PATCH v3 0/4] Add xxhash and zstd modules Nick Terrell
                   ` (2 preceding siblings ...)
  2017-07-20 21:27 ` [PATCH v3 4/4] squashfs: " Nick Terrell
@ 2017-07-21 11:16 ` Austin S. Hemmelgarn
  2017-07-21 11:18   ` Austin S. Hemmelgarn
  2017-07-21 15:56 ` Austin S. Hemmelgarn
  4 siblings, 1 reply; 16+ messages in thread
From: Austin S. Hemmelgarn @ 2017-07-21 11:16 UTC (permalink / raw)
  To: Nick Terrell
  Cc: Herbert Xu, kernel-team, Chris Mason, Yann Collet, Adam Borowski,
	David Sterba, squashfs-devel, linux-btrfs, linux-kernel

On 2017-07-20 17:27, Nick Terrell wrote:
> Hi all,
> 
> This patch set adds xxhash, zstd compression, and zstd decompression
> modules. It also adds zstd support to BtrFS and SquashFS.
> 
> Each patch has relevant summaries, benchmarks, and tests.
> 
> Best,
> Nick Terrell
> 
> Changelog:
> 
> v1 -> v2:
> - Make pointer in lib/xxhash.c:394 non-const (1/4)
> - Use div_u64() for division of u64s (2/4)
> - Reduce stack usage of ZSTD_compressSequences(), ZSTD_buildSeqTable(),
>    ZSTD_decompressSequencesLong(), FSE_buildDTable(), FSE_decompress_wksp(),
>    HUF_writeCTable(), HUF_readStats(), HUF_readCTable(),
>    HUF_compressWeights(), HUF_readDTableX2(), and HUF_readDTableX4() (2/4)
> - No zstd function uses more than 400 B of stack space (2/4)
> 
> v2 -> v3:
> - Work around gcc-7 bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81388
>    (2/4)
> - Fix bug in dictionary compression from upstream commit cc1522351f (2/4)
> - Port upstream BtrFS commits e1ddce71d6, 389a6cfc2a, and 6acafd1eff (3/4)
> - Change default compression level for BtrFS to 3 (3/4)
> 
> Nick Terrell (4):
>    lib: Add xxhash module
>    lib: Add zstd modules
>    btrfs: Add zstd support
>    squashfs: Add zstd support
> 
>   fs/btrfs/Kconfig           |    2 +
>   fs/btrfs/Makefile          |    2 +-
>   fs/btrfs/compression.c     |    1 +
>   fs/btrfs/compression.h     |    6 +-
>   fs/btrfs/ctree.h           |    1 +
>   fs/btrfs/disk-io.c         |    2 +
>   fs/btrfs/ioctl.c           |    6 +-
>   fs/btrfs/props.c           |    6 +
>   fs/btrfs/super.c           |   12 +-
>   fs/btrfs/sysfs.c           |    2 +
>   fs/btrfs/zstd.c            |  435 ++++++
>   fs/squashfs/Kconfig        |   14 +
>   fs/squashfs/Makefile       |    1 +
>   fs/squashfs/decompressor.c |    7 +
>   fs/squashfs/decompressor.h |    4 +
>   fs/squashfs/squashfs_fs.h  |    1 +
>   fs/squashfs/zstd_wrapper.c |  150 ++
>   include/linux/xxhash.h     |  236 +++
>   include/linux/zstd.h       | 1157 +++++++++++++++
>   include/uapi/linux/btrfs.h |    8 +-
>   lib/Kconfig                |   11 +
>   lib/Makefile               |    3 +
>   lib/xxhash.c               |  500 +++++++
>   lib/zstd/Makefile          |   18 +
>   lib/zstd/bitstream.h       |  374 +++++
>   lib/zstd/compress.c        | 3479 ++++++++++++++++++++++++++++++++++++++++++++
>   lib/zstd/decompress.c      | 2526 ++++++++++++++++++++++++++++++++
>   lib/zstd/entropy_common.c  |  243 ++++
>   lib/zstd/error_private.h   |   53 +
>   lib/zstd/fse.h             |  575 ++++++++
>   lib/zstd/fse_compress.c    |  795 ++++++++++
>   lib/zstd/fse_decompress.c  |  332 +++++
>   lib/zstd/huf.h             |  212 +++
>   lib/zstd/huf_compress.c    |  770 ++++++++++
>   lib/zstd/huf_decompress.c  |  960 ++++++++++++
>   lib/zstd/mem.h             |  151 ++
>   lib/zstd/zstd_common.c     |   75 +
>   lib/zstd/zstd_internal.h   |  250 ++++
>   lib/zstd/zstd_opt.h        | 1014 +++++++++++++
>   39 files changed, 14382 insertions(+), 12 deletions(-)
>   create mode 100644 fs/btrfs/zstd.c
>   create mode 100644 fs/squashfs/zstd_wrapper.c
>   create mode 100644 include/linux/xxhash.h
>   create mode 100644 include/linux/zstd.h
>   create mode 100644 lib/xxhash.c
>   create mode 100644 lib/zstd/Makefile
>   create mode 100644 lib/zstd/bitstream.h
>   create mode 100644 lib/zstd/compress.c
>   create mode 100644 lib/zstd/decompress.c
>   create mode 100644 lib/zstd/entropy_common.c
>   create mode 100644 lib/zstd/error_private.h
>   create mode 100644 lib/zstd/fse.h
>   create mode 100644 lib/zstd/fse_compress.c
>   create mode 100644 lib/zstd/fse_decompress.c
>   create mode 100644 lib/zstd/huf.h
>   create mode 100644 lib/zstd/huf_compress.c
>   create mode 100644 lib/zstd/huf_decompress.c
>   create mode 100644 lib/zstd/mem.h
>   create mode 100644 lib/zstd/zstd_common.c
>   create mode 100644 lib/zstd/zstd_internal.h
>   create mode 100644 lib/zstd/zstd_opt.h
> 
> --
> 2.9.3
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 0/4] Add xxhash and zstd modules
  2017-07-21 11:16 ` [PATCH v3 0/4] Add xxhash and zstd modules Austin S. Hemmelgarn
@ 2017-07-21 11:18   ` Austin S. Hemmelgarn
  0 siblings, 0 replies; 16+ messages in thread
From: Austin S. Hemmelgarn @ 2017-07-21 11:18 UTC (permalink / raw)
  To: Nick Terrell
  Cc: Herbert Xu, kernel-team, Chris Mason, Yann Collet, Adam Borowski,
	David Sterba, squashfs-devel, linux-btrfs, linux-kernel

On 2017-07-21 07:16, Austin S. Hemmelgarn wrote:
> On 2017-07-20 17:27, Nick Terrell wrote:
Well this is embarrassing, forgot to type anything before hitting send...
>> Hi all,
>>
>> This patch set adds xxhash, zstd compression, and zstd decompression
>> modules. It also adds zstd support to BtrFS and SquashFS.
>>
>> Each patch has relevant summaries, benchmarks, and tests.
>>
>> Best,
>> Nick Terrell
I've compile tested all of this on my end now, and have gotten some 
preliminary testing of patches 2-4, and everything looks good so far 
here.  I'll reply to this if I end up finding issues, but it doesn't 
look like I will right now considering the testing has run over night 
without issue.
>>
>> Changelog:
>>
>> v1 -> v2:
>> - Make pointer in lib/xxhash.c:394 non-const (1/4)
>> - Use div_u64() for division of u64s (2/4)
>> - Reduce stack usage of ZSTD_compressSequences(), ZSTD_buildSeqTable(),
>>    ZSTD_decompressSequencesLong(), FSE_buildDTable(), 
>> FSE_decompress_wksp(),
>>    HUF_writeCTable(), HUF_readStats(), HUF_readCTable(),
>>    HUF_compressWeights(), HUF_readDTableX2(), and HUF_readDTableX4() 
>> (2/4)
>> - No zstd function uses more than 400 B of stack space (2/4)
>>
>> v2 -> v3:
>> - Work around gcc-7 bug 
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81388
>>    (2/4)
>> - Fix bug in dictionary compression from upstream commit cc1522351f (2/4)
>> - Port upstream BtrFS commits e1ddce71d6, 389a6cfc2a, and 6acafd1eff 
>> (3/4)
>> - Change default compression level for BtrFS to 3 (3/4)
>>
>> Nick Terrell (4):
>>    lib: Add xxhash module
>>    lib: Add zstd modules
>>    btrfs: Add zstd support
>>    squashfs: Add zstd support
>>
>>   fs/btrfs/Kconfig           |    2 +
>>   fs/btrfs/Makefile          |    2 +-
>>   fs/btrfs/compression.c     |    1 +
>>   fs/btrfs/compression.h     |    6 +-
>>   fs/btrfs/ctree.h           |    1 +
>>   fs/btrfs/disk-io.c         |    2 +
>>   fs/btrfs/ioctl.c           |    6 +-
>>   fs/btrfs/props.c           |    6 +
>>   fs/btrfs/super.c           |   12 +-
>>   fs/btrfs/sysfs.c           |    2 +
>>   fs/btrfs/zstd.c            |  435 ++++++
>>   fs/squashfs/Kconfig        |   14 +
>>   fs/squashfs/Makefile       |    1 +
>>   fs/squashfs/decompressor.c |    7 +
>>   fs/squashfs/decompressor.h |    4 +
>>   fs/squashfs/squashfs_fs.h  |    1 +
>>   fs/squashfs/zstd_wrapper.c |  150 ++
>>   include/linux/xxhash.h     |  236 +++
>>   include/linux/zstd.h       | 1157 +++++++++++++++
>>   include/uapi/linux/btrfs.h |    8 +-
>>   lib/Kconfig                |   11 +
>>   lib/Makefile               |    3 +
>>   lib/xxhash.c               |  500 +++++++
>>   lib/zstd/Makefile          |   18 +
>>   lib/zstd/bitstream.h       |  374 +++++
>>   lib/zstd/compress.c        | 3479 
>> ++++++++++++++++++++++++++++++++++++++++++++
>>   lib/zstd/decompress.c      | 2526 ++++++++++++++++++++++++++++++++
>>   lib/zstd/entropy_common.c  |  243 ++++
>>   lib/zstd/error_private.h   |   53 +
>>   lib/zstd/fse.h             |  575 ++++++++
>>   lib/zstd/fse_compress.c    |  795 ++++++++++
>>   lib/zstd/fse_decompress.c  |  332 +++++
>>   lib/zstd/huf.h             |  212 +++
>>   lib/zstd/huf_compress.c    |  770 ++++++++++
>>   lib/zstd/huf_decompress.c  |  960 ++++++++++++
>>   lib/zstd/mem.h             |  151 ++
>>   lib/zstd/zstd_common.c     |   75 +
>>   lib/zstd/zstd_internal.h   |  250 ++++
>>   lib/zstd/zstd_opt.h        | 1014 +++++++++++++
>>   39 files changed, 14382 insertions(+), 12 deletions(-)
>>   create mode 100644 fs/btrfs/zstd.c
>>   create mode 100644 fs/squashfs/zstd_wrapper.c
>>   create mode 100644 include/linux/xxhash.h
>>   create mode 100644 include/linux/zstd.h
>>   create mode 100644 lib/xxhash.c
>>   create mode 100644 lib/zstd/Makefile
>>   create mode 100644 lib/zstd/bitstream.h
>>   create mode 100644 lib/zstd/compress.c
>>   create mode 100644 lib/zstd/decompress.c
>>   create mode 100644 lib/zstd/entropy_common.c
>>   create mode 100644 lib/zstd/error_private.h
>>   create mode 100644 lib/zstd/fse.h
>>   create mode 100644 lib/zstd/fse_compress.c
>>   create mode 100644 lib/zstd/fse_decompress.c
>>   create mode 100644 lib/zstd/huf.h
>>   create mode 100644 lib/zstd/huf_compress.c
>>   create mode 100644 lib/zstd/huf_decompress.c
>>   create mode 100644 lib/zstd/mem.h
>>   create mode 100644 lib/zstd/zstd_common.c
>>   create mode 100644 lib/zstd/zstd_internal.h
>>   create mode 100644 lib/zstd/zstd_opt.h
>>
>> -- 
>> 2.9.3
>>
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 0/4] Add xxhash and zstd modules
  2017-07-20 21:27 [PATCH v3 0/4] Add xxhash and zstd modules Nick Terrell
                   ` (3 preceding siblings ...)
  2017-07-21 11:16 ` [PATCH v3 0/4] Add xxhash and zstd modules Austin S. Hemmelgarn
@ 2017-07-21 15:56 ` Austin S. Hemmelgarn
  2017-07-22 11:35   ` Adam Borowski
  4 siblings, 1 reply; 16+ messages in thread
From: Austin S. Hemmelgarn @ 2017-07-21 15:56 UTC (permalink / raw)
  To: Nick Terrell
  Cc: Herbert Xu, kernel-team, Chris Mason, Yann Collet, Adam Borowski,
	David Sterba, squashfs-devel, linux-btrfs, linux-kernel

On 2017-07-20 17:27, Nick Terrell wrote:
> Hi all,
> 
> This patch set adds xxhash, zstd compression, and zstd decompression
> modules. It also adds zstd support to BtrFS and SquashFS.
> 
> Each patch has relevant summaries, benchmarks, and tests.
> 
> Best,
> Nick Terrell

For patches 2-3, I've compile tested and had runtime testing running for 
about 18 hours now with no issues, so you can add:

Tested-by: Austin S. Hemmelgarn <ahferroin7@gmail.com>

For patch 1, I've only compile tested it, but had no issues and got no 
warnings about it when booting to test 2-4.

For patch 4, I've compile tested it and done some really basic runtime 
testing using a couple of hand-crafted SquashFS images.  It appears to 
run fine, but I've not done enough that I feel it warrants a Tested-by 
tag from me.
> 
> Changelog:
> 
> v1 -> v2:
> - Make pointer in lib/xxhash.c:394 non-const (1/4)
> - Use div_u64() for division of u64s (2/4)
> - Reduce stack usage of ZSTD_compressSequences(), ZSTD_buildSeqTable(),
>    ZSTD_decompressSequencesLong(), FSE_buildDTable(), FSE_decompress_wksp(),
>    HUF_writeCTable(), HUF_readStats(), HUF_readCTable(),
>    HUF_compressWeights(), HUF_readDTableX2(), and HUF_readDTableX4() (2/4)
> - No zstd function uses more than 400 B of stack space (2/4)
> 
> v2 -> v3:
> - Work around gcc-7 bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81388
>    (2/4)
> - Fix bug in dictionary compression from upstream commit cc1522351f (2/4)
> - Port upstream BtrFS commits e1ddce71d6, 389a6cfc2a, and 6acafd1eff (3/4)
> - Change default compression level for BtrFS to 3 (3/4)
> 
> Nick Terrell (4):
>    lib: Add xxhash module
>    lib: Add zstd modules
>    btrfs: Add zstd support
>    squashfs: Add zstd support
> 
>   fs/btrfs/Kconfig           |    2 +
>   fs/btrfs/Makefile          |    2 +-
>   fs/btrfs/compression.c     |    1 +
>   fs/btrfs/compression.h     |    6 +-
>   fs/btrfs/ctree.h           |    1 +
>   fs/btrfs/disk-io.c         |    2 +
>   fs/btrfs/ioctl.c           |    6 +-
>   fs/btrfs/props.c           |    6 +
>   fs/btrfs/super.c           |   12 +-
>   fs/btrfs/sysfs.c           |    2 +
>   fs/btrfs/zstd.c            |  435 ++++++
>   fs/squashfs/Kconfig        |   14 +
>   fs/squashfs/Makefile       |    1 +
>   fs/squashfs/decompressor.c |    7 +
>   fs/squashfs/decompressor.h |    4 +
>   fs/squashfs/squashfs_fs.h  |    1 +
>   fs/squashfs/zstd_wrapper.c |  150 ++
>   include/linux/xxhash.h     |  236 +++
>   include/linux/zstd.h       | 1157 +++++++++++++++
>   include/uapi/linux/btrfs.h |    8 +-
>   lib/Kconfig                |   11 +
>   lib/Makefile               |    3 +
>   lib/xxhash.c               |  500 +++++++
>   lib/zstd/Makefile          |   18 +
>   lib/zstd/bitstream.h       |  374 +++++
>   lib/zstd/compress.c        | 3479 ++++++++++++++++++++++++++++++++++++++++++++
>   lib/zstd/decompress.c      | 2526 ++++++++++++++++++++++++++++++++
>   lib/zstd/entropy_common.c  |  243 ++++
>   lib/zstd/error_private.h   |   53 +
>   lib/zstd/fse.h             |  575 ++++++++
>   lib/zstd/fse_compress.c    |  795 ++++++++++
>   lib/zstd/fse_decompress.c  |  332 +++++
>   lib/zstd/huf.h             |  212 +++
>   lib/zstd/huf_compress.c    |  770 ++++++++++
>   lib/zstd/huf_decompress.c  |  960 ++++++++++++
>   lib/zstd/mem.h             |  151 ++
>   lib/zstd/zstd_common.c     |   75 +
>   lib/zstd/zstd_internal.h   |  250 ++++
>   lib/zstd/zstd_opt.h        | 1014 +++++++++++++
>   39 files changed, 14382 insertions(+), 12 deletions(-)
>   create mode 100644 fs/btrfs/zstd.c
>   create mode 100644 fs/squashfs/zstd_wrapper.c
>   create mode 100644 include/linux/xxhash.h
>   create mode 100644 include/linux/zstd.h
>   create mode 100644 lib/xxhash.c
>   create mode 100644 lib/zstd/Makefile
>   create mode 100644 lib/zstd/bitstream.h
>   create mode 100644 lib/zstd/compress.c
>   create mode 100644 lib/zstd/decompress.c
>   create mode 100644 lib/zstd/entropy_common.c
>   create mode 100644 lib/zstd/error_private.h
>   create mode 100644 lib/zstd/fse.h
>   create mode 100644 lib/zstd/fse_compress.c
>   create mode 100644 lib/zstd/fse_decompress.c
>   create mode 100644 lib/zstd/huf.h
>   create mode 100644 lib/zstd/huf_compress.c
>   create mode 100644 lib/zstd/huf_decompress.c
>   create mode 100644 lib/zstd/mem.h
>   create mode 100644 lib/zstd/zstd_common.c
>   create mode 100644 lib/zstd/zstd_internal.h
>   create mode 100644 lib/zstd/zstd_opt.h
> 
> --
> 2.9.3
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 0/4] Add xxhash and zstd modules
  2017-07-21 15:56 ` Austin S. Hemmelgarn
@ 2017-07-22 11:35   ` Adam Borowski
  2017-07-24 13:44     ` Austin S. Hemmelgarn
  0 siblings, 1 reply; 16+ messages in thread
From: Adam Borowski @ 2017-07-22 11:35 UTC (permalink / raw)
  To: Austin S. Hemmelgarn
  Cc: Nick Terrell, Herbert Xu, kernel-team, Chris Mason, Yann Collet,
	David Sterba, squashfs-devel, linux-btrfs, linux-kernel

On Fri, Jul 21, 2017 at 11:56:21AM -0400, Austin S. Hemmelgarn wrote:
> On 2017-07-20 17:27, Nick Terrell wrote:
> > This patch set adds xxhash, zstd compression, and zstd decompression
> > modules. It also adds zstd support to BtrFS and SquashFS.
> > 
> > Each patch has relevant summaries, benchmarks, and tests.
> 
> For patches 2-3, I've compile tested and had runtime testing running for
> about 18 hours now with no issues, so you can add:
> 
> Tested-by: Austin S. Hemmelgarn <ahferroin7@gmail.com>

I assume you haven't tried it on arm64, right?

I had no time to get 'round to it before, and just got the following build
failure:

  CC      fs/btrfs/zstd.o
In file included from fs/btrfs/zstd.c:28:0:
fs/btrfs/compression.h:39:2: error: unknown type name ‘refcount_t’
  refcount_t pending_bios;
  ^~~~~~~~~~
scripts/Makefile.build:302: recipe for target 'fs/btrfs/zstd.o' failed

It's trivially fixably by:
--- a/fs/btrfs/zstd.c
+++ b/fs/btrfs/zstd.c
@@ -24,6 +24,7 @@
 #include <linux/sched.h>
 #include <linux/pagemap.h>
 #include <linux/bio.h>
+#include <linux/refcount.h>
 #include <linux/zstd.h>
 #include "compression.h"

after which it works fine, although half an hour of testing isn't exactly
exhaustive.


Alas, the armhf machine I ran stress tests (Debian archive rebuilds) on
doesn't boot with 4.13-rc1 due to some unrelated regression, bisecting that
would be quite painful so I did not try yet.  I guess re-testing your patch
set on 4.12, even with btrfs-for-4.13 (which it had for a while), wouldn't
be of much help.  So far, previous versions have been running for weeks,
with no issue since you fixed workspace flickering.


On amd64 all is fine.


I haven't tested SquashFS at all.


Meow!
-- 
⢀⣴⠾⠻⢶⣦⠀ 
⣾⠁⢠⠒⠀⣿⡁ A dumb species has no way to open a tuna can.
⢿⡄⠘⠷⠚⠋⠀ A smart species invents a can opener.
⠈⠳⣄⠀⠀⠀⠀ A master species delegates.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 3/4] btrfs: Add zstd support
  2017-07-20 21:27 ` [PATCH v3 3/4] btrfs: Add zstd support Nick Terrell
@ 2017-07-23 19:28   ` kbuild test robot
  2017-07-24 17:25   ` kbuild test robot
  2017-07-25 23:19   ` Giovanni Cabiddu
  2 siblings, 0 replies; 16+ messages in thread
From: kbuild test robot @ 2017-07-23 19:28 UTC (permalink / raw)
  To: Nick Terrell
  Cc: kbuild-all, Nick Terrell, Austin S . Hemmelgarn, Herbert Xu,
	kernel-team, Chris Mason, Yann Collet, Adam Borowski,
	David Sterba, squashfs-devel, linux-btrfs, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 5583 bytes --]

Hi Nick,

[auto build test WARNING on linus/master]
[also build test WARNING on v4.13-rc1 next-20170721]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Nick-Terrell/Add-xxhash-and-zstd-modules/20170723-092845
config: parisc-allyesconfig (attached as .config)
compiler: hppa-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
        wget https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=parisc 

All warnings (new ones prefixed by >>):

   lib/xxhash.c: In function 'xxh64':
>> lib/xxhash.c:236:1: warning: the frame size of 1688 bytes is larger than 1024 bytes [-Wframe-larger-than=]
    }
    ^
   lib/xxhash.c: In function 'xxh64_update':
   lib/xxhash.c:441:1: warning: the frame size of 1072 bytes is larger than 1024 bytes [-Wframe-larger-than=]
    }
    ^

vim +236 lib/xxhash.c

09c83807 Nick Terrell 2017-07-20  171  
09c83807 Nick Terrell 2017-07-20  172  uint64_t xxh64(const void *input, const size_t len, const uint64_t seed)
09c83807 Nick Terrell 2017-07-20  173  {
09c83807 Nick Terrell 2017-07-20  174  	const uint8_t *p = (const uint8_t *)input;
09c83807 Nick Terrell 2017-07-20  175  	const uint8_t *const b_end = p + len;
09c83807 Nick Terrell 2017-07-20  176  	uint64_t h64;
09c83807 Nick Terrell 2017-07-20  177  
09c83807 Nick Terrell 2017-07-20  178  	if (len >= 32) {
09c83807 Nick Terrell 2017-07-20  179  		const uint8_t *const limit = b_end - 32;
09c83807 Nick Terrell 2017-07-20  180  		uint64_t v1 = seed + PRIME64_1 + PRIME64_2;
09c83807 Nick Terrell 2017-07-20  181  		uint64_t v2 = seed + PRIME64_2;
09c83807 Nick Terrell 2017-07-20  182  		uint64_t v3 = seed + 0;
09c83807 Nick Terrell 2017-07-20  183  		uint64_t v4 = seed - PRIME64_1;
09c83807 Nick Terrell 2017-07-20  184  
09c83807 Nick Terrell 2017-07-20  185  		do {
09c83807 Nick Terrell 2017-07-20  186  			v1 = xxh64_round(v1, get_unaligned_le64(p));
09c83807 Nick Terrell 2017-07-20  187  			p += 8;
09c83807 Nick Terrell 2017-07-20  188  			v2 = xxh64_round(v2, get_unaligned_le64(p));
09c83807 Nick Terrell 2017-07-20  189  			p += 8;
09c83807 Nick Terrell 2017-07-20  190  			v3 = xxh64_round(v3, get_unaligned_le64(p));
09c83807 Nick Terrell 2017-07-20  191  			p += 8;
09c83807 Nick Terrell 2017-07-20  192  			v4 = xxh64_round(v4, get_unaligned_le64(p));
09c83807 Nick Terrell 2017-07-20  193  			p += 8;
09c83807 Nick Terrell 2017-07-20  194  		} while (p <= limit);
09c83807 Nick Terrell 2017-07-20  195  
09c83807 Nick Terrell 2017-07-20  196  		h64 = xxh_rotl64(v1, 1) + xxh_rotl64(v2, 7) +
09c83807 Nick Terrell 2017-07-20  197  			xxh_rotl64(v3, 12) + xxh_rotl64(v4, 18);
09c83807 Nick Terrell 2017-07-20  198  		h64 = xxh64_merge_round(h64, v1);
09c83807 Nick Terrell 2017-07-20  199  		h64 = xxh64_merge_round(h64, v2);
09c83807 Nick Terrell 2017-07-20  200  		h64 = xxh64_merge_round(h64, v3);
09c83807 Nick Terrell 2017-07-20  201  		h64 = xxh64_merge_round(h64, v4);
09c83807 Nick Terrell 2017-07-20  202  
09c83807 Nick Terrell 2017-07-20  203  	} else {
09c83807 Nick Terrell 2017-07-20  204  		h64  = seed + PRIME64_5;
09c83807 Nick Terrell 2017-07-20  205  	}
09c83807 Nick Terrell 2017-07-20  206  
09c83807 Nick Terrell 2017-07-20  207  	h64 += (uint64_t)len;
09c83807 Nick Terrell 2017-07-20  208  
09c83807 Nick Terrell 2017-07-20  209  	while (p + 8 <= b_end) {
09c83807 Nick Terrell 2017-07-20  210  		const uint64_t k1 = xxh64_round(0, get_unaligned_le64(p));
09c83807 Nick Terrell 2017-07-20  211  
09c83807 Nick Terrell 2017-07-20  212  		h64 ^= k1;
09c83807 Nick Terrell 2017-07-20  213  		h64 = xxh_rotl64(h64, 27) * PRIME64_1 + PRIME64_4;
09c83807 Nick Terrell 2017-07-20  214  		p += 8;
09c83807 Nick Terrell 2017-07-20  215  	}
09c83807 Nick Terrell 2017-07-20  216  
09c83807 Nick Terrell 2017-07-20  217  	if (p + 4 <= b_end) {
09c83807 Nick Terrell 2017-07-20  218  		h64 ^= (uint64_t)(get_unaligned_le32(p)) * PRIME64_1;
09c83807 Nick Terrell 2017-07-20  219  		h64 = xxh_rotl64(h64, 23) * PRIME64_2 + PRIME64_3;
09c83807 Nick Terrell 2017-07-20  220  		p += 4;
09c83807 Nick Terrell 2017-07-20  221  	}
09c83807 Nick Terrell 2017-07-20  222  
09c83807 Nick Terrell 2017-07-20  223  	while (p < b_end) {
09c83807 Nick Terrell 2017-07-20  224  		h64 ^= (*p) * PRIME64_5;
09c83807 Nick Terrell 2017-07-20  225  		h64 = xxh_rotl64(h64, 11) * PRIME64_1;
09c83807 Nick Terrell 2017-07-20  226  		p++;
09c83807 Nick Terrell 2017-07-20  227  	}
09c83807 Nick Terrell 2017-07-20  228  
09c83807 Nick Terrell 2017-07-20  229  	h64 ^= h64 >> 33;
09c83807 Nick Terrell 2017-07-20  230  	h64 *= PRIME64_2;
09c83807 Nick Terrell 2017-07-20  231  	h64 ^= h64 >> 29;
09c83807 Nick Terrell 2017-07-20  232  	h64 *= PRIME64_3;
09c83807 Nick Terrell 2017-07-20  233  	h64 ^= h64 >> 32;
09c83807 Nick Terrell 2017-07-20  234  
09c83807 Nick Terrell 2017-07-20  235  	return h64;
09c83807 Nick Terrell 2017-07-20 @236  }
09c83807 Nick Terrell 2017-07-20  237  EXPORT_SYMBOL(xxh64);
09c83807 Nick Terrell 2017-07-20  238  

:::::: The code at line 236 was first introduced by commit
:::::: 09c83807485b28f2b1813bcd3b7e2c7c3720234e lib: Add xxhash module

:::::: TO: Nick Terrell <terrelln@fb.com>
:::::: CC: 0day robot <fengguang.wu@intel.com>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 50926 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 0/4] Add xxhash and zstd modules
  2017-07-22 11:35   ` Adam Borowski
@ 2017-07-24 13:44     ` Austin S. Hemmelgarn
  0 siblings, 0 replies; 16+ messages in thread
From: Austin S. Hemmelgarn @ 2017-07-24 13:44 UTC (permalink / raw)
  To: Adam Borowski
  Cc: Nick Terrell, Herbert Xu, kernel-team, Chris Mason, Yann Collet,
	David Sterba, squashfs-devel, linux-btrfs, linux-kernel

On 2017-07-22 07:35, Adam Borowski wrote:
> On Fri, Jul 21, 2017 at 11:56:21AM -0400, Austin S. Hemmelgarn wrote:
>> On 2017-07-20 17:27, Nick Terrell wrote:
>>> This patch set adds xxhash, zstd compression, and zstd decompression
>>> modules. It also adds zstd support to BtrFS and SquashFS.
>>>
>>> Each patch has relevant summaries, benchmarks, and tests.
>>
>> For patches 2-3, I've compile tested and had runtime testing running for
>> about 18 hours now with no issues, so you can add:
>>
>> Tested-by: Austin S. Hemmelgarn <ahferroin7@gmail.com>
> 
> I assume you haven't tried it on arm64, right?
> 
> I had no time to get 'round to it before, and just got the following build
> failure:
> 
>    CC      fs/btrfs/zstd.o
> In file included from fs/btrfs/zstd.c:28:0:
> fs/btrfs/compression.h:39:2: error: unknown type name ‘refcount_t’
>    refcount_t pending_bios;
>    ^~~~~~~~~~
> scripts/Makefile.build:302: recipe for target 'fs/btrfs/zstd.o' failed
> 
> It's trivially fixably by:
> --- a/fs/btrfs/zstd.c
> +++ b/fs/btrfs/zstd.c
> @@ -24,6 +24,7 @@
>   #include <linux/sched.h>
>   #include <linux/pagemap.h>
>   #include <linux/bio.h>
> +#include <linux/refcount.h>
>   #include <linux/zstd.h>
>   #include "compression.h"
> 
> after which it works fine, although half an hour of testing isn't exactly
> exhaustive.
I did, and didn't hit this somehow...

Off to go verify my tool-chain and scripts then...
> 
> 
> Alas, the armhf machine I ran stress tests (Debian archive rebuilds) on
> doesn't boot with 4.13-rc1 due to some unrelated regression, bisecting that
> would be quite painful so I did not try yet.  I guess re-testing your patch
> set on 4.12, even with btrfs-for-4.13 (which it had for a while), wouldn't
> be of much help.  So far, previous versions have been running for weeks,
> with no issue since you fixed workspace flickering.
I also didn't see this, but I test on some seriously bare-bones 
configurations for both the 32-bit ARM tests I run.  On further 
inspection, it looks like my scripts decided to use btrfs-for-4.13 as 
the base, not 4.13-rc1 like I thought they did, so I don't know anymore 
how helpful my testing may have been.
> 
> 
> On amd64 all is fine.
> 
> 
> I haven't tested SquashFS at all.
> 
> 
> Meow!
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 3/4] btrfs: Add zstd support
  2017-07-20 21:27 ` [PATCH v3 3/4] btrfs: Add zstd support Nick Terrell
  2017-07-23 19:28   ` kbuild test robot
@ 2017-07-24 17:25   ` kbuild test robot
  2017-07-25 23:19   ` Giovanni Cabiddu
  2 siblings, 0 replies; 16+ messages in thread
From: kbuild test robot @ 2017-07-24 17:25 UTC (permalink / raw)
  To: Nick Terrell
  Cc: kbuild-all, Nick Terrell, Austin S . Hemmelgarn, Herbert Xu,
	kernel-team, Chris Mason, Yann Collet, Adam Borowski,
	David Sterba, squashfs-devel, linux-btrfs, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 7200 bytes --]

Hi Nick,

[auto build test WARNING on linus/master]
[also build test WARNING on v4.13-rc2 next-20170724]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Nick-Terrell/Add-xxhash-and-zstd-modules/20170723-092845
config: x86_64-randconfig-a0-07242221 (attached as .config)
compiler: gcc-4.4 (Debian 4.4.7-8) 4.4.7
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All warnings (new ones prefixed by >>):

   lib/zstd/decompress.c: In function 'ZSTD_decodeSequence':
>> lib/zstd/decompress.c:1001: warning: 'seq.match' is used uninitialized in this function

vim +1001 lib/zstd/decompress.c

dc22844b Nick Terrell 2017-07-20   930  
dc22844b Nick Terrell 2017-07-20   931  static seq_t ZSTD_decodeSequence(seqState_t *seqState)
dc22844b Nick Terrell 2017-07-20   932  {
dc22844b Nick Terrell 2017-07-20   933  	seq_t seq;
dc22844b Nick Terrell 2017-07-20   934  
dc22844b Nick Terrell 2017-07-20   935  	U32 const llCode = FSE_peekSymbol(&seqState->stateLL);
dc22844b Nick Terrell 2017-07-20   936  	U32 const mlCode = FSE_peekSymbol(&seqState->stateML);
dc22844b Nick Terrell 2017-07-20   937  	U32 const ofCode = FSE_peekSymbol(&seqState->stateOffb); /* <= maxOff, by table construction */
dc22844b Nick Terrell 2017-07-20   938  
dc22844b Nick Terrell 2017-07-20   939  	U32 const llBits = LL_bits[llCode];
dc22844b Nick Terrell 2017-07-20   940  	U32 const mlBits = ML_bits[mlCode];
dc22844b Nick Terrell 2017-07-20   941  	U32 const ofBits = ofCode;
dc22844b Nick Terrell 2017-07-20   942  	U32 const totalBits = llBits + mlBits + ofBits;
dc22844b Nick Terrell 2017-07-20   943  
dc22844b Nick Terrell 2017-07-20   944  	static const U32 LL_base[MaxLL + 1] = {0,  1,  2,  3,  4,  5,  6,  7,  8,    9,     10,    11,    12,    13,     14,     15,     16,     18,
dc22844b Nick Terrell 2017-07-20   945  					       20, 22, 24, 28, 32, 40, 48, 64, 0x80, 0x100, 0x200, 0x400, 0x800, 0x1000, 0x2000, 0x4000, 0x8000, 0x10000};
dc22844b Nick Terrell 2017-07-20   946  
dc22844b Nick Terrell 2017-07-20   947  	static const U32 ML_base[MaxML + 1] = {3,  4,  5,  6,  7,  8,  9,  10,   11,    12,    13,    14,    15,     16,     17,     18,     19,     20,
dc22844b Nick Terrell 2017-07-20   948  					       21, 22, 23, 24, 25, 26, 27, 28,   29,    30,    31,    32,    33,     34,     35,     37,     39,     41,
dc22844b Nick Terrell 2017-07-20   949  					       43, 47, 51, 59, 67, 83, 99, 0x83, 0x103, 0x203, 0x403, 0x803, 0x1003, 0x2003, 0x4003, 0x8003, 0x10003};
dc22844b Nick Terrell 2017-07-20   950  
dc22844b Nick Terrell 2017-07-20   951  	static const U32 OF_base[MaxOff + 1] = {0,       1,	1,	5,	0xD,      0x1D,      0x3D,      0x7D,      0xFD,     0x1FD,
dc22844b Nick Terrell 2017-07-20   952  						0x3FD,   0x7FD,    0xFFD,    0x1FFD,   0x3FFD,   0x7FFD,    0xFFFD,    0x1FFFD,   0x3FFFD,  0x7FFFD,
dc22844b Nick Terrell 2017-07-20   953  						0xFFFFD, 0x1FFFFD, 0x3FFFFD, 0x7FFFFD, 0xFFFFFD, 0x1FFFFFD, 0x3FFFFFD, 0x7FFFFFD, 0xFFFFFFD};
dc22844b Nick Terrell 2017-07-20   954  
dc22844b Nick Terrell 2017-07-20   955  	/* sequence */
dc22844b Nick Terrell 2017-07-20   956  	{
dc22844b Nick Terrell 2017-07-20   957  		size_t offset;
dc22844b Nick Terrell 2017-07-20   958  		if (!ofCode)
dc22844b Nick Terrell 2017-07-20   959  			offset = 0;
dc22844b Nick Terrell 2017-07-20   960  		else {
dc22844b Nick Terrell 2017-07-20   961  			offset = OF_base[ofCode] + BIT_readBitsFast(&seqState->DStream, ofBits); /* <=  (ZSTD_WINDOWLOG_MAX-1) bits */
dc22844b Nick Terrell 2017-07-20   962  			if (ZSTD_32bits())
dc22844b Nick Terrell 2017-07-20   963  				BIT_reloadDStream(&seqState->DStream);
dc22844b Nick Terrell 2017-07-20   964  		}
dc22844b Nick Terrell 2017-07-20   965  
dc22844b Nick Terrell 2017-07-20   966  		if (ofCode <= 1) {
dc22844b Nick Terrell 2017-07-20   967  			offset += (llCode == 0);
dc22844b Nick Terrell 2017-07-20   968  			if (offset) {
dc22844b Nick Terrell 2017-07-20   969  				size_t temp = (offset == 3) ? seqState->prevOffset[0] - 1 : seqState->prevOffset[offset];
dc22844b Nick Terrell 2017-07-20   970  				temp += !temp; /* 0 is not valid; input is corrupted; force offset to 1 */
dc22844b Nick Terrell 2017-07-20   971  				if (offset != 1)
dc22844b Nick Terrell 2017-07-20   972  					seqState->prevOffset[2] = seqState->prevOffset[1];
dc22844b Nick Terrell 2017-07-20   973  				seqState->prevOffset[1] = seqState->prevOffset[0];
dc22844b Nick Terrell 2017-07-20   974  				seqState->prevOffset[0] = offset = temp;
dc22844b Nick Terrell 2017-07-20   975  			} else {
dc22844b Nick Terrell 2017-07-20   976  				offset = seqState->prevOffset[0];
dc22844b Nick Terrell 2017-07-20   977  			}
dc22844b Nick Terrell 2017-07-20   978  		} else {
dc22844b Nick Terrell 2017-07-20   979  			seqState->prevOffset[2] = seqState->prevOffset[1];
dc22844b Nick Terrell 2017-07-20   980  			seqState->prevOffset[1] = seqState->prevOffset[0];
dc22844b Nick Terrell 2017-07-20   981  			seqState->prevOffset[0] = offset;
dc22844b Nick Terrell 2017-07-20   982  		}
dc22844b Nick Terrell 2017-07-20   983  		seq.offset = offset;
dc22844b Nick Terrell 2017-07-20   984  	}
dc22844b Nick Terrell 2017-07-20   985  
dc22844b Nick Terrell 2017-07-20   986  	seq.matchLength = ML_base[mlCode] + ((mlCode > 31) ? BIT_readBitsFast(&seqState->DStream, mlBits) : 0); /* <=  16 bits */
dc22844b Nick Terrell 2017-07-20   987  	if (ZSTD_32bits() && (mlBits + llBits > 24))
dc22844b Nick Terrell 2017-07-20   988  		BIT_reloadDStream(&seqState->DStream);
dc22844b Nick Terrell 2017-07-20   989  
dc22844b Nick Terrell 2017-07-20   990  	seq.litLength = LL_base[llCode] + ((llCode > 15) ? BIT_readBitsFast(&seqState->DStream, llBits) : 0); /* <=  16 bits */
dc22844b Nick Terrell 2017-07-20   991  	if (ZSTD_32bits() || (totalBits > 64 - 7 - (LLFSELog + MLFSELog + OffFSELog)))
dc22844b Nick Terrell 2017-07-20   992  		BIT_reloadDStream(&seqState->DStream);
dc22844b Nick Terrell 2017-07-20   993  
dc22844b Nick Terrell 2017-07-20   994  	/* ANS state update */
dc22844b Nick Terrell 2017-07-20   995  	FSE_updateState(&seqState->stateLL, &seqState->DStream); /* <=  9 bits */
dc22844b Nick Terrell 2017-07-20   996  	FSE_updateState(&seqState->stateML, &seqState->DStream); /* <=  9 bits */
dc22844b Nick Terrell 2017-07-20   997  	if (ZSTD_32bits())
dc22844b Nick Terrell 2017-07-20   998  		BIT_reloadDStream(&seqState->DStream);		   /* <= 18 bits */
dc22844b Nick Terrell 2017-07-20   999  	FSE_updateState(&seqState->stateOffb, &seqState->DStream); /* <=  8 bits */
dc22844b Nick Terrell 2017-07-20  1000  
dc22844b Nick Terrell 2017-07-20 @1001  	return seq;
dc22844b Nick Terrell 2017-07-20  1002  }
dc22844b Nick Terrell 2017-07-20  1003  

:::::: The code at line 1001 was first introduced by commit
:::::: dc22844bd66de07e2ff858aaebfe678b94df529c lib: Add zstd modules

:::::: TO: Nick Terrell <terrelln@fb.com>
:::::: CC: 0day robot <fengguang.wu@intel.com>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 26498 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 3/4] btrfs: Add zstd support
  2017-07-20 21:27 ` [PATCH v3 3/4] btrfs: Add zstd support Nick Terrell
  2017-07-23 19:28   ` kbuild test robot
  2017-07-24 17:25   ` kbuild test robot
@ 2017-07-25 23:19   ` Giovanni Cabiddu
  2017-08-18 15:37     ` David Sterba
  2 siblings, 1 reply; 16+ messages in thread
From: Giovanni Cabiddu @ 2017-07-25 23:19 UTC (permalink / raw)
  To: Nick Terrell
  Cc: Austin S . Hemmelgarn, Herbert Xu, kernel-team@fb.com,
	Chris Mason, Yann Collet, Adam Borowski, David Sterba,
	squashfs-devel@lists.sourceforge.net, linux-btrfs@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org,
	Weigang Li, Brian Will, Brian A Keating, Giovanni Cabiddu

Hi Nick,

On Thu, Jul 20, 2017 at 10:27:42PM +0100, Nick Terrell wrote:
> Add zstd compression and decompression support to BtrFS. zstd at its
> fastest level compresses almost as well as zlib, while offering much
> faster compression and decompression, approaching lzo speeds.
Can we look at integrating the zstd implementation below the acomp API
available in the crypto subsystem?
(https://github.com/torvalds/linux/blob/master/crypto/acompress.c)
Acomp was designed to provide a generic and uniform API for compression
in the kernel which hides algorithm specific details to frameworks.
In future it would be nice to see btrfs using exclusively acomp
for compression. This way when a new compression algorithm is exposed
through acomp, it will be available immediately in btrfs.
Furthermore, any framework in the kernel that will use acomp will be
automatically enabled to use zstd.
What do you think?

Here is a prototype that shows how btrfs can be integrated with
acomp: https://patchwork.kernel.org/patch/9201741/

Regards,

-- 
Giovanni

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 4/4] squashfs: Add zstd support
  2017-07-20 21:27 ` [PATCH v3 4/4] squashfs: " Nick Terrell
@ 2017-07-31  1:50   ` Phillip Lougher
  2017-07-31  2:18     ` Phillip Lougher
  2017-07-31 21:30     ` Nick Terrell
  0 siblings, 2 replies; 16+ messages in thread
From: Phillip Lougher @ 2017-07-31  1:50 UTC (permalink / raw)
  To: Nick Terrell
  Cc: Austin S . Hemmelgarn, Herbert Xu, kernel-team, Chris Mason,
	Yann Collet, Adam Borowski, David Sterba, SquashFS developers,
	linux-btrfs, LKML, Sean Purcell

On Thu, Jul 20, 2017 at 10:27 PM, Nick Terrell <terrelln@fb.com> wrote:
> Add zstd compression and decompression support to SquashFS. zstd is a
> great fit for SquashFS because it can compress at ratios approaching xz,
> while decompressing twice as fast as zlib. For SquashFS in particular,
> it can decompress as fast as lzo and lz4. It also has the flexibility
> to turn down the compression ratio for faster compression times.

Hi Nick,

This patch (and none of the previous versions) is showing up on
squashfs-devel@lists.sourceforge.net.  I also think you should have
emailed me directly as a courtesy, as I'm the Squashfs author and
maintainer.


> | Method         | Ratio | Compression MB/s | Decompression MB/s |
> |----------------|-------|------------------|--------------------|
> | gzip           |  2.92 |               15 |                128 |
> | lzo            |  2.64 |              9.5 |                217 |
> | lz4            |  2.12 |               94 |                218 |
> | xz             |  3.43 |              5.5 |                 35 |
> | xz 256 KB      |  3.53 |              5.4 |                 40 |
> | zstd 1         |  2.71 |               96 |                210 |
> | zstd 5         |  2.93 |               69 |                198 |
> | zstd 10        |  3.01 |               41 |                225 |
> | zstd 15        |  3.13 |             11.4 |                224 |
> | zstd 16 256 KB |  3.24 |              8.1 |                210 |


Those numbers look good to me.

>
> This patch was written by Sean Purcell <me@seanp.xyz>, but I will be
> taking over the submission process.
>
> [1] http://releases.ubuntu.com/16.10/
> [2] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/squashfs-benchmark.sh

I can't find your patch that adds zstd to the user-land
Squashfs-tools.  That would be handy to do any testing :-)

Phillip

>
> zstd source repository: https://github.com/facebook/zstd
>
> Cc: Sean Purcell <me@seanp.xyz>
> Signed-off-by: Nick Terrell <terrelln@fb.com>
> ---
>  fs/squashfs/Kconfig        |  14 +++++
>  fs/squashfs/Makefile       |   1 +
>  fs/squashfs/decompressor.c |   7 +++
>  fs/squashfs/decompressor.h |   4 ++
>  fs/squashfs/squashfs_fs.h  |   1 +
>  fs/squashfs/zstd_wrapper.c | 150 +++++++++++++++++++++++++++++++++++++++++++++
>  6 files changed, 177 insertions(+)
>  create mode 100644 fs/squashfs/zstd_wrapper.c
>
> diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig
> index ffb093e..1adb334 100644
> --- a/fs/squashfs/Kconfig
> +++ b/fs/squashfs/Kconfig
> @@ -165,6 +165,20 @@ config SQUASHFS_XZ
>
>           If unsure, say N.
>
> +config SQUASHFS_ZSTD
> +       bool "Include support for ZSTD compressed file systems"
> +       depends on SQUASHFS
> +       select ZSTD_DECOMPRESS
> +       help
> +         Saying Y here includes support for reading Squashfs file systems
> +         compressed with ZSTD compression.  ZSTD gives better compression than
> +         the default ZLIB compression, while using less CPU.
> +
> +         ZSTD is not the standard compression used in Squashfs and so most
> +         file systems will be readable without selecting this option.
> +
> +         If unsure, say N.
> +
>  config SQUASHFS_4K_DEVBLK_SIZE
>         bool "Use 4K device block size?"
>         depends on SQUASHFS
> diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile
> index 246a6f3..6655631 100644
> --- a/fs/squashfs/Makefile
> +++ b/fs/squashfs/Makefile
> @@ -15,3 +15,4 @@ squashfs-$(CONFIG_SQUASHFS_LZ4) += lz4_wrapper.o
>  squashfs-$(CONFIG_SQUASHFS_LZO) += lzo_wrapper.o
>  squashfs-$(CONFIG_SQUASHFS_XZ) += xz_wrapper.o
>  squashfs-$(CONFIG_SQUASHFS_ZLIB) += zlib_wrapper.o
> +squashfs-$(CONFIG_SQUASHFS_ZSTD) += zstd_wrapper.o
> diff --git a/fs/squashfs/decompressor.c b/fs/squashfs/decompressor.c
> index d2bc136..8366398 100644
> --- a/fs/squashfs/decompressor.c
> +++ b/fs/squashfs/decompressor.c
> @@ -65,6 +65,12 @@ static const struct squashfs_decompressor squashfs_zlib_comp_ops = {
>  };
>  #endif
>
> +#ifndef CONFIG_SQUASHFS_ZSTD
> +static const struct squashfs_decompressor squashfs_zstd_comp_ops = {
> +       NULL, NULL, NULL, NULL, ZSTD_COMPRESSION, "zstd", 0
> +};
> +#endif
> +
>  static const struct squashfs_decompressor squashfs_unknown_comp_ops = {
>         NULL, NULL, NULL, NULL, 0, "unknown", 0
>  };
> @@ -75,6 +81,7 @@ static const struct squashfs_decompressor *decompressor[] = {
>         &squashfs_lzo_comp_ops,
>         &squashfs_xz_comp_ops,
>         &squashfs_lzma_unsupported_comp_ops,
> +       &squashfs_zstd_comp_ops,
>         &squashfs_unknown_comp_ops
>  };
>
> diff --git a/fs/squashfs/decompressor.h b/fs/squashfs/decompressor.h
> index a25713c..0f5a8e4 100644
> --- a/fs/squashfs/decompressor.h
> +++ b/fs/squashfs/decompressor.h
> @@ -58,4 +58,8 @@ extern const struct squashfs_decompressor squashfs_lzo_comp_ops;
>  extern const struct squashfs_decompressor squashfs_zlib_comp_ops;
>  #endif
>
> +#ifdef CONFIG_SQUASHFS_ZSTD
> +extern const struct squashfs_decompressor squashfs_zstd_comp_ops;
> +#endif
> +
>  #endif
> diff --git a/fs/squashfs/squashfs_fs.h b/fs/squashfs/squashfs_fs.h
> index 506f4ba..24d12fd 100644
> --- a/fs/squashfs/squashfs_fs.h
> +++ b/fs/squashfs/squashfs_fs.h
> @@ -241,6 +241,7 @@ struct meta_index {
>  #define LZO_COMPRESSION                3
>  #define XZ_COMPRESSION         4
>  #define LZ4_COMPRESSION                5
> +#define ZSTD_COMPRESSION       6
>
>  struct squashfs_super_block {
>         __le32                  s_magic;
> diff --git a/fs/squashfs/zstd_wrapper.c b/fs/squashfs/zstd_wrapper.c
> new file mode 100644
> index 0000000..8cb7c76
> --- /dev/null
> +++ b/fs/squashfs/zstd_wrapper.c
> @@ -0,0 +1,150 @@
> +/*
> + * Squashfs - a compressed read only filesystem for Linux
> + *
> + * Copyright (c) 2016-present, Facebook, Inc.
> + * All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * as published by the Free Software Foundation; either version 2,
> + * or (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
> + *
> + * zstd_wrapper.c
> + */
> +
> +#include <linux/mutex.h>
> +#include <linux/buffer_head.h>
> +#include <linux/slab.h>
> +#include <linux/zstd.h>
> +#include <linux/vmalloc.h>
> +
> +#include "squashfs_fs.h"
> +#include "squashfs_fs_sb.h"
> +#include "squashfs.h"
> +#include "decompressor.h"
> +#include "page_actor.h"
> +
> +struct workspace {
> +       void *mem;
> +       size_t mem_size;
> +};
> +
> +static void *zstd_init(struct squashfs_sb_info *msblk, void *buff)
> +{
> +       struct workspace *wksp = kmalloc(sizeof(*wksp), GFP_KERNEL);
> +       if (wksp == NULL)
> +               goto failed;
> +       wksp->mem_size = ZSTD_DStreamWorkspaceBound(max_t(size_t,
> +                               msblk->block_size, SQUASHFS_METADATA_SIZE));
> +       wksp->mem = vmalloc(wksp->mem_size);
> +       if (wksp->mem == NULL)
> +               goto failed;
> +
> +       return wksp;
> +
> +failed:
> +       ERROR("Failed to allocate zstd workspace\n");
> +       kfree(wksp);
> +       return ERR_PTR(-ENOMEM);
> +}
> +
> +
> +static void zstd_free(void *strm)
> +{
> +       struct workspace *wksp = strm;
> +
> +       if (wksp)
> +               vfree(wksp->mem);
> +       kfree(wksp);
> +}
> +
> +
> +static int zstd_uncompress(struct squashfs_sb_info *msblk, void *strm,
> +       struct buffer_head **bh, int b, int offset, int length,
> +       struct squashfs_page_actor *output)
> +{
> +       struct workspace *wksp = strm;
> +       ZSTD_DStream *stream;
> +       size_t total_out = 0;
> +       size_t zstd_err;
> +       int k = 0;
> +       ZSTD_inBuffer in_buf = { NULL, 0, 0 };
> +       ZSTD_outBuffer out_buf = { NULL, 0, 0 };
> +
> +       stream = ZSTD_initDStream(wksp->mem_size, wksp->mem, wksp->mem_size);
> +
> +       if (!stream) {
> +               ERROR("Failed to initialize zstd decompressor\n");
> +               goto out;
> +       }
> +
> +       out_buf.size = PAGE_SIZE;
> +       out_buf.dst = squashfs_first_page(output);
> +
> +       do {
> +               if (in_buf.pos == in_buf.size && k < b) {
> +                       int avail = min(length, msblk->devblksize - offset);
> +                       length -= avail;
> +                       in_buf.src = bh[k]->b_data + offset;
> +                       in_buf.size = avail;
> +                       in_buf.pos = 0;
> +                       offset = 0;
> +               }
> +
> +               if (out_buf.pos == out_buf.size) {
> +                       out_buf.dst = squashfs_next_page(output);
> +                       if (out_buf.dst == NULL) {
> +                               /* shouldn't run out of pages before stream is
> +                                * done */
> +                               squashfs_finish_page(output);
> +                               goto out;
> +                       }
> +                       out_buf.pos = 0;
> +                       out_buf.size = PAGE_SIZE;
> +               }
> +
> +               total_out -= out_buf.pos;
> +               zstd_err = ZSTD_decompressStream(stream, &out_buf, &in_buf);
> +               total_out += out_buf.pos; /* add the additional data produced */
> +
> +               if (in_buf.pos == in_buf.size && k < b)
> +                       put_bh(bh[k++]);
> +       } while (zstd_err != 0 && !ZSTD_isError(zstd_err));
> +
> +       squashfs_finish_page(output);
> +
> +       if (ZSTD_isError(zstd_err)) {
> +               ERROR("zstd decompression error: %d\n",
> +                               (int)ZSTD_getErrorCode(zstd_err));
> +               goto out;
> +       }
> +
> +       if (k < b)
> +               goto out;
> +
> +       return (int)total_out;
> +
> +out:
> +       for (; k < b; k++)
> +               put_bh(bh[k]);
> +
> +       return -EIO;
> +}
> +
> +const struct squashfs_decompressor squashfs_zstd_comp_ops = {
> +       .init = zstd_init,
> +       .free = zstd_free,
> +       .decompress = zstd_uncompress,
> +       .id = ZSTD_COMPRESSION,
> +       .name = "zstd",
> +       .supported = 1
> +};
> --
> 2.9.3

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 4/4] squashfs: Add zstd support
  2017-07-31  1:50   ` Phillip Lougher
@ 2017-07-31  2:18     ` Phillip Lougher
  2017-07-31 21:30     ` Nick Terrell
  1 sibling, 0 replies; 16+ messages in thread
From: Phillip Lougher @ 2017-07-31  2:18 UTC (permalink / raw)
  To: Nick Terrell
  Cc: Austin S . Hemmelgarn, Herbert Xu, kernel-team, Chris Mason,
	Yann Collet, Adam Borowski, David Sterba, SquashFS developers,
	linux-btrfs, LKML, Sean Purcell

On Mon, Jul 31, 2017 at 2:50 AM, Phillip Lougher
<phillip.lougher@gmail.com> wrote:
> On Thu, Jul 20, 2017 at 10:27 PM, Nick Terrell <terrelln@fb.com> wrote:
>> Add zstd compression and decompression support to SquashFS. zstd is a
>> great fit for SquashFS because it can compress at ratios approaching xz,
>> while decompressing twice as fast as zlib. For SquashFS in particular,
>> it can decompress as fast as lzo and lz4. It also has the flexibility
>> to turn down the compression ratio for faster compression times.
>
> Hi Nick,
>
> This patch (and none of the previous versions) is showing up on
> squashfs-devel@lists.sourceforge.net.  I also think you should have
> emailed me directly as a courtesy, as I'm the Squashfs author and
> maintainer.

OK, you're not subscribed to squashfs-devel ....   I have accepted
your previous posts, but, please subscribe.

Thanks

Phillip

>
>
>> | Method         | Ratio | Compression MB/s | Decompression MB/s |
>> |----------------|-------|------------------|--------------------|
>> | gzip           |  2.92 |               15 |                128 |
>> | lzo            |  2.64 |              9.5 |                217 |
>> | lz4            |  2.12 |               94 |                218 |
>> | xz             |  3.43 |              5.5 |                 35 |
>> | xz 256 KB      |  3.53 |              5.4 |                 40 |
>> | zstd 1         |  2.71 |               96 |                210 |
>> | zstd 5         |  2.93 |               69 |                198 |
>> | zstd 10        |  3.01 |               41 |                225 |
>> | zstd 15        |  3.13 |             11.4 |                224 |
>> | zstd 16 256 KB |  3.24 |              8.1 |                210 |
>
>
> Those numbers look good to me.
>
>>
>> This patch was written by Sean Purcell <me@seanp.xyz>, but I will be
>> taking over the submission process.
>>
>> [1] http://releases.ubuntu.com/16.10/
>> [2] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/squashfs-benchmark.sh
>
> I can't find your patch that adds zstd to the user-land
> Squashfs-tools.  That would be handy to do any testing :-)
>
> Phillip
>
>>
>> zstd source repository: https://github.com/facebook/zstd
>>
>> Cc: Sean Purcell <me@seanp.xyz>
>> Signed-off-by: Nick Terrell <terrelln@fb.com>
>> ---
>>  fs/squashfs/Kconfig        |  14 +++++
>>  fs/squashfs/Makefile       |   1 +
>>  fs/squashfs/decompressor.c |   7 +++
>>  fs/squashfs/decompressor.h |   4 ++
>>  fs/squashfs/squashfs_fs.h  |   1 +
>>  fs/squashfs/zstd_wrapper.c | 150 +++++++++++++++++++++++++++++++++++++++++++++
>>  6 files changed, 177 insertions(+)
>>  create mode 100644 fs/squashfs/zstd_wrapper.c
>>
>> diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig
>> index ffb093e..1adb334 100644
>> --- a/fs/squashfs/Kconfig
>> +++ b/fs/squashfs/Kconfig
>> @@ -165,6 +165,20 @@ config SQUASHFS_XZ
>>
>>           If unsure, say N.
>>
>> +config SQUASHFS_ZSTD
>> +       bool "Include support for ZSTD compressed file systems"
>> +       depends on SQUASHFS
>> +       select ZSTD_DECOMPRESS
>> +       help
>> +         Saying Y here includes support for reading Squashfs file systems
>> +         compressed with ZSTD compression.  ZSTD gives better compression than
>> +         the default ZLIB compression, while using less CPU.
>> +
>> +         ZSTD is not the standard compression used in Squashfs and so most
>> +         file systems will be readable without selecting this option.
>> +
>> +         If unsure, say N.
>> +
>>  config SQUASHFS_4K_DEVBLK_SIZE
>>         bool "Use 4K device block size?"
>>         depends on SQUASHFS
>> diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile
>> index 246a6f3..6655631 100644
>> --- a/fs/squashfs/Makefile
>> +++ b/fs/squashfs/Makefile
>> @@ -15,3 +15,4 @@ squashfs-$(CONFIG_SQUASHFS_LZ4) += lz4_wrapper.o
>>  squashfs-$(CONFIG_SQUASHFS_LZO) += lzo_wrapper.o
>>  squashfs-$(CONFIG_SQUASHFS_XZ) += xz_wrapper.o
>>  squashfs-$(CONFIG_SQUASHFS_ZLIB) += zlib_wrapper.o
>> +squashfs-$(CONFIG_SQUASHFS_ZSTD) += zstd_wrapper.o
>> diff --git a/fs/squashfs/decompressor.c b/fs/squashfs/decompressor.c
>> index d2bc136..8366398 100644
>> --- a/fs/squashfs/decompressor.c
>> +++ b/fs/squashfs/decompressor.c
>> @@ -65,6 +65,12 @@ static const struct squashfs_decompressor squashfs_zlib_comp_ops = {
>>  };
>>  #endif
>>
>> +#ifndef CONFIG_SQUASHFS_ZSTD
>> +static const struct squashfs_decompressor squashfs_zstd_comp_ops = {
>> +       NULL, NULL, NULL, NULL, ZSTD_COMPRESSION, "zstd", 0
>> +};
>> +#endif
>> +
>>  static const struct squashfs_decompressor squashfs_unknown_comp_ops = {
>>         NULL, NULL, NULL, NULL, 0, "unknown", 0
>>  };
>> @@ -75,6 +81,7 @@ static const struct squashfs_decompressor *decompressor[] = {
>>         &squashfs_lzo_comp_ops,
>>         &squashfs_xz_comp_ops,
>>         &squashfs_lzma_unsupported_comp_ops,
>> +       &squashfs_zstd_comp_ops,
>>         &squashfs_unknown_comp_ops
>>  };
>>
>> diff --git a/fs/squashfs/decompressor.h b/fs/squashfs/decompressor.h
>> index a25713c..0f5a8e4 100644
>> --- a/fs/squashfs/decompressor.h
>> +++ b/fs/squashfs/decompressor.h
>> @@ -58,4 +58,8 @@ extern const struct squashfs_decompressor squashfs_lzo_comp_ops;
>>  extern const struct squashfs_decompressor squashfs_zlib_comp_ops;
>>  #endif
>>
>> +#ifdef CONFIG_SQUASHFS_ZSTD
>> +extern const struct squashfs_decompressor squashfs_zstd_comp_ops;
>> +#endif
>> +
>>  #endif
>> diff --git a/fs/squashfs/squashfs_fs.h b/fs/squashfs/squashfs_fs.h
>> index 506f4ba..24d12fd 100644
>> --- a/fs/squashfs/squashfs_fs.h
>> +++ b/fs/squashfs/squashfs_fs.h
>> @@ -241,6 +241,7 @@ struct meta_index {
>>  #define LZO_COMPRESSION                3
>>  #define XZ_COMPRESSION         4
>>  #define LZ4_COMPRESSION                5
>> +#define ZSTD_COMPRESSION       6
>>
>>  struct squashfs_super_block {
>>         __le32                  s_magic;
>> diff --git a/fs/squashfs/zstd_wrapper.c b/fs/squashfs/zstd_wrapper.c
>> new file mode 100644
>> index 0000000..8cb7c76
>> --- /dev/null
>> +++ b/fs/squashfs/zstd_wrapper.c
>> @@ -0,0 +1,150 @@
>> +/*
>> + * Squashfs - a compressed read only filesystem for Linux
>> + *
>> + * Copyright (c) 2016-present, Facebook, Inc.
>> + * All rights reserved.
>> + *
>> + * This program is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU General Public License
>> + * as published by the Free Software Foundation; either version 2,
>> + * or (at your option) any later version.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program; if not, write to the Free Software
>> + * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
>> + *
>> + * zstd_wrapper.c
>> + */
>> +
>> +#include <linux/mutex.h>
>> +#include <linux/buffer_head.h>
>> +#include <linux/slab.h>
>> +#include <linux/zstd.h>
>> +#include <linux/vmalloc.h>
>> +
>> +#include "squashfs_fs.h"
>> +#include "squashfs_fs_sb.h"
>> +#include "squashfs.h"
>> +#include "decompressor.h"
>> +#include "page_actor.h"
>> +
>> +struct workspace {
>> +       void *mem;
>> +       size_t mem_size;
>> +};
>> +
>> +static void *zstd_init(struct squashfs_sb_info *msblk, void *buff)
>> +{
>> +       struct workspace *wksp = kmalloc(sizeof(*wksp), GFP_KERNEL);
>> +       if (wksp == NULL)
>> +               goto failed;
>> +       wksp->mem_size = ZSTD_DStreamWorkspaceBound(max_t(size_t,
>> +                               msblk->block_size, SQUASHFS_METADATA_SIZE));
>> +       wksp->mem = vmalloc(wksp->mem_size);
>> +       if (wksp->mem == NULL)
>> +               goto failed;
>> +
>> +       return wksp;
>> +
>> +failed:
>> +       ERROR("Failed to allocate zstd workspace\n");
>> +       kfree(wksp);
>> +       return ERR_PTR(-ENOMEM);
>> +}
>> +
>> +
>> +static void zstd_free(void *strm)
>> +{
>> +       struct workspace *wksp = strm;
>> +
>> +       if (wksp)
>> +               vfree(wksp->mem);
>> +       kfree(wksp);
>> +}
>> +
>> +
>> +static int zstd_uncompress(struct squashfs_sb_info *msblk, void *strm,
>> +       struct buffer_head **bh, int b, int offset, int length,
>> +       struct squashfs_page_actor *output)
>> +{
>> +       struct workspace *wksp = strm;
>> +       ZSTD_DStream *stream;
>> +       size_t total_out = 0;
>> +       size_t zstd_err;
>> +       int k = 0;
>> +       ZSTD_inBuffer in_buf = { NULL, 0, 0 };
>> +       ZSTD_outBuffer out_buf = { NULL, 0, 0 };
>> +
>> +       stream = ZSTD_initDStream(wksp->mem_size, wksp->mem, wksp->mem_size);
>> +
>> +       if (!stream) {
>> +               ERROR("Failed to initialize zstd decompressor\n");
>> +               goto out;
>> +       }
>> +
>> +       out_buf.size = PAGE_SIZE;
>> +       out_buf.dst = squashfs_first_page(output);
>> +
>> +       do {
>> +               if (in_buf.pos == in_buf.size && k < b) {
>> +                       int avail = min(length, msblk->devblksize - offset);
>> +                       length -= avail;
>> +                       in_buf.src = bh[k]->b_data + offset;
>> +                       in_buf.size = avail;
>> +                       in_buf.pos = 0;
>> +                       offset = 0;
>> +               }
>> +
>> +               if (out_buf.pos == out_buf.size) {
>> +                       out_buf.dst = squashfs_next_page(output);
>> +                       if (out_buf.dst == NULL) {
>> +                               /* shouldn't run out of pages before stream is
>> +                                * done */
>> +                               squashfs_finish_page(output);
>> +                               goto out;
>> +                       }
>> +                       out_buf.pos = 0;
>> +                       out_buf.size = PAGE_SIZE;
>> +               }
>> +
>> +               total_out -= out_buf.pos;
>> +               zstd_err = ZSTD_decompressStream(stream, &out_buf, &in_buf);
>> +               total_out += out_buf.pos; /* add the additional data produced */
>> +
>> +               if (in_buf.pos == in_buf.size && k < b)
>> +                       put_bh(bh[k++]);
>> +       } while (zstd_err != 0 && !ZSTD_isError(zstd_err));
>> +
>> +       squashfs_finish_page(output);
>> +
>> +       if (ZSTD_isError(zstd_err)) {
>> +               ERROR("zstd decompression error: %d\n",
>> +                               (int)ZSTD_getErrorCode(zstd_err));
>> +               goto out;
>> +       }
>> +
>> +       if (k < b)
>> +               goto out;
>> +
>> +       return (int)total_out;
>> +
>> +out:
>> +       for (; k < b; k++)
>> +               put_bh(bh[k]);
>> +
>> +       return -EIO;
>> +}
>> +
>> +const struct squashfs_decompressor squashfs_zstd_comp_ops = {
>> +       .init = zstd_init,
>> +       .free = zstd_free,
>> +       .decompress = zstd_uncompress,
>> +       .id = ZSTD_COMPRESSION,
>> +       .name = "zstd",
>> +       .supported = 1
>> +};
>> --
>> 2.9.3

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 4/4] squashfs: Add zstd support
  2017-07-31  1:50   ` Phillip Lougher
  2017-07-31  2:18     ` Phillip Lougher
@ 2017-07-31 21:30     ` Nick Terrell
  1 sibling, 0 replies; 16+ messages in thread
From: Nick Terrell @ 2017-07-31 21:30 UTC (permalink / raw)
  To: Phillip Lougher
  Cc: Austin S . Hemmelgarn, Herbert Xu, Kernel Team, Chris Mason,
	Yann Collet, Adam Borowski, David Sterba, SquashFS developers,
	linux-btrfs@vger.kernel.org, LKML, Sean Purcell

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 2088 bytes --]

On 7/30/17, 6:50 PM, "Phillip Lougher" <phillip.lougher@gmail.com> wrote:
> On Thu, Jul 20, 2017 at 10:27 PM, Nick Terrell <terrelln@fb.com> wrote:
>> Add zstd compression and decompression support to SquashFS. zstd is a
>> great fit for SquashFS because it can compress at ratios approaching xz,
>> while decompressing twice as fast as zlib. For SquashFS in particular,
>> it can decompress as fast as lzo and lz4. It also has the flexibility
>> to turn down the compression ratio for faster compression times.
>
> Hi Nick,
>
> This patch (and none of the previous versions) is showing up on
> squashfs-devel@lists.sourceforge.net.  I also think you should have
> emailed me directly as a courtesy, as I'm the Squashfs author and
> maintainer.

Sorry about that Phillip, it was an oversight on my part. I've added you to
the CC list going forward, and have subscribed to the mailing list.

>> | Method>> | Ratio | Compression MB/s | Decompression MB/s |
>> |----------------|-------|------------------|--------------------|
>> | gzip>>   |  2.92 |>>>   15 |>>>>128 |
>> | lzo>>>|  2.64 |>>>  9.5 |>>>>217 |
>> | lz4>>>|  2.12 |>>>   94 |>>>>218 |
>> | xz>>> |  3.43 |>>>  5.5 |>>>> 35 |
>> | xz 256 KB>  |  3.53 |>>>  5.4 |>>>> 40 |
>> | zstd 1>> |  2.71 |>>>   96 |>>>>210 |
>> | zstd 5>> |  2.93 |>>>   69 |>>>>198 |
>> | zstd 10>>|  3.01 |>>>   41 |>>>>225 |
>> | zstd 15>>|  3.13 |>>> 11.4 |>>>>224 |
>> | zstd 16 256 KB |  3.24 |>>>  8.1 |>>>>210 |
>
>
> Those numbers look good to me.
>
>>
>> This patch was written by Sean Purcell <me@seanp.xyz>, but I will be
>> taking over the submission process.
>>
>> [1] http://releases.ubuntu.com/16.10/
>> [2] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/squashfs-benchmark.sh
>
> I can't find your patch that adds zstd to the user-land
> Squashfs-tools.  That would be handy to do any testing :-)
>
> Phillip

I'll include the squashfs-tools patch that Sean just posted in the next
version.


ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±ý»k~ÏâžØ^n‡r¡ö¦zË\x1aëh™¨è­Ú&£ûàz¿äz¹Þ—ú+€Ê+zf£¢·hšˆ§~†­†Ûiÿÿïêÿ‘êçz_è®\x0fæj:+v‰¨þ)ߣøm

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 3/4] btrfs: Add zstd support
  2017-07-25 23:19   ` Giovanni Cabiddu
@ 2017-08-18 15:37     ` David Sterba
  0 siblings, 0 replies; 16+ messages in thread
From: David Sterba @ 2017-08-18 15:37 UTC (permalink / raw)
  To: Giovanni Cabiddu
  Cc: Nick Terrell, Austin S . Hemmelgarn, Herbert Xu,
	kernel-team@fb.com, Chris Mason, Yann Collet, Adam Borowski,
	David Sterba, squashfs-devel@lists.sourceforge.net,
	linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-crypto@vger.kernel.org, Weigang Li, Brian Will,
	Brian A Keating, Giovanni Cabiddu

On Wed, Jul 26, 2017 at 12:19:29AM +0100, Giovanni Cabiddu wrote:
> Hi Nick,
> 
> On Thu, Jul 20, 2017 at 10:27:42PM +0100, Nick Terrell wrote:
> > Add zstd compression and decompression support to BtrFS. zstd at its
> > fastest level compresses almost as well as zlib, while offering much
> > faster compression and decompression, approaching lzo speeds.
> Can we look at integrating the zstd implementation below the acomp API
> available in the crypto subsystem?
> (https://github.com/torvalds/linux/blob/master/crypto/acompress.c)
> Acomp was designed to provide a generic and uniform API for compression
> in the kernel which hides algorithm specific details to frameworks.
> In future it would be nice to see btrfs using exclusively acomp
> for compression. This way when a new compression algorithm is exposed
> through acomp, it will be available immediately in btrfs.

The compression algorithm is considered part of the filesystem format
and must be covered by an incompatibility bit. Regarding the acomp API,
I don't see much point adding the extra abstraction layer for the two
currently supported algorithms. We only call 2 functions, for
compression and decompression, no need to allocate the acomp helper
structures and call the functions indirectly.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2017-08-18 15:38 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-07-20 21:27 [PATCH v3 0/4] Add xxhash and zstd modules Nick Terrell
2017-07-20 21:27 ` [PATCH v3 1/4] lib: Add xxhash module Nick Terrell
2017-07-20 21:27 ` [PATCH v3 3/4] btrfs: Add zstd support Nick Terrell
2017-07-23 19:28   ` kbuild test robot
2017-07-24 17:25   ` kbuild test robot
2017-07-25 23:19   ` Giovanni Cabiddu
2017-08-18 15:37     ` David Sterba
2017-07-20 21:27 ` [PATCH v3 4/4] squashfs: " Nick Terrell
2017-07-31  1:50   ` Phillip Lougher
2017-07-31  2:18     ` Phillip Lougher
2017-07-31 21:30     ` Nick Terrell
2017-07-21 11:16 ` [PATCH v3 0/4] Add xxhash and zstd modules Austin S. Hemmelgarn
2017-07-21 11:18   ` Austin S. Hemmelgarn
2017-07-21 15:56 ` Austin S. Hemmelgarn
2017-07-22 11:35   ` Adam Borowski
2017-07-24 13:44     ` Austin S. Hemmelgarn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).