Linux cryptographic layer development

Linux cryptographic layer development
 help / color / mirror / Atom feed

* [PATCH] fix itnull.cocci warnings
From: Julia Lawall @ 2017-01-07  9:46 UTC (permalink / raw)
  To: Harsh Jain
  Cc: hariprasad, netdev, herbert, linux-crypto, Atul Gupta, kbuild-all

The first argument to list_for_each_entry cannot be NULL.

Generated by: scripts/coccinelle/iterators/itnull.cocci

CC: Harsh Jain <harsh@chelsio.com>
Signed-off-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
---

This code comes from the following git tree:

url:
https://github.com/0day-ci/linux/commits/Harsh-Jain/crypto-chcr-Bug-fixes/20170107-093356
base:
https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git
master
In-Reply-To:
<8e0086b56d8fb61637d179c32a09a1bca03c4186.1483599449.git.harsh@chelsio.com>

 chcr_core.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/crypto/chelsio/chcr_core.c
+++ b/drivers/crypto/chelsio/chcr_core.c
@@ -61,7 +61,7 @@ int assign_chcr_device(struct chcr_dev *
 	 */
 	mutex_lock(&dev_mutex); /* TODO ? */
 	list_for_each_entry(u_ctx, &uld_ctx_list, entry)
-		if (u_ctx && u_ctx->dev) {
+		if (u_ctx->dev) {
 			*dev = u_ctx->dev;
 			ret = 0;
 			break;

^ permalink raw reply

* Re: [PATCH 0/3] crypto: picoxcell - Cleanups removing non-DT code
From: Jamie Iles @ 2017-01-07 15:05 UTC (permalink / raw)
  To: Javier Martinez Canillas
  Cc: linux-kernel, Arnd Bergmann, Jamie Iles, David S. Miller,
	linux-crypto, Herbert Xu, linux-arm-kernel
In-Reply-To: <1483376819-26726-1-git-send-email-javier@osg.samsung.com>

Hi Javier,

On Mon, Jan 02, 2017 at 02:06:56PM -0300, Javier Martinez Canillas wrote:
> Hello,
> 
> This small series contains a couple of cleanups that removes some driver's code
> that isn't needed due the driver being for a DT-only platform.
> 
> The changes were suggested by Arnd Bergmann as a response to a previous patch:
> https://lkml.org/lkml/2017/1/2/342
> 
> Patch #1 allows the driver to be built when the COMPILE_TEST option is enabled.
> Patch #2 removes the platform ID table since isn't needed for DT-only drivers.
> Patch #3 removes a wrapper function that's also not needed if driver is DT-only.
> 
> Best regards,
> 
> 
> Javier Martinez Canillas (3):
>   crypto: picoxcell - Allow driver to build COMPILE_TEST is enabled
>   crypto: picoxcell - Remove platform device ID table
>   crypto: picoxcell - Remove spacc_is_compatible() wrapper function
> 
>  drivers/crypto/Kconfig            |  2 +-
>  drivers/crypto/picoxcell_crypto.c | 28 +++-------------------------
>  2 files changed, 4 insertions(+), 26 deletions(-)

Acked-by: Jamie Iles <jamie@jamieiles.com>

Thanks,

Jamie

^ permalink raw reply

* [PATCH v2 0/4] Update LZ4 compressor module
From: Sven Schmidt @ 2017-01-07 16:55 UTC (permalink / raw)
  To: akpm
  Cc: bongkyu.kim, rsalvaterra, sergey.senozhatsky, gregkh,
	linux-kernel, herbert, davem, linux-crypto, anton, ccross,
	keescook, tony.luck, phillip
In-Reply-To: <1482259992-16680-1-git-send-email-4sschmid@informatik.uni-hamburg.de>


This patchset is for updating the LZ4 compression module to a version based
on LZ4 v1.7.2 allowing to use the fast compression algorithm aka LZ4 fast
which provides an "acceleration" parameter as a tradeoff between
high compression ratio and high compression speed.

We want to use LZ4 fast in order to support compression in lustre
and (mostly, based on that) investigate data reduction techniques in behalf of
storage systems.

Also, it will be useful for other users of LZ4 compression, as with LZ4 fast
it is possible to enable applications to use fast and/or high compression
depending on the usecase.
For instance, ZRAM is offering a LZ4 backend and could benefit from an updated
LZ4 in the kernel.

LZ4 homepage: http://www.lz4.org/
LZ4 source repository: https://github.com/lz4/lz4
Source version: 1.7.2

Benchmark (taken from [1], Core i5-4300U @1.9GHz):
----------------|--------------|----------------|----------
Compressor      | Compression  | Decompression  | Ratio
----------------|--------------|----------------|----------
memcpy          |  4200 MB/s   |  4200 MB/s     | 1.000
LZ4 fast 50     |  1080 MB/s   |  2650 MB/s     | 1.375
LZ4 fast 17     |   680 MB/s   |  2220 MB/s     | 1.607
LZ4 fast 5      |   475 MB/s   |  1920 MB/s     | 1.886
LZ4 default     |   385 MB/s   |  1850 MB/s     | 2.101

[1] http://fastcompression.blogspot.de/2015/04/sampling-or-faster-lz4.html

[PATCHv2 1/4] lib: Update LZ4  compressor module based on LZ4 v1.7.2
[PATCHv2 2/4] lib: Update decompress_unlz4 wrapper to work with new LZ4 module
[PATCHv2 3/4] crypto: Change lz4 modules to work with new LZ4 module
[PATCHv2 4/4] fs/pstore: fs/squashfs: Change LZ4 compressor functions to work with new LZ4 module

v2:
- Changed order of the patches since in the initial patchset the lz4.h was in the
  last patch but was referenced by the other ones
- Split lib/decompress_unlz4.c in an own patch
- Fixed errors reported by the buildbot
- Further refactorings
- Added more appropriate copyright note to include/linux/lz4.h

^ permalink raw reply

* [PATCH v2 1/4] lib: Update LZ4 compressor module based on LZ4 v1.7.2.
From: Sven Schmidt @ 2017-01-07 16:55 UTC (permalink / raw)
  To: akpm
  Cc: bongkyu.kim, rsalvaterra, sergey.senozhatsky, gregkh,
	linux-kernel, herbert, davem, linux-crypto, anton, ccross,
	keescook, tony.luck, phillip, Sven Schmidt
In-Reply-To: <1483808145-18417-1-git-send-email-4sschmid@informatik.uni-hamburg.de>

This patch updates LZ4 kernel module to LZ4 v1.7.2 by Yann Collet.
The kernel module is inspired by the previous work by Chanho Min.
The updated LZ4 module will not break existing code since there were alias
methods added to ensure backwards compatibility.

API changes:

New method LZ4_compress_fast which differs from the variant available in kernel
by the new acceleration parameter, allowing to trade compression ratio for
more speedup.

LZ4_decompress_fast is the respective decompression method,
allowing to decompress data compressed with any LZ4 compression method including
LZ4HC.

Also the useful functions LZ4_decompress_safe_partial
LZ4_compress_destsize were added. The latter reverses the logic by trying to
compress as much data as possible from source to dest while the former aims
to decompress partial blocks of data.

The methods lz4_compress and lz4_decompress_unknownoutputsize are now known as
LZ4_compress_default respectivley LZ4_decompress_safe. The old methods are still
available for offering backwards compatibility.

Signed-off-by: Sven Schmidt <4sschmid@informatik.uni-hamburg.de>
---
 include/linux/lz4.h      |  307 ++++++++++----
 lib/lz4/lz4_compress.c   | 1024 +++++++++++++++++++++++++++------------------
 lib/lz4/lz4_decompress.c |  631 ++++++++++++++--------------
 lib/lz4/lz4defs.h        |  378 +++++++++++------
 lib/lz4/lz4hc_compress.c | 1045 ++++++++++++++++++++++++----------------------
 5 files changed, 1938 insertions(+), 1447 deletions(-)

diff --git a/include/linux/lz4.h b/include/linux/lz4.h
index 6b784c5..20280f8 100644
--- a/include/linux/lz4.h
+++ b/include/linux/lz4.h
@@ -1,87 +1,238 @@
+/* LZ4 Kernel Interface
+
+Copyright (C) 2013, LG Electronics, Kyungsik Lee <kyungsik.lee@lge.com>
+Copyright (C) 2016, Sven Schmidt <4sschmid@informatik.uni-hamburg.de>
+
+This program is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License version 2 as
+published by the Free Software Foundation.
+
+This file is based on the original header file for LZ4 - Fast LZ compression algorithm.
+
+LZ4 - Fast LZ compression algorithm
+Copyright (C) 2011-2016, Yann Collet.
+BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are
+met:
+    * Redistributions of source code must retain the above copyright
+notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above
+copyright notice, this list of conditions and the following disclaimer
+in the documentation and/or other materials provided with the
+distribution.
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+You can contact the author at :
+ - LZ4 homepage : http://www.lz4.org
+ - LZ4 source repository : https://github.com/lz4/lz4
+*/
+
 #ifndef __LZ4_H__
 #define __LZ4_H__
-/*
- * LZ4 Kernel Interface
- *
- * Copyright (C) 2013, LG Electronics, Kyungsik Lee <kyungsik.lee@lge.com>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- */
+
+#include <linux/types.h>
+
+/*-************************************************************************
+*  Constants
+**************************************************************************/
 #define LZ4_MEM_COMPRESS	(16384)
 #define LZ4HC_MEM_COMPRESS	(262144 + (2 * sizeof(unsigned char *)))
 
-/*
- * lz4_compressbound()
- * Provides the maximum size that LZ4 may output in a "worst case" scenario
- * (input data not compressible)
- */
-static inline size_t lz4_compressbound(size_t isize)
-{
-	return isize + (isize / 255) + 16;
+#define LZ4_MAX_INPUT_SIZE        0x7E000000   /* 2 113 929 216 bytes */
+#define LZ4_COMPRESSBOUND(isize)  ((unsigned)(isize) > (unsigned)LZ4_MAX_INPUT_SIZE ? 0 : (isize) + ((isize)/255) + 16)
+
+#define LZ4HC_MIN_CLEVEL        3
+#define LZ4HC_DEFAULT_CLEVEL    9
+#define LZ4HC_MAX_CLEVEL        16
+
+/*-************************************************************************
+*  Compression Functions
+**************************************************************************/
+
+/*!LZ4_compressbound() :
+    Provides the maximum size that LZ4 may output in a "worst case" scenario
+    (input data not compressible)
+*/
+static inline int LZ4_compressBound(int isize) {
+  return LZ4_COMPRESSBOUND(isize);
 }
 
-/*
- * lz4_compress()
- *	src     : source address of the original data
- *	src_len : size of the original data
- *	dst	: output buffer address of the compressed data
- *		This requires 'dst' of size LZ4_COMPRESSBOUND.
- *	dst_len : is the output size, which is returned after compress done
- *	workmem : address of the working memory.
- *		This requires 'workmem' of size LZ4_MEM_COMPRESS.
- *	return  : Success if return 0
- *		  Error if return (< 0)
- *	note :  Destination buffer and workmem must be already allocated with
- *		the defined size.
- */
-int lz4_compress(const unsigned char *src, size_t src_len,
-		unsigned char *dst, size_t *dst_len, void *wrkmem);
-
- /*
-  * lz4hc_compress()
-  *	 src	 : source address of the original data
-  *	 src_len : size of the original data
-  *	 dst	 : output buffer address of the compressed data
-  *		This requires 'dst' of size LZ4_COMPRESSBOUND.
-  *	 dst_len : is the output size, which is returned after compress done
-  *	 workmem : address of the working memory.
-  *		This requires 'workmem' of size LZ4HC_MEM_COMPRESS.
-  *	 return  : Success if return 0
-  *		   Error if return (< 0)
-  *	 note :  Destination buffer and workmem must be already allocated with
-  *		 the defined size.
-  */
-int lz4hc_compress(const unsigned char *src, size_t src_len,
-		unsigned char *dst, size_t *dst_len, void *wrkmem);
-
-/*
- * lz4_decompress()
- *	src     : source address of the compressed data
- *	src_len : is the input size, whcih is returned after decompress done
- *	dest	: output buffer address of the decompressed data
- *	actual_dest_len: is the size of uncompressed data, supposing it's known
- *	return  : Success if return 0
- *		  Error if return (< 0)
- *	note :  Destination buffer must be already allocated.
- *		slightly faster than lz4_decompress_unknownoutputsize()
- */
-int lz4_decompress(const unsigned char *src, size_t *src_len,
-		unsigned char *dest, size_t actual_dest_len);
-
-/*
- * lz4_decompress_unknownoutputsize()
- *	src     : source address of the compressed data
- *	src_len : is the input size, therefore the compressed size
- *	dest	: output buffer address of the decompressed data
- *	dest_len: is the max size of the destination buffer, which is
- *			returned with actual size of decompressed data after
- *			decompress done
- *	return  : Success if return 0
- *		  Error if return (< 0)
- *	note :  Destination buffer must be already allocated.
+/*! lz4_compressbound() :
+    For backward compabitiliby
+*/
+static inline size_t lz4_compressbound(size_t isize) {
+	return (int)LZ4_COMPRESSBOUND(isize);
+}
+
+/*! LZ4_compress_default() :
+    Compresses 'sourceSize' bytes from buffer 'source'
+    into already allocated 'dest' buffer of size 'maxDestSize'.
+    Compression is guaranteed to succeed if 'maxDestSize' >= LZ4_compressBound(sourceSize).
+    It also runs faster, so it's a recommended setting.
+    If the function cannot compress 'source' into a more limited 'dest' budget,
+    compression stops *immediately*, and the function result is zero.
+    As a consequence, 'dest' content is not valid.
+    This function never writes outside 'dest' buffer, nor read outside 'source' buffer.
+        sourceSize  : Max supported value is LZ4_MAX_INPUT_VALUE
+        maxDestSize : full or partial size of buffer 'dest' (which must be already allocated)
+        workmem : address of the working memory. This requires 'workmem' of size LZ4_MEM_COMPRESS.
+        return : the number of bytes written into buffer 'dest' (necessarily <= maxOutputSize)
+              or 0 if compression fails */
+int LZ4_compress_default(const char* source, char* dest, int inputSize, int maxOutputSize, void* wrkmem);
+
+/*!
+LZ4_compress_fast() :
+    Same as LZ4_compress_default(), but allows to select an "acceleration" factor.
+    The larger the acceleration value, the faster the algorithm, but also the lesser the compression.
+    It's a trade-off. It can be fine tuned, with each successive value providing roughly +~3% to speed.
+    An acceleration value of "1" is the same as regular LZ4_compress_default()
+    Values <= 0 will be replaced by ACCELERATION_DEFAULT, which is 1.
+*/
+int LZ4_compress_fast(const char* source, char* dest, int inputSize, int maxOutputSize, void* wrkmem, int acceleration);
+
+/*!
+LZ4_compress_destSize() :
+    Reverse the logic, by compressing as much data as possible from 'source' buffer
+    into already allocated buffer 'dest' of size 'targetDestSize'.
+    This function either compresses the entire 'source' content into 'dest' if it's large enough,
+    or fill 'dest' buffer completely with as much data as possible from 'source'.
+        *sourceSizePtr : will be modified to indicate how many bytes where read from 'source' to fill 'dest'.
+                         New value is necessarily <= old value.
+        workmem : address of the working memory. This requires 'workmem' of size LZ4_MEM_COMPRESS.
+        return : Nb bytes written into 'dest' (necessarily <= targetDestSize)
+              or 0 if compression fails
+*/
+int LZ4_compress_destSize (const char* source, char* dest, int* sourceSizePtr, int targetDestSize, void* wrkmem);
+
+/*!
+ * lz4_compress() :
+    For backwards compabitiliby
+        src     : source address of the original data
+        src_len : size of the original data
+        dst     : output buffer address of the compressed data
+                This requires 'dst' of size LZ4_COMPRESSBOUND.
+        dst_len : is the output size, which is returned after compress done
+        workmem : address of the working memory.
+                This requires 'workmem' of size LZ4_MEM_COMPRESS.
+        return  : Success if return 0
+                  Error if return (< 0)
+        note :  Destination buffer and workmem must be already allocated with
+                the defined size.
  */
-int lz4_decompress_unknownoutputsize(const unsigned char *src, size_t src_len,
-		unsigned char *dest, size_t *dest_len);
+int lz4_compress(const unsigned char *src, size_t src_len, unsigned char *dst, size_t *dst_len, void *wrkmem);
+
+/*-************************************************************************
+*  Decompression Functions
+**************************************************************************/
+
+/*!
+LZ4_decompress_fast() :
+    originalSize : is the original and therefore uncompressed size
+    return : the number of bytes read from the source buffer (in other words, the compressed size)
+             If the source stream is detected malformed, the function will stop decoding and return a negative result.
+             Destination buffer must be already allocated. Its size must be a minimum of 'originalSize' bytes.
+    note : This function fully respect memory boundaries for properly formed compressed data.
+           It is a bit faster than LZ4_decompress_safe().
+           However, it does not provide any protection against intentionally modified data stream (malicious input).
+           Use this function in trusted environment only (data to decode comes from a trusted source).
+*/
+int LZ4_decompress_fast(const char* source, char* dest, int originalSize);
+
+/*!
+LZ4_decompress_safe() :
+    compressedSize : is the precise full size of the compressed block.
+    maxDecompressedSize : is the size of destination buffer, which must be already allocated.
+    return : the number of bytes decompressed into destination buffer (necessarily <= maxDecompressedSize)
+             If destination buffer is not large enough, decoding will stop and output an error code (<0).
+             If the source stream is detected malformed, the function will stop decoding and return a negative result.
+             This function is protected against buffer overflow exploits, including malicious data packets.
+             It never writes outside output buffer, nor reads outside input buffer.
+*/
+int LZ4_decompress_safe(const char* source, char* dest, int compressedSize, int maxDecompressedSize);
+
+/*!
+LZ4_decompress_safe_partial() :
+    This function decompress a compressed block of size 'compressedSize' at position 'source'
+    into destination buffer 'dest' of size 'maxDecompressedSize'.
+    The function tries to stop decompressing operation as soon as 'targetOutputSize' has been reached,
+    reducing decompression time.
+    return : the number of bytes decoded in the destination buffer (necessarily <= maxDecompressedSize)
+       Note : this number can be < 'targetOutputSize' should the compressed block to decode be smaller.
+             Always control how many bytes were decoded.
+             If the source stream is detected malformed, the function will stop decoding and return a negative result.
+             This function never writes outside of output buffer, and never reads outside of input buffer. It is therefore protected against malicious data packets
+*/
+int LZ4_decompress_safe_partial(const char* source, char* dest, int compressedSize, int targetOutputSize, int maxDecompressedSize);
+
+
+/*!
+lz4_decompress_unknownoutputsize() :
+    For backwards compabitiliby
+        src     : source address of the compressed data
+        src_len : is the input size, therefore the compressed size
+        dest    : output buffer address of the decompressed data
+        dest_len: is the max size of the destination buffer, which is
+                        returned with actual size of decompressed data after
+                        decompress done
+        return  : Success if return 0
+                  Error if return (< 0)
+        note :  Destination buffer must be already allocated.
+*/
+int lz4_decompress_unknownoutputsize(const unsigned char *src, size_t src_len, unsigned char *dest, size_t *dest_len);
+
+/*!
+lz4_decompress() :
+    For backwards compabitiliby
+        src     : source address of the compressed data
+        src_len : is the input size, which is returned after decompress done
+        dest    : output buffer address of the decompressed data
+        actual_dest_len: is the size of uncompressed data, supposing it's known
+        return  : Success if return 0
+                  Error if return (< 0)
+        note :  Destination buffer must be already allocated.
+                slightly faster than lz4_decompress_unknownoutputsize()
+*/
+int lz4_decompress(const unsigned char *src, size_t *src_len, unsigned char *dest, size_t actual_dest_len);
+
+/*-************************************************************************
+ *  LZ4 HC Compression
+ **************************************************************************/
+
+/*! LZ4_compress_HC() :
+    Compress data from `src` into `dst`, using the more powerful but slower "HC" algorithm. dst` must be already allocated.
+        wrkmem : address of the working memory. This requires 'wrkmem' of size LZ4HC_MEM_COMPRESS.
+        Compression is guaranteed to succeed if `dstCapacity >= LZ4_compressBound(srcSize)`
+        Max supported `srcSize` value is LZ4_MAX_INPUT_SIZE
+        compressionLevel` : Recommended values are between 4 and 9, although any value between 1 and LZ4HC_MAX_CLEVEL will work.
+                        Values >LZ4HC_MAX_CLEVEL behave the same as 16.
+        @return : the number of bytes written into 'dst' or 0 if compression fails.
+*/
+int LZ4_compress_HC(const char* src, char* dst, int srcSize, int dstCapacity, int compressionLevel, void* wrkmem);
+
+/*! lz4hc_compress()
+    For backwards compabitiliby
+        src     : source address of the original data
+        src_len : size of the original data
+        dst     : output buffer address of the compressed data
+               This requires 'dst' of size LZ4_COMPRESSBOUND.
+        dst_len : is the output size, which is returned after compress done
+        workmem : address of the working memory.
+               This requires 'workmem' of size LZ4HC_MEM_COMPRESS.
+        return  : Success if return 0
+                  Error if return (< 0)
+        note :  Destination buffer and workmem must be already allocated with
+                the defined size.
+*/
+int lz4hc_compress(const unsigned char *src, size_t src_len, unsigned char *dst, size_t *dst_len, void *wrkmem);
+
 #endif
diff --git a/lib/lz4/lz4_compress.c b/lib/lz4/lz4_compress.c
index 28321d8..1cba405 100644
--- a/lib/lz4/lz4_compress.c
+++ b/lib/lz4/lz4_compress.c
@@ -1,443 +1,639 @@
 /*
- * LZ4 - Fast LZ compression algorithm
- * Copyright (C) 2011-2012, Yann Collet.
- * BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
-
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions are
- * met:
- *
- *     * Redistributions of source code must retain the above copyright
- * notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above
- * copyright notice, this list of conditions and the following disclaimer
- * in the documentation and/or other materials provided with the
- * distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- *
- * You can contact the author at :
- * - LZ4 homepage : http://fastcompression.blogspot.com/p/lz4.html
- * - LZ4 source repository : http://code.google.com/p/lz4/
- *
- *  Changed for kernel use by:
- *  Chanho Min <chanho.min@lge.com>
- */
-
+   LZ4 - Fast LZ compression algorithm
+   Copyright (C) 2011-2016, Yann Collet.
+   BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
+   Redistribution and use in source and binary forms, with or without
+   modification, are permitted provided that the following conditions are
+   met:
+       * Redistributions of source code must retain the above copyright
+   notice, this list of conditions and the following disclaimer.
+       * Redistributions in binary form must reproduce the above
+   copyright notice, this list of conditions and the following disclaimer
+   in the documentation and/or other materials provided with the
+   distribution.
+   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+   You can contact the author at :
+    - LZ4 homepage : http://www.lz4.org
+    - LZ4 source repository : https://github.com/lz4/lz4
+
+    Changed for kernel use by:
+    Sven Schmidt <4sschmid@informatik.uni-hamburg.de>
+*/
+
+/*-************************************
+*  Includes
+**************************************/
+#include "lz4defs.h"
+#include <linux/lz4.h>
 #include <linux/module.h>
 #include <linux/kernel.h>
-#include <linux/lz4.h>
 #include <asm/unaligned.h>
-#include "lz4defs.h"
 
 /*
- * LZ4_compressCtx :
- * -----------------
- * Compress 'isize' bytes from 'source' into an output buffer 'dest' of
- * maximum size 'maxOutputSize'.  * If it cannot achieve it, compression
- * will stop, and result of the function will be zero.
- * return : the number of bytes written in buffer 'dest', or 0 if the
- * compression fails
+ * ACCELERATION_DEFAULT :
+ * Select "acceleration" for LZ4_compress_fast() when parameter value <= 0
  */
-static inline int lz4_compressctx(void *ctx,
-		const char *source,
-		char *dest,
-		int isize,
-		int maxoutputsize)
+#define ACCELERATION_DEFAULT 1
+
+/*-******************************
+*  Compression functions
+********************************/
+static U32 LZ4_hashSequence(U32 sequence, tableType_t const tableType)
+{
+    if (tableType == byU16)
+        return (((sequence) * 2654435761U) >> ((MINMATCH*8)-(LZ4_HASHLOG+1)));
+    else
+        return (((sequence) * 2654435761U) >> ((MINMATCH*8)-LZ4_HASHLOG));
+}
+
+static const U64 prime5bytes = 889523592379ULL;
+static U32 LZ4_hashSequence64(size_t sequence, tableType_t const tableType)
+{
+    const U32 hashLog = (tableType == byU16) ? LZ4_HASHLOG+1 : LZ4_HASHLOG;
+    const U32 hashMask = (1<<hashLog) - 1;
+    return ((sequence * prime5bytes) >> (40 - hashLog)) & hashMask;
+}
+
+static U32 LZ4_hashSequenceT(size_t sequence, tableType_t const tableType)
+{
+    if (LZ4_64bits())
+        return LZ4_hashSequence64(sequence, tableType);
+    return LZ4_hashSequence((U32)sequence, tableType);
+}
+
+static U32 LZ4_hashPosition(const void* p, tableType_t tableType) { return LZ4_hashSequenceT(LZ4_read_ARCH(p), tableType); }
+
+static void LZ4_putPositionOnHash(const BYTE* p, U32 h, void* tableBase, tableType_t const tableType, const BYTE* srcBase)
+{
+    switch (tableType)
+    {
+    case byPtr: { const BYTE** hashTable = (const BYTE**)tableBase; hashTable[h] = p; return; }
+    case byU32: { U32* hashTable = (U32*) tableBase; hashTable[h] = (U32)(p-srcBase); return; }
+    case byU16: { U16* hashTable = (U16*) tableBase; hashTable[h] = (U16)(p-srcBase); return; }
+    }
+}
+
+static void LZ4_putPosition(const BYTE* p, void* tableBase, tableType_t tableType, const BYTE* srcBase)
+{
+    U32 const h = LZ4_hashPosition(p, tableType);
+    LZ4_putPositionOnHash(p, h, tableBase, tableType, srcBase);
+}
+
+static const BYTE* LZ4_getPositionOnHash(U32 h, void* tableBase, tableType_t tableType, const BYTE* srcBase)
+{
+    if (tableType == byPtr) { const BYTE** hashTable = (const BYTE**) tableBase; return hashTable[h]; }
+    if (tableType == byU32) { const U32* const hashTable = (U32*) tableBase; return hashTable[h] + srcBase; }
+    { const U16* const hashTable = (U16*) tableBase; return hashTable[h] + srcBase; }   /* default, to ensure a return */
+}
+
+static const BYTE* LZ4_getPosition(const BYTE* p, void* tableBase, tableType_t tableType, const BYTE* srcBase)
+{
+    U32 const h = LZ4_hashPosition(p, tableType);
+    return LZ4_getPositionOnHash(h, tableBase, tableType, srcBase);
+}
+
+
+/** LZ4_compress_generic() :
+    inlined, to ensure branches are decided at compilation time */
+static inline int LZ4_compress_generic(
+                 void* const ctx,
+                 const char* const source,
+                 char* const dest,
+                 const int inputSize,
+                 const int maxOutputSize,
+                 const limitedOutput_directive outputLimited,
+                 const tableType_t tableType,
+                 const dict_directive dict,
+                 const dictIssue_directive dictIssue,
+                 const U32 acceleration)
 {
-	HTYPE *hashtable = (HTYPE *)ctx;
-	const u8 *ip = (u8 *)source;
-#if LZ4_ARCH64
-	const BYTE * const base = ip;
-#else
-	const int base = 0;
-#endif
-	const u8 *anchor = ip;
-	const u8 *const iend = ip + isize;
-	const u8 *const mflimit = iend - MFLIMIT;
-	#define MATCHLIMIT (iend - LASTLITERALS)
-
-	u8 *op = (u8 *) dest;
-	u8 *const oend = op + maxoutputsize;
-	int length;
-	const int skipstrength = SKIPSTRENGTH;
-	u32 forwardh;
-	int lastrun;
-
-	/* Init */
-	if (isize < MINLENGTH)
-		goto _last_literals;
-
-	memset((void *)hashtable, 0, LZ4_MEM_COMPRESS);
-
-	/* First Byte */
-	hashtable[LZ4_HASH_VALUE(ip)] = ip - base;
-	ip++;
-	forwardh = LZ4_HASH_VALUE(ip);
-
-	/* Main Loop */
-	for (;;) {
-		int findmatchattempts = (1U << skipstrength) + 3;
-		const u8 *forwardip = ip;
-		const u8 *ref;
-		u8 *token;
-
-		/* Find a match */
-		do {
-			u32 h = forwardh;
-			int step = findmatchattempts++ >> skipstrength;
-			ip = forwardip;
-			forwardip = ip + step;
-
-			if (unlikely(forwardip > mflimit))
-				goto _last_literals;
-
-			forwardh = LZ4_HASH_VALUE(forwardip);
-			ref = base + hashtable[h];
-			hashtable[h] = ip - base;
-		} while ((ref < ip - MAX_DISTANCE) || (A32(ref) != A32(ip)));
-
-		/* Catch up */
-		while ((ip > anchor) && (ref > (u8 *)source) &&
-			unlikely(ip[-1] == ref[-1])) {
-			ip--;
-			ref--;
-		}
-
-		/* Encode Literal length */
-		length = (int)(ip - anchor);
-		token = op++;
-		/* check output limit */
-		if (unlikely(op + length + (2 + 1 + LASTLITERALS) +
-			(length >> 8) > oend))
-			return 0;
-
-		if (length >= (int)RUN_MASK) {
-			int len;
-			*token = (RUN_MASK << ML_BITS);
-			len = length - RUN_MASK;
-			for (; len > 254 ; len -= 255)
-				*op++ = 255;
-			*op++ = (u8)len;
-		} else
-			*token = (length << ML_BITS);
-
-		/* Copy Literals */
-		LZ4_BLINDCOPY(anchor, op, length);
+    LZ4_stream_t_internal* const dictPtr = (LZ4_stream_t_internal*)ctx;
+
+    const BYTE* ip = (const BYTE*) source;
+    const BYTE* base;
+    const BYTE* lowLimit;
+    const BYTE* const lowRefLimit = ip - dictPtr->dictSize;
+    const BYTE* const dictionary = dictPtr->dictionary;
+    const BYTE* const dictEnd = dictionary + dictPtr->dictSize;
+    const size_t dictDelta = dictEnd - (const BYTE*)source;
+    const BYTE* anchor = (const BYTE*) source;
+    const BYTE* const iend = ip + inputSize;
+    const BYTE* const mflimit = iend - MFLIMIT;
+    const BYTE* const matchlimit = iend - LASTLITERALS;
+
+    BYTE* op = (BYTE*) dest;
+    BYTE* const olimit = op + maxOutputSize;
+
+    U32 forwardH;
+    size_t refDelta=0;
+
+    /* Init conditions */
+    if ((U32)inputSize > (U32)LZ4_MAX_INPUT_SIZE) return 0;   /* Unsupported inputSize, too large (or negative) */
+    switch(dict)
+    {
+    case noDict:
+    default:
+        base = (const BYTE*)source;
+        lowLimit = (const BYTE*)source;
+        break;
+    case withPrefix64k:
+        base = (const BYTE*)source - dictPtr->currentOffset;
+        lowLimit = (const BYTE*)source - dictPtr->dictSize;
+        break;
+    case usingExtDict:
+        base = (const BYTE*)source - dictPtr->currentOffset;
+        lowLimit = (const BYTE*)source;
+        break;
+    }
+    if ((tableType == byU16) && (inputSize>=LZ4_64Klimit)) return 0;   /* Size too large (not within 64K limit) */
+    if (inputSize<LZ4_minLength) goto _last_literals;                  /* Input too small, no compression (all literals) */
+
+    /* First Byte */
+    LZ4_putPosition(ip, ctx, tableType, base);
+    ip++; forwardH = LZ4_hashPosition(ip, tableType);
+
+    /* Main Loop */
+    for ( ; ; ) {
+        const BYTE* match;
+        BYTE* token;
+
+        /* Find a match */
+        {   const BYTE* forwardIp = ip;
+            unsigned step = 1;
+            unsigned searchMatchNb = acceleration << LZ4_skipTrigger;
+            do {
+                U32 const h = forwardH;
+                ip = forwardIp;
+                forwardIp += step;
+                step = (searchMatchNb++ >> LZ4_skipTrigger);
+
+                if (unlikely(forwardIp > mflimit)) goto _last_literals;
+
+                match = LZ4_getPositionOnHash(h, ctx, tableType, base);
+                if (dict==usingExtDict) {
+                    if (match < (const BYTE*)source) {
+                        refDelta = dictDelta;
+                        lowLimit = dictionary;
+                    } else {
+                        refDelta = 0;
+                        lowLimit = (const BYTE*)source;
+                }   }
+                forwardH = LZ4_hashPosition(forwardIp, tableType);
+                LZ4_putPositionOnHash(ip, h, ctx, tableType, base);
+
+            } while ( ((dictIssue==dictSmall) ? (match < lowRefLimit) : 0)
+                || ((tableType==byU16) ? 0 : (match + MAX_DISTANCE < ip))
+                || (LZ4_read32(match+refDelta) != LZ4_read32(ip)) );
+        }
+
+        /* Catch up */
+        while (((ip>anchor) & (match+refDelta > lowLimit)) && (unlikely(ip[-1]==match[refDelta-1]))) { ip--; match--; }
+
+        /* Encode Literals */
+        {   unsigned const litLength = (unsigned)(ip - anchor);
+            token = op++;
+            if ((outputLimited) &&  /* Check output buffer overflow */
+                (unlikely(op + litLength + (2 + 1 + LASTLITERALS) + (litLength/255) > olimit)))
+                return 0;
+            if (litLength >= RUN_MASK) {
+                int len = (int)litLength-RUN_MASK;
+                *token = (RUN_MASK<<ML_BITS);
+                for(; len >= 255 ; len-=255) *op++ = 255;
+                *op++ = (BYTE)len;
+            }
+            else *token = (BYTE)(litLength<<ML_BITS);
+
+            /* Copy Literals */
+            LZ4_wildCopy(op, anchor, op+litLength);
+            op+=litLength;
+        }
+
 _next_match:
-		/* Encode Offset */
-		LZ4_WRITE_LITTLEENDIAN_16(op, (u16)(ip - ref));
-
-		/* Start Counting */
-		ip += MINMATCH;
-		/* MinMatch verified */
-		ref += MINMATCH;
-		anchor = ip;
-		while (likely(ip < MATCHLIMIT - (STEPSIZE - 1))) {
-			#if LZ4_ARCH64
-			u64 diff = A64(ref) ^ A64(ip);
-			#else
-			u32 diff = A32(ref) ^ A32(ip);
-			#endif
-			if (!diff) {
-				ip += STEPSIZE;
-				ref += STEPSIZE;
-				continue;
-			}
-			ip += LZ4_NBCOMMONBYTES(diff);
-			goto _endcount;
-		}
-		#if LZ4_ARCH64
-		if ((ip < (MATCHLIMIT - 3)) && (A32(ref) == A32(ip))) {
-			ip += 4;
-			ref += 4;
-		}
-		#endif
-		if ((ip < (MATCHLIMIT - 1)) && (A16(ref) == A16(ip))) {
-			ip += 2;
-			ref += 2;
-		}
-		if ((ip < MATCHLIMIT) && (*ref == *ip))
-			ip++;
-_endcount:
-		/* Encode MatchLength */
-		length = (int)(ip - anchor);
-		/* Check output limit */
-		if (unlikely(op + (1 + LASTLITERALS) + (length >> 8) > oend))
-			return 0;
-		if (length >= (int)ML_MASK) {
-			*token += ML_MASK;
-			length -= ML_MASK;
-			for (; length > 509 ; length -= 510) {
-				*op++ = 255;
-				*op++ = 255;
-			}
-			if (length > 254) {
-				length -= 255;
-				*op++ = 255;
-			}
-			*op++ = (u8)length;
-		} else
-			*token += length;
-
-		/* Test end of chunk */
-		if (ip > mflimit) {
-			anchor = ip;
-			break;
-		}
-
-		/* Fill table */
-		hashtable[LZ4_HASH_VALUE(ip-2)] = ip - 2 - base;
-
-		/* Test next position */
-		ref = base + hashtable[LZ4_HASH_VALUE(ip)];
-		hashtable[LZ4_HASH_VALUE(ip)] = ip - base;
-		if ((ref > ip - (MAX_DISTANCE + 1)) && (A32(ref) == A32(ip))) {
-			token = op++;
-			*token = 0;
-			goto _next_match;
-		}
-
-		/* Prepare next loop */
-		anchor = ip++;
-		forwardh = LZ4_HASH_VALUE(ip);
-	}
+        /* Encode Offset */
+        LZ4_writeLE16(op, (U16)(ip-match)); op+=2;
+
+        /* Encode MatchLength */
+        {   unsigned matchCode;
+
+            if ((dict==usingExtDict) && (lowLimit==dictionary)) {
+                const BYTE* limit;
+                match += refDelta;
+                limit = ip + (dictEnd-match);
+                if (limit > matchlimit) limit = matchlimit;
+                matchCode = LZ4_count(ip+MINMATCH, match+MINMATCH, limit);
+                ip += MINMATCH + matchCode;
+                if (ip==limit) {
+                    unsigned const more = LZ4_count(ip, (const BYTE*)source, matchlimit);
+                    matchCode += more;
+                    ip += more;
+                }
+            } else {
+                matchCode = LZ4_count(ip+MINMATCH, match+MINMATCH, matchlimit);
+                ip += MINMATCH + matchCode;
+            }
+
+            if ( outputLimited &&    /* Check output buffer overflow */
+                (unlikely(op + (1 + LASTLITERALS) + (matchCode>>8) > olimit)) )
+                return 0;
+            if (matchCode >= ML_MASK) {
+                *token += ML_MASK;
+                matchCode -= ML_MASK;
+                LZ4_write32(op, 0xFFFFFFFF);
+                while (matchCode >= 4*255) op+=4, LZ4_write32(op, 0xFFFFFFFF), matchCode -= 4*255;
+                op += matchCode / 255;
+                *op++ = (BYTE)(matchCode % 255);
+            } else
+                *token += (BYTE)(matchCode);
+        }
+
+        anchor = ip;
+
+        /* Test end of chunk */
+        if (ip > mflimit) break;
+
+        /* Fill table */
+        LZ4_putPosition(ip-2, ctx, tableType, base);
+
+        /* Test next position */
+        match = LZ4_getPosition(ip, ctx, tableType, base);
+        if (dict==usingExtDict) {
+            if (match < (const BYTE*)source) {
+                refDelta = dictDelta;
+                lowLimit = dictionary;
+            } else {
+                refDelta = 0;
+                lowLimit = (const BYTE*)source;
+        }   }
+        LZ4_putPosition(ip, ctx, tableType, base);
+        if ( ((dictIssue==dictSmall) ? (match>=lowRefLimit) : 1)
+            && (match+MAX_DISTANCE>=ip)
+            && (LZ4_read32(match+refDelta)==LZ4_read32(ip)) )
+        { token=op++; *token=0; goto _next_match; }
+
+        /* Prepare next loop */
+        forwardH = LZ4_hashPosition(++ip, tableType);
+    }
 
 _last_literals:
-	/* Encode Last Literals */
-	lastrun = (int)(iend - anchor);
-	if (((char *)op - dest) + lastrun + 1
-		+ ((lastrun + 255 - RUN_MASK) / 255) > (u32)maxoutputsize)
-		return 0;
-
-	if (lastrun >= (int)RUN_MASK) {
-		*op++ = (RUN_MASK << ML_BITS);
-		lastrun -= RUN_MASK;
-		for (; lastrun > 254 ; lastrun -= 255)
-			*op++ = 255;
-		*op++ = (u8)lastrun;
-	} else
-		*op++ = (lastrun << ML_BITS);
-	memcpy(op, anchor, iend - anchor);
-	op += iend - anchor;
-
-	/* End */
-	return (int)(((char *)op) - dest);
+    /* Encode Last Literals */
+    {   const size_t lastRun = (size_t)(iend - anchor);
+        if ( (outputLimited) &&  /* Check output buffer overflow */
+            ((op - (BYTE*)dest) + lastRun + 1 + ((lastRun+255-RUN_MASK)/255) > (U32)maxOutputSize) )
+            return 0;
+        if (lastRun >= RUN_MASK) {
+            size_t accumulator = lastRun - RUN_MASK;
+            *op++ = RUN_MASK << ML_BITS;
+            for(; accumulator >= 255 ; accumulator-=255) *op++ = 255;
+            *op++ = (BYTE) accumulator;
+        } else {
+            *op++ = (BYTE)(lastRun<<ML_BITS);
+        }
+        memcpy(op, anchor, lastRun);
+        op += lastRun;
+    }
+
+    /* End */
+    return (int) (((char*)op)-dest);
+}
+
+
+int LZ4_compress_fast_extState(void* state, const char* source, char* dest, int inputSize, int maxOutputSize, int acceleration)
+{
+    LZ4_resetStream((LZ4_stream_t*)state);
+    if (acceleration < 1) acceleration = ACCELERATION_DEFAULT;
+
+    if (maxOutputSize >= LZ4_compressBound(inputSize)) {
+        if (inputSize < LZ4_64Klimit)
+            return LZ4_compress_generic(state, source, dest, inputSize, 0, notLimited, byU16,                        noDict, noDictIssue, acceleration);
+        else
+            return LZ4_compress_generic(state, source, dest, inputSize, 0, notLimited, LZ4_64bits() ? byU32 : byPtr, noDict, noDictIssue, acceleration);
+    } else {
+        if (inputSize < LZ4_64Klimit)
+            return LZ4_compress_generic(state, source, dest, inputSize, maxOutputSize, limitedOutput, byU16,                        noDict, noDictIssue, acceleration);
+        else
+            return LZ4_compress_generic(state, source, dest, inputSize, maxOutputSize, limitedOutput, LZ4_64bits() ? byU32 : byPtr, noDict, noDictIssue, acceleration);
+    }
+}
+
+
+int LZ4_compress_fast(const char* source, char* dest, int inputSize, int maxOutputSize, void* wrkmem, int acceleration)
+{
+    int result = LZ4_compress_fast_extState(wrkmem, source, dest, inputSize, maxOutputSize, acceleration);
+
+    return result;
 }
 
-static inline int lz4_compress64kctx(void *ctx,
-		const char *source,
-		char *dest,
-		int isize,
-		int maxoutputsize)
+
+int LZ4_compress_default(const char* source, char* dest, int inputSize, int maxOutputSize, void* wrkmem)
 {
-	u16 *hashtable = (u16 *)ctx;
-	const u8 *ip = (u8 *) source;
-	const u8 *anchor = ip;
-	const u8 *const base = ip;
-	const u8 *const iend = ip + isize;
-	const u8 *const mflimit = iend - MFLIMIT;
-	#define MATCHLIMIT (iend - LASTLITERALS)
-
-	u8 *op = (u8 *) dest;
-	u8 *const oend = op + maxoutputsize;
-	int len, length;
-	const int skipstrength = SKIPSTRENGTH;
-	u32 forwardh;
-	int lastrun;
-
-	/* Init */
-	if (isize < MINLENGTH)
-		goto _last_literals;
-
-	memset((void *)hashtable, 0, LZ4_MEM_COMPRESS);
-
-	/* First Byte */
-	ip++;
-	forwardh = LZ4_HASH64K_VALUE(ip);
-
-	/* Main Loop */
-	for (;;) {
-		int findmatchattempts = (1U << skipstrength) + 3;
-		const u8 *forwardip = ip;
-		const u8 *ref;
-		u8 *token;
-
-		/* Find a match */
-		do {
-			u32 h = forwardh;
-			int step = findmatchattempts++ >> skipstrength;
-			ip = forwardip;
-			forwardip = ip + step;
-
-			if (forwardip > mflimit)
-				goto _last_literals;
-
-			forwardh = LZ4_HASH64K_VALUE(forwardip);
-			ref = base + hashtable[h];
-			hashtable[h] = (u16)(ip - base);
-		} while (A32(ref) != A32(ip));
-
-		/* Catch up */
-		while ((ip > anchor) && (ref > (u8 *)source)
-			&& (ip[-1] == ref[-1])) {
-			ip--;
-			ref--;
-		}
-
-		/* Encode Literal length */
-		length = (int)(ip - anchor);
-		token = op++;
-		/* Check output limit */
-		if (unlikely(op + length + (2 + 1 + LASTLITERALS)
-			+ (length >> 8) > oend))
-			return 0;
-		if (length >= (int)RUN_MASK) {
-			*token = (RUN_MASK << ML_BITS);
-			len = length - RUN_MASK;
-			for (; len > 254 ; len -= 255)
-				*op++ = 255;
-			*op++ = (u8)len;
-		} else
-			*token = (length << ML_BITS);
-
-		/* Copy Literals */
-		LZ4_BLINDCOPY(anchor, op, length);
+    return LZ4_compress_fast(source, dest, inputSize, maxOutputSize, wrkmem, 1);
+}
+
+/*-******************************
+*  *_destSize() variant
+********************************/
+
+static int LZ4_compress_destSize_generic(
+                       void* const ctx,
+                 const char* const src,
+                       char* const dst,
+                       int*  const srcSizePtr,
+                 const int targetDstSize,
+                 const tableType_t tableType)
+{
+    const BYTE* ip = (const BYTE*) src;
+    const BYTE* base = (const BYTE*) src;
+    const BYTE* lowLimit = (const BYTE*) src;
+    const BYTE* anchor = ip;
+    const BYTE* const iend = ip + *srcSizePtr;
+    const BYTE* const mflimit = iend - MFLIMIT;
+    const BYTE* const matchlimit = iend - LASTLITERALS;
+
+    BYTE* op = (BYTE*) dst;
+    BYTE* const oend = op + targetDstSize;
+    BYTE* const oMaxLit = op + targetDstSize - 2 /* offset */ - 8 /* because 8+MINMATCH==MFLIMIT */ - 1 /* token */;
+    BYTE* const oMaxMatch = op + targetDstSize - (LASTLITERALS + 1 /* token */);
+    BYTE* const oMaxSeq = oMaxLit - 1 /* token */;
+
+    U32 forwardH;
+
+
+    /* Init conditions */
+    if (targetDstSize < 1) return 0;                                     /* Impossible to store anything */
+    if ((U32)*srcSizePtr > (U32)LZ4_MAX_INPUT_SIZE) return 0;            /* Unsupported input size, too large (or negative) */
+    if ((tableType == byU16) && (*srcSizePtr>=LZ4_64Klimit)) return 0;   /* Size too large (not within 64K limit) */
+    if (*srcSizePtr<LZ4_minLength) goto _last_literals;                  /* Input too small, no compression (all literals) */
+
+    /* First Byte */
+    *srcSizePtr = 0;
+    LZ4_putPosition(ip, ctx, tableType, base);
+    ip++; forwardH = LZ4_hashPosition(ip, tableType);
+
+    /* Main Loop */
+    for ( ; ; ) {
+        const BYTE* match;
+        BYTE* token;
+
+        /* Find a match */
+        {   const BYTE* forwardIp = ip;
+            unsigned step = 1;
+            unsigned searchMatchNb = 1 << LZ4_skipTrigger;
+
+            do {
+                U32 h = forwardH;
+                ip = forwardIp;
+                forwardIp += step;
+                step = (searchMatchNb++ >> LZ4_skipTrigger);
+
+                if (unlikely(forwardIp > mflimit)) goto _last_literals;
+
+                match = LZ4_getPositionOnHash(h, ctx, tableType, base);
+                forwardH = LZ4_hashPosition(forwardIp, tableType);
+                LZ4_putPositionOnHash(ip, h, ctx, tableType, base);
+
+            } while ( ((tableType==byU16) ? 0 : (match + MAX_DISTANCE < ip))
+                || (LZ4_read32(match) != LZ4_read32(ip)) );
+        }
+
+        /* Catch up */
+        while ((ip>anchor) && (match > lowLimit) && (unlikely(ip[-1]==match[-1]))) { ip--; match--; }
+
+        /* Encode Literal length */
+        {   unsigned litLength = (unsigned)(ip - anchor);
+            token = op++;
+            if (op + ((litLength+240)/255) + litLength > oMaxLit) {
+                /* Not enough space for a last match */
+                op--;
+                goto _last_literals;
+            }
+            if (litLength>=RUN_MASK) {
+                unsigned len = litLength - RUN_MASK;
+                *token=(RUN_MASK<<ML_BITS);
+                for(; len >= 255 ; len-=255) *op++ = 255;
+                *op++ = (BYTE)len;
+            }
+            else *token = (BYTE)(litLength<<ML_BITS);
+
+            /* Copy Literals */
+            LZ4_wildCopy(op, anchor, op+litLength);
+            op += litLength;
+        }
 
 _next_match:
-		/* Encode Offset */
-		LZ4_WRITE_LITTLEENDIAN_16(op, (u16)(ip - ref));
-
-		/* Start Counting */
-		ip += MINMATCH;
-		/* MinMatch verified */
-		ref += MINMATCH;
-		anchor = ip;
-
-		while (ip < MATCHLIMIT - (STEPSIZE - 1)) {
-			#if LZ4_ARCH64
-			u64 diff = A64(ref) ^ A64(ip);
-			#else
-			u32 diff = A32(ref) ^ A32(ip);
-			#endif
-
-			if (!diff) {
-				ip += STEPSIZE;
-				ref += STEPSIZE;
-				continue;
-			}
-			ip += LZ4_NBCOMMONBYTES(diff);
-			goto _endcount;
-		}
-		#if LZ4_ARCH64
-		if ((ip < (MATCHLIMIT - 3)) && (A32(ref) == A32(ip))) {
-			ip += 4;
-			ref += 4;
-		}
-		#endif
-		if ((ip < (MATCHLIMIT - 1)) && (A16(ref) == A16(ip))) {
-			ip += 2;
-			ref += 2;
-		}
-		if ((ip < MATCHLIMIT) && (*ref == *ip))
-			ip++;
-_endcount:
-
-		/* Encode MatchLength */
-		len = (int)(ip - anchor);
-		/* Check output limit */
-		if (unlikely(op + (1 + LASTLITERALS) + (len >> 8) > oend))
-			return 0;
-		if (len >= (int)ML_MASK) {
-			*token += ML_MASK;
-			len -= ML_MASK;
-			for (; len > 509 ; len -= 510) {
-				*op++ = 255;
-				*op++ = 255;
-			}
-			if (len > 254) {
-				len -= 255;
-				*op++ = 255;
-			}
-			*op++ = (u8)len;
-		} else
-			*token += len;
-
-		/* Test end of chunk */
-		if (ip > mflimit) {
-			anchor = ip;
-			break;
-		}
-
-		/* Fill table */
-		hashtable[LZ4_HASH64K_VALUE(ip-2)] = (u16)(ip - 2 - base);
-
-		/* Test next position */
-		ref = base + hashtable[LZ4_HASH64K_VALUE(ip)];
-		hashtable[LZ4_HASH64K_VALUE(ip)] = (u16)(ip - base);
-		if (A32(ref) == A32(ip)) {
-			token = op++;
-			*token = 0;
-			goto _next_match;
-		}
-
-		/* Prepare next loop */
-		anchor = ip++;
-		forwardh = LZ4_HASH64K_VALUE(ip);
-	}
+        /* Encode Offset */
+        LZ4_writeLE16(op, (U16)(ip-match)); op+=2;
+
+        /* Encode MatchLength */
+        {   size_t matchLength = LZ4_count(ip+MINMATCH, match+MINMATCH, matchlimit);
+
+            if (op + ((matchLength+240)/255) > oMaxMatch) {
+                /* Match description too long : reduce it */
+                matchLength = (15-1) + (oMaxMatch-op) * 255;
+            }
+            ip += MINMATCH + matchLength;
+
+            if (matchLength>=ML_MASK) {
+                *token += ML_MASK;
+                matchLength -= ML_MASK;
+                while (matchLength >= 255) { matchLength-=255; *op++ = 255; }
+                *op++ = (BYTE)matchLength;
+            }
+            else *token += (BYTE)(matchLength);
+        }
+
+        anchor = ip;
+
+        /* Test end of block */
+        if (ip > mflimit) break;
+        if (op > oMaxSeq) break;
+
+        /* Fill table */
+        LZ4_putPosition(ip-2, ctx, tableType, base);
+
+        /* Test next position */
+        match = LZ4_getPosition(ip, ctx, tableType, base);
+        LZ4_putPosition(ip, ctx, tableType, base);
+        if ( (match+MAX_DISTANCE>=ip)
+            && (LZ4_read32(match)==LZ4_read32(ip)) )
+        { token=op++; *token=0; goto _next_match; }
+
+        /* Prepare next loop */
+        forwardH = LZ4_hashPosition(++ip, tableType);
+    }
 
 _last_literals:
-	/* Encode Last Literals */
-	lastrun = (int)(iend - anchor);
-	if (op + lastrun + 1 + (lastrun - RUN_MASK + 255) / 255 > oend)
-		return 0;
-	if (lastrun >= (int)RUN_MASK) {
-		*op++ = (RUN_MASK << ML_BITS);
-		lastrun -= RUN_MASK;
-		for (; lastrun > 254 ; lastrun -= 255)
-			*op++ = 255;
-		*op++ = (u8)lastrun;
-	} else
-		*op++ = (lastrun << ML_BITS);
-	memcpy(op, anchor, iend - anchor);
-	op += iend - anchor;
-	/* End */
-	return (int)(((char *)op) - dest);
+    /* Encode Last Literals */
+    {   size_t lastRunSize = (size_t)(iend - anchor);
+        if (op + 1 /* token */ + ((lastRunSize+240)/255) /* litLength */ + lastRunSize /* literals */ > oend) {
+            /* adapt lastRunSize to fill 'dst' */
+            lastRunSize  = (oend-op) - 1;
+            lastRunSize -= (lastRunSize+240)/255;
+        }
+        ip = anchor + lastRunSize;
+
+        if (lastRunSize >= RUN_MASK) {
+            size_t accumulator = lastRunSize - RUN_MASK;
+            *op++ = RUN_MASK << ML_BITS;
+            for(; accumulator >= 255 ; accumulator-=255) *op++ = 255;
+            *op++ = (BYTE) accumulator;
+        } else {
+            *op++ = (BYTE)(lastRunSize<<ML_BITS);
+        }
+        memcpy(op, anchor, lastRunSize);
+        op += lastRunSize;
+    }
+
+    /* End */
+    *srcSizePtr = (int) (((const char*)ip)-src);
+    return (int) (((char*)op)-dst);
 }
 
-int lz4_compress(const unsigned char *src, size_t src_len,
-			unsigned char *dst, size_t *dst_len, void *wrkmem)
+
+static int LZ4_compress_destSize_extState (void* state, const char* src, char* dst, int* srcSizePtr, int targetDstSize)
 {
-	int ret = -1;
-	int out_len = 0;
+    LZ4_resetStream((LZ4_stream_t*)state);
+
+    if (targetDstSize >= LZ4_compressBound(*srcSizePtr)) {  /* compression success is guaranteed */
+        return LZ4_compress_fast_extState(state, src, dst, *srcSizePtr, targetDstSize, 1);
+    } else {
+        if (*srcSizePtr < LZ4_64Klimit)
+            return LZ4_compress_destSize_generic(state, src, dst, srcSizePtr, targetDstSize, byU16);
+        else
+            return LZ4_compress_destSize_generic(state, src, dst, srcSizePtr, targetDstSize, LZ4_64bits() ? byU32 : byPtr);
+    }
+}
 
-	if (src_len < LZ4_64KLIMIT)
-		out_len = lz4_compress64kctx(wrkmem, src, dst, src_len,
-				lz4_compressbound(src_len));
-	else
-		out_len = lz4_compressctx(wrkmem, src, dst, src_len,
-				lz4_compressbound(src_len));
 
-	if (out_len < 0)
-		goto exit;
+int LZ4_compress_destSize(const char* src, char* dst, int* srcSizePtr, int targetDstSize, void* wrkmem)
+{
+    return LZ4_compress_destSize_extState(wrkmem, src, dst, srcSizePtr, targetDstSize);
+}
 
-	*dst_len = out_len;
+static void LZ4_renormDictT(LZ4_stream_t_internal* LZ4_dict, const BYTE* src)
+{
+    if ((LZ4_dict->currentOffset > 0x80000000) ||
+        ((size_t)LZ4_dict->currentOffset > (size_t)src)) {   /* address space overflow */
+        /* rescale hash table */
+        U32 const delta = LZ4_dict->currentOffset - 64 KB;
+        const BYTE* dictEnd = LZ4_dict->dictionary + LZ4_dict->dictSize;
+        int i;
+        for (i=0; i<LZ4_HASH_SIZE_U32; i++) {
+            if (LZ4_dict->hashTable[i] < delta) LZ4_dict->hashTable[i]=0;
+            else LZ4_dict->hashTable[i] -= delta;
+        }
+        LZ4_dict->currentOffset = 64 KB;
+        if (LZ4_dict->dictSize > 64 KB) LZ4_dict->dictSize = 64 KB;
+        LZ4_dict->dictionary = dictEnd - LZ4_dict->dictSize;
+    }
+}
 
-	return 0;
-exit:
-	return ret;
+int LZ4_compress_fast_continue (LZ4_stream_t* LZ4_stream, const char* source, char* dest, int inputSize, int maxOutputSize, int acceleration)
+{
+    LZ4_stream_t_internal* streamPtr = (LZ4_stream_t_internal*)LZ4_stream;
+    const BYTE* const dictEnd = streamPtr->dictionary + streamPtr->dictSize;
+
+    const BYTE* smallest = (const BYTE*) source;
+    if (streamPtr->initCheck) return 0;   /* Uninitialized structure detected */
+    if ((streamPtr->dictSize>0) && (smallest>dictEnd)) smallest = dictEnd;
+    LZ4_renormDictT(streamPtr, smallest);
+    if (acceleration < 1) acceleration = ACCELERATION_DEFAULT;
+
+    /* Check overlapping input/dictionary space */
+    {   const BYTE* sourceEnd = (const BYTE*) source + inputSize;
+        if ((sourceEnd > streamPtr->dictionary) && (sourceEnd < dictEnd)) {
+            streamPtr->dictSize = (U32)(dictEnd - sourceEnd);
+            if (streamPtr->dictSize > 64 KB) streamPtr->dictSize = 64 KB;
+            if (streamPtr->dictSize < 4) streamPtr->dictSize = 0;
+            streamPtr->dictionary = dictEnd - streamPtr->dictSize;
+        }
+    }
+
+    /* prefix mode : source data follows dictionary */
+    if (dictEnd == (const BYTE*)source) {
+        int result;
+        if ((streamPtr->dictSize < 64 KB) && (streamPtr->dictSize < streamPtr->currentOffset))
+            result = LZ4_compress_generic(LZ4_stream, source, dest, inputSize, maxOutputSize, limitedOutput, byU32, withPrefix64k, dictSmall, acceleration);
+        else
+            result = LZ4_compress_generic(LZ4_stream, source, dest, inputSize, maxOutputSize, limitedOutput, byU32, withPrefix64k, noDictIssue, acceleration);
+        streamPtr->dictSize += (U32)inputSize;
+        streamPtr->currentOffset += (U32)inputSize;
+        return result;
+    }
+
+    /* external dictionary mode */
+    {   int result;
+        if ((streamPtr->dictSize < 64 KB) && (streamPtr->dictSize < streamPtr->currentOffset))
+            result = LZ4_compress_generic(LZ4_stream, source, dest, inputSize, maxOutputSize, limitedOutput, byU32, usingExtDict, dictSmall, acceleration);
+        else
+            result = LZ4_compress_generic(LZ4_stream, source, dest, inputSize, maxOutputSize, limitedOutput, byU32, usingExtDict, noDictIssue, acceleration);
+        streamPtr->dictionary = (const BYTE*)source;
+        streamPtr->dictSize = (U32)inputSize;
+        streamPtr->currentOffset += (U32)inputSize;
+        return result;
+    }
+}
+
+/*~
+ * for backwards compatibility
+*/
+int lz4_compress(const unsigned char *src, size_t src_len, unsigned char *dst, size_t *dst_len, void *wrkmem) {
+    *dst_len = LZ4_compress_default(src, dst, (int)src_len, (int)((size_t)dst_len), wrkmem);
+
+    return (int)((size_t)dst_len);
+}
+
+/* Hidden debug function, to force external dictionary mode */
+int LZ4_compress_forceExtDict (LZ4_stream_t* LZ4_dict, const char* source, char* dest, int inputSize)
+{
+    LZ4_stream_t_internal* streamPtr = (LZ4_stream_t_internal*)LZ4_dict;
+    int result;
+    const BYTE* const dictEnd = streamPtr->dictionary + streamPtr->dictSize;
+
+    const BYTE* smallest = dictEnd;
+    if (smallest > (const BYTE*) source) smallest = (const BYTE*) source;
+    LZ4_renormDictT((LZ4_stream_t_internal*)LZ4_dict, smallest);
+
+    result = LZ4_compress_generic(LZ4_dict, source, dest, inputSize, 0, notLimited, byU32, usingExtDict, noDictIssue, 1);
+
+    streamPtr->dictionary = (const BYTE*)source;
+    streamPtr->dictSize = (U32)inputSize;
+    streamPtr->currentOffset += (U32)inputSize;
+
+    return result;
+}
+
+
+int LZ4_saveDict (LZ4_stream_t* LZ4_dict, char* safeBuffer, int dictSize)
+{
+    LZ4_stream_t_internal* dict = (LZ4_stream_t_internal*) LZ4_dict;
+    const BYTE* previousDictEnd = dict->dictionary + dict->dictSize;
+
+    if ((U32)dictSize > 64 KB) dictSize = 64 KB;   /* useless to define a dictionary > 64 KB */
+    if ((U32)dictSize > dict->dictSize) dictSize = dict->dictSize;
+
+    memmove(safeBuffer, previousDictEnd - dictSize, dictSize);
+
+    dict->dictionary = (const BYTE*)safeBuffer;
+    dict->dictSize = (U32)dictSize;
+
+    return dictSize;
 }
-EXPORT_SYMBOL(lz4_compress);
 
 MODULE_LICENSE("Dual BSD/GPL");
 MODULE_DESCRIPTION("LZ4 compressor");
+
+/* Kernel exports */
+EXPORT_SYMBOL(LZ4_compress_default);
+EXPORT_SYMBOL(LZ4_compress_fast);
+EXPORT_SYMBOL(LZ4_compress_destSize);
+EXPORT_SYMBOL(lz4_compress);
diff --git a/lib/lz4/lz4_decompress.c b/lib/lz4/lz4_decompress.c
index 6d940c7..e0fdb9cb 100644
--- a/lib/lz4/lz4_decompress.c
+++ b/lib/lz4/lz4_decompress.c
@@ -1,343 +1,330 @@
 /*
- * LZ4 Decompressor for Linux kernel
- *
- * Copyright (C) 2013, LG Electronics, Kyungsik Lee <kyungsik.lee@lge.com>
- *
- * Based on LZ4 implementation by Yann Collet.
- *
- * LZ4 - Fast LZ compression algorithm
- * Copyright (C) 2011-2012, Yann Collet.
- * BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions are
- * met:
- *
- *     * Redistributions of source code must retain the above copyright
- * notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above
- * copyright notice, this list of conditions and the following disclaimer
- * in the documentation and/or other materials provided with the
- * distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- *
- *  You can contact the author at :
- *  - LZ4 homepage : http://fastcompression.blogspot.com/p/lz4.html
- *  - LZ4 source repository : http://code.google.com/p/lz4/
- */
-
-#ifndef STATIC
+   LZ4 - Fast LZ compression algorithm
+   Copyright (C) 2011-2016, Yann Collet.
+   BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
+   Redistribution and use in source and binary forms, with or without
+   modification, are permitted provided that the following conditions are
+   met:
+       * Redistributions of source code must retain the above copyright
+   notice, this list of conditions and the following disclaimer.
+       * Redistributions in binary form must reproduce the above
+   copyright notice, this list of conditions and the following disclaimer
+   in the documentation and/or other materials provided with the
+   distribution.
+   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+   You can contact the author at :
+    - LZ4 homepage : http://www.lz4.org
+    - LZ4 source repository : https://github.com/lz4/lz4
+
+    Changed for kernel use by:
+    Sven Schmidt <4sschmid@informatik.uni-hamburg.de>
+*/
+
+/*-************************************
+*  Includes
+**************************************/
+#include "lz4defs.h"
+#include <linux/lz4.h>
+#include <linux/init.h>
 #include <linux/module.h>
 #include <linux/kernel.h>
-#endif
-#include <linux/lz4.h>
-
 #include <asm/unaligned.h>
 
-#include "lz4defs.h"
-
-static const int dec32table[] = {0, 3, 2, 3, 0, 0, 0, 0};
-#if LZ4_ARCH64
-static const int dec64table[] = {0, 0, 0, -1, 0, 1, 2, 3};
-#endif
-
-static int lz4_uncompress(const char *source, char *dest, int osize)
+/*-*****************************
+*  Decompression functions
+*******************************/
+/*! LZ4_decompress_generic() :
+ *  This generic decompression function cover all use cases.
+ *  It shall be instantiated several times, using different sets of directives
+ *  Note that it is important this generic function is really inlined,
+ *  in order to remove useless branches during compilation optimization.
+ */
+static inline int LZ4_decompress_generic(
+                 const char* const source,
+                 char* const dest,
+                 int inputSize,
+                 int outputSize,         /* If endOnInput==endOnInputSize, this value is the max size of Output Buffer. */
+
+                 int endOnInput,         /* endOnOutputSize, endOnInputSize */
+                 int partialDecoding,    /* full, partial */
+                 int targetOutputSize,   /* only used if partialDecoding==partial */
+                 int dict,               /* noDict, withPrefix64k, usingExtDict */
+                 const BYTE* const lowPrefix,  /* == dest when no prefix */
+                 const BYTE* const dictStart,  /* only if dict==usingExtDict */
+                 const size_t dictSize         /* note : = 0 if noDict */
+                 )
 {
-	const BYTE *ip = (const BYTE *) source;
-	const BYTE *ref;
-	BYTE *op = (BYTE *) dest;
-	BYTE * const oend = op + osize;
-	BYTE *cpy;
-	unsigned token;
-	size_t length;
-
-	while (1) {
-
-		/* get runlength */
-		token = *ip++;
-		length = (token >> ML_BITS);
-		if (length == RUN_MASK) {
-			size_t len;
-
-			len = *ip++;
-			for (; len == 255; length += 255)
-				len = *ip++;
-			if (unlikely(length > (size_t)(length + len)))
-				goto _output_error;
-			length += len;
-		}
-
-		/* copy literals */
-		cpy = op + length;
-		if (unlikely(cpy > oend - COPYLENGTH)) {
-			/*
-			 * Error: not enough place for another match
-			 * (min 4) + 5 literals
-			 */
-			if (cpy != oend)
-				goto _output_error;
-
-			memcpy(op, ip, length);
-			ip += length;
-			break; /* EOF */
-		}
-		LZ4_WILDCOPY(ip, op, cpy);
-		ip -= (op - cpy);
-		op = cpy;
-
-		/* get offset */
-		LZ4_READ_LITTLEENDIAN_16(ref, cpy, ip);
-		ip += 2;
-
-		/* Error: offset create reference outside destination buffer */
-		if (unlikely(ref < (BYTE *const) dest))
-			goto _output_error;
-
-		/* get matchlength */
-		length = token & ML_MASK;
-		if (length == ML_MASK) {
-			for (; *ip == 255; length += 255)
-				ip++;
-			if (unlikely(length > (size_t)(length + *ip)))
-				goto _output_error;
-			length += *ip++;
-		}
-
-		/* copy repeated sequence */
-		if (unlikely((op - ref) < STEPSIZE)) {
-#if LZ4_ARCH64
-			int dec64 = dec64table[op - ref];
-#else
-			const int dec64 = 0;
-#endif
-			op[0] = ref[0];
-			op[1] = ref[1];
-			op[2] = ref[2];
-			op[3] = ref[3];
-			op += 4;
-			ref += 4;
-			ref -= dec32table[op-ref];
-			PUT4(ref, op);
-			op += STEPSIZE - 4;
-			ref -= dec64;
-		} else {
-			LZ4_COPYSTEP(ref, op);
-		}
-		cpy = op + length - (STEPSIZE - 4);
-		if (cpy > (oend - COPYLENGTH)) {
-
-			/* Error: request to write beyond destination buffer */
-			if (cpy > oend)
-				goto _output_error;
-#if LZ4_ARCH64
-			if ((ref + COPYLENGTH) > oend)
-#else
-			if ((ref + COPYLENGTH) > oend ||
-					(op + COPYLENGTH) > oend)
-#endif
-				goto _output_error;
-			LZ4_SECURECOPY(ref, op, (oend - COPYLENGTH));
-			while (op < cpy)
-				*op++ = *ref++;
-			op = cpy;
-			/*
-			 * Check EOF (should never happen, since last 5 bytes
-			 * are supposed to be literals)
-			 */
-			if (op == oend)
-				goto _output_error;
-			continue;
-		}
-		LZ4_SECURECOPY(ref, op, cpy);
-		op = cpy; /* correction */
-	}
-	/* end of decoding */
-	return (int) (((char *)ip) - source);
-
-	/* write overflow error detected */
+    /* Local Variables */
+    const BYTE* ip = (const BYTE*) source;
+    const BYTE* const iend = ip + inputSize;
+
+    BYTE* op = (BYTE*) dest;
+    BYTE* const oend = op + outputSize;
+    BYTE* cpy;
+    BYTE* oexit = op + targetOutputSize;
+    const BYTE* const lowLimit = lowPrefix - dictSize;
+
+    const BYTE* const dictEnd = (const BYTE*)dictStart + dictSize;
+    const unsigned dec32table[] = {4, 1, 2, 1, 4, 4, 4, 4};
+    const int dec64table[] = {0, 0, 0, -1, 0, 1, 2, 3};
+
+    const int safeDecode = (endOnInput==endOnInputSize);
+    const int checkOffset = ((safeDecode) && (dictSize < (int)(64 KB)));
+
+
+    /* Special cases */
+    if ((partialDecoding) && (oexit > oend-MFLIMIT)) oexit = oend-MFLIMIT;                        /* targetOutputSize too high => decode everything */
+    if ((endOnInput) && (unlikely(outputSize==0))) return ((inputSize==1) && (*ip==0)) ? 0 : -1;  /* Empty output buffer */
+    if ((!endOnInput) && (unlikely(outputSize==0))) return (*ip==0?1:-1);
+
+    /* Main Loop : decode sequences */
+    while (1) {
+        unsigned token;
+        size_t length;
+        const BYTE* match;
+        size_t offset;
+
+        /* get literal length */
+        token = *ip++;
+        if ((length=(token>>ML_BITS)) == RUN_MASK) {
+            unsigned s;
+            do {
+                s = *ip++;
+                length += s;
+            } while ( likely(endOnInput ? ip<iend-RUN_MASK : 1) & (s==255) );
+            if ((safeDecode) && unlikely((size_t)(op+length)<(size_t)(op))) goto _output_error;   /* overflow detection */
+            if ((safeDecode) && unlikely((size_t)(ip+length)<(size_t)(ip))) goto _output_error;   /* overflow detection */
+        }
+
+        /* copy literals */
+        cpy = op+length;
+        if ( ((endOnInput) && ((cpy>(partialDecoding?oexit:oend-MFLIMIT)) || (ip+length>iend-(2+1+LASTLITERALS))) )
+            || ((!endOnInput) && (cpy>oend-WILDCOPYLENGTH)) )
+        {
+            if (partialDecoding) {
+                if (cpy > oend) goto _output_error;                           /* Error : write attempt beyond end of output buffer */
+                if ((endOnInput) && (ip+length > iend)) goto _output_error;   /* Error : read attempt beyond end of input buffer */
+            } else {
+                if ((!endOnInput) && (cpy != oend)) goto _output_error;       /* Error : block decoding must stop exactly there */
+                if ((endOnInput) && ((ip+length != iend) || (cpy > oend))) goto _output_error;   /* Error : input must be consumed */
+            }
+            memcpy(op, ip, length);
+            ip += length;
+            op += length;
+            break;     /* Necessarily EOF, due to parsing restrictions */
+        }
+        LZ4_wildCopy(op, ip, cpy);
+        ip += length; op = cpy;
+
+        /* get offset */
+        offset = LZ4_readLE16(ip); ip+=2;
+        match = op - offset;
+        if ((checkOffset) && (unlikely(match < lowLimit))) goto _output_error;   /* Error : offset outside buffers */
+
+        /* get matchlength */
+        length = token & ML_MASK;
+        if (length == ML_MASK) {
+            unsigned s;
+            do {
+                s = *ip++;
+                if ((endOnInput) && (ip > iend-LASTLITERALS)) goto _output_error;
+                length += s;
+            } while (s==255);
+            if ((safeDecode) && unlikely((size_t)(op+length)<(size_t)op)) goto _output_error;   /* overflow detection */
+        }
+        length += MINMATCH;
+
+        /* check external dictionary */
+        if ((dict==usingExtDict) && (match < lowPrefix)) {
+            if (unlikely(op+length > oend-LASTLITERALS)) goto _output_error;   /* doesn't respect parsing restriction */
+
+            if (length <= (size_t)(lowPrefix-match)) {
+                /* match can be copied as a single segment from external dictionary */
+                memmove(op, dictEnd - (lowPrefix-match), length);
+                op += length;
+            } else {
+                /* match encompass external dictionary and current block */
+                size_t const copySize = (size_t)(lowPrefix-match);
+                size_t const restSize = length - copySize;
+                memcpy(op, dictEnd - copySize, copySize);
+                op += copySize;
+                if (restSize > (size_t)(op-lowPrefix)) {  /* overlap copy */
+                    BYTE* const endOfMatch = op + restSize;
+                    const BYTE* copyFrom = lowPrefix;
+                    while (op < endOfMatch) *op++ = *copyFrom++;
+                } else {
+                    memcpy(op, lowPrefix, restSize);
+                    op += restSize;
+            }   }
+            continue;
+        }
+
+        /* copy match within block */
+        cpy = op + length;
+        if (unlikely(offset<8)) {
+            const int dec64 = dec64table[offset];
+            op[0] = match[0];
+            op[1] = match[1];
+            op[2] = match[2];
+            op[3] = match[3];
+            match += dec32table[offset];
+            memcpy(op+4, match, 4);
+            match -= dec64;
+        } else { LZ4_copy8(op, match); match+=8; }
+        op += 8;
+
+        if (unlikely(cpy>oend-12)) {
+            BYTE* const oCopyLimit = oend-(WILDCOPYLENGTH-1);
+            if (cpy > oend-LASTLITERALS) goto _output_error;    /* Error : last LASTLITERALS bytes must be literals (uncompressed) */
+            if (op < oCopyLimit) {
+                LZ4_wildCopy(op, match, oCopyLimit);
+                match += oCopyLimit - op;
+                op = oCopyLimit;
+            }
+            while (op<cpy) *op++ = *match++;
+        } else {
+            LZ4_copy8(op, match);
+            if (length>16) LZ4_wildCopy(op+8, match+8, cpy);
+        }
+        op=cpy;   /* correction */
+    }
+
+    /* end of decoding */
+    if (endOnInput)
+       return (int) (((char*)op)-dest);     /* Nb of output bytes decoded */
+    else
+       return (int) (((const char*)ip)-source);   /* Nb of input bytes read */
+
+    /* Overflow error detected */
 _output_error:
-	return -1;
+    return (int) (-(((const char*)ip)-source))-1;
 }
 
-static int lz4_uncompress_unknownoutputsize(const char *source, char *dest,
-				int isize, size_t maxoutputsize)
+int LZ4_decompress_safe(const char* source, char* dest, int compressedSize, int maxDecompressedSize)
 {
-	const BYTE *ip = (const BYTE *) source;
-	const BYTE *const iend = ip + isize;
-	const BYTE *ref;
-
-
-	BYTE *op = (BYTE *) dest;
-	BYTE * const oend = op + maxoutputsize;
-	BYTE *cpy;
-
-	/* Main Loop */
-	while (ip < iend) {
-
-		unsigned token;
-		size_t length;
-
-		/* get runlength */
-		token = *ip++;
-		length = (token >> ML_BITS);
-		if (length == RUN_MASK) {
-			int s = 255;
-			while ((ip < iend) && (s == 255)) {
-				s = *ip++;
-				if (unlikely(length > (size_t)(length + s)))
-					goto _output_error;
-				length += s;
-			}
-		}
-		/* copy literals */
-		cpy = op + length;
-		if ((cpy > oend - COPYLENGTH) ||
-			(ip + length > iend - COPYLENGTH)) {
-
-			if (cpy > oend)
-				goto _output_error;/* writes beyond buffer */
-
-			if (ip + length != iend)
-				goto _output_error;/*
-						    * Error: LZ4 format requires
-						    * to consume all input
-						    * at this stage
-						    */
-			memcpy(op, ip, length);
-			op += length;
-			break;/* Necessarily EOF, due to parsing restrictions */
-		}
-		LZ4_WILDCOPY(ip, op, cpy);
-		ip -= (op - cpy);
-		op = cpy;
-
-		/* get offset */
-		LZ4_READ_LITTLEENDIAN_16(ref, cpy, ip);
-		ip += 2;
-		if (ref < (BYTE * const) dest)
-			goto _output_error;
-			/*
-			 * Error : offset creates reference
-			 * outside of destination buffer
-			 */
-
-		/* get matchlength */
-		length = (token & ML_MASK);
-		if (length == ML_MASK) {
-			while (ip < iend) {
-				int s = *ip++;
-				if (unlikely(length > (size_t)(length + s)))
-					goto _output_error;
-				length += s;
-				if (s == 255)
-					continue;
-				break;
-			}
-		}
-
-		/* copy repeated sequence */
-		if (unlikely((op - ref) < STEPSIZE)) {
-#if LZ4_ARCH64
-			int dec64 = dec64table[op - ref];
-#else
-			const int dec64 = 0;
-#endif
-				op[0] = ref[0];
-				op[1] = ref[1];
-				op[2] = ref[2];
-				op[3] = ref[3];
-				op += 4;
-				ref += 4;
-				ref -= dec32table[op - ref];
-				PUT4(ref, op);
-				op += STEPSIZE - 4;
-				ref -= dec64;
-		} else {
-			LZ4_COPYSTEP(ref, op);
-		}
-		cpy = op + length - (STEPSIZE-4);
-		if (cpy > oend - COPYLENGTH) {
-			if (cpy > oend)
-				goto _output_error; /* write outside of buf */
-#if LZ4_ARCH64
-			if ((ref + COPYLENGTH) > oend)
-#else
-			if ((ref + COPYLENGTH) > oend ||
-					(op + COPYLENGTH) > oend)
-#endif
-				goto _output_error;
-			LZ4_SECURECOPY(ref, op, (oend - COPYLENGTH));
-			while (op < cpy)
-				*op++ = *ref++;
-			op = cpy;
-			/*
-			 * Check EOF (should never happen, since last 5 bytes
-			 * are supposed to be literals)
-			 */
-			if (op == oend)
-				goto _output_error;
-			continue;
-		}
-		LZ4_SECURECOPY(ref, op, cpy);
-		op = cpy; /* correction */
-	}
-	/* end of decoding */
-	return (int) (((char *) op) - dest);
-
-	/* write overflow error detected */
-_output_error:
-	return -1;
+    return LZ4_decompress_generic(source, dest, compressedSize, maxDecompressedSize, endOnInputSize, full, 0, noDict, (BYTE*)dest, NULL, 0);
 }
 
-int lz4_decompress(const unsigned char *src, size_t *src_len,
-		unsigned char *dest, size_t actual_dest_len)
+int LZ4_decompress_safe_partial(const char* source, char* dest, int compressedSize, int targetOutputSize, int maxDecompressedSize)
 {
-	int ret = -1;
-	int input_len = 0;
+    return LZ4_decompress_generic(source, dest, compressedSize, maxDecompressedSize, endOnInputSize, partial, targetOutputSize, noDict, (BYTE*)dest, NULL, 0);
+}
 
-	input_len = lz4_uncompress(src, dest, actual_dest_len);
-	if (input_len < 0)
-		goto exit_0;
-	*src_len = input_len;
+int LZ4_decompress_fast(const char* source, char* dest, int originalSize)
+{
+    return LZ4_decompress_generic(source, dest, 0, originalSize, endOnOutputSize, full, 0, withPrefix64k, (BYTE*)(dest - 64 KB), NULL, 64 KB);
+}
 
-	return 0;
-exit_0:
-	return ret;
+/*!
+ * LZ4_setStreamDecode() :
+ * Use this function to instruct where to find the dictionary.
+ * This function is not necessary if previous data is still available where it was decoded.
+ * Loading a size of 0 is allowed (same effect as no dictionary).
+ * Return : 1 if OK, 0 if error
+ */
+int LZ4_setStreamDecode (LZ4_streamDecode_t* LZ4_streamDecode, const char* dictionary, int dictSize)
+{
+    LZ4_streamDecode_t_internal* lz4sd = (LZ4_streamDecode_t_internal*) LZ4_streamDecode;
+    lz4sd->prefixSize = (size_t) dictSize;
+    lz4sd->prefixEnd = (const BYTE*) dictionary + dictSize;
+    lz4sd->externalDict = NULL;
+    lz4sd->extDictSize  = 0;
+    return 1;
 }
-#ifndef STATIC
-EXPORT_SYMBOL(lz4_decompress);
-#endif
 
-int lz4_decompress_unknownoutputsize(const unsigned char *src, size_t src_len,
-		unsigned char *dest, size_t *dest_len)
+/*
+*_continue() :
+    These decoding functions allow decompression of multiple blocks in "streaming" mode.
+    Previously decoded blocks must still be available at the memory position where they were decoded.
+    If it's not possible, save the relevant part of decoded data into a safe buffer,
+    and indicate where it stands using LZ4_setStreamDecode()
+*/
+int LZ4_decompress_safe_continue (LZ4_streamDecode_t* LZ4_streamDecode, const char* source, char* dest, int compressedSize, int maxOutputSize)
 {
-	int ret = -1;
-	int out_len = 0;
-
-	out_len = lz4_uncompress_unknownoutputsize(src, dest, src_len,
-					*dest_len);
-	if (out_len < 0)
-		goto exit_0;
-	*dest_len = out_len;
-
-	return 0;
-exit_0:
-	return ret;
+    LZ4_streamDecode_t_internal* lz4sd = (LZ4_streamDecode_t_internal*) LZ4_streamDecode;
+    int result;
+
+    if (lz4sd->prefixEnd == (BYTE*)dest) {
+        result = LZ4_decompress_generic(source, dest, compressedSize, maxOutputSize,
+                                        endOnInputSize, full, 0,
+                                        usingExtDict, lz4sd->prefixEnd - lz4sd->prefixSize, lz4sd->externalDict, lz4sd->extDictSize);
+        if (result <= 0) return result;
+        lz4sd->prefixSize += result;
+        lz4sd->prefixEnd  += result;
+    } else {
+        lz4sd->extDictSize = lz4sd->prefixSize;
+        lz4sd->externalDict = lz4sd->prefixEnd - lz4sd->extDictSize;
+        result = LZ4_decompress_generic(source, dest, compressedSize, maxOutputSize,
+                                        endOnInputSize, full, 0,
+                                        usingExtDict, (BYTE*)dest, lz4sd->externalDict, lz4sd->extDictSize);
+        if (result <= 0) return result;
+        lz4sd->prefixSize = result;
+        lz4sd->prefixEnd  = (BYTE*)dest + result;
+    }
+
+    return result;
+}
+
+int LZ4_decompress_fast_continue (LZ4_streamDecode_t* LZ4_streamDecode, const char* source, char* dest, int originalSize)
+{
+    LZ4_streamDecode_t_internal* lz4sd = (LZ4_streamDecode_t_internal*) LZ4_streamDecode;
+    int result;
+
+    if (lz4sd->prefixEnd == (BYTE*)dest) {
+        result = LZ4_decompress_generic(source, dest, 0, originalSize,
+                                        endOnOutputSize, full, 0,
+                                        usingExtDict, lz4sd->prefixEnd - lz4sd->prefixSize, lz4sd->externalDict, lz4sd->extDictSize);
+        if (result <= 0) return result;
+        lz4sd->prefixSize += originalSize;
+        lz4sd->prefixEnd  += originalSize;
+    } else {
+        lz4sd->extDictSize = lz4sd->prefixSize;
+        lz4sd->externalDict = (BYTE*)dest - lz4sd->extDictSize;
+        result = LZ4_decompress_generic(source, dest, 0, originalSize,
+                                        endOnOutputSize, full, 0,
+                                        usingExtDict, (BYTE*)dest, lz4sd->externalDict, lz4sd->extDictSize);
+        if (result <= 0) return result;
+        lz4sd->prefixSize = originalSize;
+        lz4sd->prefixEnd  = (BYTE*)dest + originalSize;
+    }
+
+    return result;
+}
+
+/*
+ * for backwards compatibility
+ */
+int lz4_decompress_unknownoutputsize(const unsigned char *src, size_t src_len, unsigned char *dest, size_t *dest_len) {
+   *dest_len = LZ4_decompress_safe(src, dest, (int)src_len, (int)((size_t)dest_len));
+
+   return (int)((size_t)(dest_len));
+}
+
+/*
+ * for backwards compatibility
+ */
+int lz4_decompress(const unsigned char *src, size_t *src_len, unsigned char *dest, size_t actual_dest_len) {
+    *src_len = LZ4_decompress_fast(src, dest, (int)actual_dest_len);
+
+    return (int)((size_t)(src_len));
 }
-#ifndef STATIC
-EXPORT_SYMBOL(lz4_decompress_unknownoutputsize);
 
 MODULE_LICENSE("Dual BSD/GPL");
-MODULE_DESCRIPTION("LZ4 Decompressor");
-#endif
+MODULE_DESCRIPTION("LZ4 decompressor");
+
+/* Kernel exports */
+EXPORT_SYMBOL(LZ4_decompress_fast);
+EXPORT_SYMBOL(LZ4_decompress_safe);
+EXPORT_SYMBOL(LZ4_decompress_safe_partial);
+EXPORT_SYMBOL(lz4_decompress);
+EXPORT_SYMBOL(lz4_decompress_unknownoutputsize);
diff --git a/lib/lz4/lz4defs.h b/lib/lz4/lz4defs.h
index c79d7ea..567998d 100644
--- a/lib/lz4/lz4defs.h
+++ b/lib/lz4/lz4defs.h
@@ -1,157 +1,277 @@
+#ifndef __LZ4DEFS_H__
+#define __LZ4DEFS_H__
+
 /*
- * lz4defs.h -- architecture specific defines
- *
- * Copyright (C) 2013, LG Electronics, Kyungsik Lee <kyungsik.lee@lge.com>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- */
+   lz4defs.h -- common and architecture specific defines for the kernel usage
+
+   LZ4 - Fast LZ compression algorithm
+   Copyright (C) 2011-2016, Yann Collet.
+   BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
+   Redistribution and use in source and binary forms, with or without
+   modification, are permitted provided that the following conditions are
+   met:
+       * Redistributions of source code must retain the above copyright
+   notice, this list of conditions and the following disclaimer.
+       * Redistributions in binary form must reproduce the above
+   copyright notice, this list of conditions and the following disclaimer
+   in the documentation and/or other materials provided with the
+   distribution.
+   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+   You can contact the author at :
+    - LZ4 homepage : http://www.lz4.org
+    - LZ4 source repository : https://github.com/lz4/lz4
+
+    Created for kernel usage by:
+    Sven Schmidt <4sschmid@informatik.uni-hamburg.de>
+*/
+
+#include <asm/unaligned.h>
 
 /*
  * Detects 64 bits mode
- */
+*/
 #if defined(CONFIG_64BIT)
 #define LZ4_ARCH64 1
 #else
 #define LZ4_ARCH64 0
 #endif
 
+static inline unsigned LZ4_64bits(void) { return LZ4_ARCH64; }
+
 /*
- * Architecture-specific macros
+ * Little/big endian
  */
-#define BYTE	u8
-typedef struct _U16_S { u16 v; } U16_S;
-typedef struct _U32_S { u32 v; } U32_S;
-typedef struct _U64_S { u64 v; } U64_S;
-#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)
-
-#define A16(x) (((U16_S *)(x))->v)
-#define A32(x) (((U32_S *)(x))->v)
-#define A64(x) (((U64_S *)(x))->v)
-
-#define PUT4(s, d) (A32(d) = A32(s))
-#define PUT8(s, d) (A64(d) = A64(s))
-
-#define LZ4_READ_LITTLEENDIAN_16(d, s, p)	\
-	(d = s - A16(p))
-
-#define LZ4_WRITE_LITTLEENDIAN_16(p, v)	\
-	do {	\
-		A16(p) = v; \
-		p += 2; \
-	} while (0)
-#else /* CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS */
-
-#define A64(x) get_unaligned((u64 *)&(((U16_S *)(x))->v))
-#define A32(x) get_unaligned((u32 *)&(((U16_S *)(x))->v))
-#define A16(x) get_unaligned((u16 *)&(((U16_S *)(x))->v))
-
-#define PUT4(s, d) \
-	put_unaligned(get_unaligned((const u32 *) s), (u32 *) d)
-#define PUT8(s, d) \
-	put_unaligned(get_unaligned((const u64 *) s), (u64 *) d)
-
-#define LZ4_READ_LITTLEENDIAN_16(d, s, p)	\
-	(d = s - get_unaligned_le16(p))
-
-#define LZ4_WRITE_LITTLEENDIAN_16(p, v)			\
-	do {						\
-		put_unaligned_le16(v, (u16 *)(p));	\
-		p += 2;					\
-	} while (0)
+#ifdef __LITTLE_ENDIAN
+#define LZ4_isLittleEndian(void) (true)
+#else
+#define LZ4_isLittleEndian(void) (false)
 #endif
 
-#define COPYLENGTH 8
+/*-************************************
+*  Tuning parameter
+**************************************/
+/*! * LZ4_MEMORY_USAGE :
+ * Memory usage formula : N->2^N Bytes (examples : 10 -> 1KB; 12 -> 4KB ; 16 -> 64KB; 20 -> 1MB; etc.)
+ * Increasing memory usage improves compression ratio
+ * Reduced memory usage can improve speed, due to cache effect
+ * Default value is 14, for 16KB, which nicely fits into Intel x86 L1 cache
+ */
+#define LZ4_MEMORY_USAGE 10
+
+/*-************************************
+*  Memory routines
+**************************************/
+#include <linux/slab.h>
+#include <linux/string.h>   /* memset, memcpy */
+#define MEM_INIT       memset
+
+/*-************************************
+*  Basic Types
+**************************************/
+#include <linux/types.h>
+
+typedef  uint8_t BYTE;
+typedef uint16_t U16;
+typedef uint32_t U32;
+typedef  int32_t S32;
+typedef uint64_t U64;
+
+/*-************************************
+*  Common Constants
+**************************************/
+#define MINMATCH 4
+
+#define WILDCOPYLENGTH 8
+#define LASTLITERALS 5
+#define MFLIMIT (WILDCOPYLENGTH+MINMATCH)
+static const int LZ4_minLength = (MFLIMIT+1);
+
+#define KB *(1 <<10)
+#define MB *(1 <<20)
+#define GB *(1U<<30)
+
+#define MAXD_LOG 16
+#define MAX_DISTANCE ((1 << MAXD_LOG) - 1)
+#define STEPSIZE sizeof(size_t)
+
 #define ML_BITS  4
-#define ML_MASK  ((1U << ML_BITS) - 1)
-#define RUN_BITS (8 - ML_BITS)
-#define RUN_MASK ((1U << RUN_BITS) - 1)
-#define MEMORY_USAGE	14
-#define MINMATCH	4
-#define SKIPSTRENGTH	6
-#define LASTLITERALS	5
-#define MFLIMIT		(COPYLENGTH + MINMATCH)
-#define MINLENGTH	(MFLIMIT + 1)
-#define MAXD_LOG	16
-#define MAXD		(1 << MAXD_LOG)
-#define MAXD_MASK	(u32)(MAXD - 1)
-#define MAX_DISTANCE	(MAXD - 1)
-#define HASH_LOG	(MAXD_LOG - 1)
-#define HASHTABLESIZE	(1 << HASH_LOG)
-#define MAX_NB_ATTEMPTS	256
-#define OPTIMAL_ML	(int)((ML_MASK-1)+MINMATCH)
-#define LZ4_64KLIMIT	((1<<16) + (MFLIMIT - 1))
-#define HASHLOG64K	((MEMORY_USAGE - 2) + 1)
-#define HASH64KTABLESIZE	(1U << HASHLOG64K)
-#define LZ4_HASH_VALUE(p)	(((A32(p)) * 2654435761U) >> \
-				((MINMATCH * 8) - (MEMORY_USAGE-2)))
-#define LZ4_HASH64K_VALUE(p)	(((A32(p)) * 2654435761U) >> \
-				((MINMATCH * 8) - HASHLOG64K))
-#define HASH_VALUE(p)		(((A32(p)) * 2654435761U) >> \
-				((MINMATCH * 8) - HASH_LOG))
-
-#if LZ4_ARCH64/* 64-bit */
-#define STEPSIZE 8
-
-#define LZ4_COPYSTEP(s, d)	\
-	do {			\
-		PUT8(s, d);	\
-		d += 8;		\
-		s += 8;		\
-	} while (0)
-
-#define LZ4_COPYPACKET(s, d)	LZ4_COPYSTEP(s, d)
-
-#define LZ4_SECURECOPY(s, d, e)			\
-	do {					\
-		if (d < e) {			\
-			LZ4_WILDCOPY(s, d, e);	\
-		}				\
-	} while (0)
-#define HTYPE u32
+#define ML_MASK  ((1U<<ML_BITS)-1)
+#define RUN_BITS (8-ML_BITS)
+#define RUN_MASK ((1U<<RUN_BITS)-1)
 
-#ifdef __BIG_ENDIAN
-#define LZ4_NBCOMMONBYTES(val) (__builtin_clzll(val) >> 3)
-#else
-#define LZ4_NBCOMMONBYTES(val) (__builtin_ctzll(val) >> 3)
-#endif
+#define LZ4_HASHLOG   (LZ4_MEMORY_USAGE-2)
+#define LZ4_HASHTABLESIZE (1 << LZ4_MEMORY_USAGE)
+#define LZ4_HASH_SIZE_U32 (1 << LZ4_HASHLOG)       /* required as macro for static inline allocation */
+
+static const int LZ4_64Klimit = ((64 KB) + (MFLIMIT-1));
+static const U32 LZ4_skipTrigger = 6;  /* Increase this value ==> compression run slower on incompressible data */
+
+/*-************************************
+*  Reading and writing into memory
+**************************************/
+
+static inline U16 LZ4_read16(const void* memPtr)
+{
+    U16 val; memcpy(&val, memPtr, sizeof(val)); return val;
+}
+
+static inline U32 LZ4_read32(const void* memPtr)
+{
+    U32 val; memcpy(&val, memPtr, sizeof(val)); return val;
+}
+
+static inline size_t LZ4_read_ARCH(const void* memPtr)
+{
+    size_t val; memcpy(&val, memPtr, sizeof(val)); return val;
+}
 
-#else	/* 32-bit */
-#define STEPSIZE 4
+static inline void LZ4_write16(void* memPtr, U16 value)
+{
+    memcpy(memPtr, &value, sizeof(value));
+}
 
-#define LZ4_COPYSTEP(s, d)	\
-	do {			\
-		PUT4(s, d);	\
-		d += 4;		\
-		s += 4;		\
-	} while (0)
+static inline void LZ4_write32(void* memPtr, U32 value)
+{
+    memcpy(memPtr, &value, sizeof(value));
+}
 
-#define LZ4_COPYPACKET(s, d)		\
-	do {				\
-		LZ4_COPYSTEP(s, d);	\
-		LZ4_COPYSTEP(s, d);	\
-	} while (0)
+static inline U16 LZ4_readLE16(const void* memPtr)
+{
+    if (LZ4_isLittleEndian()) {
+        return LZ4_read16(memPtr);
+    } else {
+        const BYTE* p = (const BYTE*)memPtr;
+        return (U16)((U16)p[0] + (p[1]<<8));
+    }
+}
 
-#define LZ4_SECURECOPY	LZ4_WILDCOPY
-#define HTYPE const u8*
+static inline void LZ4_writeLE16(void* memPtr, U16 value)
+{
+    if (LZ4_isLittleEndian()) {
+        LZ4_write16(memPtr, value);
+    } else {
+        BYTE* p = (BYTE*)memPtr;
+        p[0] = (BYTE) value;
+        p[1] = (BYTE)(value>>8);
+    }
+}
 
+static inline void LZ4_copy8(void* dst, const void* src)
+{
+    memcpy(dst,src,8);
+}
+
+/* customized variant of memcpy, which can overwrite up to 7 bytes beyond dstEnd */
+static inline void LZ4_wildCopy(void* dstPtr, const void* srcPtr, void* dstEnd)
+{
+    BYTE* d = (BYTE*)dstPtr;
+    const BYTE* s = (const BYTE*)srcPtr;
+    BYTE* const e = (BYTE*)dstEnd;
+
+#if 0
+    const size_t l2 = 8 - (((size_t)d) & (sizeof(void*)-1));
+    LZ4_copy8(d,s); if (d>e-9) return;
+    d+=l2; s+=l2;
+#endif /* join to align */
+
+    do { LZ4_copy8(d,s); d+=8; s+=8; } while (d<e);
+}
+
+#if LZ4_ARCH64
+#ifdef __BIG_ENDIAN
+#define LZ4_NBCOMMONBYTES(val) (__builtin_clzll(val) >> 3)
+#else
+#define LZ4_NBCOMMONBYTES(val) (__builtin_clzll(val) >> 3)
+#endif
+#else
 #ifdef __BIG_ENDIAN
 #define LZ4_NBCOMMONBYTES(val) (__builtin_clz(val) >> 3)
 #else
 #define LZ4_NBCOMMONBYTES(val) (__builtin_ctz(val) >> 3)
 #endif
-
 #endif
 
-#define LZ4_WILDCOPY(s, d, e)		\
-	do {				\
-		LZ4_COPYPACKET(s, d);	\
-	} while (d < e)
-
-#define LZ4_BLINDCOPY(s, d, l)	\
-	do {	\
-		u8 *e = (d) + l;	\
-		LZ4_WILDCOPY(s, d, e);	\
-		d = e;	\
-	} while (0)
+static inline unsigned LZ4_count(const BYTE* pIn, const BYTE* pMatch, const BYTE* pInLimit)
+{
+    const BYTE* const pStart = pIn;
+
+    while (likely(pIn<pInLimit-(STEPSIZE-1))) {
+        size_t diff = LZ4_read_ARCH(pMatch) ^ LZ4_read_ARCH(pIn);
+        if (!diff) { pIn+=STEPSIZE; pMatch+=STEPSIZE; continue; }
+        pIn += LZ4_NBCOMMONBYTES(diff);
+        return (unsigned)(pIn - pStart);
+    }
+
+    if (LZ4_64bits()) if ((pIn<(pInLimit-3)) && (LZ4_read32(pMatch) == LZ4_read32(pIn))) { pIn+=4; pMatch+=4; }
+    if ((pIn<(pInLimit-1)) && (LZ4_read16(pMatch) == LZ4_read16(pIn))) { pIn+=2; pMatch+=2; }
+    if ((pIn<pInLimit) && (*pMatch == *pIn)) pIn++;
+    return (unsigned)(pIn - pStart);
+}
+
+typedef struct {
+    uint32_t hashTable[LZ4_HASH_SIZE_U32];
+    uint32_t currentOffset;
+    uint32_t initCheck;
+    const uint8_t* dictionary;
+    uint8_t* bufferStart;   /* obsolete, used for slideInputBuffer */
+    uint32_t dictSize;
+} LZ4_stream_t_internal;
+
+typedef struct {
+    const uint8_t* externalDict;
+    size_t extDictSize;
+    const uint8_t* prefixEnd;
+    size_t prefixSize;
+} LZ4_streamDecode_t_internal;
+
+typedef enum { notLimited = 0, limitedOutput = 1 } limitedOutput_directive;
+typedef enum { byPtr, byU32, byU16 } tableType_t;
+
+typedef enum { noDict = 0, withPrefix64k, usingExtDict } dict_directive;
+typedef enum { noDictIssue = 0, dictSmall } dictIssue_directive;
+
+typedef enum { endOnOutputSize = 0, endOnInputSize = 1 } endCondition_directive;
+typedef enum { full = 0, partial = 1 } earlyEnd_directive;
+
+/*-**********************************************
+*  Streaming Decompression
+************************************************/
+#define LZ4_STREAMDECODESIZE_U64  4
+#define LZ4_STREAMDECODESIZE     (LZ4_STREAMDECODESIZE_U64 * sizeof(unsigned long long))
+typedef struct { unsigned long long table[LZ4_STREAMDECODESIZE_U64]; } LZ4_streamDecode_t;
+
+/*-*********************************************
+*  Streaming Compression
+***********************************************/
+#define LZ4_STREAMSIZE_U64 ((1 << (LZ4_MEMORY_USAGE-3)) + 4)
+#define LZ4_STREAMSIZE     (LZ4_STREAMSIZE_U64 * sizeof(long long))
+
+/*!
+ * LZ4_stream_t :
+ * information structure to track an LZ4 stream.
+ * important : init this structure content before first use !
+ * note : only allocated directly the structure if you are static inlineally linking LZ4
+ *        If you are using liblz4 as a DLL, please use below construction methods instead.
+ */
+typedef struct { long long table[LZ4_STREAMSIZE_U64]; } LZ4_stream_t;
+
+/*-******************************
+*  Streaming functions
+********************************/
+
+static inline void LZ4_resetStream (LZ4_stream_t* LZ4_stream)
+{
+    MEM_INIT(LZ4_stream, 0, sizeof(LZ4_stream_t));
+}
+
+#endif
diff --git a/lib/lz4/lz4hc_compress.c b/lib/lz4/lz4hc_compress.c
index f344f76..e630100 100644
--- a/lib/lz4/lz4hc_compress.c
+++ b/lib/lz4/lz4hc_compress.c
@@ -1,539 +1,576 @@
 /*
- * LZ4 HC - High Compression Mode of LZ4
- * Copyright (C) 2011-2012, Yann Collet.
- * BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions are
- * met:
- *
- *     * Redistributions of source code must retain the above copyright
- * notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above
- * copyright notice, this list of conditions and the following disclaimer
- * in the documentation and/or other materials provided with the
- * distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- *
- * You can contact the author at :
- * - LZ4 homepage : http://fastcompression.blogspot.com/p/lz4.html
- * - LZ4 source repository : http://code.google.com/p/lz4/
- *
- *  Changed for kernel use by:
- *  Chanho Min <chanho.min@lge.com>
- */
-
+    LZ4 HC - High Compression Mode of LZ4
+    Copyright (C) 2011-2015, Yann Collet.
+
+    BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions are
+    met:
+
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above
+    copyright notice, this list of conditions and the following disclaimer
+    in the documentation and/or other materials provided with the
+    distribution.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+    You can contact the author at :
+       - LZ4 source repository : https://github.com/lz4/lz4
+       - LZ4 public forum : https://groups.google.com/forum/#!forum/lz4c
+
+    Changed for kernel use by:
+    Sven Schmidt <4sschmid@informatik.uni-hamburg.de>
+*/
+
+/*-************************************
+*  Includes
+**************************************/
+#include "lz4defs.h"
+#include <linux/lz4.h>
 #include <linux/module.h>
 #include <linux/kernel.h>
-#include <linux/lz4.h>
-#include <asm/unaligned.h>
-#include "lz4defs.h"
 
-struct lz4hc_data {
-	const u8 *base;
-	HTYPE hashtable[HASHTABLESIZE];
-	u16 chaintable[MAXD];
-	const u8 *nexttoupdate;
-} __attribute__((__packed__));
+/* *************************************
+*  Local Constants and types
+***************************************/
+#define LZ4HC_DICTIONARY_LOGSIZE 16
+#define LZ4HC_MAXD (1<<LZ4HC_DICTIONARY_LOGSIZE)
+#define LZ4HC_MAXD_MASK (LZ4HC_MAXD - 1)
+
+#define LZ4HC_HASH_LOG (LZ4HC_DICTIONARY_LOGSIZE-1)
+#define LZ4HC_HASHTABLESIZE (1 << LZ4HC_HASH_LOG)
+#define LZ4HC_HASH_MASK (LZ4HC_HASHTABLESIZE - 1)
 
-static inline int lz4hc_init(struct lz4hc_data *hc4, const u8 *base)
+typedef struct
+{
+    unsigned int   hashTable[LZ4HC_HASHTABLESIZE];
+    unsigned short   chainTable[LZ4HC_MAXD];
+    const unsigned char* end;        /* next block here to continue on current prefix */
+    const unsigned char* base;       /* All index relative to this position */
+    const unsigned char* dictBase;   /* alternate base for extDict */
+    unsigned char* inputBuffer;      /* deprecated */
+    unsigned int   dictLimit;        /* below that point, need extDict */
+    unsigned int   lowLimit;         /* below that point, no more dict */
+    unsigned int   nextToUpdate;     /* index from which to continue dictionary update */
+    unsigned int   compressionLevel;
+} LZ4HC_CCtx_internal;
+
+#define LZ4_STREAMHCSIZE        262192
+#define LZ4_STREAMHCSIZE_SIZET (LZ4_STREAMHCSIZE / sizeof(size_t))
+
+typedef union LZ4_streamHC_u LZ4_streamHC_t;
+
+union LZ4_streamHC_u {
+    size_t table[LZ4_STREAMHCSIZE_SIZET];
+    LZ4HC_CCtx_internal internal_donotuse;
+};   /* previously typedef'd to LZ4_streamHC_t */
+
+#define OPTIMAL_ML (int)((ML_MASK-1)+MINMATCH)
+
+/**************************************
+*  Local Macros
+**************************************/
+#define HASH_FUNCTION(i)       (((i) * 2654435761U) >> ((MINMATCH*8)-LZ4HC_HASH_LOG))
+#define DELTANEXTU16(p)        chainTable[(U16)(p)]   /* faster */
+
+static U32 LZ4HC_hashPtr(const void* ptr) { return HASH_FUNCTION(LZ4_read32(ptr)); }
+
+/**************************************
+*  HC Compression
+**************************************/
+static void LZ4HC_init (LZ4HC_CCtx_internal* hc4, const BYTE* start)
 {
-	memset((void *)hc4->hashtable, 0, sizeof(hc4->hashtable));
-	memset(hc4->chaintable, 0xFF, sizeof(hc4->chaintable));
-
-#if LZ4_ARCH64
-	hc4->nexttoupdate = base + 1;
-#else
-	hc4->nexttoupdate = base;
-#endif
-	hc4->base = base;
-	return 1;
+    MEM_INIT((void*)hc4->hashTable, 0, sizeof(hc4->hashTable));
+    MEM_INIT(hc4->chainTable, 0xFF, sizeof(hc4->chainTable));
+    hc4->nextToUpdate = 64 KB;
+    hc4->base = start - 64 KB;
+    hc4->end = start;
+    hc4->dictBase = start - 64 KB;
+    hc4->dictLimit = 64 KB;
+    hc4->lowLimit = 64 KB;
 }
 
 /* Update chains up to ip (excluded) */
-static inline void lz4hc_insert(struct lz4hc_data *hc4, const u8 *ip)
+static inline void LZ4HC_Insert (LZ4HC_CCtx_internal* hc4, const BYTE* ip)
 {
-	u16 *chaintable = hc4->chaintable;
-	HTYPE *hashtable  = hc4->hashtable;
-#if LZ4_ARCH64
-	const BYTE * const base = hc4->base;
-#else
-	const int base = 0;
-#endif
-
-	while (hc4->nexttoupdate < ip) {
-		const u8 *p = hc4->nexttoupdate;
-		size_t delta = p - (hashtable[HASH_VALUE(p)] + base);
-		if (delta > MAX_DISTANCE)
-			delta = MAX_DISTANCE;
-		chaintable[(size_t)(p) & MAXD_MASK] = (u16)delta;
-		hashtable[HASH_VALUE(p)] = (p) - base;
-		hc4->nexttoupdate++;
-	}
+    U16* const chainTable = hc4->chainTable;
+    U32* const hashTable  = hc4->hashTable;
+    const BYTE* const base = hc4->base;
+    U32 const target = (U32)(ip - base);
+    U32 idx = hc4->nextToUpdate;
+
+    while (idx < target) {
+        U32 const h = LZ4HC_hashPtr(base+idx);
+        size_t delta = idx - hashTable[h];
+        if (delta>MAX_DISTANCE) delta = MAX_DISTANCE;
+        DELTANEXTU16(idx) = (U16)delta;
+        hashTable[h] = idx;
+        idx++;
+    }
+
+    hc4->nextToUpdate = target;
 }
 
-static inline size_t lz4hc_commonlength(const u8 *p1, const u8 *p2,
-		const u8 *const matchlimit)
+static inline int LZ4HC_InsertAndFindBestMatch (LZ4HC_CCtx_internal* hc4,   /* Index table will be updated */
+                                               const BYTE* ip, const BYTE* const iLimit,
+                                               const BYTE** matchpos,
+                                               const int maxNbAttempts)
 {
-	const u8 *p1t = p1;
-
-	while (p1t < matchlimit - (STEPSIZE - 1)) {
-#if LZ4_ARCH64
-		u64 diff = A64(p2) ^ A64(p1t);
-#else
-		u32 diff = A32(p2) ^ A32(p1t);
-#endif
-		if (!diff) {
-			p1t += STEPSIZE;
-			p2 += STEPSIZE;
-			continue;
-		}
-		p1t += LZ4_NBCOMMONBYTES(diff);
-		return p1t - p1;
-	}
-#if LZ4_ARCH64
-	if ((p1t < (matchlimit-3)) && (A32(p2) == A32(p1t))) {
-		p1t += 4;
-		p2 += 4;
-	}
-#endif
-
-	if ((p1t < (matchlimit - 1)) && (A16(p2) == A16(p1t))) {
-		p1t += 2;
-		p2 += 2;
-	}
-	if ((p1t < matchlimit) && (*p2 == *p1t))
-		p1t++;
-	return p1t - p1;
+    U16* const chainTable = hc4->chainTable;
+    U32* const HashTable = hc4->hashTable;
+    const BYTE* const base = hc4->base;
+    const BYTE* const dictBase = hc4->dictBase;
+    const U32 dictLimit = hc4->dictLimit;
+    const U32 lowLimit = (hc4->lowLimit + 64 KB > (U32)(ip-base)) ? hc4->lowLimit : (U32)(ip - base) - (64 KB - 1);
+    U32 matchIndex;
+    int nbAttempts=maxNbAttempts;
+    size_t ml=0;
+
+    /* HC4 match finder */
+    LZ4HC_Insert(hc4, ip);
+    matchIndex = HashTable[LZ4HC_hashPtr(ip)];
+
+    while ((matchIndex>=lowLimit) && (nbAttempts)) {
+        nbAttempts--;
+        if (matchIndex >= dictLimit) {
+            const BYTE* const match = base + matchIndex;
+            if (*(match+ml) == *(ip+ml)
+                && (LZ4_read32(match) == LZ4_read32(ip)))
+            {
+                size_t const mlt = LZ4_count(ip+MINMATCH, match+MINMATCH, iLimit) + MINMATCH;
+                if (mlt > ml) { ml = mlt; *matchpos = match; }
+            }
+        } else {
+            const BYTE* const match = dictBase + matchIndex;
+            if (LZ4_read32(match) == LZ4_read32(ip)) {
+                size_t mlt;
+                const BYTE* vLimit = ip + (dictLimit - matchIndex);
+                if (vLimit > iLimit) vLimit = iLimit;
+                mlt = LZ4_count(ip+MINMATCH, match+MINMATCH, vLimit) + MINMATCH;
+                if ((ip+mlt == vLimit) && (vLimit < iLimit))
+                    mlt += LZ4_count(ip+mlt, base+dictLimit, iLimit);
+                if (mlt > ml) { ml = mlt; *matchpos = base + matchIndex; }   /* virtual matchpos */
+            }
+        }
+        matchIndex -= DELTANEXTU16(matchIndex);
+    }
+
+    return (int)ml;
 }
 
-static inline int lz4hc_insertandfindbestmatch(struct lz4hc_data *hc4,
-		const u8 *ip, const u8 *const matchlimit, const u8 **matchpos)
+
+static inline int LZ4HC_InsertAndGetWiderMatch (
+    LZ4HC_CCtx_internal* hc4,
+    const BYTE* const ip,
+    const BYTE* const iLowLimit,
+    const BYTE* const iHighLimit,
+    int longest,
+    const BYTE** matchpos,
+    const BYTE** startpos,
+    const int maxNbAttempts)
 {
-	u16 *const chaintable = hc4->chaintable;
-	HTYPE *const hashtable = hc4->hashtable;
-	const u8 *ref;
-#if LZ4_ARCH64
-	const BYTE * const base = hc4->base;
-#else
-	const int base = 0;
-#endif
-	int nbattempts = MAX_NB_ATTEMPTS;
-	size_t repl = 0, ml = 0;
-	u16 delta;
-
-	/* HC4 match finder */
-	lz4hc_insert(hc4, ip);
-	ref = hashtable[HASH_VALUE(ip)] + base;
-
-	/* potential repetition */
-	if (ref >= ip-4) {
-		/* confirmed */
-		if (A32(ref) == A32(ip)) {
-			delta = (u16)(ip-ref);
-			repl = ml  = lz4hc_commonlength(ip + MINMATCH,
-					ref + MINMATCH, matchlimit) + MINMATCH;
-			*matchpos = ref;
-		}
-		ref -= (size_t)chaintable[(size_t)(ref) & MAXD_MASK];
-	}
-
-	while ((ref >= ip - MAX_DISTANCE) && nbattempts) {
-		nbattempts--;
-		if (*(ref + ml) == *(ip + ml)) {
-			if (A32(ref) == A32(ip)) {
-				size_t mlt =
-					lz4hc_commonlength(ip + MINMATCH,
-					ref + MINMATCH, matchlimit) + MINMATCH;
-				if (mlt > ml) {
-					ml = mlt;
-					*matchpos = ref;
-				}
-			}
-		}
-		ref -= (size_t)chaintable[(size_t)(ref) & MAXD_MASK];
-	}
-
-	/* Complete table */
-	if (repl) {
-		const BYTE *ptr = ip;
-		const BYTE *end;
-		end = ip + repl - (MINMATCH-1);
-		/* Pre-Load */
-		while (ptr < end - delta) {
-			chaintable[(size_t)(ptr) & MAXD_MASK] = delta;
-			ptr++;
-		}
-		do {
-			chaintable[(size_t)(ptr) & MAXD_MASK] = delta;
-			/* Head of chain */
-			hashtable[HASH_VALUE(ptr)] = (ptr) - base;
-			ptr++;
-		} while (ptr < end);
-		hc4->nexttoupdate = end;
-	}
-
-	return (int)ml;
+    U16* const chainTable = hc4->chainTable;
+    U32* const HashTable = hc4->hashTable;
+    const BYTE* const base = hc4->base;
+    const U32 dictLimit = hc4->dictLimit;
+    const BYTE* const lowPrefixPtr = base + dictLimit;
+    const U32 lowLimit = (hc4->lowLimit + 64 KB > (U32)(ip-base)) ? hc4->lowLimit : (U32)(ip - base) - (64 KB - 1);
+    const BYTE* const dictBase = hc4->dictBase;
+    U32   matchIndex;
+    int nbAttempts = maxNbAttempts;
+    int delta = (int)(ip-iLowLimit);
+
+
+    /* First Match */
+    LZ4HC_Insert(hc4, ip);
+    matchIndex = HashTable[LZ4HC_hashPtr(ip)];
+
+    while ((matchIndex>=lowLimit) && (nbAttempts)) {
+        nbAttempts--;
+        if (matchIndex >= dictLimit) {
+            const BYTE* matchPtr = base + matchIndex;
+            if (*(iLowLimit + longest) == *(matchPtr - delta + longest)) {
+                if (LZ4_read32(matchPtr) == LZ4_read32(ip)) {
+                    int mlt = MINMATCH + LZ4_count(ip+MINMATCH, matchPtr+MINMATCH, iHighLimit);
+                    int back = 0;
+
+                    while ((ip+back > iLowLimit)
+                           && (matchPtr+back > lowPrefixPtr)
+                           && (ip[back-1] == matchPtr[back-1]))
+                            back--;
+
+                    mlt -= back;
+
+                    if (mlt > longest) {
+                        longest = (int)mlt;
+                        *matchpos = matchPtr+back;
+                        *startpos = ip+back;
+                    }
+                }
+            }
+        } else {
+            const BYTE* const matchPtr = dictBase + matchIndex;
+            if (LZ4_read32(matchPtr) == LZ4_read32(ip)) {
+                size_t mlt;
+                int back=0;
+                const BYTE* vLimit = ip + (dictLimit - matchIndex);
+                if (vLimit > iHighLimit) vLimit = iHighLimit;
+                mlt = LZ4_count(ip+MINMATCH, matchPtr+MINMATCH, vLimit) + MINMATCH;
+                if ((ip+mlt == vLimit) && (vLimit < iHighLimit))
+                    mlt += LZ4_count(ip+mlt, base+dictLimit, iHighLimit);
+                while ((ip+back > iLowLimit) && (matchIndex+back > lowLimit) && (ip[back-1] == matchPtr[back-1])) back--;
+                mlt -= back;
+                if ((int)mlt > longest) { longest = (int)mlt; *matchpos = base + matchIndex + back; *startpos = ip+back; }
+            }
+        }
+        matchIndex -= DELTANEXTU16(matchIndex);
+    }
+
+    return longest;
 }
 
-static inline int lz4hc_insertandgetwidermatch(struct lz4hc_data *hc4,
-	const u8 *ip, const u8 *startlimit, const u8 *matchlimit, int longest,
-	const u8 **matchpos, const u8 **startpos)
+static inline int LZ4HC_encodeSequence (
+    const BYTE** ip,
+    BYTE** op,
+    const BYTE** anchor,
+    int matchLength,
+    const BYTE* const match,
+    limitedOutput_directive limitedOutputBuffer,
+    BYTE* oend)
 {
-	u16 *const chaintable = hc4->chaintable;
-	HTYPE *const hashtable = hc4->hashtable;
-#if LZ4_ARCH64
-	const BYTE * const base = hc4->base;
-#else
-	const int base = 0;
-#endif
-	const u8 *ref;
-	int nbattempts = MAX_NB_ATTEMPTS;
-	int delta = (int)(ip - startlimit);
-
-	/* First Match */
-	lz4hc_insert(hc4, ip);
-	ref = hashtable[HASH_VALUE(ip)] + base;
-
-	while ((ref >= ip - MAX_DISTANCE) && (ref >= hc4->base)
-		&& (nbattempts)) {
-		nbattempts--;
-		if (*(startlimit + longest) == *(ref - delta + longest)) {
-			if (A32(ref) == A32(ip)) {
-				const u8 *reft = ref + MINMATCH;
-				const u8 *ipt = ip + MINMATCH;
-				const u8 *startt = ip;
-
-				while (ipt < matchlimit-(STEPSIZE - 1)) {
-					#if LZ4_ARCH64
-					u64 diff = A64(reft) ^ A64(ipt);
-					#else
-					u32 diff = A32(reft) ^ A32(ipt);
-					#endif
-
-					if (!diff) {
-						ipt += STEPSIZE;
-						reft += STEPSIZE;
-						continue;
-					}
-					ipt += LZ4_NBCOMMONBYTES(diff);
-					goto _endcount;
-				}
-				#if LZ4_ARCH64
-				if ((ipt < (matchlimit - 3))
-					&& (A32(reft) == A32(ipt))) {
-					ipt += 4;
-					reft += 4;
-				}
-				ipt += 2;
-				#endif
-				if ((ipt < (matchlimit - 1))
-					&& (A16(reft) == A16(ipt))) {
-					reft += 2;
-				}
-				if ((ipt < matchlimit) && (*reft == *ipt))
-					ipt++;
-_endcount:
-				reft = ref;
-
-				while ((startt > startlimit)
-					&& (reft > hc4->base)
-					&& (startt[-1] == reft[-1])) {
-					startt--;
-					reft--;
-				}
-
-				if ((ipt - startt) > longest) {
-					longest = (int)(ipt - startt);
-					*matchpos = reft;
-					*startpos = startt;
-				}
-			}
-		}
-		ref -= (size_t)chaintable[(size_t)(ref) & MAXD_MASK];
-	}
-	return longest;
+    int length;
+    BYTE* token;
+
+    /* Encode Literal length */
+    length = (int)(*ip - *anchor);
+    token = (*op)++;
+    if ((limitedOutputBuffer) && ((*op + (length>>8) + length + (2 + 1 + LASTLITERALS)) > oend)) return 1;   /* Check output limit */
+    if (length>=(int)RUN_MASK) { int len; *token=(RUN_MASK<<ML_BITS); len = length-RUN_MASK; for(; len > 254 ; len-=255) *(*op)++ = 255;  *(*op)++ = (BYTE)len; }
+    else *token = (BYTE)(length<<ML_BITS);
+
+    /* Copy Literals */
+    LZ4_wildCopy(*op, *anchor, (*op) + length);
+    *op += length;
+
+    /* Encode Offset */
+    LZ4_writeLE16(*op, (U16)(*ip-match)); *op += 2;
+
+    /* Encode MatchLength */
+    length = (int)(matchLength-MINMATCH);
+    if ((limitedOutputBuffer) && (*op + (length>>8) + (1 + LASTLITERALS) > oend)) return 1;   /* Check output limit */
+    if (length>=(int)ML_MASK) {
+        *token += ML_MASK;
+        length -= ML_MASK;
+        for(; length > 509 ; length-=510) { *(*op)++ = 255; *(*op)++ = 255; }
+        if (length > 254) { length-=255; *(*op)++ = 255; }
+        *(*op)++ = (BYTE)length;
+    } else {
+        *token += (BYTE)(length);
+    }
+
+    /* Prepare next loop */
+    *ip += matchLength;
+    *anchor = *ip;
+
+    return 0;
 }
 
-static inline int lz4_encodesequence(const u8 **ip, u8 **op, const u8 **anchor,
-		int ml, const u8 *ref)
+static int LZ4HC_compress_generic (
+    LZ4HC_CCtx_internal* const ctx,
+    const char* const source,
+    char* const dest,
+    int const inputSize,
+    int const maxOutputSize,
+    int compressionLevel,
+    limitedOutput_directive limit
+    )
 {
-	int length, len;
-	u8 *token;
-
-	/* Encode Literal length */
-	length = (int)(*ip - *anchor);
-	token = (*op)++;
-	if (length >= (int)RUN_MASK) {
-		*token = (RUN_MASK << ML_BITS);
-		len = length - RUN_MASK;
-		for (; len > 254 ; len -= 255)
-			*(*op)++ = 255;
-		*(*op)++ = (u8)len;
-	} else
-		*token = (length << ML_BITS);
-
-	/* Copy Literals */
-	LZ4_BLINDCOPY(*anchor, *op, length);
-
-	/* Encode Offset */
-	LZ4_WRITE_LITTLEENDIAN_16(*op, (u16)(*ip - ref));
-
-	/* Encode MatchLength */
-	len = (int)(ml - MINMATCH);
-	if (len >= (int)ML_MASK) {
-		*token += ML_MASK;
-		len -= ML_MASK;
-		for (; len > 509 ; len -= 510) {
-			*(*op)++ = 255;
-			*(*op)++ = 255;
-		}
-		if (len > 254) {
-			len -= 255;
-			*(*op)++ = 255;
-		}
-		*(*op)++ = (u8)len;
-	} else
-		*token += len;
-
-	/* Prepare next loop */
-	*ip += ml;
-	*anchor = *ip;
-
-	return 0;
+    const BYTE* ip = (const BYTE*) source;
+    const BYTE* anchor = ip;
+    const BYTE* const iend = ip + inputSize;
+    const BYTE* const mflimit = iend - MFLIMIT;
+    const BYTE* const matchlimit = (iend - LASTLITERALS);
+
+    BYTE* op = (BYTE*) dest;
+    BYTE* const oend = op + maxOutputSize;
+
+    unsigned maxNbAttempts;
+    int   ml, ml2, ml3, ml0;
+    const BYTE* ref = NULL;
+    const BYTE* start2 = NULL;
+    const BYTE* ref2 = NULL;
+    const BYTE* start3 = NULL;
+    const BYTE* ref3 = NULL;
+    const BYTE* start0;
+    const BYTE* ref0;
+
+    /* init */
+    if (compressionLevel > LZ4HC_MAX_CLEVEL) compressionLevel = LZ4HC_MAX_CLEVEL;
+    if (compressionLevel < 1) compressionLevel = LZ4HC_DEFAULT_CLEVEL;
+    maxNbAttempts = 1 << (compressionLevel-1);
+    ctx->end += inputSize;
+
+    ip++;
+
+    /* Main Loop */
+    while (ip < mflimit) {
+        ml = LZ4HC_InsertAndFindBestMatch (ctx, ip, matchlimit, (&ref), maxNbAttempts);
+        if (!ml) { ip++; continue; }
+
+        /* saved, in case we would skip too much */
+        start0 = ip;
+        ref0 = ref;
+        ml0 = ml;
+
+_Search2:
+        if (ip+ml < mflimit)
+            ml2 = LZ4HC_InsertAndGetWiderMatch(ctx, ip + ml - 2, ip + 0, matchlimit, ml, &ref2, &start2, maxNbAttempts);
+        else ml2 = ml;
+
+        if (ml2 == ml) { /* No better match */
+            if (LZ4HC_encodeSequence(&ip, &op, &anchor, ml, ref, limit, oend)) return 0;
+            continue;
+        }
+
+        if (start0 < ip) {
+            if (start2 < ip + ml0) {  /* empirical */
+                ip = start0;
+                ref = ref0;
+                ml = ml0;
+            }
+        }
+
+        /* Here, start0==ip */
+        if ((start2 - ip) < 3) {  /* First Match too small : removed */
+            ml = ml2;
+            ip = start2;
+            ref =ref2;
+            goto _Search2;
+        }
+
+_Search3:
+        /*
+        * Currently we have :
+        * ml2 > ml1, and
+        * ip1+3 <= ip2 (usually < ip1+ml1)
+        */
+        if ((start2 - ip) < OPTIMAL_ML) {
+            int correction;
+            int new_ml = ml;
+            if (new_ml > OPTIMAL_ML) new_ml = OPTIMAL_ML;
+            if (ip+new_ml > start2 + ml2 - MINMATCH) new_ml = (int)(start2 - ip) + ml2 - MINMATCH;
+            correction = new_ml - (int)(start2 - ip);
+            if (correction > 0) {
+                start2 += correction;
+                ref2 += correction;
+                ml2 -= correction;
+            }
+        }
+        /* Now, we have start2 = ip+new_ml, with new_ml = min(ml, OPTIMAL_ML=18) */
+
+        if (start2 + ml2 < mflimit)
+            ml3 = LZ4HC_InsertAndGetWiderMatch(ctx, start2 + ml2 - 3, start2, matchlimit, ml2, &ref3, &start3, maxNbAttempts);
+        else ml3 = ml2;
+
+        if (ml3 == ml2) {  /* No better match : 2 sequences to encode */
+            /* ip & ref are known; Now for ml */
+            if (start2 < ip+ml)  ml = (int)(start2 - ip);
+            /* Now, encode 2 sequences */
+            if (LZ4HC_encodeSequence(&ip, &op, &anchor, ml, ref, limit, oend)) return 0;
+            ip = start2;
+            if (LZ4HC_encodeSequence(&ip, &op, &anchor, ml2, ref2, limit, oend)) return 0;
+            continue;
+        }
+
+        if (start3 < ip+ml+3) {  /* Not enough space for match 2 : remove it */
+            if (start3 >= (ip+ml)) {  /* can write Seq1 immediately ==> Seq2 is removed, so Seq3 becomes Seq1 */
+                if (start2 < ip+ml) {
+                    int correction = (int)(ip+ml - start2);
+                    start2 += correction;
+                    ref2 += correction;
+                    ml2 -= correction;
+                    if (ml2 < MINMATCH) {
+                        start2 = start3;
+                        ref2 = ref3;
+                        ml2 = ml3;
+                    }
+                }
+
+                if (LZ4HC_encodeSequence(&ip, &op, &anchor, ml, ref, limit, oend)) return 0;
+                ip  = start3;
+                ref = ref3;
+                ml  = ml3;
+
+                start0 = start2;
+                ref0 = ref2;
+                ml0 = ml2;
+                goto _Search2;
+            }
+
+            start2 = start3;
+            ref2 = ref3;
+            ml2 = ml3;
+            goto _Search3;
+        }
+
+        /*
+        * OK, now we have 3 ascending matches; let's write at least the first one
+        * ip & ref are known; Now for ml
+        */
+        if (start2 < ip+ml) {
+            if ((start2 - ip) < (int)ML_MASK) {
+                int correction;
+                if (ml > OPTIMAL_ML) ml = OPTIMAL_ML;
+                if (ip + ml > start2 + ml2 - MINMATCH) ml = (int)(start2 - ip) + ml2 - MINMATCH;
+                correction = ml - (int)(start2 - ip);
+                if (correction > 0) {
+                    start2 += correction;
+                    ref2 += correction;
+                    ml2 -= correction;
+                }
+            } else {
+                ml = (int)(start2 - ip);
+            }
+        }
+        if (LZ4HC_encodeSequence(&ip, &op, &anchor, ml, ref, limit, oend)) return 0;
+
+        ip = start2;
+        ref = ref2;
+        ml = ml2;
+
+        start2 = start3;
+        ref2 = ref3;
+        ml2 = ml3;
+
+        goto _Search3;
+    }
+
+    /* Encode Last Literals */
+    {   int lastRun = (int)(iend - anchor);
+        if ((limit) && (((char*)op - dest) + lastRun + 1 + ((lastRun+255-RUN_MASK)/255) > (U32)maxOutputSize)) return 0;  /* Check output limit */
+        if (lastRun>=(int)RUN_MASK) { *op++=(RUN_MASK<<ML_BITS); lastRun-=RUN_MASK; for(; lastRun > 254 ; lastRun-=255) *op++ = 255; *op++ = (BYTE) lastRun; }
+        else *op++ = (BYTE)(lastRun<<ML_BITS);
+        memcpy(op, anchor, iend - anchor);
+        op += iend-anchor;
+    }
+
+    /* End */
+    return (int) (((char*)op)-dest);
 }
 
-static int lz4_compresshcctx(struct lz4hc_data *ctx,
-		const char *source,
-		char *dest,
-		int isize)
+int LZ4_sizeofStateHC(void) { return sizeof(LZ4_streamHC_t); }
+
+int LZ4_compress_HC_extStateHC (void* state, const char* src, char* dst, int srcSize, int maxDstSize, int compressionLevel)
 {
-	const u8 *ip = (const u8 *)source;
-	const u8 *anchor = ip;
-	const u8 *const iend = ip + isize;
-	const u8 *const mflimit = iend - MFLIMIT;
-	const u8 *const matchlimit = (iend - LASTLITERALS);
-
-	u8 *op = (u8 *)dest;
-
-	int ml, ml2, ml3, ml0;
-	const u8 *ref = NULL;
-	const u8 *start2 = NULL;
-	const u8 *ref2 = NULL;
-	const u8 *start3 = NULL;
-	const u8 *ref3 = NULL;
-	const u8 *start0;
-	const u8 *ref0;
-	int lastrun;
-
-	ip++;
-
-	/* Main Loop */
-	while (ip < mflimit) {
-		ml = lz4hc_insertandfindbestmatch(ctx, ip, matchlimit, (&ref));
-		if (!ml) {
-			ip++;
-			continue;
-		}
-
-		/* saved, in case we would skip too much */
-		start0 = ip;
-		ref0 = ref;
-		ml0 = ml;
-_search2:
-		if (ip+ml < mflimit)
-			ml2 = lz4hc_insertandgetwidermatch(ctx, ip + ml - 2,
-				ip + 1, matchlimit, ml, &ref2, &start2);
-		else
-			ml2 = ml;
-		/* No better match */
-		if (ml2 == ml) {
-			lz4_encodesequence(&ip, &op, &anchor, ml, ref);
-			continue;
-		}
-
-		if (start0 < ip) {
-			/* empirical */
-			if (start2 < ip + ml0) {
-				ip = start0;
-				ref = ref0;
-				ml = ml0;
-			}
-		}
-		/*
-		 * Here, start0==ip
-		 * First Match too small : removed
-		 */
-		if ((start2 - ip) < 3) {
-			ml = ml2;
-			ip = start2;
-			ref = ref2;
-			goto _search2;
-		}
-
-_search3:
-		/*
-		 * Currently we have :
-		 * ml2 > ml1, and
-		 * ip1+3 <= ip2 (usually < ip1+ml1)
-		 */
-		if ((start2 - ip) < OPTIMAL_ML) {
-			int correction;
-			int new_ml = ml;
-			if (new_ml > OPTIMAL_ML)
-				new_ml = OPTIMAL_ML;
-			if (ip + new_ml > start2 + ml2 - MINMATCH)
-				new_ml = (int)(start2 - ip) + ml2 - MINMATCH;
-			correction = new_ml - (int)(start2 - ip);
-			if (correction > 0) {
-				start2 += correction;
-				ref2 += correction;
-				ml2 -= correction;
-			}
-		}
-		/*
-		 * Now, we have start2 = ip+new_ml,
-		 * with new_ml=min(ml, OPTIMAL_ML=18)
-		 */
-		if (start2 + ml2 < mflimit)
-			ml3 = lz4hc_insertandgetwidermatch(ctx,
-				start2 + ml2 - 3, start2, matchlimit,
-				ml2, &ref3, &start3);
-		else
-			ml3 = ml2;
-
-		/* No better match : 2 sequences to encode */
-		if (ml3 == ml2) {
-			/* ip & ref are known; Now for ml */
-			if (start2 < ip+ml)
-				ml = (int)(start2 - ip);
-
-			/* Now, encode 2 sequences */
-			lz4_encodesequence(&ip, &op, &anchor, ml, ref);
-			ip = start2;
-			lz4_encodesequence(&ip, &op, &anchor, ml2, ref2);
-			continue;
-		}
-
-		/* Not enough space for match 2 : remove it */
-		if (start3 < ip + ml + 3) {
-			/*
-			 * can write Seq1 immediately ==> Seq2 is removed,
-			 * so Seq3 becomes Seq1
-			 */
-			if (start3 >= (ip + ml)) {
-				if (start2 < ip + ml) {
-					int correction =
-						(int)(ip + ml - start2);
-					start2 += correction;
-					ref2 += correction;
-					ml2 -= correction;
-					if (ml2 < MINMATCH) {
-						start2 = start3;
-						ref2 = ref3;
-						ml2 = ml3;
-					}
-				}
-
-				lz4_encodesequence(&ip, &op, &anchor, ml, ref);
-				ip  = start3;
-				ref = ref3;
-				ml  = ml3;
-
-				start0 = start2;
-				ref0 = ref2;
-				ml0 = ml2;
-				goto _search2;
-			}
-
-			start2 = start3;
-			ref2 = ref3;
-			ml2 = ml3;
-			goto _search3;
-		}
-
-		/*
-		 * OK, now we have 3 ascending matches; let's write at least
-		 * the first one ip & ref are known; Now for ml
-		 */
-		if (start2 < ip + ml) {
-			if ((start2 - ip) < (int)ML_MASK) {
-				int correction;
-				if (ml > OPTIMAL_ML)
-					ml = OPTIMAL_ML;
-				if (ip + ml > start2 + ml2 - MINMATCH)
-					ml = (int)(start2 - ip) + ml2
-						- MINMATCH;
-				correction = ml - (int)(start2 - ip);
-				if (correction > 0) {
-					start2 += correction;
-					ref2 += correction;
-					ml2 -= correction;
-				}
-			} else
-				ml = (int)(start2 - ip);
-		}
-		lz4_encodesequence(&ip, &op, &anchor, ml, ref);
-
-		ip = start2;
-		ref = ref2;
-		ml = ml2;
-
-		start2 = start3;
-		ref2 = ref3;
-		ml2 = ml3;
-
-		goto _search3;
-	}
-
-	/* Encode Last Literals */
-	lastrun = (int)(iend - anchor);
-	if (lastrun >= (int)RUN_MASK) {
-		*op++ = (RUN_MASK << ML_BITS);
-		lastrun -= RUN_MASK;
-		for (; lastrun > 254 ; lastrun -= 255)
-			*op++ = 255;
-		*op++ = (u8) lastrun;
-	} else
-		*op++ = (lastrun << ML_BITS);
-	memcpy(op, anchor, iend - anchor);
-	op += iend - anchor;
-	/* End */
-	return (int) (((char *)op) - dest);
+    LZ4HC_CCtx_internal* ctx = &((LZ4_streamHC_t*)state)->internal_donotuse;
+    if (((size_t)(state)&(sizeof(void*)-1)) != 0) return 0;   /* Error : state is not aligned for pointers (32 or 64 bits) */
+    LZ4HC_init (ctx, (const BYTE*)src);
+    if (maxDstSize < LZ4_compressBound(srcSize))
+        return LZ4HC_compress_generic (ctx, src, dst, srcSize, maxDstSize, compressionLevel, limitedOutput);
+    else
+        return LZ4HC_compress_generic (ctx, src, dst, srcSize, maxDstSize, compressionLevel, notLimited);
 }
 
-int lz4hc_compress(const unsigned char *src, size_t src_len,
-			unsigned char *dst, size_t *dst_len, void *wrkmem)
+int LZ4_compress_HC(const char* src, char* dst, int srcSize, int maxDstSize, int compressionLevel, void* wrkmem)
 {
-	int ret = -1;
-	int out_len = 0;
+    return LZ4_compress_HC_extStateHC(wrkmem, src, dst, srcSize, maxDstSize, compressionLevel);
+}
 
-	struct lz4hc_data *hc4 = (struct lz4hc_data *)wrkmem;
-	lz4hc_init(hc4, (const u8 *)src);
-	out_len = lz4_compresshcctx((struct lz4hc_data *)hc4, (const u8 *)src,
-		(char *)dst, (int)src_len);
+int lz4hc_compress(const unsigned char *src, size_t src_len, unsigned char *dst, size_t *dst_len, void *wrkmem)
+{
+  *dst_len = LZ4_compress_HC(src, dst, (int)src_len, (int)((size_t)dst_len), LZ4HC_DEFAULT_CLEVEL, wrkmem);
 
-	if (out_len < 0)
-		goto exit;
+  return (int)((size_t)dst_len);
+}
 
-	*dst_len = out_len;
-	return 0;
+/**************************************
+*  Streaming Functions
+**************************************/
 
-exit:
-	return ret;
+int LZ4_loadDictHC (LZ4_streamHC_t* LZ4_streamHCPtr, const char* dictionary, int dictSize)
+{
+    LZ4HC_CCtx_internal* ctxPtr = &LZ4_streamHCPtr->internal_donotuse;
+    if (dictSize > 64 KB) {
+        dictionary += dictSize - 64 KB;
+        dictSize = 64 KB;
+    }
+    LZ4HC_init (ctxPtr, (const BYTE*)dictionary);
+    if (dictSize >= 4) LZ4HC_Insert (ctxPtr, (const BYTE*)dictionary +(dictSize-3));
+    ctxPtr->end = (const BYTE*)dictionary + dictSize;
+    return dictSize;
+}
+
+
+/* compression */
+
+static void LZ4HC_setExternalDict(LZ4HC_CCtx_internal* ctxPtr, const BYTE* newBlock)
+{
+    if (ctxPtr->end >= ctxPtr->base + 4) LZ4HC_Insert (ctxPtr, ctxPtr->end-3);   /* Referencing remaining dictionary content */
+    /* Only one memory segment for extDict, so any previous extDict is lost at this stage */
+    ctxPtr->lowLimit  = ctxPtr->dictLimit;
+    ctxPtr->dictLimit = (U32)(ctxPtr->end - ctxPtr->base);
+    ctxPtr->dictBase  = ctxPtr->base;
+    ctxPtr->base = newBlock - ctxPtr->dictLimit;
+    ctxPtr->end  = newBlock;
+    ctxPtr->nextToUpdate = ctxPtr->dictLimit;   /* match referencing will resume from there */
+}
+
+static int LZ4_compressHC_continue_generic (LZ4_streamHC_t* LZ4_streamHCPtr,
+                                            const char* source, char* dest,
+                                            int inputSize, int maxOutputSize, limitedOutput_directive limit)
+{
+    LZ4HC_CCtx_internal* ctxPtr = &LZ4_streamHCPtr->internal_donotuse;
+    /* auto-init if forgotten */
+    if (ctxPtr->base == NULL) LZ4HC_init (ctxPtr, (const BYTE*) source);
+
+    /* Check overflow */
+    if ((size_t)(ctxPtr->end - ctxPtr->base) > 2 GB) {
+        size_t dictSize = (size_t)(ctxPtr->end - ctxPtr->base) - ctxPtr->dictLimit;
+        if (dictSize > 64 KB) dictSize = 64 KB;
+        LZ4_loadDictHC(LZ4_streamHCPtr, (const char*)(ctxPtr->end) - dictSize, (int)dictSize);
+    }
+
+    /* Check if blocks follow each other */
+    if ((const BYTE*)source != ctxPtr->end) LZ4HC_setExternalDict(ctxPtr, (const BYTE*)source);
+
+    /* Check overlapping input/dictionary space */
+    {   const BYTE* sourceEnd = (const BYTE*) source + inputSize;
+        const BYTE* const dictBegin = ctxPtr->dictBase + ctxPtr->lowLimit;
+        const BYTE* const dictEnd   = ctxPtr->dictBase + ctxPtr->dictLimit;
+        if ((sourceEnd > dictBegin) && ((const BYTE*)source < dictEnd)) {
+            if (sourceEnd > dictEnd) sourceEnd = dictEnd;
+            ctxPtr->lowLimit = (U32)(sourceEnd - ctxPtr->dictBase);
+            if (ctxPtr->dictLimit - ctxPtr->lowLimit < 4) ctxPtr->lowLimit = ctxPtr->dictLimit;
+        }
+    }
+
+    return LZ4HC_compress_generic (ctxPtr, source, dest, inputSize, maxOutputSize, ctxPtr->compressionLevel, limit);
+}
+
+int LZ4_compress_HC_continue (LZ4_streamHC_t* LZ4_streamHCPtr, const char* source, char* dest, int inputSize, int maxOutputSize)
+{
+    if (maxOutputSize < LZ4_compressBound(inputSize))
+        return LZ4_compressHC_continue_generic (LZ4_streamHCPtr, source, dest, inputSize, maxOutputSize, limitedOutput);
+    else
+        return LZ4_compressHC_continue_generic (LZ4_streamHCPtr, source, dest, inputSize, maxOutputSize, notLimited);
 }
-EXPORT_SYMBOL(lz4hc_compress);
 
 MODULE_LICENSE("Dual BSD/GPL");
-MODULE_DESCRIPTION("LZ4HC compressor");
+MODULE_DESCRIPTION("LZ4 HC compressor");
+
+/* Kernel exports */
+EXPORT_SYMBOL(LZ4_compress_HC);
+EXPORT_SYMBOL(lz4hc_compress);
-- 
2.1.4

^ permalink raw reply related

* [PATCH v2 2/4] lib/decompress_unlz4: Change module to work with new LZ4 module version
From: Sven Schmidt @ 2017-01-07 16:55 UTC (permalink / raw)
  To: akpm
  Cc: bongkyu.kim, rsalvaterra, sergey.senozhatsky, gregkh,
	linux-kernel, herbert, davem, linux-crypto, anton, ccross,
	keescook, tony.luck, phillip, Sven Schmidt
In-Reply-To: <1483808145-18417-1-git-send-email-4sschmid@informatik.uni-hamburg.de>

This patch updates the unlz4 wrapper to work with the new LZ4 kernel module version.

Signed-off-by: Sven Schmidt <4sschmid@informatik.uni-hamburg.de>
---
 lib/decompress_unlz4.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/lib/decompress_unlz4.c b/lib/decompress_unlz4.c
index 036fc88..1b0baf3 100644
--- a/lib/decompress_unlz4.c
+++ b/lib/decompress_unlz4.c
@@ -72,7 +72,7 @@ STATIC inline int INIT unlz4(u8 *input, long in_len,
 		error("NULL input pointer and missing fill function");
 		goto exit_1;
 	} else {
-		inp = large_malloc(lz4_compressbound(uncomp_chunksize));
+		inp = large_malloc(LZ4_compressBound(uncomp_chunksize));
 		if (!inp) {
 			error("Could not allocate input buffer");
 			goto exit_1;
@@ -136,7 +136,7 @@ STATIC inline int INIT unlz4(u8 *input, long in_len,
 			inp += 4;
 			size -= 4;
 		} else {
-			if (chunksize > lz4_compressbound(uncomp_chunksize)) {
+			if (chunksize > LZ4_compressBound(uncomp_chunksize)) {
 				error("chunk length is longer than allocated");
 				goto exit_2;
 			}
@@ -152,11 +152,14 @@ STATIC inline int INIT unlz4(u8 *input, long in_len,
 			out_len -= dest_len;
 		} else
 			dest_len = out_len;
-		ret = lz4_decompress(inp, &chunksize, outp, dest_len);
+
+		ret = LZ4_decompress_fast(inp, outp, dest_len);
+		chunksize = ret;
 #else
 		dest_len = uncomp_chunksize;
-		ret = lz4_decompress_unknownoutputsize(inp, chunksize, outp,
-				&dest_len);
+
+		ret = LZ4_decompress_safe(inp, outp, chunksize, dest_len);
+		dest_len = ret;
 #endif
 		if (ret < 0) {
 			error("Decoding failed");
-- 
2.1.4

^ permalink raw reply related

* [PATCH v2 3/4] crypto: Change LZ4 modules to work with new LZ4 module version
From: Sven Schmidt @ 2017-01-07 16:55 UTC (permalink / raw)
  To: akpm
  Cc: bongkyu.kim, rsalvaterra, sergey.senozhatsky, gregkh,
	linux-kernel, herbert, davem, linux-crypto, anton, ccross,
	keescook, tony.luck, phillip, Sven Schmidt
In-Reply-To: <1483808145-18417-1-git-send-email-4sschmid@informatik.uni-hamburg.de>

This patch updates the crypto modules using LZ4 compression to work with the
new LZ4 module version.

Signed-off-by: Sven Schmidt <4sschmid@informatik.uni-hamburg.de>
---
 crypto/lz4.c   | 27 ++++++++++++++-------------
 crypto/lz4hc.c | 25 +++++++++++++------------
 2 files changed, 27 insertions(+), 25 deletions(-)

diff --git a/crypto/lz4.c b/crypto/lz4.c
index 99c1b2c..b969e5f 100644
--- a/crypto/lz4.c
+++ b/crypto/lz4.c
@@ -66,15 +66,16 @@ static void lz4_exit(struct crypto_tfm *tfm)
 static int __lz4_compress_crypto(const u8 *src, unsigned int slen,
 				 u8 *dst, unsigned int *dlen, void *ctx)
 {
-	size_t tmp_len = *dlen;
-	int err;
+	int out_len;
 
-	err = lz4_compress(src, slen, dst, &tmp_len, ctx);
+	out_len = LZ4_compress_default(src, dst, slen, (int)((size_t)dlen), ctx);
 
-	if (err < 0)
-		return -EINVAL;
+	if (out_len == 0) {
+		// out_len is 0 means an error occured
+			return -EINVAL;
+	}
 
-	*dlen = tmp_len;
+	*dlen = out_len;
 	return 0;
 }
 
@@ -96,16 +97,16 @@ static int lz4_compress_crypto(struct crypto_tfm *tfm, const u8 *src,
 static int __lz4_decompress_crypto(const u8 *src, unsigned int slen,
 				   u8 *dst, unsigned int *dlen, void *ctx)
 {
-	int err;
-	size_t tmp_len = *dlen;
-	size_t __slen = slen;
+	int out_len;
 
-	err = lz4_decompress_unknownoutputsize(src, __slen, dst, &tmp_len);
-	if (err < 0)
+	out_len = LZ4_decompress_safe(src, dst, slen, (int)((size_t)dlen));
+	if (out_len < 0) {
+		// out_len of less than 0 means an error occured
 		return -EINVAL;
+	}
 
-	*dlen = tmp_len;
-	return err;
+	*dlen = out_len;
+	return out_len;
 }
 
 static int lz4_sdecompress(struct crypto_scomp *tfm, const u8 *src,
diff --git a/crypto/lz4hc.c b/crypto/lz4hc.c
index 75ffc4a..bf2ceb7 100644
--- a/crypto/lz4hc.c
+++ b/crypto/lz4hc.c
@@ -65,15 +65,16 @@ static void lz4hc_exit(struct crypto_tfm *tfm)
 static int __lz4hc_compress_crypto(const u8 *src, unsigned int slen,
 				   u8 *dst, unsigned int *dlen, void *ctx)
 {
-	size_t tmp_len = *dlen;
-	int err;
+	int out_len;
 
-	err = lz4hc_compress(src, slen, dst, &tmp_len, ctx);
+	out_len = LZ4_compress_HC(src, dst, slen, (int)((size_t)dlen), LZ4HC_DEFAULT_CLEVEL, ctx);
 
-	if (err < 0)
+	if (out_len == 0) {
+		// out_len of 0 -> error
 		return -EINVAL;
+	}
 
-	*dlen = tmp_len;
+	*dlen = out_len;
 	return 0;
 }
 
@@ -97,16 +98,16 @@ static int lz4hc_compress_crypto(struct crypto_tfm *tfm, const u8 *src,
 static int __lz4hc_decompress_crypto(const u8 *src, unsigned int slen,
 				     u8 *dst, unsigned int *dlen, void *ctx)
 {
-	int err;
-	size_t tmp_len = *dlen;
-	size_t __slen = slen;
+	int out_len;
 
-	err = lz4_decompress_unknownoutputsize(src, __slen, dst, &tmp_len);
-	if (err < 0)
+	out_len = LZ4_decompress_safe(src, dst, slen, (int)((size_t)dlen));
+  if (out_len < 0) {
+		// out_len of less than 0 means an error occured
 		return -EINVAL;
+	}
 
-	*dlen = tmp_len;
-	return err;
+	*dlen = out_len;
+	return out_len;
 }
 
 static int lz4hc_sdecompress(struct crypto_scomp *tfm, const u8 *src,
-- 
2.1.4

^ permalink raw reply related

* [PATCH v2 4/4] fs/pstore: fs/squashfs: Change usage of LZ4 to comply with new LZ4 module version
From: Sven Schmidt @ 2017-01-07 16:55 UTC (permalink / raw)
  To: akpm
  Cc: bongkyu.kim, rsalvaterra, sergey.senozhatsky, gregkh,
	linux-kernel, herbert, davem, linux-crypto, anton, ccross,
	keescook, tony.luck, phillip, Sven Schmidt
In-Reply-To: <1483808145-18417-1-git-send-email-4sschmid@informatik.uni-hamburg.de>

This patch updates fs/pstore and fs/squashfs to use the updated functions from
the new LZ4 module.

Signed-off-by: Sven Schmidt <4sschmid@informatik.uni-hamburg.de>
---
 fs/pstore/platform.c      | 14 ++++++++------
 fs/squashfs/lz4_wrapper.c | 12 ++++++------
 2 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
index 729677e..a0d8ca8 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -342,31 +342,33 @@ static int compress_lz4(const void *in, void *out, size_t inlen, size_t outlen)
 {
 	int ret;
 
-	ret = lz4_compress(in, inlen, out, &outlen, workspace);
+	ret = LZ4_compress_default(in, out, inlen, outlen, workspace);
 	if (ret) {
+		// ret is 0 means an error occured
 		pr_err("lz4_compress error, ret = %d!\n", ret);
 		return -EIO;
 	}
 
-	return outlen;
+	return ret;
 }
 
 static int decompress_lz4(void *in, void *out, size_t inlen, size_t outlen)
 {
 	int ret;
 
-	ret = lz4_decompress_unknownoutputsize(in, inlen, out, &outlen);
-	if (ret) {
+	ret = LZ4_decompress_safe(in, out, inlen, outlen);
+	if (ret < 0) {
+		// return value is < 0 in case of error
 		pr_err("lz4_decompress error, ret = %d!\n", ret);
 		return -EIO;
 	}
 
-	return outlen;
+	return ret;
 }
 
 static void allocate_lz4(void)
 {
-	big_oops_buf_sz = lz4_compressbound(psinfo->bufsize);
+	big_oops_buf_sz = LZ4_compressBound(psinfo->bufsize);
 	big_oops_buf = kmalloc(big_oops_buf_sz, GFP_KERNEL);
 	if (big_oops_buf) {
 		workspace = kmalloc(LZ4_MEM_COMPRESS, GFP_KERNEL);
diff --git a/fs/squashfs/lz4_wrapper.c b/fs/squashfs/lz4_wrapper.c
index ff4468b..a512399 100644
--- a/fs/squashfs/lz4_wrapper.c
+++ b/fs/squashfs/lz4_wrapper.c
@@ -97,7 +97,6 @@ static int lz4_uncompress(struct squashfs_sb_info *msblk, void *strm,
 	struct squashfs_lz4 *stream = strm;
 	void *buff = stream->input, *data;
 	int avail, i, bytes = length, res;
-	size_t dest_len = output->length;
 
 	for (i = 0; i < b; i++) {
 		avail = min(bytes, msblk->devblksize - offset);
@@ -108,12 +107,13 @@ static int lz4_uncompress(struct squashfs_sb_info *msblk, void *strm,
 		put_bh(bh[i]);
 	}
 
-	res = lz4_decompress_unknownoutputsize(stream->input, length,
-					stream->output, &dest_len);
-	if (res)
+	res = LZ4_decompress_safe(stream->input, stream->output, length, output->length);
+	if (res < 0) {
+		// res of less than 0 means an error occured
 		return -EIO;
+	}
 
-	bytes = dest_len;
+	bytes = res;
 	data = squashfs_first_page(output);
 	buff = stream->output;
 	while (data) {
@@ -128,7 +128,7 @@ static int lz4_uncompress(struct squashfs_sb_info *msblk, void *strm,
 	}
 	squashfs_finish_page(output);
 
-	return dest_len;
+	return res;
 }
 
 const struct squashfs_decompressor squashfs_lz4_comp_ops = {
-- 
2.1.4

^ permalink raw reply related

* Re: [PATCH v2 4/4] fs/pstore: fs/squashfs: Change usage of LZ4 to comply with new LZ4 module version
From: Kees Cook @ 2017-01-07 21:33 UTC (permalink / raw)
  To: Sven Schmidt
  Cc: Andrew Morton, bongkyu.kim, rsalvaterra, Sergey Senozhatsky,
	Greg KH, LKML, Herbert Xu, David S. Miller, linux-crypto,
	Anton Vorontsov, Colin Cross, Tony Luck, phillip
In-Reply-To: <1483808145-18417-5-git-send-email-4sschmid@informatik.uni-hamburg.de>

On Sat, Jan 7, 2017 at 8:55 AM, Sven Schmidt
<4sschmid@informatik.uni-hamburg.de> wrote:
> This patch updates fs/pstore and fs/squashfs to use the updated functions from
> the new LZ4 module.
>
> Signed-off-by: Sven Schmidt <4sschmid@informatik.uni-hamburg.de>
> ---
>  fs/pstore/platform.c      | 14 ++++++++------
>  fs/squashfs/lz4_wrapper.c | 12 ++++++------
>  2 files changed, 14 insertions(+), 12 deletions(-)
>
> diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
> index 729677e..a0d8ca8 100644
> --- a/fs/pstore/platform.c
> +++ b/fs/pstore/platform.c
> @@ -342,31 +342,33 @@ static int compress_lz4(const void *in, void *out, size_t inlen, size_t outlen)
>  {
>         int ret;
>
> -       ret = lz4_compress(in, inlen, out, &outlen, workspace);
> +       ret = LZ4_compress_default(in, out, inlen, outlen, workspace);
>         if (ret) {
> +               // ret is 0 means an error occured

If that's true, then shouldn't the "if" logic be changed? Also, here
and in all following comments are C++ style instead of kernel C-style.
This should be "/* ret == 0 means an error occured */", though really,
that should be obvious from the code and the comment isn't really
needed.

>                 pr_err("lz4_compress error, ret = %d!\n", ret);

If it's always going to be zero here, is there a better place to get
details on why it failed?

>                 return -EIO;
>         }
>
> -       return outlen;
> +       return ret;
>  }
>
>  static int decompress_lz4(void *in, void *out, size_t inlen, size_t outlen)
>  {
>         int ret;
>
> -       ret = lz4_decompress_unknownoutputsize(in, inlen, out, &outlen);
> -       if (ret) {
> +       ret = LZ4_decompress_safe(in, out, inlen, outlen);
> +       if (ret < 0) {
> +               // return value is < 0 in case of error
>                 pr_err("lz4_decompress error, ret = %d!\n", ret);
>                 return -EIO;
>         }
>
> -       return outlen;
> +       return ret;
>  }
>
>  static void allocate_lz4(void)
>  {
> -       big_oops_buf_sz = lz4_compressbound(psinfo->bufsize);
> +       big_oops_buf_sz = LZ4_compressBound(psinfo->bufsize);
>         big_oops_buf = kmalloc(big_oops_buf_sz, GFP_KERNEL);
>         if (big_oops_buf) {
>                 workspace = kmalloc(LZ4_MEM_COMPRESS, GFP_KERNEL);
> diff --git a/fs/squashfs/lz4_wrapper.c b/fs/squashfs/lz4_wrapper.c
> index ff4468b..a512399 100644
> --- a/fs/squashfs/lz4_wrapper.c
> +++ b/fs/squashfs/lz4_wrapper.c
> @@ -97,7 +97,6 @@ static int lz4_uncompress(struct squashfs_sb_info *msblk, void *strm,
>         struct squashfs_lz4 *stream = strm;
>         void *buff = stream->input, *data;
>         int avail, i, bytes = length, res;
> -       size_t dest_len = output->length;
>
>         for (i = 0; i < b; i++) {
>                 avail = min(bytes, msblk->devblksize - offset);
> @@ -108,12 +107,13 @@ static int lz4_uncompress(struct squashfs_sb_info *msblk, void *strm,
>                 put_bh(bh[i]);
>         }
>
> -       res = lz4_decompress_unknownoutputsize(stream->input, length,
> -                                       stream->output, &dest_len);
> -       if (res)
> +       res = LZ4_decompress_safe(stream->input, stream->output, length, output->length);
> +       if (res < 0) {
> +               // res of less than 0 means an error occured
>                 return -EIO;
> +       }
>
> -       bytes = dest_len;
> +       bytes = res;
>         data = squashfs_first_page(output);
>         buff = stream->output;
>         while (data) {
> @@ -128,7 +128,7 @@ static int lz4_uncompress(struct squashfs_sb_info *msblk, void *strm,
>         }
>         squashfs_finish_page(output);
>
> -       return dest_len;
> +       return res;
>  }
>
>  const struct squashfs_decompressor squashfs_lz4_comp_ops = {
> --
> 2.1.4
>

-Kees

-- 
Kees Cook
Nexus Security

^ permalink raw reply

* Re: [PATCH v2 1/4] lib: Update LZ4 compressor module based on LZ4 v1.7.2.
From: Greg KH @ 2017-01-08 11:22 UTC (permalink / raw)
  To: Sven Schmidt
  Cc: akpm, bongkyu.kim, rsalvaterra, sergey.senozhatsky, linux-kernel,
	herbert, davem, linux-crypto, anton, ccross, keescook, tony.luck,
	phillip
In-Reply-To: <1483808145-18417-2-git-send-email-4sschmid@informatik.uni-hamburg.de>

On Sat, Jan 07, 2017 at 05:55:42PM +0100, Sven Schmidt wrote:
> +/*!LZ4_compressbound() :
> +    Provides the maximum size that LZ4 may output in a "worst case" scenario
> +    (input data not compressible)
> +*/

Odd coding style, please use kerneldoc format if you are going to have
comments like this.

> +static inline int LZ4_compressBound(int isize) {
> +  return LZ4_COMPRESSBOUND(isize);

And follow the proper kernel coding style rules, putting your patches
through scripts/checkpatch.pl should help you out here.

thanks,

greg k-h

^ permalink raw reply

* Re: [PATCH v2 2/4] lib/decompress_unlz4: Change module to work with new LZ4 module version
From: Greg KH @ 2017-01-08 11:23 UTC (permalink / raw)
  To: Sven Schmidt
  Cc: akpm, bongkyu.kim, rsalvaterra, sergey.senozhatsky, linux-kernel,
	herbert, davem, linux-crypto, anton, ccross, keescook, tony.luck,
	phillip
In-Reply-To: <1483808145-18417-3-git-send-email-4sschmid@informatik.uni-hamburg.de>

On Sat, Jan 07, 2017 at 05:55:43PM +0100, Sven Schmidt wrote:
> This patch updates the unlz4 wrapper to work with the new LZ4 kernel module version.
> 
> Signed-off-by: Sven Schmidt <4sschmid@informatik.uni-hamburg.de>
> ---
>  lib/decompress_unlz4.c | 13 ++++++++-----
>  1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/lib/decompress_unlz4.c b/lib/decompress_unlz4.c
> index 036fc88..1b0baf3 100644
> --- a/lib/decompress_unlz4.c
> +++ b/lib/decompress_unlz4.c
> @@ -72,7 +72,7 @@ STATIC inline int INIT unlz4(u8 *input, long in_len,
>  		error("NULL input pointer and missing fill function");
>  		goto exit_1;
>  	} else {
> -		inp = large_malloc(lz4_compressbound(uncomp_chunksize));
> +		inp = large_malloc(LZ4_compressBound(uncomp_chunksize));

Having functions differ by different cases of the characters is ripe for
abuse and confusion.  Please never do that, especially as these "new"
functions you created don't follow the correct kernel coding style rules
:(

thanks,

greg k-h

^ permalink raw reply

* Re: [PATCH v2 1/4] lib: Update LZ4 compressor module based on LZ4 v1.7.2.
From: Greg KH @ 2017-01-08 11:25 UTC (permalink / raw)
  To: Sven Schmidt
  Cc: akpm, bongkyu.kim, rsalvaterra, sergey.senozhatsky, linux-kernel,
	herbert, davem, linux-crypto, anton, ccross, keescook, tony.luck,
	phillip
In-Reply-To: <1483808145-18417-2-git-send-email-4sschmid@informatik.uni-hamburg.de>

On Sat, Jan 07, 2017 at 05:55:42PM +0100, Sven Schmidt wrote:
> This patch updates LZ4 kernel module to LZ4 v1.7.2 by Yann Collet.
> The kernel module is inspired by the previous work by Chanho Min.
> The updated LZ4 module will not break existing code since there were alias
> methods added to ensure backwards compatibility.

Meta-comment.  Does this update include all of the security fixes that
we have made over the past few years to the lz4 code?  I don't want to
be adding back insecure functions that will cause us problems.

Specifically look at the changes I made in 2014 in this directory for an
example of what I am talking about here.

thanks,

greg k-h

^ permalink raw reply

* Re: [PATCH v2 1/4] lib: Update LZ4 compressor module based on LZ4 v1.7.2.
From: Rui Salvaterra @ 2017-01-08 11:33 UTC (permalink / raw)
  To: Greg KH
  Cc: Sven Schmidt, akpm, bongkyu.kim, sergey.senozhatsky, linux-kernel,
	herbert, davem, linux-crypto, anton, ccross, keescook, tony.luck,
	phillip
In-Reply-To: <20170108112542.GC12798@kroah.com>

On 8 January 2017 at 11:25, Greg KH <gregkh@linuxfoundation.org> wrote:
> On Sat, Jan 07, 2017 at 05:55:42PM +0100, Sven Schmidt wrote:
>> This patch updates LZ4 kernel module to LZ4 v1.7.2 by Yann Collet.
>> The kernel module is inspired by the previous work by Chanho Min.
>> The updated LZ4 module will not break existing code since there were alias
>> methods added to ensure backwards compatibility.
>
> Meta-comment.  Does this update include all of the security fixes that
> we have made over the past few years to the lz4 code?  I don't want to
> be adding back insecure functions that will cause us problems.
>
> Specifically look at the changes I made in 2014 in this directory for an
> example of what I am talking about here.
>
> thanks,
>
> greg k-h

Also, this series must be tested on big endian, to make sure the last
fixes we made don't regress.


Thanks,

Rui

^ permalink raw reply

* Is the asynchronous hash crypto API asynchronous?
From: Gilad Ben-Yossef @ 2017-01-08 13:45 UTC (permalink / raw)
  To: linux-crypto; +Cc: Herbert Xu, David Miller

Hi,

My apologies in advance for the length of this email on what sounds
like a trivial $SUBJECT. I'm really at my wits end.

I'm working on giving dm-verity a make over and ran into something I'm
not clear about with regard to asynchronous Hash API.

Documentation/crypto/architecture.rst says:

"Asynchronous operation is provided by the kernel crypto API which
implies that the invocation of a cipher operation will complete almost
instantly. That invocation triggers the cipher operation but it does not
signal its completion. Before invoking a cipher operation, the caller
must provide a callback function the kernel crypto API can invoke to
signal the completion of the cipher operation. Furthermore, the caller
must ensure it can handle such asynchronous events by applying
appropriate locking around its data. The kernel crypto API does not
perform any special serialization operation to protect the caller's data
integrity."

Well, that sounds sane and what I would expect from an asynchronous  API.

api-intro.rst in same directory though includes this example however:

        #include <crypto/hash.h>
        #include <linux/err.h>
        #include <linux/scatterlist.h>

        struct scatterlist sg[2];
        char result[128];
        struct crypto_ahash *tfm;
        struct ahash_request *req;

        tfm = crypto_alloc_ahash("md5", 0, CRYPTO_ALG_ASYNC);
        if (IS_ERR(tfm))
                fail();

        /* ... set up the scatterlists ... */

        req = ahash_request_alloc(tfm, GFP_ATOMIC);
        if (!req)
                fail();

        ahash_request_set_callback(req, 0, NULL, NULL);
        ahash_request_set_crypt(req, sg, result, 2);

        if (crypto_ahash_digest(req))
                fail();

        ahash_request_free(req);
        crypto_free_ahash(tfm);

Note the NULL call back function parameter and distinct lack of any
synchronization operations on completion after crypto_ahash_digest()
is done.
Also, the code checks the return value of crypto_ahash_digest() in a
way that seems to imply the return value is the one of performing the
digest, NOT queuing an asynchronous request which later figure out if
it is indeed successful.
Well, that does not look like an invocation of an asynchronous API at all.

hmm.... documentation have been known to go out of sync (pun not
intended) with code in past. What does the API doc for
crypto_ahash_digest() say?

/**
 * crypto_ahash_digest() - calculate message digest for a buffer
 * @req: reference to the ahash_request handle that holds all information
 *       needed to perform the cipher operation
 *
 * This function is a "short-hand" for the function calls of crypto_ahash_init,
 * crypto_ahash_update and crypto_ahash_final. The parameters have the same
 * meaning as discussed for those separate three functions.
 *
 * Return: 0 if the message digest creation was successful; < 0 if an error
 *         occurred
 */
int crypto_ahash_digest(struct ahash_request *req);

Note remark says the return value indicates the *digest creation* was
successful, not submitting an asynchronous request!
This is indeed in-line in with the code example above but isn't really
asynchronous.

OK, I'm officially confused. What does the code do?

ahash_request_set_callback() doesn't seem to do anything special with
a NULL call back function:

 static inline void ahash_request_set_callback(struct ahash_request *req,
                                              u32 flags,
                                              crypto_completion_t compl,
                                              void *data)
{
        req->base.complete = compl;
        req->base.data = data;
        req->base.flags = flags;
}

Neither does the completion call site treats it in a special way
(ignoring for now the mad house which is the handling of the unaligned
requests case):

static void ahash_def_finup_done2(struct crypto_async_request *req, int err)
{
        struct ahash_request *areq = req->data;

        ahash_def_finup_finish2(areq, err);

        areq->base.complete(&areq->base, err);
}

hmmm... perhaps the specific algorithm module handle this?

rockchip/rk3288_crypto_ahash.c indeed seems to handle a NULL call back
in a sense:

static void rk_ahash_crypto_complete(struct rk_crypto_info *dev, int err)
{
        if (dev->ahash_req->base.complete)
                dev->ahash_req->base.complete(&dev->ahash_req->base, err);
}

And since it's busy waiting for completion, the example code might even work:

static int rk_ahash_digest(struct ahash_request *req)
{
        struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
        struct rk_ahash_ctx *tctx = crypto_tfm_ctx(req->base.tfm);
        struct rk_crypto_info *dev = NULL;
        unsigned long flags;
        int ret;
...
        tasklet_schedule(&dev->crypto_tasklet);

        /*
         * it will take some time to process date after last dma transmission.
         *
         * waiting time is relative with the last date len,
         * so cannot set a fixed time here.
         * 10-50 makes system not call here frequently wasting
         * efficiency, and make it response quickly when dma
         * complete.
         */
        while (!CRYPTO_READ(dev, RK_CRYPTO_HASH_STS))
                usleep_range(10, 50);

        memcpy_fromio(req->result, dev->reg + RK_CRYPTO_HASH_DOUT_0,
                      crypto_ahash_digestsize(tfm));

        return 0;
}

Not exactly my cup of tea and I wouldn't call this asynchronous but I
guess you can say it gets the job done.

caam/caamhash.c however tells a completely different story:

static void ahash_done(struct device *jrdev, u32 *desc, u32 err,
                       void *context)
{
        struct ahash_request *req = context;
        struct ahash_edesc *edesc;
...
        req->base.complete(&req->base, err);
}

Note distinct lack of checking for the possibility the registered
callback is NULL, which in my eyes is perfectly sane choice for an
implementation of an asynchronous callback invocation assuming the
wrapping layer would have caught that, which in this case it doesn't.

Also:

static int ahash_digest(struct ahash_request *req)
{
        struct crypto_ahash *ahash = crypto_ahash_reqtfm(req);
        struct caam_hash_ctx *ctx = crypto_ahash_ctx(ahash);
        struct device *jrdev = ctx->jrdev;
        gfp_t flags = (req->base.flags & (CRYPTO_TFM_REQ_MAY_BACKLOG |
                       CRYPTO_TFM_REQ_MAY_SLEEP)) ? GFP_KERNEL : GFP_ATOMIC;
...
        ret = caam_jr_enqueue(jrdev, desc, ahash_done, req);
        if (!ret) {
                ret = -EINPROGRESS;
        } else {
                ahash_unmap(jrdev, edesc, req, digestsize);
                kfree(edesc);
        }

        return ret;
}

Which means the normal return value for ahash_digest would be
-EINPROGRESS, which is actually as expected for an asynchronous except
that will not work with the code example in api-intro.rst.

So... I am totally confused. The documentation claims this is an
asynchronous interface, but then its own code examples beg to differ
and actual implementations varies.

Would anyone be kind enough to enlighten me?

Many thanks,
Gilad

-- 
Gilad Ben-Yossef
Chief Coffee Drinker

"If you take a class in large-scale robotics, can you end up in a
situation where the homework eats your dog?"
 -- Jean-Baptiste Queru

^ permalink raw reply

* Re: [PATCH 0/6] crypto: ARM/arm64 - AES and ChaCha20 updates for v4.11
From: Ard Biesheuvel @ 2017-01-09  9:21 UTC (permalink / raw)
  To: linux-crypto@vger.kernel.org
  Cc: linux-arm-kernel@lists.infradead.org, Herbert Xu, Ard Biesheuvel
In-Reply-To: <CAKv+Gu8wWW-c6dYcsRtD8LK4nrP=6pQwb63OkJTyGTYhK3Ryyw@mail.gmail.com>

On 3 January 2017 at 20:01, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> On 2 January 2017 at 18:21, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
>> This series adds SIMD implementations for arm64 and ARM of ChaCha20 (*),
>> and a port of the ARM bit-sliced AES algorithm to arm64, and
>>
>> Patch #1 is a prerequisite for the AES-XTS implementation in #6, which needs
>> a secondary AES transform to generate the initial tweak.
>>
>
> Herbert,
>
> I actually have a scalar AES implementation for arm64 which I could
> use instead, making this patch unnecessary.
>
> I could respin the entire series, or you could simply disregard #1 and
> #6 for now, whichever you prefer.
>

I ended up doing some more work on the scalar and bit sliced AES
implementations for both ARM and arm64, so everything in this series
except the chacha20 patches (#3, #4) is now superseded.

^ permalink raw reply

* Re: Is the asynchronous hash crypto API asynchronous?
From: Stephan Müller @ 2017-01-09 10:23 UTC (permalink / raw)
  To: Gilad Ben-Yossef; +Cc: linux-crypto, Herbert Xu, David Miller
In-Reply-To: <CAOtvUMd=EHaFq=AtMePzN9fsoMLszZ0ruAu8KaaP=W=8J7nGUg@mail.gmail.com>

Am Sonntag, 8. Januar 2017, 15:45:37 CET schrieb Gilad Ben-Yossef:

Hi Gilad,

>         ahash_request_set_callback(req, 0, NULL, NULL);

> 
> Would anyone be kind enough to enlighten me?

The documentation got out of sync with the real world. I will file a patch for 
that shortly.

The ahash API works identically to the async skcipher API for which you find 
an example in the api-samples.rst. There you see that with the set_callback, a 
function is registered that is triggered upon completion of the operation.

Thus, use the callback example you find for skcipher for your ahash operation 
and you get an async operation.

Ciao
Stephan

^ permalink raw reply

* Outlook Security Team Micorosof
From: Devaraj Veerasamy, Dr @ 2017-01-09 11:21 UTC (permalink / raw)
  To: "\"NO-REPLY@WEBMAIL.NET""

MICROSOFT OUTLOOK anmälan
Din e-rutan konto behöver vara verifiera nu för oegentligheter finns i din e-box-konto eller kommer att blockera. Klicka här<https://mrswangjuan17.wixsite.com/webaccess2017> för att verifiera din e-postkonto och fil i ditt korrekta användarnamn och lösenord omedelbart
Outlook Security Team Micorosof
Tack.

Copyright © 2017 MIcrosoft OUtlook . Inc . All rights reserved.

^ permalink raw reply

* Re: Is the asynchronous hash crypto API asynchronous?
From: Gilad Ben-Yossef @ 2017-01-09 12:08 UTC (permalink / raw)
  To: Stephan Müller; +Cc: linux-crypto, Herbert Xu, David Miller
In-Reply-To: <3639236.s9QSuQoh7d@positron.chronox.de>

Hello Stephen,

Before getting to business I wish to offer my thanks for hosting the
kernel crypto documentation on your web site at chronox.de. It has
proven very useful to me :-)

On Mon, Jan 9, 2017 at 12:23 PM, Stephan Müller <smueller@chronox.de> wrote:
> Am Sonntag, 8. Januar 2017, 15:45:37 CET schrieb Gilad Ben-Yossef:
>
> Hi Gilad,
>
>>         ahash_request_set_callback(req, 0, NULL, NULL);
>
>>
>> Would anyone be kind enough to enlighten me?
>
> The documentation got out of sync with the real world. I will file a patch for
> that shortly.
>
> The ahash API works identically to the async skcipher API for which you find
> an example in the api-samples.rst. There you see that with the set_callback, a
> function is registered that is triggered upon completion of the operation.
>
> Thus, use the callback example you find for skcipher for your ahash operation
> and you get an async operation.

Thank you very much for pointing out the skcipher example. It is very helpful.

However, I suspect there is something a miss here beyond documentation:

There is (quite a lot of) kernel code calling
ahash_request_set_callback() with a NULL callback and consequently not
performing any synchronization in the completion of the hashing
operations.

See for example crypt_iv_essiv_init() in drivers/md/dm-crypt.c
Other similar call sites can be found at block/drbd/drbd_worker.c,
net/ppp/ppp_mppe.c, net/wireless/intersil/orinoco/mic.c,
scsi/iscsi_tcp.c, target/iscsi/iscsi_target_login.c

As far as I could tell the code in crypto/ahash.c does not take any
special consideration of the case where a NULL call back function has
been set and at least one of the underlying ahash algorithm provider
will crash if used like this.

This seems broken to me. I would be very happy to offer to fix the
broken call sites, if you can only confirm my understanding that
indeed cases which register a NULL callback are broken and it is not
just a misunderstanding on my part.

Many thanks,
Gilad

-- 
Gilad Ben-Yossef
Chief Coffee Drinker

"If you take a class in large-scale robotics, can you end up in a
situation where the homework eats your dog?"
 -- Jean-Baptiste Queru

^ permalink raw reply

* Re: Is the asynchronous hash crypto API asynchronous?
From: Stephan Müller @ 2017-01-09 15:25 UTC (permalink / raw)
  To: Gilad Ben-Yossef; +Cc: linux-crypto, Herbert Xu, David Miller
In-Reply-To: <CAOtvUMfheVHLH6y96+YjScZOYvBJegF6GwuBynR1LEgFt-O4=w@mail.gmail.com>

Am Montag, 9. Januar 2017, 14:08:23 CET schrieb Gilad Ben-Yossef:

Hi Gilad,

> Hello Stephen,
> 
> Before getting to business I wish to offer my thanks for hosting the
> kernel crypto documentation on your web site at chronox.de. It has
> proven very useful to me :-)

I am glad that it is helpful.
> 
> On Mon, Jan 9, 2017 at 12:23 PM, Stephan Müller <smueller@chronox.de> wrote:
> > Am Sonntag, 8. Januar 2017, 15:45:37 CET schrieb Gilad Ben-Yossef:
> > 
> > Hi Gilad,
> > 
> >>         ahash_request_set_callback(req, 0, NULL, NULL);
> >> 
> >> Would anyone be kind enough to enlighten me?
> > 
> > The documentation got out of sync with the real world. I will file a patch
> > for that shortly.
> > 
> > The ahash API works identically to the async skcipher API for which you
> > find an example in the api-samples.rst. There you see that with the
> > set_callback, a function is registered that is triggered upon completion
> > of the operation.
> > 
> > Thus, use the callback example you find for skcipher for your ahash
> > operation and you get an async operation.
> 
> Thank you very much for pointing out the skcipher example. It is very
> helpful.
> 
> However, I suspect there is something a miss here beyond documentation:
> 
> There is (quite a lot of) kernel code calling
> ahash_request_set_callback() with a NULL callback and consequently not
> performing any synchronization in the completion of the hashing
> operations.

I think this is legacy, although I cannot say for sure. As of now, there is 
hardly any real ahash implementation out there. The only one I am aware of is 
the sha1-mb/sha256-mb/sha512-mb for x86. There, the kernel would even crash if 
there is a NULL as callback, if I read the code correctly.

For all other implementations, they are synchronous even though you use the 
ahash API. I.e. all of those implementations would not trigger the callback 
function.
> 
> See for example crypt_iv_essiv_init() in drivers/md/dm-crypt.c
> Other similar call sites can be found at block/drbd/drbd_worker.c,
> net/ppp/ppp_mppe.c, net/wireless/intersil/orinoco/mic.c,
> scsi/iscsi_tcp.c, target/iscsi/iscsi_target_login.c
> 
> As far as I could tell the code in crypto/ahash.c does not take any
> special consideration of the case where a NULL call back function has
> been set and at least one of the underlying ahash algorithm provider
> will crash if used like this.

I do not think that the ahash.c code crashes, but look into the sha1-mb 
implementation:

...
                        req = cast_mcryptd_ctx_to_req(req_ctx);
                        if (irqs_disabled())
                                req_ctx->complete(&req->base, ret);
                        else {
                                local_bh_disable();
                                req_ctx->complete(&req->base, ret);
                                local_bh_enable();

...

Here you see the invocation of complete without a check.

> 
> This seems broken to me. I would be very happy to offer to fix the
> broken call sites, if you can only confirm my understanding that
> indeed cases which register a NULL callback are broken and it is not
> just a misunderstanding on my part.


IMHO, the use of a NULL callback works, but should definitely converted to a 
real callback function.
> 
> Many thanks,
> Gilad


Ciao
Stephan

^ permalink raw reply

* Re: [PATCH v2 11/12] crypto: atmel-authenc: add support to authenc(hmac(shaX),Y(aes)) modes
From: Cyrille Pitchen @ 2017-01-09 18:24 UTC (permalink / raw)
  To: Stephan Müller
  Cc: herbert, davem, nicolas.ferre, linux-crypto, linux-kernel,
	linux-arm-kernel
In-Reply-To: <1548507.knvAQkH9bK@tauon.atsec.com>

Hi Stephan,

Le 23/12/2016 à 12:34, Stephan Müller a écrit :
> Am Donnerstag, 22. Dezember 2016, 17:38:00 CET schrieb Cyrille Pitchen:
> 
> Hi Cyrille,
> 
>> This patchs allows to combine the AES and SHA hardware accelerators on
>> some Atmel SoCs. Doing so, AES blocks are only written to/read from the
>> AES hardware. Those blocks are also transferred from the AES to the SHA
>> accelerator internally, without additionnal accesses to the system busses.
>>
>> Hence, the AES and SHA accelerators work in parallel to process all the
>> data blocks, instead of serializing the process by (de)crypting those
>> blocks first then authenticating them after like the generic
>> crypto/authenc.c driver does.
>>
>> Of course, both the AES and SHA hardware accelerators need to be available
>> before we can start to process the data blocks. Hence we use their crypto
>> request queue to synchronize both drivers.
>>
>> Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com>
>> ---
>>  drivers/crypto/Kconfig          |  12 +
>>  drivers/crypto/atmel-aes-regs.h |  16 ++
>>  drivers/crypto/atmel-aes.c      | 471
>> +++++++++++++++++++++++++++++++++++++++- drivers/crypto/atmel-authenc.h  | 
>> 64 ++++++
>>  drivers/crypto/atmel-sha-regs.h |  14 ++
>>  drivers/crypto/atmel-sha.c      | 344 +++++++++++++++++++++++++++--
>>  6 files changed, 906 insertions(+), 15 deletions(-)
>>  create mode 100644 drivers/crypto/atmel-authenc.h
>>
>> diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
>> index 79564785ae30..719a868d8ea1 100644
>> --- a/drivers/crypto/Kconfig
>> +++ b/drivers/crypto/Kconfig
>> @@ -415,6 +415,18 @@ config CRYPTO_DEV_BFIN_CRC
>>  	  Newer Blackfin processors have CRC hardware. Select this if you
>>  	  want to use the Blackfin CRC module.
>>
>> +config CRYPTO_DEV_ATMEL_AUTHENC
>> +	tristate "Support for Atmel IPSEC/SSL hw accelerator"
>> +	depends on (ARCH_AT91 && HAS_DMA) || COMPILE_TEST
>> +	select CRYPTO_AUTHENC
>> +	select CRYPTO_DEV_ATMEL_AES
>> +	select CRYPTO_DEV_ATMEL_SHA
>> +	help
>> +	  Some Atmel processors can combine the AES and SHA hw accelerators
>> +	  to enhance support of IPSEC/SSL.
>> +	  Select this if you want to use the Atmel modules for
>> +	  authenc(hmac(shaX),Y(cbc)) algorithms.
>> +
>>  config CRYPTO_DEV_ATMEL_AES
>>  	tristate "Support for Atmel AES hw accelerator"
>>  	depends on HAS_DMA
>> diff --git a/drivers/crypto/atmel-aes-regs.h
>> b/drivers/crypto/atmel-aes-regs.h index 0ec04407b533..7694679802b3 100644
>> --- a/drivers/crypto/atmel-aes-regs.h
>> +++ b/drivers/crypto/atmel-aes-regs.h
>> @@ -68,6 +68,22 @@
>>  #define AES_CTRR	0x98
>>  #define AES_GCMHR(x)	(0x9c + ((x) * 0x04))
>>
>> +#define AES_EMR		0xb0
>> +#define AES_EMR_APEN		BIT(0)	/* Auto Padding Enable */
>> +#define AES_EMR_APM		BIT(1)	/* Auto Padding Mode */
>> +#define AES_EMR_APM_IPSEC	0x0
>> +#define AES_EMR_APM_SSL		BIT(1)
>> +#define AES_EMR_PLIPEN		BIT(4)	/* PLIP Enable */
>> +#define AES_EMR_PLIPD		BIT(5)	/* PLIP Decipher */
>> +#define AES_EMR_PADLEN_MASK	(0xFu << 8)
>> +#define AES_EMR_PADLEN_OFFSET	8
>> +#define AES_EMR_PADLEN(padlen)	(((padlen) << AES_EMR_PADLEN_OFFSET) &\
>> +				 AES_EMR_PADLEN_MASK)
>> +#define AES_EMR_NHEAD_MASK	(0xFu << 16)
>> +#define AES_EMR_NHEAD_OFFSET	16
>> +#define AES_EMR_NHEAD(nhead)	(((nhead) << AES_EMR_NHEAD_OFFSET) &\
>> +				 AES_EMR_NHEAD_MASK)
>> +
>>  #define AES_TWR(x)	(0xc0 + ((x) * 0x04))
>>  #define AES_ALPHAR(x)	(0xd0 + ((x) * 0x04))
>>
>> diff --git a/drivers/crypto/atmel-aes.c b/drivers/crypto/atmel-aes.c
>> index 9fd2f63b8bc0..3c651e0c3113 100644
>> --- a/drivers/crypto/atmel-aes.c
>> +++ b/drivers/crypto/atmel-aes.c
>> @@ -41,6 +41,7 @@
>>  #include <linux/platform_data/crypto-atmel.h>
>>  #include <dt-bindings/dma/at91.h>
>>  #include "atmel-aes-regs.h"
>> +#include "atmel-authenc.h"
>>
>>  #define ATMEL_AES_PRIORITY	300
>>
>> @@ -78,6 +79,7 @@
>>  #define AES_FLAGS_INIT		BIT(2)
>>  #define AES_FLAGS_BUSY		BIT(3)
>>  #define AES_FLAGS_DUMP_REG	BIT(4)
>> +#define AES_FLAGS_OWN_SHA	BIT(5)
>>
>>  #define AES_FLAGS_PERSISTENT	(AES_FLAGS_INIT | AES_FLAGS_BUSY)
>>
>> @@ -92,6 +94,7 @@ struct atmel_aes_caps {
>>  	bool			has_ctr32;
>>  	bool			has_gcm;
>>  	bool			has_xts;
>> +	bool			has_authenc;
>>  	u32			max_burst_size;
>>  };
>>
>> @@ -144,10 +147,31 @@ struct atmel_aes_xts_ctx {
>>  	u32			key2[AES_KEYSIZE_256 / sizeof(u32)];
>>  };
>>
>> +#ifdef CONFIG_CRYPTO_DEV_ATMEL_AUTHENC
>> +struct atmel_aes_authenc_ctx {
>> +	struct atmel_aes_base_ctx	base;
>> +	struct atmel_sha_authenc_ctx	*auth;
>> +};
>> +#endif
>> +
>>  struct atmel_aes_reqctx {
>>  	unsigned long		mode;
>>  };
>>
>> +#ifdef CONFIG_CRYPTO_DEV_ATMEL_AUTHENC
>> +struct atmel_aes_authenc_reqctx {
>> +	struct atmel_aes_reqctx	base;
>> +
>> +	struct scatterlist	src[2];
>> +	struct scatterlist	dst[2];
>> +	size_t			textlen;
>> +	u32			digest[SHA512_DIGEST_SIZE / sizeof(u32)];
>> +
>> +	/* auth_req MUST be place last. */
>> +	struct ahash_request	auth_req;
>> +};
>> +#endif
>> +
>>  struct atmel_aes_dma {
>>  	struct dma_chan		*chan;
>>  	struct scatterlist	*sg;
>> @@ -291,6 +315,9 @@ static const char *atmel_aes_reg_name(u32 offset, char
>> *tmp, size_t sz) snprintf(tmp, sz, "GCMHR[%u]", (offset - AES_GCMHR(0)) >>
>> 2);
>>  		break;
>>
>> +	case AES_EMR:
>> +		return "EMR";
>> +
>>  	case AES_TWR(0):
>>  	case AES_TWR(1):
>>  	case AES_TWR(2):
>> @@ -463,8 +490,16 @@ static inline bool atmel_aes_is_encrypt(const struct
>> atmel_aes_dev *dd) return (dd->flags & AES_FLAGS_ENCRYPT);
>>  }
>>
>> +#ifdef CONFIG_CRYPTO_DEV_ATMEL_AUTHENC
>> +static void atmel_aes_authenc_complete(struct atmel_aes_dev *dd, int err);
>> +#endif
>> +
>>  static inline int atmel_aes_complete(struct atmel_aes_dev *dd, int err)
>>  {
>> +#ifdef CONFIG_CRYPTO_DEV_ATMEL_AUTHENC
>> +	atmel_aes_authenc_complete(dd, err);
>> +#endif
>> +
>>  	clk_disable(dd->iclk);
>>  	dd->flags &= ~AES_FLAGS_BUSY;
>>
>> @@ -1931,6 +1966,407 @@ static struct crypto_alg aes_xts_alg = {
>>  	}
>>  };
>>
>> +#ifdef CONFIG_CRYPTO_DEV_ATMEL_AUTHENC
>> +/* authenc aead functions */
>> +
>> +static int atmel_aes_authenc_start(struct atmel_aes_dev *dd);
>> +static int atmel_aes_authenc_init(struct atmel_aes_dev *dd, int err,
>> +				  bool is_async);
>> +static int atmel_aes_authenc_transfer(struct atmel_aes_dev *dd, int err,
>> +				      bool is_async);
>> +static int atmel_aes_authenc_digest(struct atmel_aes_dev *dd);
>> +static int atmel_aes_authenc_final(struct atmel_aes_dev *dd, int err,
>> +				   bool is_async);
>> +
>> +static void atmel_aes_authenc_complete(struct atmel_aes_dev *dd, int err)
>> +{
>> +	struct aead_request *req = aead_request_cast(dd->areq);
>> +	struct atmel_aes_authenc_reqctx *rctx = aead_request_ctx(req);
>> +
>> +	if (err && (dd->flags & AES_FLAGS_OWN_SHA))
>> +		atmel_sha_authenc_abort(&rctx->auth_req);
>> +	dd->flags &= ~AES_FLAGS_OWN_SHA;
>> +}
>> +
>> +static int atmel_aes_authenc_copy_assoc(struct aead_request *req)
>> +{
>> +	size_t buflen, assoclen = req->assoclen;
>> +	off_t skip = 0;
>> +	u8 buf[256];
>> +
>> +	while (assoclen) {
>> +		buflen = min_t(size_t, assoclen, sizeof(buf));
>> +
>> +		if (sg_pcopy_to_buffer(req->src, sg_nents(req->src),
>> +				       buf, buflen, skip) != buflen)
>> +			return -EINVAL;
>> +
>> +		if (sg_pcopy_from_buffer(req->dst, sg_nents(req->dst),
>> +					 buf, buflen, skip) != buflen)
>> +			return -EINVAL;
>> +
>> +		skip += buflen;
>> +		assoclen -= buflen;
>> +	}
> 
> This seems to be a very expansive operation. Wouldn't it be easier, leaner and 
> with one less memcpy to use the approach of crypto_authenc_copy_assoc?
>
> Instead of copying crypto_authenc_copy_assoc, what about carving the logic in 
> crypto/authenc.c out into a generic aead helper code as we need to add that to 
> other AEAD implementations?


Before writing this function, I checked how the crypto/authenc.c driver
handles the copy of the associated data, hence crypto_authenc_copy_assoc().

I have to admit I didn't perform any benchmark to compare the two
implementation but I just tried to understand how
crypto_authenc_copy_assoc() works. At the first look, this function seems
very simple but I guess all the black magic is hidden by the call of
crypto_skcipher_encrypt() on the default null transform, which is
implemented using the ecb(cipher_null) algorithm.

When I wrote my function I thought that this ecb(cipher_null) algorithm was
implemented by combining crypto_ecb_crypt() from crypto/ecb.c with
null_crypt() from crypto/crypto_null.c. Hence I thought there would be much
function call overhead to copy only few bytes but now checking again I
realize that the ecb(cipher_null) algorithm is directly implemented by
skcipher_null_crypt() still from crypto/crypto_null.c. So yes, maybe you're
right: it could be better to reuse what was done in
crypto_authenc_copy_assoc() from crypto/authenc.c.

This way we could need twice less memcpy() hence I agree with you.


>> +
>> +	return 0;
>> +}
>> +
>> +static int atmel_aes_authenc_start(struct atmel_aes_dev *dd)
>> +{
>> +	struct aead_request *req = aead_request_cast(dd->areq);
>> +	struct atmel_aes_authenc_reqctx *rctx = aead_request_ctx(req);
>> +	struct crypto_aead *tfm = crypto_aead_reqtfm(req);
>> +	struct atmel_aes_authenc_ctx *ctx = crypto_aead_ctx(tfm);
>> +	int err;
>> +
>> +	atmel_aes_set_mode(dd, &rctx->base);
>> +
>> +	err = atmel_aes_hw_init(dd);
>> +	if (err)
>> +		return atmel_aes_complete(dd, err);
>> +
>> +	return atmel_sha_authenc_schedule(&rctx->auth_req, ctx->auth,
>> +					  atmel_aes_authenc_init, dd);
>> +}
>> +
>> +static int atmel_aes_authenc_init(struct atmel_aes_dev *dd, int err,
>> +				  bool is_async)
>> +{
>> +	struct aead_request *req = aead_request_cast(dd->areq);
>> +	struct atmel_aes_authenc_reqctx *rctx = aead_request_ctx(req);
>> +
>> +	if (is_async)
>> +		dd->is_async = true;
>> +	if (err)
>> +		return atmel_aes_complete(dd, err);
>> +
>> +	/* If here, we've got the ownership of the SHA device. */
>> +	dd->flags |= AES_FLAGS_OWN_SHA;
>> +
>> +	/* Configure the SHA device. */
>> +	return atmel_sha_authenc_init(&rctx->auth_req,
>> +				      req->src, req->assoclen,
>> +				      rctx->textlen,
>> +				      atmel_aes_authenc_transfer, dd);
>> +}
>> +
>> +static int atmel_aes_authenc_transfer(struct atmel_aes_dev *dd, int err,
>> +				      bool is_async)
>> +{
>> +	struct aead_request *req = aead_request_cast(dd->areq);
>> +	struct atmel_aes_authenc_reqctx *rctx = aead_request_ctx(req);
>> +	bool enc = atmel_aes_is_encrypt(dd);
>> +	struct scatterlist *src, *dst;
>> +	u32 iv[AES_BLOCK_SIZE / sizeof(u32)];
>> +	u32 emr;
>> +
>> +	if (is_async)
>> +		dd->is_async = true;
>> +	if (err)
>> +		return atmel_aes_complete(dd, err);
>> +
>> +	/* Prepare src and dst scatter-lists to transfer cipher/plain texts. */
>> +	src = scatterwalk_ffwd(rctx->src, req->src, req->assoclen);
>> +	dst = src;
>> +
>> +	if (req->src != req->dst) {
>> +		err = atmel_aes_authenc_copy_assoc(req);
>> +		if (err)
>> +			return atmel_aes_complete(dd, err);
>> +
>> +		dst = scatterwalk_ffwd(rctx->dst, req->dst, req->assoclen);
>> +	}
>> +
>> +	/* Configure the AES device. */
>> +	memcpy(iv, req->iv, sizeof(iv));
>> +
>> +	/*
>> +	 * Here we always set the 2nd parameter of atmel_aes_write_ctrl() to
>> +	 * 'true' even if the data transfer is actually performed by the CPU (so
>> +	 * not by the DMA) because we must force the AES_MR_SMOD bitfield to the
>> +	 * value AES_MR_SMOD_IDATAR0. Indeed, both AES_MR_SMOD and SHA_MR_SMOD
>> +	 * must be set to *_MR_SMOD_IDATAR0.
>> +	 */
>> +	atmel_aes_write_ctrl(dd, true, iv);
>> +	emr = AES_EMR_PLIPEN;
>> +	if (!enc)
>> +		emr |= AES_EMR_PLIPD;
>> +	atmel_aes_write(dd, AES_EMR, emr);
>> +
>> +	/* Transfer data. */
>> +	return atmel_aes_dma_start(dd, src, dst, rctx->textlen,
>> +				   atmel_aes_authenc_digest);
>> +}
>> +
>> +static int atmel_aes_authenc_digest(struct atmel_aes_dev *dd)
>> +{
>> +	struct aead_request *req = aead_request_cast(dd->areq);
>> +	struct atmel_aes_authenc_reqctx *rctx = aead_request_ctx(req);
>> +
>> +	/* atmel_sha_authenc_final() releases the SHA device. */
>> +	dd->flags &= ~AES_FLAGS_OWN_SHA;
>> +	return atmel_sha_authenc_final(&rctx->auth_req,
>> +				       rctx->digest, sizeof(rctx->digest),
>> +				       atmel_aes_authenc_final, dd);
>> +}
>> +
>> +static int atmel_aes_authenc_final(struct atmel_aes_dev *dd, int err,
>> +				   bool is_async)
>> +{
>> +	struct aead_request *req = aead_request_cast(dd->areq);
>> +	struct atmel_aes_authenc_reqctx *rctx = aead_request_ctx(req);
>> +	struct crypto_aead *tfm = crypto_aead_reqtfm(req);
>> +	bool enc = atmel_aes_is_encrypt(dd);
>> +	u32 idigest[SHA512_DIGEST_SIZE / sizeof(u32)], *odigest = rctx->digest;
>> +	u32 offs, authsize;
>> +
>> +	if (is_async)
>> +		dd->is_async = true;
>> +	if (err)
>> +		goto complete;
>> +
>> +	offs = req->assoclen + rctx->textlen;
>> +	authsize = crypto_aead_authsize(tfm);
>> +	if (enc) {
>> +		scatterwalk_map_and_copy(odigest, req->dst, offs, authsize, 1);
>> +	} else {
>> +		scatterwalk_map_and_copy(idigest, req->src, offs, authsize, 0);
>> +		if (crypto_memneq(idigest, odigest, authsize))
>> +			err = -EBADMSG;
>> +	}
>> +
>> +complete:
>> +	return atmel_aes_complete(dd, err);
>> +}
>> +
>> +static int atmel_aes_authenc_setkey(struct crypto_aead *tfm, const u8 *key,
>> +				    unsigned int keylen)
>> +{
>> +	struct atmel_aes_authenc_ctx *ctx = crypto_aead_ctx(tfm);
>> +	struct crypto_authenc_keys keys;
>> +	u32 flags;
>> +	int err;
>> +
>> +	if (crypto_authenc_extractkeys(&keys, key, keylen) != 0)
>> +		goto badkey;
>> +
>> +	if (keys.enckeylen > sizeof(ctx->base.key))
>> +		goto badkey;
>> +
>> +	/* Save auth key. */
>> +	flags = crypto_aead_get_flags(tfm);
>> +	err = atmel_sha_authenc_setkey(ctx->auth,
>> +				       keys.authkey, keys.authkeylen,
>> +				       &flags);
>> +	crypto_aead_set_flags(tfm, flags & CRYPTO_TFM_RES_MASK);
>> +	if (err)
>> +		return err;
>> +
>> +	/* Save enc key. */
>> +	ctx->base.keylen = keys.enckeylen;
>> +	memcpy(ctx->base.key, keys.enckey, keys.enckeylen);
> 
> memzero_explicit(keys) please

good point :)

>> +
>> +	return 0;
>> +
>> +badkey:
>> +	crypto_aead_set_flags(tfm, CRYPTO_TFM_RES_BAD_KEY_LEN);
>> +	return -EINVAL;
>> +}
>> +
>> +static int atmel_aes_authenc_init_tfm(struct crypto_aead *tfm,
>> +				      unsigned long auth_mode)
>> +{
>> +	struct atmel_aes_authenc_ctx *ctx = crypto_aead_ctx(tfm);
>> +	unsigned int auth_reqsize = atmel_sha_authenc_get_reqsize();
>> +
>> +	ctx->auth = atmel_sha_authenc_spawn(auth_mode);
>> +	if (IS_ERR(ctx->auth))
>> +		return PTR_ERR(ctx->auth);
>> +
>> +	crypto_aead_set_reqsize(tfm, (sizeof(struct atmel_aes_authenc_reqctx) +
>> +				      auth_reqsize));
>> +	ctx->base.start = atmel_aes_authenc_start;
>> +
>> +	return 0;
>> +}
>> +
>> +static int atmel_aes_authenc_hmac_sha1_init_tfm(struct crypto_aead *tfm)
>> +{
>> +	return atmel_aes_authenc_init_tfm(tfm, SHA_FLAGS_HMAC_SHA1);
>> +}
>> +
>> +static int atmel_aes_authenc_hmac_sha224_init_tfm(struct crypto_aead *tfm)
>> +{
>> +	return atmel_aes_authenc_init_tfm(tfm, SHA_FLAGS_HMAC_SHA224);
>> +}
>> +
>> +static int atmel_aes_authenc_hmac_sha256_init_tfm(struct crypto_aead *tfm)
>> +{
>> +	return atmel_aes_authenc_init_tfm(tfm, SHA_FLAGS_HMAC_SHA256);
>> +}
>> +
>> +static int atmel_aes_authenc_hmac_sha384_init_tfm(struct crypto_aead *tfm)
>> +{
>> +	return atmel_aes_authenc_init_tfm(tfm, SHA_FLAGS_HMAC_SHA384);
>> +}
>> +
>> +static int atmel_aes_authenc_hmac_sha512_init_tfm(struct crypto_aead *tfm)
>> +{
>> +	return atmel_aes_authenc_init_tfm(tfm, SHA_FLAGS_HMAC_SHA512);
>> +}
>> +
>> +static void atmel_aes_authenc_exit_tfm(struct crypto_aead *tfm)
>> +{
>> +	struct atmel_aes_authenc_ctx *ctx = crypto_aead_ctx(tfm);
>> +
>> +	atmel_sha_authenc_free(ctx->auth);
>> +}
>> +
>> +static int atmel_aes_authenc_crypt(struct aead_request *req,
>> +				   unsigned long mode)
>> +{
>> +	struct atmel_aes_authenc_reqctx *rctx = aead_request_ctx(req);
>> +	struct crypto_aead *tfm = crypto_aead_reqtfm(req);
>> +	struct atmel_aes_base_ctx *ctx = crypto_aead_ctx(tfm);
>> +	u32 authsize = crypto_aead_authsize(tfm);
>> +	bool enc = (mode & AES_FLAGS_ENCRYPT);
>> +	struct atmel_aes_dev *dd;
>> +
>> +	/* Compute text length. */
>> +	rctx->textlen = req->cryptlen - (enc ? 0 : authsize);
> 
> Is there somewhere a check that authsize is always < req->cryptlen (at least 
> it escaped me)? Note, this logic will be exposed to user space which may do 
> funky things.

I thought those 2 sizes were always set by the kernel only but I admit I
didn't check my assumption. If you tell me they could be set directly from
the userspace, yes I agree with you, I need to add a test.


>> +
>> +	/*
>> +	 * Currently, empty messages are not supported yet:
>> +	 * the SHA auto-padding can be used only on non-empty messages.
>> +	 * Hence a special case needs to be implemented for empty message.
>> +	 */
>> +	if (!rctx->textlen && !req->assoclen)
>> +		return -EINVAL;
>> +
>> +	rctx->base.mode = mode;
>> +	ctx->block_size = AES_BLOCK_SIZE;
>> +
>> +	dd = atmel_aes_find_dev(ctx);
>> +	if (!dd)
>> +		return -ENODEV;
>> +
>> +	return atmel_aes_handle_queue(dd, &req->base);
> 
> Ciao
> Stephan
> 

thanks for your review! :)

Best regards,

Cyrille

^ permalink raw reply

* Re: [PATCH v2 11/12] crypto: atmel-authenc: add support to authenc(hmac(shaX),Y(aes)) modes
From: Stephan Müller @ 2017-01-09 18:34 UTC (permalink / raw)
  To: Cyrille Pitchen
  Cc: herbert, davem, nicolas.ferre, linux-crypto, linux-kernel,
	linux-arm-kernel
In-Reply-To: <6301d79c-f1c5-d86c-823c-dfdbb5100e74@atmel.com>

Am Montag, 9. Januar 2017, 19:24:12 CET schrieb Cyrille Pitchen:

Hi Cyrille,

> >> +static int atmel_aes_authenc_copy_assoc(struct aead_request *req)
> >> +{
> >> +	size_t buflen, assoclen = req->assoclen;
> >> +	off_t skip = 0;
> >> +	u8 buf[256];
> >> +
> >> +	while (assoclen) {
> >> +		buflen = min_t(size_t, assoclen, sizeof(buf));
> >> +
> >> +		if (sg_pcopy_to_buffer(req->src, sg_nents(req->src),
> >> +				       buf, buflen, skip) != buflen)
> >> +			return -EINVAL;
> >> +
> >> +		if (sg_pcopy_from_buffer(req->dst, sg_nents(req->dst),
> >> +					 buf, buflen, skip) != buflen)
> >> +			return -EINVAL;
> >> +
> >> +		skip += buflen;
> >> +		assoclen -= buflen;
> >> +	}
> > 
> > This seems to be a very expansive operation. Wouldn't it be easier, leaner
> > and with one less memcpy to use the approach of
> > crypto_authenc_copy_assoc?
> > 
> > Instead of copying crypto_authenc_copy_assoc, what about carving the logic
> > in crypto/authenc.c out into a generic aead helper code as we need to add
> > that to other AEAD implementations?
> 
> Before writing this function, I checked how the crypto/authenc.c driver
> handles the copy of the associated data, hence crypto_authenc_copy_assoc().
> 
> I have to admit I didn't perform any benchmark to compare the two
> implementation but I just tried to understand how
> crypto_authenc_copy_assoc() works. At the first look, this function seems
> very simple but I guess all the black magic is hidden by the call of
> crypto_skcipher_encrypt() on the default null transform, which is
> implemented using the ecb(cipher_null) algorithm.

The magic in the null cipher is that it not only performs a memcpy, but 
iterates through the SGL and performs a memcpy on each part of the source/
destination SGL.

I will release a patch set later today -- the coding is completed, but testing 
is yet under way. That patch now allows you to make only one function call 
without special init/deinit code.
> 
> When I wrote my function I thought that this ecb(cipher_null) algorithm was
> implemented by combining crypto_ecb_crypt() from crypto/ecb.c with
> null_crypt() from crypto/crypto_null.c. Hence I thought there would be much
> function call overhead to copy only few bytes but now checking again I
> realize that the ecb(cipher_null) algorithm is directly implemented by
> skcipher_null_crypt() still from crypto/crypto_null.c. So yes, maybe you're
> right: it could be better to reuse what was done in
> crypto_authenc_copy_assoc() from crypto/authenc.c.
> 
> This way we could need twice less memcpy() hence I agree with you.

In addition to the additional memcpy, the patch I want to air shortly (and 
which I hope is going to be accepted) should reduce the complexity of your 
code in this corner.

...

> >> +static int atmel_aes_authenc_crypt(struct aead_request *req,
> >> +				   unsigned long mode)
> >> +{
> >> +	struct atmel_aes_authenc_reqctx *rctx = aead_request_ctx(req);
> >> +	struct crypto_aead *tfm = crypto_aead_reqtfm(req);
> >> +	struct atmel_aes_base_ctx *ctx = crypto_aead_ctx(tfm);
> >> +	u32 authsize = crypto_aead_authsize(tfm);
> >> +	bool enc = (mode & AES_FLAGS_ENCRYPT);
> >> +	struct atmel_aes_dev *dd;
> >> +
> >> +	/* Compute text length. */
> >> +	rctx->textlen = req->cryptlen - (enc ? 0 : authsize);
> > 
> > Is there somewhere a check that authsize is always < req->cryptlen (at
> > least it escaped me)? Note, this logic will be exposed to user space
> > which may do funky things.
> 
> I thought those 2 sizes were always set by the kernel only but I admit I
> didn't check my assumption. If you tell me they could be set directly from
> the userspace, yes I agree with you, I need to add a test.

Then I would like to ask you adding that check -- as this check is cheap, it 
should not affect performance.
Ciao
Stephan

^ permalink raw reply

* [PATCH 13/13] crypto: qat - copy AAD during encryption
From: Stephan Müller @ 2017-01-10  1:41 UTC (permalink / raw)
  To: herbert; +Cc: linux-crypto
In-Reply-To: <10526995.lyZ7Je1KMx@positron.chronox.de>

Invoke the crypto_aead_copy_ad function during the encryption code path
to copy the AAD from the source to the destination buffer.

Signed-off-by: Stephan Mueller <smueller@chronox.de>
---
 drivers/crypto/qat/qat_common/qat_algs.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/crypto/qat/qat_common/qat_algs.c b/drivers/crypto/qat/qat_common/qat_algs.c
index 20f35df..6576328 100644
--- a/drivers/crypto/qat/qat_common/qat_algs.c
+++ b/drivers/crypto/qat/qat_common/qat_algs.c
@@ -865,6 +865,10 @@ static int qat_alg_aead_enc(struct aead_request *areq)
 	uint8_t *iv = areq->iv;
 	int ret, ctr = 0;
 
+	ret = crypto_aead_copy_ad(areq);
+	if (ret)
+		return ret;
+
 	ret = qat_alg_sgl_to_bufl(ctx->inst, areq->src, areq->dst, qat_req);
 	if (unlikely(ret))
 		return ret;
-- 
2.9.3

^ permalink raw reply related

* [PATCH 12/13] crypto: nx - copy AAD during encryption
From: Stephan Müller @ 2017-01-10  1:40 UTC (permalink / raw)
  To: herbert; +Cc: linux-crypto
In-Reply-To: <10526995.lyZ7Je1KMx@positron.chronox.de>

Invoke the crypto_aead_copy_ad function during the encryption code path
to copy the AAD from the source to the destination buffer.

Signed-off-by: Stephan Mueller <smueller@chronox.de>
---
 drivers/crypto/nx/nx-aes-ccm.c |  4 ++++
 drivers/crypto/nx/nx-aes-gcm.c | 10 ++++++++++
 2 files changed, 14 insertions(+)

diff --git a/drivers/crypto/nx/nx-aes-ccm.c b/drivers/crypto/nx/nx-aes-ccm.c
index 7038f36..ee570bf 100644
--- a/drivers/crypto/nx/nx-aes-ccm.c
+++ b/drivers/crypto/nx/nx-aes-ccm.c
@@ -428,6 +428,10 @@ static int ccm_nx_encrypt(struct aead_request   *req,
 	unsigned int processed = 0, to_process;
 	int rc = -1;
 
+	rc = crypto_aead_copy_ad(req);
+	if (rc)
+		return rc;
+
 	spin_lock_irqsave(&nx_ctx->lock, irq_flags);
 
 	rc = generate_pat(desc->info, req, nx_ctx, authsize, nbytes, assoclen,
diff --git a/drivers/crypto/nx/nx-aes-gcm.c b/drivers/crypto/nx/nx-aes-gcm.c
index abd465f..0cc0533 100644
--- a/drivers/crypto/nx/nx-aes-gcm.c
+++ b/drivers/crypto/nx/nx-aes-gcm.c
@@ -432,9 +432,14 @@ static int gcm_aes_nx_encrypt(struct aead_request *req)
 {
 	struct nx_gcm_rctx *rctx = aead_request_ctx(req);
 	char *iv = rctx->iv;
+	int err;
 
 	memcpy(iv, req->iv, 12);
 
+	err = crypto_aead_copy_ad(req);
+	if (err)
+		return err;
+
 	return gcm_aes_nx_crypt(req, 1, req->assoclen);
 }
 
@@ -455,6 +460,7 @@ static int gcm4106_aes_nx_encrypt(struct aead_request *req)
 	struct nx_gcm_rctx *rctx = aead_request_ctx(req);
 	char *iv = rctx->iv;
 	char *nonce = nx_ctx->priv.gcm.nonce;
+	int err;
 
 	memcpy(iv, nonce, NX_GCM4106_NONCE_LEN);
 	memcpy(iv + NX_GCM4106_NONCE_LEN, req->iv, 8);
@@ -462,6 +468,10 @@ static int gcm4106_aes_nx_encrypt(struct aead_request *req)
 	if (req->assoclen < 8)
 		return -EINVAL;
 
+	err = crypto_aead_copy_ad(req);
+	if (err)
+		return err;
+
 	return gcm_aes_nx_crypt(req, 1, req->assoclen - 8);
 }
 
-- 
2.9.3

^ permalink raw reply related

* [PATCH 11/13] crypto: chelsio - copy AAD during encryption
From: Stephan Müller @ 2017-01-10  1:40 UTC (permalink / raw)
  To: herbert; +Cc: linux-crypto
In-Reply-To: <10526995.lyZ7Je1KMx@positron.chronox.de>

Invoke the crypto_aead_copy_ad function during the encryption code path
to copy the AAD from the source to the destination buffer.

Signed-off-by: Stephan Mueller <smueller@chronox.de>
---
 drivers/crypto/chelsio/chcr_algo.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/crypto/chelsio/chcr_algo.c b/drivers/crypto/chelsio/chcr_algo.c
index 2ed1e24..b3283c0 100644
--- a/drivers/crypto/chelsio/chcr_algo.c
+++ b/drivers/crypto/chelsio/chcr_algo.c
@@ -2398,6 +2398,11 @@ static int chcr_aead_encrypt(struct aead_request *req)
 {
 	struct crypto_aead *tfm = crypto_aead_reqtfm(req);
 	struct chcr_aead_reqctx *reqctx = aead_request_ctx(req);
+	int err;
+
+	err = crypto_aead_copy_ad(req);
+	if (err)
+		return err;
 
 	reqctx->verify = VERIFY_HW;
 
-- 
2.9.3

^ permalink raw reply related

* [PATCH 10/13] crypto: caam - copy AAD during encryption
From: Stephan Müller @ 2017-01-10  1:39 UTC (permalink / raw)
  To: herbert; +Cc: linux-crypto
In-Reply-To: <10526995.lyZ7Je1KMx@positron.chronox.de>

Invoke the crypto_aead_copy_ad function during the encryption code path
to copy the AAD from the source to the destination buffer.

Signed-off-by: Stephan Mueller <smueller@chronox.de>
---
 drivers/crypto/caam/caamalg.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/crypto/caam/caamalg.c b/drivers/crypto/caam/caamalg.c
index 662fe94..30ad943 100644
--- a/drivers/crypto/caam/caamalg.c
+++ b/drivers/crypto/caam/caamalg.c
@@ -1433,6 +1433,10 @@ static int gcm_encrypt(struct aead_request *req)
 	u32 *desc;
 	int ret = 0;
 
+	ret = crypto_aead_copy_ad(req);
+	if (ret)
+		return ret;
+
 	/* allocate extended descriptor */
 	edesc = aead_edesc_alloc(req, GCM_DESC_JOB_IO_LEN, &all_contig, true);
 	if (IS_ERR(edesc))
@@ -1476,6 +1480,10 @@ static int aead_encrypt(struct aead_request *req)
 	u32 *desc;
 	int ret = 0;
 
+	ret = crypto_aead_copy_ad(req);
+	if (ret)
+		return ret;
+
 	/* allocate extended descriptor */
 	edesc = aead_edesc_alloc(req, AUTHENC_DESC_JOB_IO_LEN,
 				 &all_contig, true);
-- 
2.9.3

^ permalink raw reply related

* [PATCH 09/13] crypto: atmel - copy AAD during encryption
From: Stephan Müller @ 2017-01-10  1:39 UTC (permalink / raw)
  To: herbert; +Cc: linux-crypto
In-Reply-To: <10526995.lyZ7Je1KMx@positron.chronox.de>

Invoke the crypto_aead_copy_ad function during the encryption code path
to copy the AAD from the source to the destination buffer.

Signed-off-by: Stephan Mueller <smueller@chronox.de>
---
 drivers/crypto/atmel-aes.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/crypto/atmel-aes.c b/drivers/crypto/atmel-aes.c
index 0e3d0d6..48ecf72 100644
--- a/drivers/crypto/atmel-aes.c
+++ b/drivers/crypto/atmel-aes.c
@@ -1752,6 +1752,12 @@ static int atmel_aes_gcm_setauthsize(struct crypto_aead *tfm,
 
 static int atmel_aes_gcm_encrypt(struct aead_request *req)
 {
+	int err;
+
+	err = crypto_aead_copy_ad(req);
+	if (err)
+		return err;
+
 	return atmel_aes_gcm_crypt(req, AES_FLAGS_ENCRYPT);
 }
 
-- 
2.9.3

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox