Linux cryptographic layer development
 help / color / mirror / Atom feed
* Re: [PATCH v2] siphash: add cryptographically secure hashtable function
From: Eric Biggers @ 2016-12-12  5:42 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: kernel-hardening, LKML, linux-crypto, Linus Torvalds,
	George Spelvin, Scott Bauer, ak, Andy Lutomirski, Greg KH,
	Jean-Philippe Aumasson, Daniel J . Bernstein
In-Reply-To: <20161212034817.1773-1-Jason@zx2c4.com>

On Mon, Dec 12, 2016 at 04:48:17AM +0100, Jason A. Donenfeld wrote:
>
> diff --git a/lib/Makefile b/lib/Makefile
> index 50144a3aeebd..71d398b04a74 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -22,7 +22,8 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \
>  	 sha1.o chacha20.o md5.o irq_regs.o argv_split.o \
>  	 flex_proportions.o ratelimit.o show_mem.o \
>  	 is_single_threaded.o plist.o decompress.o kobject_uevent.o \
> -	 earlycpio.o seq_buf.o nmi_backtrace.o nodemask.o win_minmax.o
> +	 earlycpio.o seq_buf.o siphash.o \
> +	 nmi_backtrace.o nodemask.o win_minmax.o
>  
>  lib-$(CONFIG_MMU) += ioremap.o
>  lib-$(CONFIG_SMP) += cpumask.o
> @@ -44,7 +45,7 @@ obj-$(CONFIG_TEST_HEXDUMP) += test_hexdump.o
>  obj-y += kstrtox.o
>  obj-$(CONFIG_TEST_BPF) += test_bpf.o
>  obj-$(CONFIG_TEST_FIRMWARE) += test_firmware.o
> -obj-$(CONFIG_TEST_HASH) += test_hash.o
> +obj-$(CONFIG_TEST_HASH) += test_hash.o test_siphash.o

Maybe add to the help text for CONFIG_TEST_HASH that it now tests siphash too?

> +static inline u64 le64_to_cpuvp(const void *p)
> +{
> +	return le64_to_cpup(p);
> +}

This assumes the key and message buffers are aligned to __alignof__(u64).
Unless that's going to be a clearly documented requirement for callers, you
should use get_unaligned_le64() instead.  And you can pass a 'u8 *' directly to
get_unaligned_le64(), no need for a helper function.

> +	b = (v0 ^ v1) ^ (v2 ^ v3);
> +	return (__force u64)cpu_to_le64(b);
> +}

It makes sense for this to return a u64, but that means the cpu_to_le64() is
wrong, since u64 indicates CPU endianness.  It should just return 'b'.

> +++ b/lib/test_siphash.c
> @@ -0,0 +1,116 @@
> +/* Test cases for siphash.c
> + *
> + * Copyright (C) 2015-2016 Jason A. Donenfeld <Jason@zx2c4.com>
> + *
> + * This file is provided under a dual BSD/GPLv2 license.
> + *
> + * SipHash: a fast short-input PRF
> + * https://131002.net/siphash/
> + */
> +
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
> +#include <linux/siphash.h>
> +#include <linux/kernel.h>
> +#include <linux/string.h>
> +#include <linux/errno.h>
> +#include <linux/module.h>
> +
> +static const u8 test_vectors[64][8] = {
> +	{ 0x31, 0x0e, 0x0e, 0xdd, 0x47, 0xdb, 0x6f, 0x72 },

Can you mention in a comment where the test vectors came from?

> +		if (memcmp(&out, test_vectors[i], 8)) {
> +			pr_info("self-test %u: FAIL\n", i + 1);
> +			ret = -EINVAL;
> +		}

If you make the output really be CPU-endian like I'm suggesting then this will
need to be something like:

	if (out != get_unaligned_le64(test_vectors[i])) {

Or else make the test vectors be an array of u64.

- Eric

^ permalink raw reply

* Re: [PATCH v2] siphash: add cryptographically secure hashtable function
From: Jason A. Donenfeld @ 2016-12-12  5:48 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: kernel-hardening@lists.openwall.com, LKML,
	Linux Crypto Mailing List, George Spelvin, Scott Bauer,
	Andi Kleen, Andy Lutomirski, Greg KH, Jean-Philippe Aumasson,
	Daniel J . Bernstein
In-Reply-To: <CA+55aFyfijNTvi0AN1kC4oWZqdGyoRD4WUVAf+kjFytVOE3kNw@mail.gmail.com>

Hey Linus,

On Mon, Dec 12, 2016 at 5:01 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> The above is extremely inefficient. Considering that most kernel data
> would be expected to be smallish, that matters (ie the usual benchmark
> would not be about hashing megabytes of data, but instead millions of
> hashes of small data).
>
> I think this could be rewritten (at least for 64-bit architectures) as
>
>     #ifdef CONFIG_DCACHE_WORD_ACCESS
>
>         if (left)
>                 b |= le64_to_cpu(load_unaligned_zeropad(data) &
> bytemask_from_count(left));
>
>     #else
>
>         .. do the duff's device thing with the switch() ..
>
>     #endif
>
> which should give you basically perfect code generation (ie a single
> 64-bit load and a byte mask).

I modified the test to hash data of size 0 through 7 repeatedly
100000000 times, and benchmarked that a few times on a Skylake laptop.
The `load_unaligned_zeropad & bytemask_from_count` version was
consistently 7% slower.

I then modified it again to simply hash a 4 byte constant repeatedly
1000000000 times. The `load_unaligned_zeropad & bytemask_from_count`
version was around 6% faster. I tried again with a 7 byte constant and
got more or less a similar result.

Then I tried with a 1 byte constant, and found that the
`load_unaligned_zeropad & bytemask_from_count` version was slower.

So, it would seem that between the `if (left)` and the `switch
(left)`, there's the same number of branches. But for small values of
`left`, the duff's device just has simpler arithmetic, whereas for
large values of `left`, the `load_unaligned_zeropad` prevails. If
micro-optimization is really appealing, one could imagine a hybrid of
the two:

    switch (left) {
    case 7:
    case 6:
    case 5:
    case 4:
        b |= le64_to_cpu(load_unaligned_zeropad(data) &
bytemask_from_count(left));
        break;
    case 3: b |= ((u64)data[2]) << 16;
    case 2: b |= ((u64)data[1]) <<  8;
    case 1: b |= ((u64)data[0]); break;
    case 0: break;
    }

But I'm not sure this complication is worth it, and it might be more
likely that the left-over size is 4 bytes most of the time, so we
should just use your trick on platforms that support it.

Jason

^ permalink raw reply

* RE: [PATCH v6 2/2] crypto: add virtio-crypto driver
From: Gonglei (Arei) @ 2016-12-12  6:25 UTC (permalink / raw)
  To: Gonglei (Arei), linux-kernel@vger.kernel.org,
	qemu-devel@nongnu.org, virtio-dev@lists.oasis-open.org,
	virtualization@lists.linux-foundation.org,
	linux-crypto@vger.kernel.org
  Cc: Luonengjun, mst@redhat.com, stefanha@redhat.com, Huangweidong (C),
	Wubin (H), xin.zeng@intel.com, Claudio Fontana,
	herbert@gondor.apana.org.au, pasic@linux.vnet.ibm.com,
	davem@davemloft.net, Zhoujian (jay, Euler), Hanweidong (Randy),
	arei.gonglei@hotmail.com, cornelia.huck@de.ibm.com,
	Xuquan (Quan Xu), longpeng, Wanzongshun (Vincent)
In-Reply-To: <1481171829-116496-3-git-send-email-arei.gonglei@huawei.com>

Hi, Michael & Herbert

Because the virtio-crypto device emulation had been in QEMU 2.8,
would you please merge the virtio-crypto driver for 4.10 if no other
comments? If so, Miachel pls ack and/or review the patch, then
Herbert will take it (I asked him last week). Thank you!

Ps: Note on 4.10 merge window timing from Linus
 https://lkml.org/lkml/2016/12/7/506

Dec 23rd is the deadline for 4.10 merge window.

Regards,
-Gonglei


> -----Original Message-----
> From: Gonglei (Arei)
> Sent: Thursday, December 08, 2016 12:37 PM
> Subject: [PATCH v6 2/2] crypto: add virtio-crypto driver
> 
> This patch introduces virtio-crypto driver for Linux Kernel.
> 
> The virtio crypto device is a virtual cryptography device
> as well as a kind of virtual hardware accelerator for
> virtual machines. The encryption anddecryption requests
> are placed in the data queue and are ultimately handled by
> thebackend crypto accelerators. The second queue is the
> control queue used to create or destroy sessions for
> symmetric algorithms and will control some advanced features
> in the future. The virtio crypto device provides the following
> cryptoservices: CIPHER, MAC, HASH, and AEAD.
> 
> For more information about virtio-crypto device, please see:
>   http://qemu-project.org/Features/VirtioCrypto
> 
> CC: Michael S. Tsirkin <mst@redhat.com>
> CC: Cornelia Huck <cornelia.huck@de.ibm.com>
> CC: Stefan Hajnoczi <stefanha@redhat.com>
> CC: Herbert Xu <herbert@gondor.apana.org.au>
> CC: Halil Pasic <pasic@linux.vnet.ibm.com>
> CC: David S. Miller <davem@davemloft.net>
> CC: Zeng Xin <xin.zeng@intel.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> ---
>  MAINTAINERS                                  |   9 +
>  drivers/crypto/Kconfig                       |   2 +
>  drivers/crypto/Makefile                      |   1 +
>  drivers/crypto/virtio/Kconfig                |  10 +
>  drivers/crypto/virtio/Makefile               |   5 +
>  drivers/crypto/virtio/virtio_crypto_algs.c   | 541
> +++++++++++++++++++++++++++
>  drivers/crypto/virtio/virtio_crypto_common.h | 122 ++++++
>  drivers/crypto/virtio/virtio_crypto_core.c   | 464
> +++++++++++++++++++++++
>  drivers/crypto/virtio/virtio_crypto_mgr.c    | 264 +++++++++++++
>  include/uapi/linux/Kbuild                    |   1 +
>  include/uapi/linux/virtio_crypto.h           | 450
> ++++++++++++++++++++++
>  include/uapi/linux/virtio_ids.h              |   1 +
>  12 files changed, 1870 insertions(+)
>  create mode 100644 drivers/crypto/virtio/Kconfig
>  create mode 100644 drivers/crypto/virtio/Makefile
>  create mode 100644 drivers/crypto/virtio/virtio_crypto_algs.c
>  create mode 100644 drivers/crypto/virtio/virtio_crypto_common.h
>  create mode 100644 drivers/crypto/virtio/virtio_crypto_core.c
>  create mode 100644 drivers/crypto/virtio/virtio_crypto_mgr.c
>  create mode 100644 include/uapi/linux/virtio_crypto.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index ad9b965..cccaaf0 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -12810,6 +12810,7 @@ F:	drivers/net/virtio_net.c
>  F:	drivers/block/virtio_blk.c
>  F:	include/linux/virtio_*.h
>  F:	include/uapi/linux/virtio_*.h
> +F:	drivers/crypto/virtio/
> 
>  VIRTIO DRIVERS FOR S390
>  M:	Christian Borntraeger <borntraeger@de.ibm.com>
> @@ -12846,6 +12847,14 @@ S:	Maintained
>  F:	drivers/virtio/virtio_input.c
>  F:	include/uapi/linux/virtio_input.h
> 
> +VIRTIO CRYPTO DRIVER
> +M:  Gonglei <arei.gonglei@huawei.com>
> +L:  virtualization@lists.linux-foundation.org
> +L:  linux-crypto@vger.kernel.org
> +S:  Maintained
> +F:  drivers/crypto/virtio/
> +F:  include/uapi/linux/virtio_crypto.h
> +
>  VIA RHINE NETWORK DRIVER
>  S:	Orphan
>  F:	drivers/net/ethernet/via/via-rhine.c
> diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
> index 4d2b81f..7956478 100644
> --- a/drivers/crypto/Kconfig
> +++ b/drivers/crypto/Kconfig
> @@ -555,4 +555,6 @@ config CRYPTO_DEV_ROCKCHIP
> 
>  source "drivers/crypto/chelsio/Kconfig"
> 
> +source "drivers/crypto/virtio/Kconfig"
> +
>  endif # CRYPTO_HW
> diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
> index ad7250f..bc53cb8 100644
> --- a/drivers/crypto/Makefile
> +++ b/drivers/crypto/Makefile
> @@ -32,3 +32,4 @@ obj-$(CONFIG_CRYPTO_DEV_VMX) += vmx/
>  obj-$(CONFIG_CRYPTO_DEV_SUN4I_SS) += sunxi-ss/
>  obj-$(CONFIG_CRYPTO_DEV_ROCKCHIP) += rockchip/
>  obj-$(CONFIG_CRYPTO_DEV_CHELSIO) += chelsio/
> +obj-$(CONFIG_CRYPTO_DEV_VIRTIO) += virtio/
> diff --git a/drivers/crypto/virtio/Kconfig b/drivers/crypto/virtio/Kconfig
> new file mode 100644
> index 0000000..d80f733
> --- /dev/null
> +++ b/drivers/crypto/virtio/Kconfig
> @@ -0,0 +1,10 @@
> +config CRYPTO_DEV_VIRTIO
> +	tristate "VirtIO crypto driver"
> +	depends on VIRTIO
> +	select CRYPTO_AEAD
> +	select CRYPTO_AUTHENC
> +	select CRYPTO_BLKCIPHER
> +	default m
> +	help
> +	  This driver provides support for virtio crypto device. If you
> +	  choose 'M' here, this module will be called virtio_crypto.
> diff --git a/drivers/crypto/virtio/Makefile b/drivers/crypto/virtio/Makefile
> new file mode 100644
> index 0000000..dd342c9
> --- /dev/null
> +++ b/drivers/crypto/virtio/Makefile
> @@ -0,0 +1,5 @@
> +obj-$(CONFIG_CRYPTO_DEV_VIRTIO) += virtio_crypto.o
> +virtio_crypto-objs := \
> +	virtio_crypto_algs.o \
> +	virtio_crypto_mgr.o \
> +	virtio_crypto_core.o
> diff --git a/drivers/crypto/virtio/virtio_crypto_algs.c
> b/drivers/crypto/virtio/virtio_crypto_algs.c
> new file mode 100644
> index 0000000..7130dc9
> --- /dev/null
> +++ b/drivers/crypto/virtio/virtio_crypto_algs.c
> @@ -0,0 +1,541 @@
> + /* Algorithms supported by virtio crypto device
> +  *
> +  * Authors: Gonglei <arei.gonglei@huawei.com>
> +  *
> +  * Copyright 2016 HUAWEI TECHNOLOGIES CO., LTD.
> +  *
> +  * This program is free software; you can redistribute it and/or modify
> +  * it under the terms of the GNU General Public License as published by
> +  * the Free Software Foundation; either version 2 of the License, or
> +  * (at your option) any later version.
> +  *
> +  * This program is distributed in the hope that it will be useful,
> +  * but WITHOUT ANY WARRANTY; without even the implied warranty of
> +  * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +  * GNU General Public License for more details.
> +  *
> +  * You should have received a copy of the GNU General Public License
> +  * along with this program; if not, see <http://www.gnu.org/licenses/>.
> +  */
> +
> +#include <linux/scatterlist.h>
> +#include <crypto/algapi.h>
> +#include <linux/err.h>
> +#include <crypto/scatterwalk.h>
> +#include <linux/atomic.h>
> +
> +#include <uapi/linux/virtio_crypto.h>
> +#include "virtio_crypto_common.h"
> +
> +/*
> + * The algs_lock protects the below global virtio_crypto_active_devs
> + * and crypto algorithms registion.
> + */
> +static DEFINE_MUTEX(algs_lock);
> +static unsigned int virtio_crypto_active_devs;
> +
> +static u64 virtio_crypto_alg_sg_nents_length(struct scatterlist *sg)
> +{
> +	u64 total = 0;
> +
> +	for (total = 0; sg; sg = sg_next(sg))
> +		total += sg->length;
> +
> +	return total;
> +}
> +
> +static int
> +virtio_crypto_alg_validate_key(int key_len, uint32_t *alg)
> +{
> +	switch (key_len) {
> +	case AES_KEYSIZE_128:
> +	case AES_KEYSIZE_192:
> +	case AES_KEYSIZE_256:
> +		*alg = VIRTIO_CRYPTO_CIPHER_AES_CBC;
> +		break;
> +	default:
> +		pr_err("virtio_crypto: Unsupported key length: %d\n",
> +			key_len);
> +		return -EINVAL;
> +	}
> +	return 0;
> +}
> +
> +static int virtio_crypto_alg_ablkcipher_init_session(
> +		struct virtio_crypto_ablkcipher_ctx *ctx,
> +		uint32_t alg, const uint8_t *key,
> +		unsigned int keylen,
> +		int encrypt)
> +{
> +	struct scatterlist outhdr, key_sg, inhdr, *sgs[3];
> +	unsigned int tmp;
> +	struct virtio_crypto *vcrypto = ctx->vcrypto;
> +	int op = encrypt ? VIRTIO_CRYPTO_OP_ENCRYPT :
> VIRTIO_CRYPTO_OP_DECRYPT;
> +	int err;
> +	unsigned int num_out = 0, num_in = 0;
> +
> +	/*
> +	 * Avoid to do DMA from the stack, switch to using
> +	 * dynamically-allocated for the key
> +	 */
> +	uint8_t *cipher_key = kmalloc(keylen, GFP_ATOMIC);
> +
> +	if (!cipher_key)
> +		return -ENOMEM;
> +
> +	memcpy(cipher_key, key, keylen);
> +
> +	spin_lock(&vcrypto->ctrl_lock);
> +	/* Pad ctrl header */
> +	vcrypto->ctrl.header.opcode =
> +		cpu_to_le32(VIRTIO_CRYPTO_CIPHER_CREATE_SESSION);
> +	vcrypto->ctrl.header.algo = cpu_to_le32(alg);
> +	/* Set the default dataqueue id to 0 */
> +	vcrypto->ctrl.header.queue_id = 0;
> +
> +	vcrypto->input.status = cpu_to_le32(VIRTIO_CRYPTO_ERR);
> +	/* Pad cipher's parameters */
> +	vcrypto->ctrl.u.sym_create_session.op_type =
> +		cpu_to_le32(VIRTIO_CRYPTO_SYM_OP_CIPHER);
> +	vcrypto->ctrl.u.sym_create_session.u.cipher.para.algo =
> +		vcrypto->ctrl.header.algo;
> +	vcrypto->ctrl.u.sym_create_session.u.cipher.para.keylen =
> +		cpu_to_le32(keylen);
> +	vcrypto->ctrl.u.sym_create_session.u.cipher.para.op =
> +		cpu_to_le32(op);
> +
> +	sg_init_one(&outhdr, &vcrypto->ctrl, sizeof(vcrypto->ctrl));
> +	sgs[num_out++] = &outhdr;
> +
> +	/* Set key */
> +	sg_init_one(&key_sg, cipher_key, keylen);
> +	sgs[num_out++] = &key_sg;
> +
> +	/* Return status and session id back */
> +	sg_init_one(&inhdr, &vcrypto->input, sizeof(vcrypto->input));
> +	sgs[num_out + num_in++] = &inhdr;
> +
> +	err = virtqueue_add_sgs(vcrypto->ctrl_vq, sgs, num_out,
> +				num_in, vcrypto, GFP_ATOMIC);
> +	if (err < 0) {
> +		spin_unlock(&vcrypto->ctrl_lock);
> +		kzfree(cipher_key);
> +		return err;
> +	}
> +	virtqueue_kick(vcrypto->ctrl_vq);
> +
> +	/*
> +	 * Trapping into the hypervisor, so the request should be
> +	 * handled immediately.
> +	 */
> +	while (!virtqueue_get_buf(vcrypto->ctrl_vq, &tmp) &&
> +	       !virtqueue_is_broken(vcrypto->ctrl_vq))
> +		cpu_relax();
> +
> +	if (le32_to_cpu(vcrypto->input.status) != VIRTIO_CRYPTO_OK) {
> +		spin_unlock(&vcrypto->ctrl_lock);
> +		pr_err("virtio_crypto: Create session failed status: %u\n",
> +			le32_to_cpu(vcrypto->input.status));
> +		kzfree(cipher_key);
> +		return -EINVAL;
> +	}
> +
> +	if (encrypt)
> +		ctx->enc_sess_info.session_id =
> +			le64_to_cpu(vcrypto->input.session_id);
> +	else
> +		ctx->dec_sess_info.session_id =
> +			le64_to_cpu(vcrypto->input.session_id);
> +
> +	spin_unlock(&vcrypto->ctrl_lock);
> +
> +	kzfree(cipher_key);
> +	return 0;
> +}
> +
> +static int virtio_crypto_alg_ablkcipher_close_session(
> +		struct virtio_crypto_ablkcipher_ctx *ctx,
> +		int encrypt)
> +{
> +	struct scatterlist outhdr, status_sg, *sgs[2];
> +	unsigned int tmp;
> +	struct virtio_crypto_destroy_session_req *destroy_session;
> +	struct virtio_crypto *vcrypto = ctx->vcrypto;
> +	int err;
> +	unsigned int num_out = 0, num_in = 0;
> +
> +	spin_lock(&vcrypto->ctrl_lock);
> +	vcrypto->ctrl_status.status = VIRTIO_CRYPTO_ERR;
> +	/* Pad ctrl header */
> +	vcrypto->ctrl.header.opcode =
> +		cpu_to_le32(VIRTIO_CRYPTO_CIPHER_DESTROY_SESSION);
> +	/* Set the default virtqueue id to 0 */
> +	vcrypto->ctrl.header.queue_id = 0;
> +
> +	destroy_session = &vcrypto->ctrl.u.destroy_session;
> +
> +	if (encrypt)
> +		destroy_session->session_id =
> +			cpu_to_le64(ctx->enc_sess_info.session_id);
> +	else
> +		destroy_session->session_id =
> +			cpu_to_le64(ctx->dec_sess_info.session_id);
> +
> +	sg_init_one(&outhdr, &vcrypto->ctrl, sizeof(vcrypto->ctrl));
> +	sgs[num_out++] = &outhdr;
> +
> +	/* Return status and session id back */
> +	sg_init_one(&status_sg, &vcrypto->ctrl_status.status,
> +		sizeof(vcrypto->ctrl_status.status));
> +	sgs[num_out + num_in++] = &status_sg;
> +
> +	err = virtqueue_add_sgs(vcrypto->ctrl_vq, sgs, num_out,
> +			num_in, vcrypto, GFP_ATOMIC);
> +	if (err < 0) {
> +		spin_unlock(&vcrypto->ctrl_lock);
> +		return err;
> +	}
> +	virtqueue_kick(vcrypto->ctrl_vq);
> +
> +	while (!virtqueue_get_buf(vcrypto->ctrl_vq, &tmp) &&
> +	       !virtqueue_is_broken(vcrypto->ctrl_vq))
> +		cpu_relax();
> +
> +	if (vcrypto->ctrl_status.status != VIRTIO_CRYPTO_OK) {
> +		spin_unlock(&vcrypto->ctrl_lock);
> +		pr_err("virtio_crypto: Close session failed status: %u, session_id:
> 0x%llx\n",
> +			vcrypto->ctrl_status.status,
> +			destroy_session->session_id);
> +
> +		return -EINVAL;
> +	}
> +	spin_unlock(&vcrypto->ctrl_lock);
> +
> +	return 0;
> +}
> +
> +static int virtio_crypto_alg_ablkcipher_init_sessions(
> +		struct virtio_crypto_ablkcipher_ctx *ctx,
> +		const uint8_t *key, unsigned int keylen)
> +{
> +	uint32_t alg;
> +	int ret;
> +	struct virtio_crypto *vcrypto = ctx->vcrypto;
> +
> +	if (keylen > vcrypto->max_cipher_key_len) {
> +		pr_err("virtio_crypto: the key is too long\n");
> +		goto bad_key;
> +	}
> +
> +	if (virtio_crypto_alg_validate_key(keylen, &alg))
> +		goto bad_key;
> +
> +	/* Create encryption session */
> +	ret = virtio_crypto_alg_ablkcipher_init_session(ctx,
> +			alg, key, keylen, 1);
> +	if (ret)
> +		return ret;
> +	/* Create decryption session */
> +	ret = virtio_crypto_alg_ablkcipher_init_session(ctx,
> +			alg, key, keylen, 0);
> +	if (ret) {
> +		virtio_crypto_alg_ablkcipher_close_session(ctx, 1);
> +		return ret;
> +	}
> +	return 0;
> +
> +bad_key:
> +	crypto_tfm_set_flags(ctx->tfm, CRYPTO_TFM_RES_BAD_KEY_LEN);
> +	return -EINVAL;
> +}
> +
> +/* Note: kernel crypto API realization */
> +static int virtio_crypto_ablkcipher_setkey(struct crypto_ablkcipher *tfm,
> +					 const uint8_t *key,
> +					 unsigned int keylen)
> +{
> +	struct virtio_crypto_ablkcipher_ctx *ctx = crypto_ablkcipher_ctx(tfm);
> +	int ret;
> +
> +	if (!ctx->vcrypto) {
> +		/* New key */
> +		int node = virtio_crypto_get_current_node();
> +		struct virtio_crypto *vcrypto =
> +				      virtcrypto_get_dev_node(node);
> +		if (!vcrypto) {
> +			pr_err("virtio_crypto: Could not find a virtio device in the
> system");
> +			return -ENODEV;
> +		}
> +
> +		ctx->vcrypto = vcrypto;
> +	} else {
> +		/* Rekeying, we should close the created sessions previously */
> +		virtio_crypto_alg_ablkcipher_close_session(ctx, 1);
> +		virtio_crypto_alg_ablkcipher_close_session(ctx, 0);
> +	}
> +
> +	ret = virtio_crypto_alg_ablkcipher_init_sessions(ctx, key, keylen);
> +	if (ret) {
> +		virtcrypto_dev_put(ctx->vcrypto);
> +		ctx->vcrypto = NULL;
> +
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +static int
> +__virtio_crypto_ablkcipher_do_req(struct virtio_crypto_request *vc_req,
> +		struct ablkcipher_request *req,
> +		struct data_queue *data_vq,
> +		__u8 op)
> +{
> +	struct crypto_ablkcipher *tfm = crypto_ablkcipher_reqtfm(req);
> +	unsigned int ivsize = crypto_ablkcipher_ivsize(tfm);
> +	struct virtio_crypto_ablkcipher_ctx *ctx = vc_req->ablkcipher_ctx;
> +	struct virtio_crypto *vcrypto = ctx->vcrypto;
> +	struct virtio_crypto_op_data_req *req_data;
> +	int src_nents, dst_nents;
> +	int err;
> +	unsigned long flags;
> +	struct scatterlist outhdr, iv_sg, status_sg, **sgs;
> +	int i;
> +	u64 dst_len;
> +	unsigned int num_out = 0, num_in = 0;
> +	int sg_total;
> +	uint8_t *iv;
> +
> +	src_nents = sg_nents_for_len(req->src, req->nbytes);
> +	dst_nents = sg_nents(req->dst);
> +
> +	pr_debug("virtio_crypto: Number of sgs (src_nents: %d,
> dst_nents: %d)\n",
> +			src_nents, dst_nents);
> +
> +	/* Why 3?  outhdr + iv + inhdr */
> +	sg_total = src_nents + dst_nents + 3;
> +	sgs = kzalloc_node(sg_total * sizeof(*sgs), GFP_ATOMIC,
> +				dev_to_node(&vcrypto->vdev->dev));
> +	if (!sgs)
> +		return -ENOMEM;
> +
> +	req_data = kzalloc_node(sizeof(*req_data), GFP_ATOMIC,
> +				dev_to_node(&vcrypto->vdev->dev));
> +	if (!req_data) {
> +		kfree(sgs);
> +		return -ENOMEM;
> +	}
> +
> +	vc_req->req_data = req_data;
> +	vc_req->type = VIRTIO_CRYPTO_SYM_OP_CIPHER;
> +	/* Head of operation */
> +	if (op) {
> +		req_data->header.session_id =
> +			cpu_to_le64(ctx->enc_sess_info.session_id);
> +		req_data->header.opcode =
> +			cpu_to_le32(VIRTIO_CRYPTO_CIPHER_ENCRYPT);
> +	} else {
> +		req_data->header.session_id =
> +			cpu_to_le64(ctx->dec_sess_info.session_id);
> +	    req_data->header.opcode =
> +			cpu_to_le32(VIRTIO_CRYPTO_CIPHER_DECRYPT);
> +	}
> +	req_data->u.sym_req.op_type =
> cpu_to_le32(VIRTIO_CRYPTO_SYM_OP_CIPHER);
> +	req_data->u.sym_req.u.cipher.para.iv_len = cpu_to_le32(ivsize);
> +	req_data->u.sym_req.u.cipher.para.src_data_len =
> +			cpu_to_le32(req->nbytes);
> +
> +	dst_len = virtio_crypto_alg_sg_nents_length(req->dst);
> +	if (unlikely(dst_len > U32_MAX)) {
> +		pr_err("virtio_crypto: The dst_len is beyond U32_MAX\n");
> +		err = -EINVAL;
> +		goto free;
> +	}
> +
> +	pr_debug("virtio_crypto: src_len: %u, dst_len: %llu\n",
> +			req->nbytes, dst_len);
> +
> +	if (unlikely(req->nbytes + dst_len + ivsize +
> +		sizeof(vc_req->status) > vcrypto->max_size)) {
> +		pr_err("virtio_crypto: The length is too big\n");
> +		err = -EINVAL;
> +		goto free;
> +	}
> +
> +	req_data->u.sym_req.u.cipher.para.dst_data_len =
> +			cpu_to_le32((uint32_t)dst_len);
> +
> +	/* Outhdr */
> +	sg_init_one(&outhdr, req_data, sizeof(*req_data));
> +	sgs[num_out++] = &outhdr;
> +
> +	/* IV */
> +
> +	/*
> +	 * Avoid to do DMA from the stack, switch to using
> +	 * dynamically-allocated for the IV
> +	 */
> +	iv = kzalloc_node(ivsize, GFP_ATOMIC,
> +				dev_to_node(&vcrypto->vdev->dev));
> +	if (!iv) {
> +		err = -ENOMEM;
> +		goto free;
> +	}
> +	memcpy(iv, req->info, ivsize);
> +	sg_init_one(&iv_sg, iv, ivsize);
> +	sgs[num_out++] = &iv_sg;
> +	vc_req->iv = iv;
> +
> +	/* Source data */
> +	for (i = 0; i < src_nents; i++)
> +		sgs[num_out++] = &req->src[i];
> +
> +	/* Destination data */
> +	for (i = 0; i < dst_nents; i++)
> +		sgs[num_out + num_in++] = &req->dst[i];
> +
> +	/* Status */
> +	sg_init_one(&status_sg, &vc_req->status, sizeof(vc_req->status));
> +	sgs[num_out + num_in++] = &status_sg;
> +
> +	vc_req->sgs = sgs;
> +
> +	spin_lock_irqsave(&vcrypto->lock, flags);
> +	err = virtqueue_add_sgs(data_vq->vq, sgs, num_out,
> +				num_in, vc_req, GFP_ATOMIC);
> +	spin_unlock_irqrestore(&vcrypto->lock, flags);
> +	if (unlikely(err < 0))
> +		goto free_iv;
> +
> +	return 0;
> +
> +free_iv:
> +	kzfree(iv);
> +free:
> +	kzfree(req_data);
> +	kfree(sgs);
> +	return err;
> +}
> +
> +static int virtio_crypto_ablkcipher_encrypt(struct ablkcipher_request *req)
> +{
> +	struct crypto_ablkcipher *atfm = crypto_ablkcipher_reqtfm(req);
> +	struct virtio_crypto_ablkcipher_ctx *ctx = crypto_ablkcipher_ctx(atfm);
> +	struct virtio_crypto_request *vc_req = ablkcipher_request_ctx(req);
> +	struct virtio_crypto *vcrypto = ctx->vcrypto;
> +	int ret;
> +	/* Use the first data virtqueue as default */
> +	struct data_queue *data_vq = &vcrypto->data_vq[0];
> +
> +	vc_req->ablkcipher_ctx = ctx;
> +	vc_req->ablkcipher_req = req;
> +	ret = __virtio_crypto_ablkcipher_do_req(vc_req, req, data_vq, 1);
> +	if (ret < 0) {
> +		pr_err("virtio_crypto: Encryption failed!\n");
> +		return ret;
> +	}
> +	virtqueue_kick(data_vq->vq);
> +
> +	return -EINPROGRESS;
> +}
> +
> +static int virtio_crypto_ablkcipher_decrypt(struct ablkcipher_request *req)
> +{
> +	struct crypto_ablkcipher *atfm = crypto_ablkcipher_reqtfm(req);
> +	struct virtio_crypto_ablkcipher_ctx *ctx = crypto_ablkcipher_ctx(atfm);
> +	struct virtio_crypto_request *vc_req = ablkcipher_request_ctx(req);
> +	struct virtio_crypto *vcrypto = ctx->vcrypto;
> +	int ret;
> +	/* Use the first data virtqueue as default */
> +	struct data_queue *data_vq = &vcrypto->data_vq[0];
> +
> +	vc_req->ablkcipher_ctx = ctx;
> +	vc_req->ablkcipher_req = req;
> +
> +	ret = __virtio_crypto_ablkcipher_do_req(vc_req, req, data_vq, 0);
> +	if (ret < 0) {
> +		pr_err("virtio_crypto: Decryption failed!\n");
> +		return ret;
> +	}
> +	virtqueue_kick(data_vq->vq);
> +
> +	return -EINPROGRESS;
> +}
> +
> +static int virtio_crypto_ablkcipher_init(struct crypto_tfm *tfm)
> +{
> +	struct virtio_crypto_ablkcipher_ctx *ctx = crypto_tfm_ctx(tfm);
> +
> +	tfm->crt_ablkcipher.reqsize = sizeof(struct virtio_crypto_request);
> +	ctx->tfm = tfm;
> +
> +	return 0;
> +}
> +
> +static void virtio_crypto_ablkcipher_exit(struct crypto_tfm *tfm)
> +{
> +	struct virtio_crypto_ablkcipher_ctx *ctx = crypto_tfm_ctx(tfm);
> +
> +	if (!ctx->vcrypto)
> +		return;
> +
> +	virtio_crypto_alg_ablkcipher_close_session(ctx, 1);
> +	virtio_crypto_alg_ablkcipher_close_session(ctx, 0);
> +	virtcrypto_dev_put(ctx->vcrypto);
> +	ctx->vcrypto = NULL;
> +}
> +
> +static struct crypto_alg virtio_crypto_algs[] = { {
> +	.cra_name = "cbc(aes)",
> +	.cra_driver_name = "virtio_crypto_aes_cbc",
> +	.cra_priority = 501,
> +	.cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER | CRYPTO_ALG_ASYNC,
> +	.cra_blocksize = AES_BLOCK_SIZE,
> +	.cra_ctxsize  = sizeof(struct virtio_crypto_ablkcipher_ctx),
> +	.cra_alignmask = 0,
> +	.cra_module = THIS_MODULE,
> +	.cra_type = &crypto_ablkcipher_type,
> +	.cra_init = virtio_crypto_ablkcipher_init,
> +	.cra_exit = virtio_crypto_ablkcipher_exit,
> +	.cra_u = {
> +	   .ablkcipher = {
> +			.setkey = virtio_crypto_ablkcipher_setkey,
> +			.decrypt = virtio_crypto_ablkcipher_decrypt,
> +			.encrypt = virtio_crypto_ablkcipher_encrypt,
> +			.min_keysize = AES_MIN_KEY_SIZE,
> +			.max_keysize = AES_MAX_KEY_SIZE,
> +			.ivsize = AES_BLOCK_SIZE,
> +		},
> +	},
> +} };
> +
> +int virtio_crypto_algs_register(void)
> +{
> +	int ret = 0;
> +
> +	mutex_lock(&algs_lock);
> +	if (++virtio_crypto_active_devs != 1)
> +		goto unlock;
> +
> +	ret = crypto_register_algs(virtio_crypto_algs,
> +			ARRAY_SIZE(virtio_crypto_algs));
> +	if (ret)
> +		virtio_crypto_active_devs--;
> +
> +unlock:
> +	mutex_unlock(&algs_lock);
> +	return ret;
> +}
> +
> +void virtio_crypto_algs_unregister(void)
> +{
> +	mutex_lock(&algs_lock);
> +	if (--virtio_crypto_active_devs != 0)
> +		goto unlock;
> +
> +	crypto_unregister_algs(virtio_crypto_algs,
> +			ARRAY_SIZE(virtio_crypto_algs));
> +
> +unlock:
> +	mutex_unlock(&algs_lock);
> +}
> diff --git a/drivers/crypto/virtio/virtio_crypto_common.h
> b/drivers/crypto/virtio/virtio_crypto_common.h
> new file mode 100644
> index 0000000..975404b
> --- /dev/null
> +++ b/drivers/crypto/virtio/virtio_crypto_common.h
> @@ -0,0 +1,122 @@
> +/* Common header for Virtio crypto device.
> + *
> + * Copyright 2016 HUAWEI TECHNOLOGIES CO., LTD.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef _VIRTIO_CRYPTO_COMMON_H
> +#define _VIRTIO_CRYPTO_COMMON_H
> +
> +#include <linux/virtio.h>
> +#include <linux/crypto.h>
> +#include <linux/spinlock.h>
> +#include <crypto/aead.h>
> +#include <crypto/aes.h>
> +#include <crypto/authenc.h>
> +
> +
> +/* Internal representation of a data virtqueue */
> +struct data_queue {
> +	/* Virtqueue associated with this send _queue */
> +	struct virtqueue *vq;
> +
> +	/* Name of the tx queue: dataq.$index */
> +	char name[32];
> +};
> +
> +struct virtio_crypto {
> +	struct virtio_device *vdev;
> +	struct virtqueue *ctrl_vq;
> +	struct data_queue *data_vq;
> +
> +	/* To protect the vq operations for the dataq */
> +	spinlock_t lock;
> +
> +	/* To protect the vq operations for the controlq */
> +	spinlock_t ctrl_lock;
> +
> +	/* Maximum of data queues supported by the device */
> +	u32 max_data_queues;
> +
> +	/* Number of queue currently used by the driver */
> +	u32 curr_queue;
> +
> +	/* Maximum length of cipher key */
> +	u32 max_cipher_key_len;
> +	/* Maximum length of authenticated key */
> +	u32 max_auth_key_len;
> +	/* Maximum size of per request */
> +	u64 max_size;
> +
> +	/* Control VQ buffers: protected by the ctrl_lock */
> +	struct virtio_crypto_op_ctrl_req ctrl;
> +	struct virtio_crypto_session_input input;
> +	struct virtio_crypto_inhdr ctrl_status;
> +
> +	unsigned long status;
> +	atomic_t ref_count;
> +	struct list_head list;
> +	struct module *owner;
> +	uint8_t dev_id;
> +
> +	/* Does the affinity hint is set for virtqueues? */
> +	bool affinity_hint_set;
> +};
> +
> +struct virtio_crypto_sym_session_info {
> +	/* Backend session id, which come from the host side */
> +	__u64 session_id;
> +};
> +
> +struct virtio_crypto_ablkcipher_ctx {
> +	struct virtio_crypto *vcrypto;
> +	struct crypto_tfm *tfm;
> +
> +	struct virtio_crypto_sym_session_info enc_sess_info;
> +	struct virtio_crypto_sym_session_info dec_sess_info;
> +};
> +
> +struct virtio_crypto_request {
> +	/* Cipher or aead */
> +	uint32_t type;
> +	uint8_t status;
> +	struct virtio_crypto_ablkcipher_ctx *ablkcipher_ctx;
> +	struct ablkcipher_request *ablkcipher_req;
> +	struct virtio_crypto_op_data_req *req_data;
> +	struct scatterlist **sgs;
> +	uint8_t *iv;
> +};
> +
> +int virtcrypto_devmgr_add_dev(struct virtio_crypto *vcrypto_dev);
> +struct list_head *virtcrypto_devmgr_get_head(void);
> +void virtcrypto_devmgr_rm_dev(struct virtio_crypto *vcrypto_dev);
> +struct virtio_crypto *virtcrypto_devmgr_get_first(void);
> +int virtcrypto_dev_in_use(struct virtio_crypto *vcrypto_dev);
> +int virtcrypto_dev_get(struct virtio_crypto *vcrypto_dev);
> +void virtcrypto_dev_put(struct virtio_crypto *vcrypto_dev);
> +int virtcrypto_dev_started(struct virtio_crypto *vcrypto_dev);
> +struct virtio_crypto *virtcrypto_get_dev_node(int node);
> +int virtcrypto_dev_start(struct virtio_crypto *vcrypto);
> +void virtcrypto_dev_stop(struct virtio_crypto *vcrypto);
> +
> +static inline int virtio_crypto_get_current_node(void)
> +{
> +	return topology_physical_package_id(smp_processor_id());
> +}
> +
> +int virtio_crypto_algs_register(void);
> +void virtio_crypto_algs_unregister(void);
> +
> +#endif /* _VIRTIO_CRYPTO_COMMON_H */
> diff --git a/drivers/crypto/virtio/virtio_crypto_core.c
> b/drivers/crypto/virtio/virtio_crypto_core.c
> new file mode 100644
> index 0000000..286d829
> --- /dev/null
> +++ b/drivers/crypto/virtio/virtio_crypto_core.c
> @@ -0,0 +1,464 @@
> + /* Driver for Virtio crypto device.
> +  *
> +  * Copyright 2016 HUAWEI TECHNOLOGIES CO., LTD.
> +  *
> +  * This program is free software; you can redistribute it and/or modify
> +  * it under the terms of the GNU General Public License as published by
> +  * the Free Software Foundation; either version 2 of the License, or
> +  * (at your option) any later version.
> +  *
> +  * This program is distributed in the hope that it will be useful,
> +  * but WITHOUT ANY WARRANTY; without even the implied warranty of
> +  * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +  * GNU General Public License for more details.
> +  *
> +  * You should have received a copy of the GNU General Public License
> +  * along with this program; if not, see <http://www.gnu.org/licenses/>.
> +  */
> +
> +#include <linux/err.h>
> +#include <linux/module.h>
> +#include <linux/virtio_config.h>
> +#include <linux/cpu.h>
> +
> +#include <uapi/linux/virtio_crypto.h>
> +#include "virtio_crypto_common.h"
> +
> +
> +static void virtcrypto_dataq_callback(struct virtqueue *vq)
> +{
> +	struct virtio_crypto *vcrypto = vq->vdev->priv;
> +	struct virtio_crypto_request *vc_req;
> +	unsigned long flags;
> +	unsigned int len;
> +	struct ablkcipher_request *ablk_req;
> +	int error;
> +
> +	spin_lock_irqsave(&vcrypto->lock, flags);
> +	do {
> +		virtqueue_disable_cb(vq);
> +		while ((vc_req = virtqueue_get_buf(vq, &len)) != NULL) {
> +			if (vc_req->type == VIRTIO_CRYPTO_SYM_OP_CIPHER) {
> +				switch (vc_req->status) {
> +				case VIRTIO_CRYPTO_OK:
> +					error = 0;
> +					break;
> +				case VIRTIO_CRYPTO_INVSESS:
> +				case VIRTIO_CRYPTO_ERR:
> +					error = -EINVAL;
> +					break;
> +				case VIRTIO_CRYPTO_BADMSG:
> +					error = -EBADMSG;
> +					break;
> +				default:
> +					error = -EIO;
> +					break;
> +				}
> +				ablk_req = vc_req->ablkcipher_req;
> +				/* Finish the encrypt or decrypt process */
> +				ablk_req->base.complete(&ablk_req->base, error);
> +			}
> +
> +			kzfree(vc_req->iv);
> +			kzfree(vc_req->req_data);
> +			kfree(vc_req->sgs);
> +		}
> +	} while (!virtqueue_enable_cb(vq));
> +	spin_unlock_irqrestore(&vcrypto->lock, flags);
> +}
> +
> +static int virtcrypto_find_vqs(struct virtio_crypto *vi)
> +{
> +	vq_callback_t **callbacks;
> +	struct virtqueue **vqs;
> +	int ret = -ENOMEM;
> +	int i, total_vqs;
> +	const char **names;
> +
> +	/*
> +	 * We expect 1 data virtqueue, followed by
> +	 * possible N-1 data queues used in multiqueue mode,
> +	 * followed by control vq.
> +	 */
> +	total_vqs = vi->max_data_queues + 1;
> +
> +	/* Allocate space for find_vqs parameters */
> +	vqs = kcalloc(total_vqs, sizeof(*vqs), GFP_KERNEL);
> +	if (!vqs)
> +		goto err_vq;
> +	callbacks = kcalloc(total_vqs, sizeof(*callbacks), GFP_KERNEL);
> +	if (!callbacks)
> +		goto err_callback;
> +	names = kcalloc(total_vqs, sizeof(*names), GFP_KERNEL);
> +	if (!names)
> +		goto err_names;
> +
> +	/* Parameters for control virtqueue */
> +	callbacks[total_vqs - 1] = NULL;
> +	names[total_vqs - 1] = "controlq";
> +
> +	/* Allocate/initialize parameters for data virtqueues */
> +	for (i = 0; i < vi->max_data_queues; i++) {
> +		callbacks[i] = virtcrypto_dataq_callback;
> +		snprintf(vi->data_vq[i].name, sizeof(vi->data_vq[i].name),
> +				"dataq.%d", i);
> +		names[i] = vi->data_vq[i].name;
> +	}
> +
> +	ret = vi->vdev->config->find_vqs(vi->vdev, total_vqs, vqs, callbacks,
> +					 names);
> +	if (ret)
> +		goto err_find;
> +
> +	vi->ctrl_vq = vqs[total_vqs - 1];
> +
> +	for (i = 0; i < vi->max_data_queues; i++)
> +		vi->data_vq[i].vq = vqs[i];
> +
> +	kfree(names);
> +	kfree(callbacks);
> +	kfree(vqs);
> +
> +	return 0;
> +
> +err_find:
> +	kfree(names);
> +err_names:
> +	kfree(callbacks);
> +err_callback:
> +	kfree(vqs);
> +err_vq:
> +	return ret;
> +}
> +
> +static int virtcrypto_alloc_queues(struct virtio_crypto *vi)
> +{
> +	vi->data_vq = kcalloc(vi->max_data_queues, sizeof(*vi->data_vq),
> +				GFP_KERNEL);
> +	if (!vi->data_vq)
> +		return -ENOMEM;
> +
> +	return 0;
> +}
> +
> +static void virtcrypto_clean_affinity(struct virtio_crypto *vi, long hcpu)
> +{
> +	int i;
> +
> +	if (vi->affinity_hint_set) {
> +		for (i = 0; i < vi->max_data_queues; i++)
> +			virtqueue_set_affinity(vi->data_vq[i].vq, -1);
> +
> +		vi->affinity_hint_set = false;
> +	}
> +}
> +
> +static void virtcrypto_set_affinity(struct virtio_crypto *vcrypto)
> +{
> +	int i = 0;
> +	int cpu;
> +
> +	/*
> +	 * In single queue mode, we don't set the cpu affinity.
> +	 */
> +	if (vcrypto->curr_queue == 1 || vcrypto->max_data_queues == 1) {
> +		virtcrypto_clean_affinity(vcrypto, -1);
> +		return;
> +	}
> +
> +	/*
> +	 * In multiqueue mode, we let the queue to be private to one cpu
> +	 * by setting the affinity hint to eliminate the contention.
> +	 *
> +	 * TODO: adds cpu hotplug support by register cpu notifier.
> +	 *
> +	 */
> +	for_each_online_cpu(cpu) {
> +		virtqueue_set_affinity(vcrypto->data_vq[i].vq, cpu);
> +		if (++i >= vcrypto->max_data_queues)
> +			break;
> +	}
> +
> +	vcrypto->affinity_hint_set = true;
> +}
> +
> +static void virtcrypto_free_queues(struct virtio_crypto *vi)
> +{
> +	kfree(vi->data_vq);
> +}
> +
> +static int virtcrypto_init_vqs(struct virtio_crypto *vi)
> +{
> +	int ret;
> +
> +	/* Allocate send & receive queues */
> +	ret = virtcrypto_alloc_queues(vi);
> +	if (ret)
> +		goto err;
> +
> +	ret = virtcrypto_find_vqs(vi);
> +	if (ret)
> +		goto err_free;
> +
> +	get_online_cpus();
> +	virtcrypto_set_affinity(vi);
> +	put_online_cpus();
> +
> +	return 0;
> +
> +err_free:
> +	virtcrypto_free_queues(vi);
> +err:
> +	return ret;
> +}
> +
> +static int virtcrypto_update_status(struct virtio_crypto *vcrypto)
> +{
> +	u32 status;
> +	int err;
> +	unsigned long flags;
> +
> +	virtio_cread(vcrypto->vdev,
> +	    struct virtio_crypto_config, status, &status);
> +
> +	/*
> +	 * Unknown status bits would be a host error and the driver
> +	 * should consider the device to be broken.
> +	 */
> +	if (status & (~VIRTIO_CRYPTO_S_HW_READY)) {
> +		dev_warn(&vcrypto->vdev->dev,
> +				"Unknown status bits: 0x%x\n", status);
> +
> +		spin_lock_irqsave(&vcrypto->lock, flags);
> +		virtio_break_device(vcrypto->vdev);
> +		spin_unlock_irqrestore(&vcrypto->lock, flags);
> +		return -EPERM;
> +	}
> +
> +	if (vcrypto->status == status)
> +		return 0;
> +
> +	vcrypto->status = status;
> +
> +	if (vcrypto->status & VIRTIO_CRYPTO_S_HW_READY) {
> +		err = virtcrypto_dev_start(vcrypto);
> +		if (err) {
> +			dev_err(&vcrypto->vdev->dev,
> +				"Failed to start virtio crypto device.\n");
> +
> +			return -EPERM;
> +		}
> +		dev_info(&vcrypto->vdev->dev, "Accelerator is ready\n");
> +	} else {
> +		virtcrypto_dev_stop(vcrypto);
> +		dev_info(&vcrypto->vdev->dev, "Accelerator is not ready\n");
> +	}
> +
> +	return 0;
> +}
> +
> +static void virtcrypto_del_vqs(struct virtio_crypto *vcrypto)
> +{
> +	struct virtio_device *vdev = vcrypto->vdev;
> +
> +	virtcrypto_clean_affinity(vcrypto, -1);
> +
> +	vdev->config->del_vqs(vdev);
> +
> +	virtcrypto_free_queues(vcrypto);
> +}
> +
> +static int virtcrypto_probe(struct virtio_device *vdev)
> +{
> +	int err = -EFAULT;
> +	struct virtio_crypto *vcrypto;
> +	u32 max_data_queues = 0, max_cipher_key_len = 0;
> +	u32 max_auth_key_len = 0;
> +	u64 max_size = 0;
> +
> +	if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1))
> +		return -ENODEV;
> +
> +	if (!vdev->config->get) {
> +		dev_err(&vdev->dev, "%s failure: config access disabled\n",
> +			__func__);
> +		return -EINVAL;
> +	}
> +
> +	if (num_possible_nodes() > 1 && dev_to_node(&vdev->dev) < 0) {
> +		/*
> +		 * If the accelerator is connected to a node with no memory
> +		 * there is no point in using the accelerator since the remote
> +		 * memory transaction will be very slow.
> +		 */
> +		dev_err(&vdev->dev, "Invalid NUMA configuration.\n");
> +		return -EINVAL;
> +	}
> +
> +	vcrypto = kzalloc_node(sizeof(*vcrypto), GFP_KERNEL,
> +					dev_to_node(&vdev->dev));
> +	if (!vcrypto)
> +		return -ENOMEM;
> +
> +	virtio_cread(vdev, struct virtio_crypto_config,
> +			max_dataqueues, &max_data_queues);
> +	if (max_data_queues < 1)
> +		max_data_queues = 1;
> +
> +	virtio_cread(vdev, struct virtio_crypto_config,
> +		max_cipher_key_len, &max_cipher_key_len);
> +	virtio_cread(vdev, struct virtio_crypto_config,
> +		max_auth_key_len, &max_auth_key_len);
> +	virtio_cread(vdev, struct virtio_crypto_config,
> +		max_size, &max_size);
> +
> +	/* Add virtio crypto device to global table */
> +	err = virtcrypto_devmgr_add_dev(vcrypto);
> +	if (err) {
> +		dev_err(&vdev->dev, "Failed to add new virtio crypto device.\n");
> +		goto free;
> +	}
> +	vcrypto->owner = THIS_MODULE;
> +	vcrypto = vdev->priv = vcrypto;
> +	vcrypto->vdev = vdev;
> +	spin_lock_init(&vcrypto->lock);
> +	spin_lock_init(&vcrypto->ctrl_lock);
> +
> +	/* Use single data queue as default */
> +	vcrypto->curr_queue = 1;
> +	vcrypto->max_data_queues = max_data_queues;
> +	vcrypto->max_cipher_key_len = max_cipher_key_len;
> +	vcrypto->max_auth_key_len = max_auth_key_len;
> +	vcrypto->max_size = max_size;
> +
> +	dev_info(&vdev->dev,
> +		"max_queues: %u, max_cipher_key_len: %u, max_auth_key_len: %u,
> max_size 0x%llx\n",
> +		vcrypto->max_data_queues,
> +		vcrypto->max_cipher_key_len,
> +		vcrypto->max_auth_key_len,
> +		vcrypto->max_size);
> +
> +	err = virtcrypto_init_vqs(vcrypto);
> +	if (err) {
> +		dev_err(&vdev->dev, "Failed to initialize vqs.\n");
> +		goto free_dev;
> +	}
> +	virtio_device_ready(vdev);
> +
> +	err = virtcrypto_update_status(vcrypto);
> +	if (err)
> +		goto free_vqs;
> +
> +	return 0;
> +
> +free_vqs:
> +	vcrypto->vdev->config->reset(vdev);
> +	virtcrypto_del_vqs(vcrypto);
> +free_dev:
> +	virtcrypto_devmgr_rm_dev(vcrypto);
> +free:
> +	kfree(vcrypto);
> +	return err;
> +}
> +
> +static void virtcrypto_free_unused_reqs(struct virtio_crypto *vcrypto)
> +{
> +	struct virtio_crypto_request *vc_req;
> +	int i;
> +	struct virtqueue *vq;
> +
> +	for (i = 0; i < vcrypto->max_data_queues; i++) {
> +		vq = vcrypto->data_vq[i].vq;
> +		while ((vc_req = virtqueue_detach_unused_buf(vq)) != NULL) {
> +			kfree(vc_req->req_data);
> +			kfree(vc_req->sgs);
> +		}
> +	}
> +}
> +
> +static void virtcrypto_remove(struct virtio_device *vdev)
> +{
> +	struct virtio_crypto *vcrypto = vdev->priv;
> +
> +	dev_info(&vdev->dev, "Start virtcrypto_remove.\n");
> +
> +	if (virtcrypto_dev_started(vcrypto))
> +		virtcrypto_dev_stop(vcrypto);
> +	vdev->config->reset(vdev);
> +	virtcrypto_free_unused_reqs(vcrypto);
> +	virtcrypto_del_vqs(vcrypto);
> +	virtcrypto_devmgr_rm_dev(vcrypto);
> +	kfree(vcrypto);
> +}
> +
> +static void virtcrypto_config_changed(struct virtio_device *vdev)
> +{
> +	struct virtio_crypto *vcrypto = vdev->priv;
> +
> +	virtcrypto_update_status(vcrypto);
> +}
> +
> +#ifdef CONFIG_PM_SLEEP
> +static int virtcrypto_freeze(struct virtio_device *vdev)
> +{
> +	struct virtio_crypto *vcrypto = vdev->priv;
> +
> +	vdev->config->reset(vdev);
> +	virtcrypto_free_unused_reqs(vcrypto);
> +	if (virtcrypto_dev_started(vcrypto))
> +		virtcrypto_dev_stop(vcrypto);
> +
> +	virtcrypto_del_vqs(vcrypto);
> +	return 0;
> +}
> +
> +static int virtcrypto_restore(struct virtio_device *vdev)
> +{
> +	struct virtio_crypto *vcrypto = vdev->priv;
> +	int err;
> +
> +	err = virtcrypto_init_vqs(vcrypto);
> +	if (err)
> +		return err;
> +
> +	virtio_device_ready(vdev);
> +	err = virtcrypto_dev_start(vcrypto);
> +	if (err) {
> +		dev_err(&vdev->dev, "Failed to start virtio crypto device.\n");
> +		return -EFAULT;
> +	}
> +
> +	return 0;
> +}
> +#endif
> +
> +static unsigned int features[] = {
> +	/* none */
> +};
> +
> +static struct virtio_device_id id_table[] = {
> +	{ VIRTIO_ID_CRYPTO, VIRTIO_DEV_ANY_ID },
> +	{ 0 },
> +};
> +
> +static struct virtio_driver virtio_crypto_driver = {
> +	.driver.name         = KBUILD_MODNAME,
> +	.driver.owner        = THIS_MODULE,
> +	.feature_table       = features,
> +	.feature_table_size  = ARRAY_SIZE(features),
> +	.id_table            = id_table,
> +	.probe               = virtcrypto_probe,
> +	.remove              = virtcrypto_remove,
> +	.config_changed = virtcrypto_config_changed,
> +#ifdef CONFIG_PM_SLEEP
> +	.freeze = virtcrypto_freeze,
> +	.restore = virtcrypto_restore,
> +#endif
> +};
> +
> +module_virtio_driver(virtio_crypto_driver);
> +
> +MODULE_DEVICE_TABLE(virtio, id_table);
> +MODULE_DESCRIPTION("virtio crypto device driver");
> +MODULE_LICENSE("GPL");
> +MODULE_AUTHOR("Gonglei <arei.gonglei@huawei.com>");
> diff --git a/drivers/crypto/virtio/virtio_crypto_mgr.c
> b/drivers/crypto/virtio/virtio_crypto_mgr.c
> new file mode 100644
> index 0000000..a69ff71
> --- /dev/null
> +++ b/drivers/crypto/virtio/virtio_crypto_mgr.c
> @@ -0,0 +1,264 @@
> + /* Management for virtio crypto devices (refer to adf_dev_mgr.c)
> +  *
> +  * Copyright 2016 HUAWEI TECHNOLOGIES CO., LTD.
> +  *
> +  * This program is free software; you can redistribute it and/or modify
> +  * it under the terms of the GNU General Public License as published by
> +  * the Free Software Foundation; either version 2 of the License, or
> +  * (at your option) any later version.
> +  *
> +  * This program is distributed in the hope that it will be useful,
> +  * but WITHOUT ANY WARRANTY; without even the implied warranty of
> +  * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +  * GNU General Public License for more details.
> +  *
> +  * You should have received a copy of the GNU General Public License
> +  * along with this program; if not, see <http://www.gnu.org/licenses/>.
> +  */
> +
> +#include <linux/mutex.h>
> +#include <linux/list.h>
> +#include <linux/module.h>
> +
> +#include <uapi/linux/virtio_crypto.h>
> +#include "virtio_crypto_common.h"
> +
> +static LIST_HEAD(virtio_crypto_table);
> +static uint32_t num_devices;
> +
> +/* The table_lock protects the above global list and num_devices */
> +static DEFINE_MUTEX(table_lock);
> +
> +#define VIRTIO_CRYPTO_MAX_DEVICES 32
> +
> +
> +/*
> + * virtcrypto_devmgr_add_dev() - Add vcrypto_dev to the acceleration
> + * framework.
> + * @vcrypto_dev:  Pointer to virtio crypto device.
> + *
> + * Function adds virtio crypto device to the global list.
> + * To be used by virtio crypto device specific drivers.
> + *
> + * Return: 0 on success, error code othewise.
> + */
> +int virtcrypto_devmgr_add_dev(struct virtio_crypto *vcrypto_dev)
> +{
> +	struct list_head *itr;
> +
> +	mutex_lock(&table_lock);
> +	if (num_devices == VIRTIO_CRYPTO_MAX_DEVICES) {
> +		pr_info("virtio_crypto: only support up to %d devices\n",
> +			    VIRTIO_CRYPTO_MAX_DEVICES);
> +		mutex_unlock(&table_lock);
> +		return -EFAULT;
> +	}
> +
> +	list_for_each(itr, &virtio_crypto_table) {
> +		struct virtio_crypto *ptr =
> +				list_entry(itr, struct virtio_crypto, list);
> +
> +		if (ptr == vcrypto_dev) {
> +			mutex_unlock(&table_lock);
> +			return -EEXIST;
> +		}
> +	}
> +	atomic_set(&vcrypto_dev->ref_count, 0);
> +	list_add_tail(&vcrypto_dev->list, &virtio_crypto_table);
> +	vcrypto_dev->dev_id = num_devices++;
> +	mutex_unlock(&table_lock);
> +	return 0;
> +}
> +
> +struct list_head *virtcrypto_devmgr_get_head(void)
> +{
> +	return &virtio_crypto_table;
> +}
> +
> +/*
> + * virtcrypto_devmgr_rm_dev() - Remove vcrypto_dev from the acceleration
> + * framework.
> + * @vcrypto_dev:  Pointer to virtio crypto device.
> + *
> + * Function removes virtio crypto device from the acceleration framework.
> + * To be used by virtio crypto device specific drivers.
> + *
> + * Return: void
> + */
> +void virtcrypto_devmgr_rm_dev(struct virtio_crypto *vcrypto_dev)
> +{
> +	mutex_lock(&table_lock);
> +	list_del(&vcrypto_dev->list);
> +	num_devices--;
> +	mutex_unlock(&table_lock);
> +}
> +
> +/*
> + * virtcrypto_devmgr_get_first()
> + *
> + * Function returns the first virtio crypto device from the acceleration
> + * framework.
> + *
> + * To be used by virtio crypto device specific drivers.
> + *
> + * Return: pointer to vcrypto_dev or NULL if not found.
> + */
> +struct virtio_crypto *virtcrypto_devmgr_get_first(void)
> +{
> +	struct virtio_crypto *dev = NULL;
> +
> +	mutex_lock(&table_lock);
> +	if (!list_empty(&virtio_crypto_table))
> +		dev = list_first_entry(&virtio_crypto_table,
> +					struct virtio_crypto,
> +				    list);
> +	mutex_unlock(&table_lock);
> +	return dev;
> +}
> +
> +/*
> + * virtcrypto_dev_in_use() - Check whether vcrypto_dev is currently in use
> + * @vcrypto_dev: Pointer to virtio crypto device.
> + *
> + * To be used by virtio crypto device specific drivers.
> + *
> + * Return: 1 when device is in use, 0 otherwise.
> + */
> +int virtcrypto_dev_in_use(struct virtio_crypto *vcrypto_dev)
> +{
> +	return atomic_read(&vcrypto_dev->ref_count) != 0;
> +}
> +
> +/*
> + * virtcrypto_dev_get() - Increment vcrypto_dev reference count
> + * @vcrypto_dev: Pointer to virtio crypto device.
> + *
> + * Increment the vcrypto_dev refcount and if this is the first time
> + * incrementing it during this period the vcrypto_dev is in use,
> + * increment the module refcount too.
> + * To be used by virtio crypto device specific drivers.
> + *
> + * Return: 0 when successful, EFAULT when fail to bump module refcount
> + */
> +int virtcrypto_dev_get(struct virtio_crypto *vcrypto_dev)
> +{
> +	if (atomic_add_return(1, &vcrypto_dev->ref_count) == 1)
> +		if (!try_module_get(vcrypto_dev->owner))
> +			return -EFAULT;
> +	return 0;
> +}
> +
> +/*
> + * virtcrypto_dev_put() - Decrement vcrypto_dev reference count
> + * @vcrypto_dev: Pointer to virtio crypto device.
> + *
> + * Decrement the vcrypto_dev refcount and if this is the last time
> + * decrementing it during this period the vcrypto_dev is in use,
> + * decrement the module refcount too.
> + * To be used by virtio crypto device specific drivers.
> + *
> + * Return: void
> + */
> +void virtcrypto_dev_put(struct virtio_crypto *vcrypto_dev)
> +{
> +	if (atomic_sub_return(1, &vcrypto_dev->ref_count) == 0)
> +		module_put(vcrypto_dev->owner);
> +}
> +
> +/*
> + * virtcrypto_dev_started() - Check whether device has started
> + * @vcrypto_dev: Pointer to virtio crypto device.
> + *
> + * To be used by virtio crypto device specific drivers.
> + *
> + * Return: 1 when the device has started, 0 otherwise
> + */
> +int virtcrypto_dev_started(struct virtio_crypto *vcrypto_dev)
> +{
> +	return (vcrypto_dev->status & VIRTIO_CRYPTO_S_HW_READY);
> +}
> +
> +/*
> + * virtcrypto_get_dev_node() - Get vcrypto_dev on the node.
> + * @node:  Node id the driver works.
> + *
> + * Function returns the virtio crypto device used fewest on the node.
> + *
> + * To be used by virtio crypto device specific drivers.
> + *
> + * Return: pointer to vcrypto_dev or NULL if not found.
> + */
> +struct virtio_crypto *virtcrypto_get_dev_node(int node)
> +{
> +	struct virtio_crypto *vcrypto_dev = NULL, *tmp_dev;
> +	unsigned long best = ~0;
> +	unsigned long ctr;
> +
> +	mutex_lock(&table_lock);
> +	list_for_each_entry(tmp_dev, virtcrypto_devmgr_get_head(), list) {
> +
> +		if ((node == dev_to_node(&tmp_dev->vdev->dev) ||
> +		     dev_to_node(&tmp_dev->vdev->dev) < 0) &&
> +		    virtcrypto_dev_started(tmp_dev)) {
> +			ctr = atomic_read(&tmp_dev->ref_count);
> +			if (best > ctr) {
> +				vcrypto_dev = tmp_dev;
> +				best = ctr;
> +			}
> +		}
> +	}
> +
> +	if (!vcrypto_dev) {
> +		pr_info("virtio_crypto: Could not find a device on node %d\n",
> +				node);
> +		/* Get any started device */
> +		list_for_each_entry(tmp_dev,
> +				virtcrypto_devmgr_get_head(), list) {
> +			if (virtcrypto_dev_started(tmp_dev)) {
> +				vcrypto_dev = tmp_dev;
> +				break;
> +			}
> +		}
> +	}
> +	mutex_unlock(&table_lock);
> +	if (!vcrypto_dev)
> +		return NULL;
> +
> +	virtcrypto_dev_get(vcrypto_dev);
> +	return vcrypto_dev;
> +}
> +
> +/*
> + * virtcrypto_dev_start() - Start virtio crypto device
> + * @vcrypto:    Pointer to virtio crypto device.
> + *
> + * Function notifies all the registered services that the virtio crypto device
> + * is ready to be used.
> + * To be used by virtio crypto device specific drivers.
> + *
> + * Return: 0 on success, EFAULT when fail to register algorithms
> + */
> +int virtcrypto_dev_start(struct virtio_crypto *vcrypto)
> +{
> +	if (virtio_crypto_algs_register()) {
> +		pr_err("virtio_crypto: Failed to register crypto algs\n");
> +		return -EFAULT;
> +	}
> +
> +	return 0;
> +}
> +
> +/*
> + * virtcrypto_dev_stop() - Stop virtio crypto device
> + * @vcrypto:    Pointer to virtio crypto device.
> + *
> + * Function notifies all the registered services that the virtio crypto device
> + * is ready to be used.
> + * To be used by virtio crypto device specific drivers.
> + *
> + * Return: void
> + */
> +void virtcrypto_dev_stop(struct virtio_crypto *vcrypto)
> +{
> +	virtio_crypto_algs_unregister();
> +}
> diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
> index cd2be1c..4bdb84c 100644
> --- a/include/uapi/linux/Kbuild
> +++ b/include/uapi/linux/Kbuild
> @@ -460,6 +460,7 @@ header-y += virtio_rng.h
>  header-y += virtio_scsi.h
>  header-y += virtio_types.h
>  header-y += virtio_vsock.h
> +header-y += virtio_crypto.h
>  header-y += vm_sockets.h
>  header-y += vt.h
>  header-y += vtpm_proxy.h
> diff --git a/include/uapi/linux/virtio_crypto.h
> b/include/uapi/linux/virtio_crypto.h
> new file mode 100644
> index 0000000..50cdc8a
> --- /dev/null
> +++ b/include/uapi/linux/virtio_crypto.h
> @@ -0,0 +1,450 @@
> +#ifndef _VIRTIO_CRYPTO_H
> +#define _VIRTIO_CRYPTO_H
> +/* This header is BSD licensed so anyone can use the definitions to implement
> + * compatible drivers/servers.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + * 1. Redistributions of source code must retain the above copyright
> + *    notice, this list of conditions and the following disclaimer.
> + * 2. Redistributions in binary form must reproduce the above copyright
> + *    notice, this list of conditions and the following disclaimer in the
> + *    documentation and/or other materials provided with the distribution.
> + * 3. Neither the name of IBM nor the names of its contributors
> + *    may be used to endorse or promote products derived from this
> software
> + *    without specific prior written permission.
> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> + * ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS
> + * FOR A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL IBM
> OR
> + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
> NOT
> + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
> OF
> + * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> AND
> + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
> + * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
> OUT
> + * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> + * SUCH DAMAGE.
> + */
> +#include <linux/types.h>
> +#include <linux/virtio_types.h>
> +#include <linux/virtio_ids.h>
> +#include <linux/virtio_config.h>
> +
> +
> +#define VIRTIO_CRYPTO_SERVICE_CIPHER 0
> +#define VIRTIO_CRYPTO_SERVICE_HASH   1
> +#define VIRTIO_CRYPTO_SERVICE_MAC    2
> +#define VIRTIO_CRYPTO_SERVICE_AEAD   3
> +
> +#define VIRTIO_CRYPTO_OPCODE(service, op)   (((service) << 8) | (op))
> +
> +struct virtio_crypto_ctrl_header {
> +#define VIRTIO_CRYPTO_CIPHER_CREATE_SESSION \
> +	   VIRTIO_CRYPTO_OPCODE(VIRTIO_CRYPTO_SERVICE_CIPHER, 0x02)
> +#define VIRTIO_CRYPTO_CIPHER_DESTROY_SESSION \
> +	   VIRTIO_CRYPTO_OPCODE(VIRTIO_CRYPTO_SERVICE_CIPHER, 0x03)
> +#define VIRTIO_CRYPTO_HASH_CREATE_SESSION \
> +	   VIRTIO_CRYPTO_OPCODE(VIRTIO_CRYPTO_SERVICE_HASH, 0x02)
> +#define VIRTIO_CRYPTO_HASH_DESTROY_SESSION \
> +	   VIRTIO_CRYPTO_OPCODE(VIRTIO_CRYPTO_SERVICE_HASH, 0x03)
> +#define VIRTIO_CRYPTO_MAC_CREATE_SESSION \
> +	   VIRTIO_CRYPTO_OPCODE(VIRTIO_CRYPTO_SERVICE_MAC, 0x02)
> +#define VIRTIO_CRYPTO_MAC_DESTROY_SESSION \
> +	   VIRTIO_CRYPTO_OPCODE(VIRTIO_CRYPTO_SERVICE_MAC, 0x03)
> +#define VIRTIO_CRYPTO_AEAD_CREATE_SESSION \
> +	   VIRTIO_CRYPTO_OPCODE(VIRTIO_CRYPTO_SERVICE_AEAD, 0x02)
> +#define VIRTIO_CRYPTO_AEAD_DESTROY_SESSION \
> +	   VIRTIO_CRYPTO_OPCODE(VIRTIO_CRYPTO_SERVICE_AEAD, 0x03)
> +	__le32 opcode;
> +	__le32 algo;
> +	__le32 flag;
> +	/* data virtqueue id */
> +	__le32 queue_id;
> +};
> +
> +struct virtio_crypto_cipher_session_para {
> +#define VIRTIO_CRYPTO_NO_CIPHER                 0
> +#define VIRTIO_CRYPTO_CIPHER_ARC4               1
> +#define VIRTIO_CRYPTO_CIPHER_AES_ECB            2
> +#define VIRTIO_CRYPTO_CIPHER_AES_CBC            3
> +#define VIRTIO_CRYPTO_CIPHER_AES_CTR            4
> +#define VIRTIO_CRYPTO_CIPHER_DES_ECB            5
> +#define VIRTIO_CRYPTO_CIPHER_DES_CBC            6
> +#define VIRTIO_CRYPTO_CIPHER_3DES_ECB           7
> +#define VIRTIO_CRYPTO_CIPHER_3DES_CBC           8
> +#define VIRTIO_CRYPTO_CIPHER_3DES_CTR           9
> +#define VIRTIO_CRYPTO_CIPHER_KASUMI_F8          10
> +#define VIRTIO_CRYPTO_CIPHER_SNOW3G_UEA2        11
> +#define VIRTIO_CRYPTO_CIPHER_AES_F8             12
> +#define VIRTIO_CRYPTO_CIPHER_AES_XTS            13
> +#define VIRTIO_CRYPTO_CIPHER_ZUC_EEA3           14
> +	__le32 algo;
> +	/* length of key */
> +	__le32 keylen;
> +
> +#define VIRTIO_CRYPTO_OP_ENCRYPT  1
> +#define VIRTIO_CRYPTO_OP_DECRYPT  2
> +	/* encrypt or decrypt */
> +	__le32 op;
> +	__le32 padding;
> +};
> +
> +struct virtio_crypto_session_input {
> +	/* Device-writable part */
> +	__le64 session_id;
> +	__le32 status;
> +	__le32 padding;
> +};
> +
> +struct virtio_crypto_cipher_session_req {
> +	struct virtio_crypto_cipher_session_para para;
> +	__u8 padding[32];
> +};
> +
> +struct virtio_crypto_hash_session_para {
> +#define VIRTIO_CRYPTO_NO_HASH            0
> +#define VIRTIO_CRYPTO_HASH_MD5           1
> +#define VIRTIO_CRYPTO_HASH_SHA1          2
> +#define VIRTIO_CRYPTO_HASH_SHA_224       3
> +#define VIRTIO_CRYPTO_HASH_SHA_256       4
> +#define VIRTIO_CRYPTO_HASH_SHA_384       5
> +#define VIRTIO_CRYPTO_HASH_SHA_512       6
> +#define VIRTIO_CRYPTO_HASH_SHA3_224      7
> +#define VIRTIO_CRYPTO_HASH_SHA3_256      8
> +#define VIRTIO_CRYPTO_HASH_SHA3_384      9
> +#define VIRTIO_CRYPTO_HASH_SHA3_512      10
> +#define VIRTIO_CRYPTO_HASH_SHA3_SHAKE128      11
> +#define VIRTIO_CRYPTO_HASH_SHA3_SHAKE256      12
> +	__le32 algo;
> +	/* hash result length */
> +	__le32 hash_result_len;
> +	__u8 padding[8];
> +};
> +
> +struct virtio_crypto_hash_create_session_req {
> +	struct virtio_crypto_hash_session_para para;
> +	__u8 padding[40];
> +};
> +
> +struct virtio_crypto_mac_session_para {
> +#define VIRTIO_CRYPTO_NO_MAC                       0
> +#define VIRTIO_CRYPTO_MAC_HMAC_MD5                 1
> +#define VIRTIO_CRYPTO_MAC_HMAC_SHA1                2
> +#define VIRTIO_CRYPTO_MAC_HMAC_SHA_224             3
> +#define VIRTIO_CRYPTO_MAC_HMAC_SHA_256             4
> +#define VIRTIO_CRYPTO_MAC_HMAC_SHA_384             5
> +#define VIRTIO_CRYPTO_MAC_HMAC_SHA_512             6
> +#define VIRTIO_CRYPTO_MAC_CMAC_3DES                25
> +#define VIRTIO_CRYPTO_MAC_CMAC_AES                 26
> +#define VIRTIO_CRYPTO_MAC_KASUMI_F9                27
> +#define VIRTIO_CRYPTO_MAC_SNOW3G_UIA2              28
> +#define VIRTIO_CRYPTO_MAC_GMAC_AES                 41
> +#define VIRTIO_CRYPTO_MAC_GMAC_TWOFISH             42
> +#define VIRTIO_CRYPTO_MAC_CBCMAC_AES               49
> +#define VIRTIO_CRYPTO_MAC_CBCMAC_KASUMI_F9         50
> +#define VIRTIO_CRYPTO_MAC_XCBC_AES                 53
> +	__le32 algo;
> +	/* hash result length */
> +	__le32 hash_result_len;
> +	/* length of authenticated key */
> +	__le32 auth_key_len;
> +	__le32 padding;
> +};
> +
> +struct virtio_crypto_mac_create_session_req {
> +	struct virtio_crypto_mac_session_para para;
> +	__u8 padding[40];
> +};
> +
> +struct virtio_crypto_aead_session_para {
> +#define VIRTIO_CRYPTO_NO_AEAD     0
> +#define VIRTIO_CRYPTO_AEAD_GCM    1
> +#define VIRTIO_CRYPTO_AEAD_CCM    2
> +#define VIRTIO_CRYPTO_AEAD_CHACHA20_POLY1305  3
> +	__le32 algo;
> +	/* length of key */
> +	__le32 key_len;
> +	/* hash result length */
> +	__le32 hash_result_len;
> +	/* length of the additional authenticated data (AAD) in bytes */
> +	__le32 aad_len;
> +	/* encrypt or decrypt, See above VIRTIO_CRYPTO_OP_* */
> +	__le32 op;
> +	__le32 padding;
> +};
> +
> +struct virtio_crypto_aead_create_session_req {
> +	struct virtio_crypto_aead_session_para para;
> +	__u8 padding[32];
> +};
> +
> +struct virtio_crypto_alg_chain_session_para {
> +#define VIRTIO_CRYPTO_SYM_ALG_CHAIN_ORDER_HASH_THEN_CIPHER  1
> +#define VIRTIO_CRYPTO_SYM_ALG_CHAIN_ORDER_CIPHER_THEN_HASH  2
> +	__le32 alg_chain_order;
> +/* Plain hash */
> +#define VIRTIO_CRYPTO_SYM_HASH_MODE_PLAIN    1
> +/* Authenticated hash (mac) */
> +#define VIRTIO_CRYPTO_SYM_HASH_MODE_AUTH     2
> +/* Nested hash */
> +#define VIRTIO_CRYPTO_SYM_HASH_MODE_NESTED   3
> +	__le32 hash_mode;
> +	struct virtio_crypto_cipher_session_para cipher_param;
> +	union {
> +		struct virtio_crypto_hash_session_para hash_param;
> +		struct virtio_crypto_mac_session_para mac_param;
> +		__u8 padding[16];
> +	} u;
> +	/* length of the additional authenticated data (AAD) in bytes */
> +	__le32 aad_len;
> +	__le32 padding;
> +};
> +
> +struct virtio_crypto_alg_chain_session_req {
> +	struct virtio_crypto_alg_chain_session_para para;
> +};
> +
> +struct virtio_crypto_sym_create_session_req {
> +	union {
> +		struct virtio_crypto_cipher_session_req cipher;
> +		struct virtio_crypto_alg_chain_session_req chain;
> +		__u8 padding[48];
> +	} u;
> +
> +	/* Device-readable part */
> +
> +/* No operation */
> +#define VIRTIO_CRYPTO_SYM_OP_NONE  0
> +/* Cipher only operation on the data */
> +#define VIRTIO_CRYPTO_SYM_OP_CIPHER  1
> +/*
> + * Chain any cipher with any hash or mac operation. The order
> + * depends on the value of alg_chain_order param
> + */
> +#define VIRTIO_CRYPTO_SYM_OP_ALGORITHM_CHAINING  2
> +	__le32 op_type;
> +	__le32 padding;
> +};
> +
> +struct virtio_crypto_destroy_session_req {
> +	/* Device-readable part */
> +	__le64  session_id;
> +	__u8 padding[48];
> +};
> +
> +/* The request of the control virtqueue's packet */
> +struct virtio_crypto_op_ctrl_req {
> +	struct virtio_crypto_ctrl_header header;
> +
> +	union {
> +		struct virtio_crypto_sym_create_session_req
> +			sym_create_session;
> +		struct virtio_crypto_hash_create_session_req
> +			hash_create_session;
> +		struct virtio_crypto_mac_create_session_req
> +			mac_create_session;
> +		struct virtio_crypto_aead_create_session_req
> +			aead_create_session;
> +		struct virtio_crypto_destroy_session_req
> +			destroy_session;
> +		__u8 padding[56];
> +	} u;
> +};
> +
> +struct virtio_crypto_op_header {
> +#define VIRTIO_CRYPTO_CIPHER_ENCRYPT \
> +	VIRTIO_CRYPTO_OPCODE(VIRTIO_CRYPTO_SERVICE_CIPHER, 0x00)
> +#define VIRTIO_CRYPTO_CIPHER_DECRYPT \
> +	VIRTIO_CRYPTO_OPCODE(VIRTIO_CRYPTO_SERVICE_CIPHER, 0x01)
> +#define VIRTIO_CRYPTO_HASH \
> +	VIRTIO_CRYPTO_OPCODE(VIRTIO_CRYPTO_SERVICE_HASH, 0x00)
> +#define VIRTIO_CRYPTO_MAC \
> +	VIRTIO_CRYPTO_OPCODE(VIRTIO_CRYPTO_SERVICE_MAC, 0x00)
> +#define VIRTIO_CRYPTO_AEAD_ENCRYPT \
> +	VIRTIO_CRYPTO_OPCODE(VIRTIO_CRYPTO_SERVICE_AEAD, 0x00)
> +#define VIRTIO_CRYPTO_AEAD_DECRYPT \
> +	VIRTIO_CRYPTO_OPCODE(VIRTIO_CRYPTO_SERVICE_AEAD, 0x01)
> +	__le32 opcode;
> +	/* algo should be service-specific algorithms */
> +	__le32 algo;
> +	/* session_id should be service-specific algorithms */
> +	__le64 session_id;
> +	/* control flag to control the request */
> +	__le32 flag;
> +	__le32 padding;
> +};
> +
> +struct virtio_crypto_cipher_para {
> +	/*
> +	 * Byte Length of valid IV/Counter
> +	 *
> +	 * For block ciphers in CBC or F8 mode, or for Kasumi in F8 mode, or for
> +	 *   SNOW3G in UEA2 mode, this is the length of the IV (which
> +	 *   must be the same as the block length of the cipher).
> +	 * For block ciphers in CTR mode, this is the length of the counter
> +	 *   (which must be the same as the block length of the cipher).
> +	 * For AES-XTS, this is the 128bit tweak, i, from IEEE Std 1619-2007.
> +	 *
> +	 * The IV/Counter will be updated after every partial cryptographic
> +	 * operation.
> +	 */
> +	__le32 iv_len;
> +	/* length of source data */
> +	__le32 src_data_len;
> +	/* length of dst data */
> +	__le32 dst_data_len;
> +	__le32 padding;
> +};
> +
> +struct virtio_crypto_hash_para {
> +	/* length of source data */
> +	__le32 src_data_len;
> +	/* hash result length */
> +	__le32 hash_result_len;
> +};
> +
> +struct virtio_crypto_mac_para {
> +	struct virtio_crypto_hash_para hash;
> +};
> +
> +struct virtio_crypto_aead_para {
> +	/*
> +	 * Byte Length of valid IV data pointed to by the below iv_addr
> +	 * parameter.
> +	 *
> +	 * For GCM mode, this is either 12 (for 96-bit IVs) or 16, in which
> +	 *   case iv_addr points to J0.
> +	 * For CCM mode, this is the length of the nonce, which can be in the
> +	 *   range 7 to 13 inclusive.
> +	 */
> +	__le32 iv_len;
> +	/* length of additional auth data */
> +	__le32 aad_len;
> +	/* length of source data */
> +	__le32 src_data_len;
> +	/* length of dst data */
> +	__le32 dst_data_len;
> +};
> +
> +struct virtio_crypto_cipher_data_req {
> +	/* Device-readable part */
> +	struct virtio_crypto_cipher_para para;
> +	__u8 padding[24];
> +};
> +
> +struct virtio_crypto_hash_data_req {
> +	/* Device-readable part */
> +	struct virtio_crypto_hash_para para;
> +	__u8 padding[40];
> +};
> +
> +struct virtio_crypto_mac_data_req {
> +	/* Device-readable part */
> +	struct virtio_crypto_mac_para para;
> +	__u8 padding[40];
> +};
> +
> +struct virtio_crypto_alg_chain_data_para {
> +	__le32 iv_len;
> +	/* Length of source data */
> +	__le32 src_data_len;
> +	/* Length of destination data */
> +	__le32 dst_data_len;
> +	/* Starting point for cipher processing in source data */
> +	__le32 cipher_start_src_offset;
> +	/* Length of the source data that the cipher will be computed on */
> +	__le32 len_to_cipher;
> +	/* Starting point for hash processing in source data */
> +	__le32 hash_start_src_offset;
> +	/* Length of the source data that the hash will be computed on */
> +	__le32 len_to_hash;
> +	/* Length of the additional auth data */
> +	__le32 aad_len;
> +	/* Length of the hash result */
> +	__le32 hash_result_len;
> +	__le32 reserved;
> +};
> +
> +struct virtio_crypto_alg_chain_data_req {
> +	/* Device-readable part */
> +	struct virtio_crypto_alg_chain_data_para para;
> +};
> +
> +struct virtio_crypto_sym_data_req {
> +	union {
> +		struct virtio_crypto_cipher_data_req cipher;
> +		struct virtio_crypto_alg_chain_data_req chain;
> +		__u8 padding[40];
> +	} u;
> +
> +	/* See above VIRTIO_CRYPTO_SYM_OP_* */
> +	__le32 op_type;
> +	__le32 padding;
> +};
> +
> +struct virtio_crypto_aead_data_req {
> +	/* Device-readable part */
> +	struct virtio_crypto_aead_para para;
> +	__u8 padding[32];
> +};
> +
> +/* The request of the data virtqueue's packet */
> +struct virtio_crypto_op_data_req {
> +	struct virtio_crypto_op_header header;
> +
> +	union {
> +		struct virtio_crypto_sym_data_req  sym_req;
> +		struct virtio_crypto_hash_data_req hash_req;
> +		struct virtio_crypto_mac_data_req mac_req;
> +		struct virtio_crypto_aead_data_req aead_req;
> +		__u8 padding[48];
> +	} u;
> +};
> +
> +#define VIRTIO_CRYPTO_OK        0
> +#define VIRTIO_CRYPTO_ERR       1
> +#define VIRTIO_CRYPTO_BADMSG    2
> +#define VIRTIO_CRYPTO_NOTSUPP   3
> +#define VIRTIO_CRYPTO_INVSESS   4 /* Invalid session id */
> +
> +/* The accelerator hardware is ready */
> +#define VIRTIO_CRYPTO_S_HW_READY  (1 << 0)
> +
> +struct virtio_crypto_config {
> +	/* See VIRTIO_CRYPTO_OP_* above */
> +	__u32  status;
> +
> +	/*
> +	 * Maximum number of data queue
> +	 */
> +	__u32  max_dataqueues;
> +
> +	/*
> +	 * Specifies the services mask which the device support,
> +	 * see VIRTIO_CRYPTO_SERVICE_* above
> +	 */
> +	__u32 crypto_services;
> +
> +	/* Detailed algorithms mask */
> +	__u32 cipher_algo_l;
> +	__u32 cipher_algo_h;
> +	__u32 hash_algo;
> +	__u32 mac_algo_l;
> +	__u32 mac_algo_h;
> +	__u32 aead_algo;
> +	/* Maximum length of cipher key */
> +	__u32 max_cipher_key_len;
> +	/* Maximum length of authenticated key */
> +	__u32 max_auth_key_len;
> +	__u32 reserve;
> +	/* Maximum size of each crypto request's content */
> +	__u64 max_size;
> +};
> +
> +struct virtio_crypto_inhdr {
> +	/* See VIRTIO_CRYPTO_* above */
> +	__u8 status;
> +};
> +#endif
> diff --git a/include/uapi/linux/virtio_ids.h b/include/uapi/linux/virtio_ids.h
> index 3228d58..6d5c3b2 100644
> --- a/include/uapi/linux/virtio_ids.h
> +++ b/include/uapi/linux/virtio_ids.h
> @@ -42,5 +42,6 @@
>  #define VIRTIO_ID_GPU          16 /* virtio GPU */
>  #define VIRTIO_ID_INPUT        18 /* virtio input */
>  #define VIRTIO_ID_VSOCK        19 /* virtio vsock transport */
> +#define VIRTIO_ID_CRYPTO       20 /* virtio crypto */
> 
>  #endif /* _LINUX_VIRTIO_IDS_H */
> --
> 1.8.3.1
> 

^ permalink raw reply

* Re: [PATCH v6 2/2] crypto: add virtio-crypto driver
From: Herbert Xu @ 2016-12-12 10:54 UTC (permalink / raw)
  To: Gonglei (Arei)
  Cc: virtio-dev@lists.oasis-open.org, Huangweidong (C),
	Claudio Fontana, mst@redhat.com, qemu-devel@nongnu.org,
	Hanweidong (Randy), Luonengjun, linux-kernel@vger.kernel.org,
	Wanzongshun (Vincent), virtualization@lists.linux-foundation.org,
	Xuquan (Quan Xu), linux-crypto@vger.kernel.org,
	stefanha@redhat.com, Zhoujian (jay, Euler), longpeng,
	davem@davemloft.net, Wubin (H), "arei.gonglei@hotmail.co
In-Reply-To: <33183CC9F5247A488A2544077AF19020DA15A07C@DGGEMA505-MBX.china.huawei.com>

On Mon, Dec 12, 2016 at 06:25:12AM +0000, Gonglei (Arei) wrote:
> Hi, Michael & Herbert
> 
> Because the virtio-crypto device emulation had been in QEMU 2.8,
> would you please merge the virtio-crypto driver for 4.10 if no other
> comments? If so, Miachel pls ack and/or review the patch, then
> Herbert will take it (I asked him last week). Thank you!
> 
> Ps: Note on 4.10 merge window timing from Linus
>  https://lkml.org/lkml/2016/12/7/506
> 
> Dec 23rd is the deadline for 4.10 merge window.

Sorry but it's too late for 4.10.  It needed to have been in my
tree before the merge window opened to make it for this cycle.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* [RFC PATCH 1/3] crypto: zip - Add ThunderX ZIP driver core
From: Jan Glauber @ 2016-12-12 15:04 UTC (permalink / raw)
  To: Herbert Xu
  Cc: linux-crypto, linux-kernel, David S . Miller, Mahipal Challa,
	Vishnu Nair, Jan Glauber
In-Reply-To: <20161212150439.18627-1-jglauber@cavium.com>

From: Mahipal Challa <Mahipal.Challa@cavium.com>

Add a driver for the ZIP engine found on Cavium ThunderX SOCs.
The ZIP engine supports hardware accelerated compression and
decompression. It includes 2 independent ZIP cores and supports:

- DEFLATE compression and decompression (RFC 1951)
- LZS compression and decompression (RFC 2395 and ANSI X3.241-1994)
- ADLER32 and CRC32 checksums for ZLIB (RFC 1950) and GZIP (RFC 1952)

The ZIP engine is presented as a PCI device. It supports DMA and
scatter-gather.

Signed-off-by: Mahipal Challa <Mahipal.Challa@cavium.com>
Signed-off-by: Vishnu Nair <Vishnu.Nair@cavium.com>
Signed-off-by: Jan Glauber <jglauber@cavium.com>
---
 drivers/crypto/Kconfig                 |    7 +
 drivers/crypto/Makefile                |    1 +
 drivers/crypto/cavium/Makefile         |    4 +
 drivers/crypto/cavium/zip/Makefile     |    8 +
 drivers/crypto/cavium/zip/common.h     |  258 +++++++
 drivers/crypto/cavium/zip/zip_crypto.h |   61 ++
 drivers/crypto/cavium/zip/zip_device.c |  208 +++++
 drivers/crypto/cavium/zip/zip_device.h |  138 ++++
 drivers/crypto/cavium/zip/zip_main.c   |  500 ++++++++++++
 drivers/crypto/cavium/zip/zip_main.h   |  126 +++
 drivers/crypto/cavium/zip/zip_mem.c    |  120 +++
 drivers/crypto/cavium/zip/zip_mem.h    |   78 ++
 drivers/crypto/cavium/zip/zip_regs.h   | 1326 ++++++++++++++++++++++++++++++++
 13 files changed, 2835 insertions(+)
 create mode 100644 drivers/crypto/cavium/Makefile
 create mode 100644 drivers/crypto/cavium/zip/Makefile
 create mode 100644 drivers/crypto/cavium/zip/common.h
 create mode 100644 drivers/crypto/cavium/zip/zip_crypto.h
 create mode 100644 drivers/crypto/cavium/zip/zip_device.c
 create mode 100644 drivers/crypto/cavium/zip/zip_device.h
 create mode 100644 drivers/crypto/cavium/zip/zip_main.c
 create mode 100644 drivers/crypto/cavium/zip/zip_main.h
 create mode 100644 drivers/crypto/cavium/zip/zip_mem.c
 create mode 100644 drivers/crypto/cavium/zip/zip_mem.h
 create mode 100644 drivers/crypto/cavium/zip/zip_regs.h

diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index 4d2b81f..da48d93 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -485,6 +485,13 @@ config CRYPTO_DEV_MXS_DCP
 
 source "drivers/crypto/qat/Kconfig"
 
+config CRYPTO_DEV_CAVIUM_ZIP
+	tristate "Cavium ZIP driver"
+	depends on PCI && 64BIT && (ARM64 || COMPILE_TEST)
+	---help---
+	  Select this option if you want to enable compression/decompression
+	  acceleration on Cavium's ARM based SoCs
+
 config CRYPTO_DEV_QCE
 	tristate "Qualcomm crypto engine accelerator"
 	depends on (ARCH_QCOM || COMPILE_TEST) && HAS_DMA && HAS_IOMEM
diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
index ad7250f..3d152d4 100644
--- a/drivers/crypto/Makefile
+++ b/drivers/crypto/Makefile
@@ -27,6 +27,7 @@ obj-$(CONFIG_CRYPTO_DEV_MXC_SCC) += mxc-scc.o
 obj-$(CONFIG_CRYPTO_DEV_TALITOS) += talitos.o
 obj-$(CONFIG_CRYPTO_DEV_UX500) += ux500/
 obj-$(CONFIG_CRYPTO_DEV_QAT) += qat/
+obj-$(CONFIG_CRYPTO_DEV_CAVIUM_ZIP) += cavium/
 obj-$(CONFIG_CRYPTO_DEV_QCE) += qce/
 obj-$(CONFIG_CRYPTO_DEV_VMX) += vmx/
 obj-$(CONFIG_CRYPTO_DEV_SUN4I_SS) += sunxi-ss/
diff --git a/drivers/crypto/cavium/Makefile b/drivers/crypto/cavium/Makefile
new file mode 100644
index 0000000..641268b
--- /dev/null
+++ b/drivers/crypto/cavium/Makefile
@@ -0,0 +1,4 @@
+#
+# Makefile for Cavium crypto device drivers
+#
+obj-$(CONFIG_CRYPTO_DEV_CAVIUM_ZIP) += zip/
diff --git a/drivers/crypto/cavium/zip/Makefile b/drivers/crypto/cavium/zip/Makefile
new file mode 100644
index 0000000..2c07508
--- /dev/null
+++ b/drivers/crypto/cavium/zip/Makefile
@@ -0,0 +1,8 @@
+#
+# Makefile for Cavium's ZIP Driver.
+#
+
+obj-$(CONFIG_CRYPTO_DEV_CAVIUM_ZIP) += thunderx_zip.o
+thunderx_zip-y := zip_main.o    \
+                  zip_device.o  \
+                  zip_mem.o
diff --git a/drivers/crypto/cavium/zip/common.h b/drivers/crypto/cavium/zip/common.h
new file mode 100644
index 0000000..f0694f4
--- /dev/null
+++ b/drivers/crypto/cavium/zip/common.h
@@ -0,0 +1,258 @@
+/***********************license start************************************
+ * Copyright (c) 2003-2016 Cavium, Inc.
+ * All rights reserved.
+ *
+ * License: one of 'Cavium License' or 'GNU General Public License Version 2'
+ *
+ * This file is provided under the terms of the Cavium License (see below)
+ * or under the terms of GNU General Public License, Version 2, as
+ * published by the Free Software Foundation. When using or redistributing
+ * this file, you may do so under either license.
+ *
+ * Cavium License:  Redistribution and use in source and binary forms, with
+ * or without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  * Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ *
+ *  * Redistributions in binary form must reproduce the above
+ *    copyright notice, this list of conditions and the following
+ *    disclaimer in the documentation and/or other materials provided
+ *    with the distribution.
+ *
+ *  * Neither the name of Cavium Inc. nor the names of its contributors may be
+ *    used to endorse or promote products derived from this software without
+ *    specific prior written permission.
+ *
+ * This Software, including technical data, may be subject to U.S. export
+ * control laws, including the U.S. Export Administration Act and its
+ * associated regulations, and may be subject to export or import
+ * regulations in other countries.
+ *
+ * TO THE MAXIMUM EXTENT PERMITTED BY LAW, THE SOFTWARE IS PROVIDED "AS IS"
+ * AND WITH ALL FAULTS AND CAVIUM INC. MAKES NO PROMISES, REPRESENTATIONS
+ * OR WARRANTIES, EITHER EXPRESS, IMPLIED, STATUTORY, OR OTHERWISE, WITH
+ * RESPECT TO THE SOFTWARE, INCLUDING ITS CONDITION, ITS CONFORMITY TO ANY
+ * REPRESENTATION OR DESCRIPTION, OR THE EXISTENCE OF ANY LATENT OR PATENT
+ * DEFECTS, AND CAVIUM SPECIFICALLY DISCLAIMS ALL IMPLIED (IF ANY)
+ * WARRANTIES OF TITLE, MERCHANTABILITY, NONINFRINGEMENT, FITNESS FOR A
+ * PARTICULAR PURPOSE, LACK OF VIRUSES, ACCURACY OR COMPLETENESS, QUIET
+ * ENJOYMENT, QUIET POSSESSION OR CORRESPONDENCE TO DESCRIPTION. THE
+ * ENTIRE  RISK ARISING OUT OF USE OR PERFORMANCE OF THE SOFTWARE LIES
+ * WITH YOU.
+ ***********************license end**************************************/
+
+#ifndef __COMMON_H__
+#define __COMMON_H__
+
+#include <linux/init.h>
+#include <linux/interrupt.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/seq_file.h>
+#include <linux/string.h>
+#include <linux/types.h>
+#include <linux/version.h>
+
+/* Device specific zlib function definitions */
+#include "zip_device.h"
+
+/* ZIP device definitions */
+#include "zip_main.h"
+
+/* ZIP memory allocation/deallocation related definitions */
+#include "zip_mem.h"
+
+/* Device specific structure definitions */
+#include "zip_regs.h"
+
+#define ZIP_ERROR    -1
+
+#define ZIP_FLUSH_FINISH  4
+
+#define RAW_FORMAT		0  /* for rawpipe */
+#define ZLIB_FORMAT		1  /* for zpipe */
+#define GZIP_FORMAT		2  /* for gzpipe */
+#define LZS_FORMAT		3  /* for lzspipe */
+
+/* Max number of ZIP devices supported */
+#define MAX_ZIP_DEVICES		2
+
+/* Configures the number of zip queues to be used */
+#define ZIP_NUM_QUEUES		2
+
+#define DYNAMIC_STOP_EXCESS	1024
+
+/* Maximum buffer sizes in direct mode */
+#define MAX_INPUT_BUFFER_SIZE   ((64 * 1024) - 1)
+#define MAX_OUTPUT_BUFFER_SIZE  ((64 * 1024) - 1)
+
+/* ZIP invocation result completion status codes */
+#define ZIP_NOTDONE		0x0
+
+/* Successful completion. */
+#define ZIP_SUCCESS		0x1
+
+/* Output truncated */
+#define ZIP_DTRUNC		0x2
+
+/* Dynamic Stop */
+#define ZIP_DYNAMIC_STOP	0x3
+
+/* Uncompress ran out of input data when IWORD0[EF] was set */
+#define ZIP_ITRUNC		0x4
+
+/* Uncompress found the reserved block type 3 */
+#define ZIP_RBLOCK		0x5
+
+/* Uncompress found LEN != ZIP_NLEN in an uncompressed block in the input */
+#define ZIP_NLEN		0x6
+
+/* Uncompress found a bad code in the main Huffman codes. */
+#define ZIP_BADCODE		0x7
+
+/* Uncompress found a bad code in the 19 Huffman codes encoding lengths. */
+#define ZIP_BADCODE2	        0x8
+
+/* Compress found a zero-length input. */
+#define ZIP_ZERO_LEN	        0x9
+
+/* The compress or decompress encountered an internal parity error. */
+#define ZIP_PARITY		0xA
+
+/*
+ * Uncompress found a string identifier that precedes the uncompressed data and
+ * decompression history.
+ */
+#define ZIP_FATAL		0xB
+
+/**
+ * struct zip_operation - common data structure for comp and decomp operations
+ * @input:               Next input byte is read from here
+ * @output:              Next output byte written here
+ * @ctx_addr:            Inflate context buffer address
+ * @history:             Pointer to the history buffer
+ * @input_len:           Number of bytes available at next_in
+ * @input_total_len:     Total number of input bytes read
+ * @output_len:          Remaining free space at next_out
+ * @output_total_len:    Total number of bytes output so far
+ * @csum:                Checksum value of the uncompressed data
+ * @flush:               Flush flag
+ * @format:              Format (depends on stream's wrap)
+ * @speed:               Speed depends on stream's level
+ * @ccode:               Compression code ( stream's strategy)
+ * @lzs_flag:            Flag for LZS support
+ * @begin_file:          Beginning of file indication for inflate
+ * @history_len:         Size of the history data
+ * @end_file:            Ending of the file indication for inflate
+ * @compcode:            Completion status of the ZIP invocation
+ * @bytes_read:          Input bytes read in current instruction
+ * @bits_processed:      Total bits processed for entire file
+ * @sizeofptr:           To distinguish between ILP32 and LP64
+ * @sizeofzops:          Optional just for padding
+ *
+ * This structure is used to maintain the required meta data for the
+ * comp and decomp operations.
+ */
+struct zip_operation {
+	u8    *input;
+	u8    *output;
+	u64   ctx_addr;
+	u64   history;
+
+	u32   input_len;
+	u32   input_total_len;
+
+	u32   output_len;
+	u32   output_total_len;
+
+	u32   csum;
+	u32   flush;
+
+	u32   format;
+	u32   speed;
+	u32   ccode;
+	u32   lzs_flag;
+
+	u32   begin_file;
+	u32   history_len;
+
+	u32   end_file;
+	u32   compcode;
+	u32   bytes_read;
+	u32   bits_processed;
+
+	u32   sizeofptr;
+	u32   sizeofzops;
+};
+
+/* error messages */
+#define zip_err(fmt, args...) pr_err("ZIP ERR:%s():%d: " \
+			      fmt "\n", __func__, __LINE__, ## args)
+
+#ifdef MSG_ENABLE
+/* Enable all messages */
+#define zip_msg(fmt, args...) pr_info("ZIP_MSG:" fmt "\n", ## args)
+#else
+#define zip_msg(fmt, args...)
+#endif
+
+#if defined(ZIP_DEBUG_ENABLE) && defined(MSG_ENABLE)
+
+#ifdef DEBUG_LEVEL
+
+#define FILE_NAME (strrchr(__FILE__, '/') ? strrchr(__FILE__, '/') + 1 : \
+	strrchr(__FILE__, '\\') ? strrchr(__FILE__, '\\') + 1 : __FILE__)
+
+#if DEBUG_LEVEL >= 4
+
+#define zip_dbg(fmt, args...) pr_info("ZIP DBG: %s: %s() : %d: " \
+			      fmt "\n", FILE_NAME, __func__, __LINE__, ## args)
+
+#define zip_dbg_enter(fmt, args...) pr_info("ZIP_DBG: %s() in %s" \
+				    fmt "\n", __func__, FILE_NAME, ## args)
+
+#define zip_dbg_exit(fmt, args...) pr_info("ZIP_DBG:Exit %s() in %s" \
+				   fmt "\n", __func__, FILE_NAME, ## args)
+
+#elif DEBUG_LEVEL >= 3
+
+#define zip_dbg(fmt, args...) pr_info("ZIP DBG: %s: %s() : %d: " \
+			      fmt "\n", FILE_NAME, __func__, __LINE__, ## args)
+
+#elif DEBUG_LEVEL >= 2
+
+#define zip_dbg(fmt, args...) pr_info("ZIP DBG: %s() : %d: " \
+			      fmt "\n", __func__, __LINE__, ## args)
+
+#else
+
+#define zip_dbg(fmt, args...) pr_info("ZIP DBG:" fmt "\n", ## args)
+
+#endif /* DEBUG LEVEL >= */
+
+#if DEBUG_LEVEL <= 3
+
+#define zip_dbg_enter(fmt, args...)
+#define zip_dbg_exit(fmt, args...)
+
+#endif /* DEBUG_LEVEL <= 3 */
+#else
+
+#define zip_dbg(fmt, args...) pr_info("ZIP DBG:" fmt "\n", ## args)
+
+#define zip_dbg_enter(fmt, args...)
+#define zip_dbg_exit(fmt, args...)
+
+#endif /* DEBUG_LEVEL */
+#else
+
+#define zip_dbg(fmt, args...)
+#define zip_dbg_enter(fmt, args...)
+#define zip_dbg_exit(fmt, args...)
+
+#endif /* ZIP_DEBUG_ENABLE */
+
+#endif
diff --git a/drivers/crypto/cavium/zip/zip_crypto.h b/drivers/crypto/cavium/zip/zip_crypto.h
new file mode 100644
index 0000000..1215049
--- /dev/null
+++ b/drivers/crypto/cavium/zip/zip_crypto.h
@@ -0,0 +1,61 @@
+/***********************license start************************************
+ * Copyright (c) 2003-2016 Cavium, Inc.
+ * All rights reserved.
+ *
+ * License: one of 'Cavium License' or 'GNU General Public License Version 2'
+ *
+ * This file is provided under the terms of the Cavium License (see below)
+ * or under the terms of GNU General Public License, Version 2, as
+ * published by the Free Software Foundation. When using or redistributing
+ * this file, you may do so under either license.
+ *
+ * Cavium License:  Redistribution and use in source and binary forms, with
+ * or without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  * Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ *
+ *  * Redistributions in binary form must reproduce the above
+ *    copyright notice, this list of conditions and the following
+ *    disclaimer in the documentation and/or other materials provided
+ *    with the distribution.
+ *
+ *  * Neither the name of Cavium Inc. nor the names of its contributors may be
+ *    used to endorse or promote products derived from this software without
+ *    specific prior written permission.
+ *
+ * This Software, including technical data, may be subject to U.S. export
+ * control laws, including the U.S. Export Administration Act and its
+ * associated regulations, and may be subject to export or import
+ * regulations in other countries.
+ *
+ * TO THE MAXIMUM EXTENT PERMITTED BY LAW, THE SOFTWARE IS PROVIDED "AS IS"
+ * AND WITH ALL FAULTS AND CAVIUM INC. MAKES NO PROMISES, REPRESENTATIONS
+ * OR WARRANTIES, EITHER EXPRESS, IMPLIED, STATUTORY, OR OTHERWISE, WITH
+ * RESPECT TO THE SOFTWARE, INCLUDING ITS CONDITION, ITS CONFORMITY TO ANY
+ * REPRESENTATION OR DESCRIPTION, OR THE EXISTENCE OF ANY LATENT OR PATENT
+ * DEFECTS, AND CAVIUM SPECIFICALLY DISCLAIMS ALL IMPLIED (IF ANY)
+ * WARRANTIES OF TITLE, MERCHANTABILITY, NONINFRINGEMENT, FITNESS FOR A
+ * PARTICULAR PURPOSE, LACK OF VIRUSES, ACCURACY OR COMPLETENESS, QUIET
+ * ENJOYMENT, QUIET POSSESSION OR CORRESPONDENCE TO DESCRIPTION. THE
+ * ENTIRE  RISK ARISING OUT OF USE OR PERFORMANCE OF THE SOFTWARE LIES
+ * WITH YOU.
+ ***********************license end**************************************/
+
+#ifndef __ZIP_CRYPTO_H__
+#define __ZIP_CRYPTO_H__
+
+#include <linux/crypto.h>
+#include "common.h"
+
+struct zip_kernel_ctx {
+	struct zip_operation zip_comp;
+	struct zip_operation zip_decomp;
+};
+
+int  zip_alloc_zip_ctx(struct crypto_tfm *tfm);
+int  zip_alloc_lzs_ctx(struct crypto_tfm *tfm);
+void zip_free_zip_ctx(struct crypto_tfm *tfm);
+
+#endif
diff --git a/drivers/crypto/cavium/zip/zip_device.c b/drivers/crypto/cavium/zip/zip_device.c
new file mode 100644
index 0000000..ed21c5a
--- /dev/null
+++ b/drivers/crypto/cavium/zip/zip_device.c
@@ -0,0 +1,208 @@
+/***********************license start************************************
+ * Copyright (c) 2003-2016 Cavium, Inc.
+ * All rights reserved.
+ *
+ * License: one of 'Cavium License' or 'GNU General Public License Version 2'
+ *
+ * This file is provided under the terms of the Cavium License (see below)
+ * or under the terms of GNU General Public License, Version 2, as
+ * published by the Free Software Foundation. When using or redistributing
+ * this file, you may do so under either license.
+ *
+ * Cavium License:  Redistribution and use in source and binary forms, with
+ * or without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  * Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ *
+ *  * Redistributions in binary form must reproduce the above
+ *    copyright notice, this list of conditions and the following
+ *    disclaimer in the documentation and/or other materials provided
+ *    with the distribution.
+ *
+ *  * Neither the name of Cavium Inc. nor the names of its contributors may be
+ *    used to endorse or promote products derived from this software without
+ *    specific prior written permission.
+ *
+ * This Software, including technical data, may be subject to U.S. export
+ * control laws, including the U.S. Export Administration Act and its
+ * associated regulations, and may be subject to export or import
+ * regulations in other countries.
+ *
+ * TO THE MAXIMUM EXTENT PERMITTED BY LAW, THE SOFTWARE IS PROVIDED "AS IS"
+ * AND WITH ALL FAULTS AND CAVIUM INC. MAKES NO PROMISES, REPRESENTATIONS
+ * OR WARRANTIES, EITHER EXPRESS, IMPLIED, STATUTORY, OR OTHERWISE, WITH
+ * RESPECT TO THE SOFTWARE, INCLUDING ITS CONDITION, ITS CONFORMITY TO ANY
+ * REPRESENTATION OR DESCRIPTION, OR THE EXISTENCE OF ANY LATENT OR PATENT
+ * DEFECTS, AND CAVIUM SPECIFICALLY DISCLAIMS ALL IMPLIED (IF ANY)
+ * WARRANTIES OF TITLE, MERCHANTABILITY, NONINFRINGEMENT, FITNESS FOR A
+ * PARTICULAR PURPOSE, LACK OF VIRUSES, ACCURACY OR COMPLETENESS, QUIET
+ * ENJOYMENT, QUIET POSSESSION OR CORRESPONDENCE TO DESCRIPTION. THE
+ * ENTIRE  RISK ARISING OUT OF USE OR PERFORMANCE OF THE SOFTWARE LIES
+ * WITH YOU.
+ ***********************license end**************************************/
+
+#include "common.h"
+
+/**
+ * zip_cmd_queue_consumed - Calculates the space consumed in the command queue.
+ *
+ * @zip_dev: Pointer to zip device structure
+ * @queue:   Queue number
+ *
+ * Return: Bytes consumed in the command queue buffer.
+ */
+static inline u32 zip_cmd_queue_consumed(struct zip_device *zip_dev, int queue)
+{
+	return ((zip_dev->iq[queue].sw_head - zip_dev->iq[queue].sw_tail) *
+		sizeof(u64 *));
+}
+
+/**
+ * zip_load_instr - Submits the instruction into the ZIP command queue
+ * @instr:      Pointer to the instruction to be submitted
+ * @zip_dev:    Pointer to ZIP device structure to which the instruction is to
+ *              be submitted
+ *
+ * This function copies the ZIP instruction to the command queue and rings the
+ * doorbell to notify the engine of the instruction submission. The command
+ * queue is maintained in a circular fashion. When there is space for exactly
+ * one instruction in the queue, next chunk pointer of the queue is made to
+ * point to the head of the queue, thus maintaining a circular queue.
+ *
+ * Return: Queue number to which the instruction was submitted
+ */
+u32 zip_load_instr(union zip_inst_s *instr,
+		   struct zip_device *zip_dev)
+{
+	union zip_quex_doorbell dbell;
+	u32 queue = 0;
+	u32 consumed = 0;
+	u64 *ncb_ptr = NULL;
+	union zip_nptr_s ncp;
+
+	/*
+	 * Distribute the instructions between the enabled queues based on
+	 * the CPU id.
+	 */
+	if (raw_smp_processor_id() % 2 == 0)
+		queue = 0;
+	else
+		queue = 1;
+
+	zip_dbg("CPU Core: %d Queue number:%d", raw_smp_processor_id(), queue);
+
+	/* Take cmd buffer lock */
+	spin_lock(&zip_dev->iq[queue].lock);
+
+	/*
+	 * Command Queue implementation
+	 * 1. If there is place for new instructions, push the cmd at sw_head.
+	 * 2. If there is place for exactly one instruction, push the new cmd
+	 *    at the sw_head. Make sw_head point to the sw_tail to make it
+	 *    circular. Write sw_head's physical address to the "Next-Chunk
+	 *    Buffer Ptr" to make it cmd_hw_tail.
+	 * 3. Ring the door bell.
+	 */
+	zip_dbg("sw_head : %lx", zip_dev->iq[queue].sw_head);
+	zip_dbg("sw_tail : %lx", zip_dev->iq[queue].sw_tail);
+
+	consumed = zip_cmd_queue_consumed(zip_dev, queue);
+	/* Check if there is space to push just one cmd */
+	if ((consumed + 128) == (ZIP_CMD_QBUF_SIZE - 8)) {
+		zip_dbg("Cmd queue space available for single command");
+		/* Space for one cmd, pust it and make it circular queue */
+		memcpy((u8 *)zip_dev->iq[queue].sw_head, (u8 *)instr,
+		       sizeof(union zip_inst_s));
+		zip_dev->iq[queue].sw_head += 16; /* 16 64_bit words = 128B */
+
+		/* Now, point the "Next-Chunk Buffer Ptr" to sw_head */
+		ncb_ptr = zip_dev->iq[queue].sw_head;
+
+		zip_dbg("ncb addr :0x%lx sw_head addr :0x%lx",
+			ncb_ptr, zip_dev->iq[queue].sw_head - 16);
+
+		/* Using Circular command queue */
+		zip_dev->iq[queue].sw_head = zip_dev->iq[queue].sw_tail;
+		/* Mark this buffer for free */
+		zip_dev->iq[queue].free_flag = 1;
+
+		/* Write new chunk buffer address at "Next-Chunk Buffer Ptr" */
+		ncp.u_reg64 = 0ull;
+		ncp.s.addr = __pa(zip_dev->iq[queue].sw_head);
+		*ncb_ptr = ncp.u_reg64;
+		zip_dbg("*ncb_ptr :0x%lx sw_head[phys] :0x%lx",
+			*ncb_ptr, __pa(zip_dev->iq[queue].sw_head));
+
+		zip_dev->iq[queue].pend_cnt++;
+
+	} else {
+		zip_dbg("Enough space is available for commands");
+		/* Push this cmd to cmd queue buffer */
+		memcpy((u8 *)zip_dev->iq[queue].sw_head, (u8 *)instr,
+		       sizeof(union zip_inst_s));
+		zip_dev->iq[queue].sw_head += 16; /* 16 64_bit words = 128B */
+
+		zip_dev->iq[queue].pend_cnt++;
+	}
+	zip_dbg("sw_head :0x%lx sw_tail :0x%lx hw_tail :0x%lx",
+		zip_dev->iq[queue].sw_head, zip_dev->iq[queue].sw_tail,
+		zip_dev->iq[queue].hw_tail);
+
+	zip_dbg(" Pushed the new cmd : pend_cnt : %d",
+		zip_dev->iq[queue].pend_cnt);
+
+	/* Ring the doorbell */
+	dbell.u_reg64     = 0ull;
+	dbell.s.dbell_cnt = 1;
+	zip_reg_write(dbell.u_reg64,
+		      (zip_dev->reg_base + ZIP_QUEX_DOORBELL(queue)));
+
+	/* Unlock cmd buffer lock */
+	spin_unlock(&zip_dev->iq[queue].lock);
+
+	/* Poll for the IQ cmd completion code */
+	zip_dbg_exit();
+
+	return queue;
+}
+
+/**
+ * zip_update_cmd_bufs - Updates the queue statistics after posting the
+ *                       instruction
+ * @zip_dev: Pointer to zip device structure
+ * @queue:   Queue number
+ */
+void zip_update_cmd_bufs(struct zip_device *zip_dev, u32 queue)
+{
+	zip_dbg_enter();
+
+	/* Take cmd buffer lock */
+	spin_lock(&zip_dev->iq[queue].lock);
+
+	/* Check if the previous buffer can be freed */
+	if (zip_dev->iq[queue].free_flag == 1) {
+		zip_dbg("Free flag. Free cmd buffer, adjust sw head and tail");
+		/* Reset the free flag */
+		zip_dev->iq[queue].free_flag = 0;
+
+		/* Point the hw_tail to start of the new chunk buffer */
+		zip_dev->iq[queue].hw_tail = zip_dev->iq[queue].sw_head;
+	} else {
+		zip_dbg("Free flag not set. increment hw tail");
+		zip_dev->iq[queue].hw_tail += 16; /* 16 64_bit words = 128B */
+	}
+
+	zip_dev->iq[queue].done_cnt++;
+	zip_dev->iq[queue].pend_cnt--;
+
+	zip_dbg("sw_head :0x%lx sw_tail :0x%lx hw_tail :0x%lx",
+		zip_dev->iq[queue].sw_head, zip_dev->iq[queue].sw_tail,
+		zip_dev->iq[queue].hw_tail);
+	zip_dbg(" Got CC : pend_cnt : %d\n", zip_dev->iq[queue].pend_cnt);
+
+	spin_unlock(&zip_dev->iq[queue].lock);
+
+	zip_dbg_exit();
+}
diff --git a/drivers/crypto/cavium/zip/zip_device.h b/drivers/crypto/cavium/zip/zip_device.h
new file mode 100644
index 0000000..7f864e0
--- /dev/null
+++ b/drivers/crypto/cavium/zip/zip_device.h
@@ -0,0 +1,138 @@
+/***********************license start************************************
+ * Copyright (c) 2003-2016 Cavium, Inc.
+ * All rights reserved.
+ *
+ * License: one of 'Cavium License' or 'GNU General Public License Version 2'
+ *
+ * This file is provided under the terms of the Cavium License (see below)
+ * or under the terms of GNU General Public License, Version 2, as
+ * published by the Free Software Foundation. When using or redistributing
+ * this file, you may do so under either license.
+ *
+ * Cavium License:  Redistribution and use in source and binary forms, with
+ * or without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  * Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ *
+ *  * Redistributions in binary form must reproduce the above
+ *    copyright notice, this list of conditions and the following
+ *    disclaimer in the documentation and/or other materials provided
+ *    with the distribution.
+ *
+ *  * Neither the name of Cavium Inc. nor the names of its contributors may be
+ *    used to endorse or promote products derived from this software without
+ *    specific prior written permission.
+ *
+ * This Software, including technical data, may be subject to U.S. export
+ * control laws, including the U.S. Export Administration Act and its
+ * associated regulations, and may be subject to export or import
+ * regulations in other countries.
+ *
+ * TO THE MAXIMUM EXTENT PERMITTED BY LAW, THE SOFTWARE IS PROVIDED "AS IS"
+ * AND WITH ALL FAULTS AND CAVIUM INC. MAKES NO PROMISES, REPRESENTATIONS
+ * OR WARRANTIES, EITHER EXPRESS, IMPLIED, STATUTORY, OR OTHERWISE, WITH
+ * RESPECT TO THE SOFTWARE, INCLUDING ITS CONDITION, ITS CONFORMITY TO ANY
+ * REPRESENTATION OR DESCRIPTION, OR THE EXISTENCE OF ANY LATENT OR PATENT
+ * DEFECTS, AND CAVIUM SPECIFICALLY DISCLAIMS ALL IMPLIED (IF ANY)
+ * WARRANTIES OF TITLE, MERCHANTABILITY, NONINFRINGEMENT, FITNESS FOR A
+ * PARTICULAR PURPOSE, LACK OF VIRUSES, ACCURACY OR COMPLETENESS, QUIET
+ * ENJOYMENT, QUIET POSSESSION OR CORRESPONDENCE TO DESCRIPTION. THE
+ * ENTIRE  RISK ARISING OUT OF USE OR PERFORMANCE OF THE SOFTWARE LIES
+ * WITH YOU.
+ ***********************license end**************************************/
+
+#ifndef __ZIP_DEVICE_H__
+#define __ZIP_DEVICE_H__
+
+#include <linux/types.h>
+#include "zip_main.h"
+
+struct sg_info {
+	/*
+	 * Pointer to the input data when scatter_gather == 0 and
+	 * pointer to the input gather list buffer when scatter_gather == 1
+	 */
+	union zip_zptr_s *gather;
+
+	/*
+	 * Pointer to the output data when scatter_gather == 0 and
+	 * pointer to the output scatter list buffer when scatter_gather == 1
+	 */
+	union zip_zptr_s *scatter;
+
+	/*
+	 * Holds size of the output buffer pointed by scatter list
+	 * when scatter_gather == 1
+	 */
+	u64 scatter_buf_size;
+
+	/* for gather data */
+	u64 gather_enable;
+
+	/* for scatter data */
+	u64 scatter_enable;
+
+	/* Number of gather list pointers for gather data */
+	u32 gbuf_cnt;
+
+	/* Number of scatter list pointers for scatter data */
+	u32 sbuf_cnt;
+
+	/* Buffers allocation state */
+	u8 alloc_state;
+};
+
+/**
+ * struct zip_state - Structure representing the required information related
+ *                    to a command
+ * @zip_cmd: Pointer to zip instruction structure
+ * @result:  Pointer to zip result structure
+ * @ctx:     Context pointer for inflate
+ * @history: Decompression history pointer
+ * @sginfo:  Scatter-gather info structure
+ */
+struct zip_state {
+	union zip_inst_s zip_cmd;
+	union zip_zres_s result;
+	union zip_zptr_s *ctx;
+	union zip_zptr_s *history;
+	struct sg_info   sginfo;
+};
+
+static inline u64 zip_depth(void)
+{
+	struct zip_device *zip_dev = zip_get_device(zip_get_node_id());
+
+	if (!zip_dev)
+		return -ENODEV;
+
+	return zip_dev->depth;
+}
+
+static inline u64 zip_onfsize(void)
+{
+	struct zip_device *zip_dev = zip_get_device(zip_get_node_id());
+
+	if (!zip_dev)
+		return -ENODEV;
+
+	return zip_dev->onfsize;
+}
+
+static inline u64 zip_ctxsize(void)
+{
+	struct zip_device *zip_dev = zip_get_device(zip_get_node_id());
+
+	if (!zip_dev)
+		return -ENODEV;
+
+	return zip_dev->ctxsize;
+}
+
+#define ZIP_CONTEXT_SIZE          2048
+#define ZIP_INFLATE_HISTORY_SIZE  32768
+#define ZIP_DEFLATE_HISTORY_SIZE  32768
+
+#endif
diff --git a/drivers/crypto/cavium/zip/zip_main.c b/drivers/crypto/cavium/zip/zip_main.c
new file mode 100644
index 0000000..052c42d
--- /dev/null
+++ b/drivers/crypto/cavium/zip/zip_main.c
@@ -0,0 +1,500 @@
+/***********************license start************************************
+ * Copyright (c) 2003-2016 Cavium, Inc.
+ * All rights reserved.
+ *
+ * License: one of 'Cavium License' or 'GNU General Public License Version 2'
+ *
+ * This file is provided under the terms of the Cavium License (see below)
+ * or under the terms of GNU General Public License, Version 2, as
+ * published by the Free Software Foundation. When using or redistributing
+ * this file, you may do so under either license.
+ *
+ * Cavium License:  Redistribution and use in source and binary forms, with
+ * or without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  * Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ *
+ *  * Redistributions in binary form must reproduce the above
+ *    copyright notice, this list of conditions and the following
+ *    disclaimer in the documentation and/or other materials provided
+ *    with the distribution.
+ *
+ *  * Neither the name of Cavium Inc. nor the names of its contributors may be
+ *    used to endorse or promote products derived from this software without
+ *    specific prior written permission.
+ *
+ * This Software, including technical data, may be subject to U.S. export
+ * control laws, including the U.S. Export Administration Act and its
+ * associated regulations, and may be subject to export or import
+ * regulations in other countries.
+ *
+ * TO THE MAXIMUM EXTENT PERMITTED BY LAW, THE SOFTWARE IS PROVIDED "AS IS"
+ * AND WITH ALL FAULTS AND CAVIUM INC. MAKES NO PROMISES, REPRESENTATIONS
+ * OR WARRANTIES, EITHER EXPRESS, IMPLIED, STATUTORY, OR OTHERWISE, WITH
+ * RESPECT TO THE SOFTWARE, INCLUDING ITS CONDITION, ITS CONFORMITY TO ANY
+ * REPRESENTATION OR DESCRIPTION, OR THE EXISTENCE OF ANY LATENT OR PATENT
+ * DEFECTS, AND CAVIUM SPECIFICALLY DISCLAIMS ALL IMPLIED (IF ANY)
+ * WARRANTIES OF TITLE, MERCHANTABILITY, NONINFRINGEMENT, FITNESS FOR A
+ * PARTICULAR PURPOSE, LACK OF VIRUSES, ACCURACY OR COMPLETENESS, QUIET
+ * ENJOYMENT, QUIET POSSESSION OR CORRESPONDENCE TO DESCRIPTION. THE
+ * ENTIRE  RISK ARISING OUT OF USE OR PERFORMANCE OF THE SOFTWARE LIES
+ * WITH YOU.
+ ***********************license end**************************************/
+
+#include "common.h"
+#include "zip_crypto.h"
+
+#define DRV_NAME		"ThunderX-ZIP"
+
+static struct zip_device *zip_dev[MAX_ZIP_DEVICES];
+
+static const struct pci_device_id zip_id_table[] = {
+	{ PCI_DEVICE(PCI_VENDOR_ID_CAVIUM, PCI_DEVICE_ID_THUNDERX_ZIP) },
+	{ 0, }
+};
+
+void zip_reg_write(u64 val, u64 __iomem *addr)
+{
+	writeq(val, addr);
+}
+
+u64 zip_reg_read(u64 __iomem *addr)
+{
+	return readq(addr);
+}
+
+/*
+ * Allocates new ZIP device structure
+ * Returns zip_device pointer or NULL if cannot allocate memory for zip_device
+ */
+static struct zip_device *zip_alloc_device(struct pci_dev *pdev)
+{
+	struct zip_device *zip = NULL;
+	int idx = 0;
+
+	for (idx = 0; idx < MAX_ZIP_DEVICES; idx++) {
+		if (!zip_dev[idx])
+			break;
+	}
+
+	zip = kzalloc(sizeof(*zip), GFP_KERNEL);
+
+	if (!zip)
+		return NULL;
+
+	zip_dev[idx] = zip;
+	zip->index = idx;
+	return zip;
+}
+
+/**
+ * zip_get_device - Get ZIP device based on node id of cpu
+ *
+ * @node: Node id of the current cpu
+ * Return: Pointer to Zip device structure
+ */
+struct zip_device *zip_get_device(int node)
+{
+	if ((node < MAX_ZIP_DEVICES) && (node >= 0))
+		return zip_dev[node];
+
+	zip_err("ZIP device not found for node id %d\n", node);
+	return NULL;
+}
+
+/**
+ * zip_get_node_id - Get the node id of the current cpu
+ *
+ * Return: Node id of the current cpu
+ */
+int zip_get_node_id(void)
+{
+	return cpu_to_node(raw_smp_processor_id());
+}
+
+/**
+ * zip-get_zipeng_count - Returns No. of ZIP Cores present in CN88XX
+ *
+ * Return: Number of zip engines in the system
+ */
+int zip_get_zipeng_count(void)
+{
+	return ZIP_NUMENG_CN88XX;
+}
+
+/* Initializes the ZIP h/w sub-system */
+static int zip_init_hw(struct zip_device *zip)
+{
+	union zip_cmd_ctl    cmd_ctl;
+	union zip_constants  constants;
+	union zip_que_ena    que_ena;
+	union zip_quex_map   que_map;
+	union zip_que_pri    que_pri;
+
+	union zip_quex_sbuf_addr que_sbuf_addr;
+	union zip_quex_sbuf_ctl  que_sbuf_ctl;
+
+	int q = 0;
+
+	zip_dbg_enter();
+
+	/* ZIP Engine Init / Enable */
+
+	/* Enable the ZIP Engine(Core) Clock */
+	cmd_ctl.u_reg64 = zip_reg_read(zip->reg_base + ZIP_CMD_CTL);
+	cmd_ctl.s.forceclk = 1;
+	zip_reg_write(cmd_ctl.u_reg64 & 0xFF, (zip->reg_base + ZIP_CMD_CTL));
+
+	zip_msg("ZIP_CMD_CTL  : 0x%016llx",
+		zip_reg_read(zip->reg_base + ZIP_CMD_CTL));
+
+	constants.u_reg64 = zip_reg_read(zip->reg_base + ZIP_CONSTANTS);
+	zip->depth    = constants.s.depth;
+	zip->onfsize  = constants.s.onfsize;
+	zip->ctxsize  = constants.s.ctxsize;
+
+	zip_msg("depth: 0x%016llx , onfsize : 0x%016llx , ctxsize : 0x%016llx",
+		zip->depth, zip->onfsize, zip->ctxsize);
+
+	/*
+	 * Program ZIP_QUE(0..7)_SBUF_ADDR and ZIP_QUE(0..7)_SBUF_CTL to
+	 * have the correct buffer pointer and size configured for each
+	 * instruction queue.
+	 */
+	for (q = 0; q < ZIP_NUM_QUEUES; q++) {
+		que_sbuf_ctl.u_reg64 = 0ull;
+		que_sbuf_ctl.s.size = (ZIP_CMD_QBUF_SIZE / sizeof(u64));
+		que_sbuf_ctl.s.inst_be   = 0;
+		que_sbuf_ctl.s.stream_id = 0;
+		zip_reg_write(que_sbuf_ctl.u_reg64,
+			      (zip->reg_base + ZIP_QUEX_SBUF_CTL(q)));
+
+		zip_msg("QUEX_SBUF_CTL[%d]: 0x%016llx", q,
+			zip_reg_read(zip->reg_base + ZIP_QUEX_SBUF_CTL(q)));
+	}
+
+	for (q = 0; q < ZIP_NUM_QUEUES; q++) {
+		memset(&zip->iq[q], 0x0, sizeof(struct zip_iq));
+
+		spin_lock_init(&zip->iq[q].lock);
+
+		if (zip_cmd_qbuf_alloc(zip, q)) {
+			while (q != 0) {
+				q--;
+				zip_cmd_qbuf_free(zip, q);
+			}
+			return -ENOMEM;
+		}
+
+		/* Initialize tail ptr to head */
+		zip->iq[q].sw_tail = zip->iq[q].sw_head;
+		zip->iq[q].hw_tail = zip->iq[q].sw_head;
+
+		/* Write the physical addr to register */
+		que_sbuf_addr.u_reg64   = 0ull;
+		que_sbuf_addr.s.ptr = (__pa(zip->iq[q].sw_head) >>
+				       ZIP_128B_ALIGN);
+
+		zip_msg("QUE[%d]_PTR(PHYS): 0x%016llx", q,
+			(u64)que_sbuf_addr.s.ptr);
+
+		zip_reg_write(que_sbuf_addr.u_reg64,
+			      (zip->reg_base + ZIP_QUEX_SBUF_ADDR(q)));
+
+		zip_msg("QUEX_SBUF_ADDR[%d]: 0x%016llx", q,
+			zip_reg_read(zip->reg_base + ZIP_QUEX_SBUF_ADDR(q)));
+
+		zip_dbg("sw_head :0x%lx sw_tail :0x%lx hw_tail :0x%lx",
+			zip->iq[q].sw_head, zip->iq[q].sw_tail,
+			zip->iq[q].hw_tail);
+		zip_dbg("sw_head phy addr : 0x%lx", que_sbuf_addr.s.ptr);
+	}
+
+	/*
+	 * Queue-to-ZIP core mapping
+	 * If a queue is not mapped to a particular core, it is equivalent to
+	 * the ZIP core being disabled.
+	 */
+	que_ena.u_reg64 = 0x0ull;
+	/* Enabling queues based on ZIP_NUM_QUEUES */
+	for (q = 0; q < ZIP_NUM_QUEUES; q++)
+		que_ena.s.ena |= (0x1 << q);
+	zip_reg_write(que_ena.u_reg64, (zip->reg_base + ZIP_QUE_ENA));
+
+	zip_msg("QUE_ENA      : 0x%016llx",
+		zip_reg_read(zip->reg_base + ZIP_QUE_ENA));
+
+	for (q = 0; q < ZIP_NUM_QUEUES; q++) {
+		que_map.u_reg64 = 0ull;
+		/* Mapping each queue to two ZIP cores */
+		que_map.s.zce = 0x3;
+		zip_reg_write(que_map.u_reg64,
+			      (zip->reg_base + ZIP_QUEX_MAP(q)));
+
+		zip_msg("QUE_MAP(%d)   : 0x%016llx", q,
+			zip_reg_read(zip->reg_base + ZIP_QUEX_MAP(q)));
+	}
+
+	que_pri.u_reg64 = 0ull;
+	for (q = 0; q < ZIP_NUM_QUEUES; q++)
+		que_pri.s.pri |= (0x1 << q); /* Higher Priority RR */
+	zip_reg_write(que_pri.u_reg64, (zip->reg_base + ZIP_QUE_PRI));
+
+	zip_msg("QUE_PRI %016llx", zip_reg_read(zip->reg_base + ZIP_QUE_PRI));
+
+	zip_dbg_exit();
+	return 0;
+}
+
+static int zip_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
+{
+	struct device *dev = &pdev->dev;
+	struct zip_device *zip = NULL;
+	int    err;
+
+	zip_dbg_enter();
+
+	zip = zip_alloc_device(pdev);
+
+	if (!zip)
+		return -ENOMEM;
+
+	pr_info("Found ZIP device %d %x:%x on Node %d\n", zip->index,
+		pdev->vendor, pdev->device, dev_to_node(dev));
+
+	zip->pdev = pdev;
+
+	pci_set_drvdata(pdev, zip);
+
+	err = pci_enable_device(pdev);
+	if (err) {
+		zip_err("Failed to enable PCI device");
+		goto err_free_device;
+	}
+
+	err = pci_request_regions(pdev, DRV_NAME);
+	if (err) {
+		zip_err("PCI request regions failed 0x%x", err);
+		goto err_disable_device;
+	}
+
+	err = pci_set_dma_mask(pdev, DMA_BIT_MASK(48));
+	if (err) {
+		dev_err(dev, "Unable to get usable DMA configuration\n");
+		goto err_release_regions;
+	}
+
+	err = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(48));
+	if (err) {
+		dev_err(dev, "Unable to get 48-bit DMA for allocations\n");
+		goto err_release_regions;
+	}
+
+	/* MAP configuration registers */
+	zip->reg_base = pci_ioremap_bar(pdev, PCI_CFG_ZIP_PF_BAR0);
+	if (!zip->reg_base) {
+		zip_err("ZIP: Cannot map BAR0 CSR memory space, aborting");
+		err = -ENOMEM;
+		goto err_release_regions;
+	}
+
+	/* Initialize ZIP Hardware */
+	err = zip_init_hw(zip);
+	if (err)
+		goto err_release_regions;
+
+	return 0;
+
+err_release_regions:
+	if (zip->reg_base)
+		iounmap(zip->reg_base);
+	pci_release_regions(pdev);
+
+err_disable_device:
+	pci_disable_device(pdev);
+
+err_free_device:
+	pci_set_drvdata(pdev, NULL);
+
+	/* remove zip_dev from zip_device list, free the zip_device memory */
+	zip_dev[zip->index] = NULL;
+	kfree(zip);
+
+	zip_dbg_exit();
+	return err;
+}
+
+static void zip_remove(struct pci_dev *pdev)
+{
+	struct zip_device *zip = pci_get_drvdata(pdev);
+	int q = 0;
+
+	zip_dbg_enter();
+
+	if (!zip)
+		return;
+
+	if (zip->reg_base) {
+		union zip_cmd_ctl cmd_ctl;
+
+		cmd_ctl.u_reg64 = 0x0ull;
+		cmd_ctl.s.reset = 1;  /* Forces ZIP cores to do reset */
+		zip_reg_write(cmd_ctl.u_reg64, (zip->reg_base + ZIP_CMD_CTL));
+		iounmap(zip->reg_base);
+	}
+
+	pci_release_regions(pdev);
+	pci_disable_device(pdev);
+
+	/*
+	 * Free Command Queue buffers. This free should be called for all
+	 * the enabled Queues.
+	 */
+	for (q = 0; q < ZIP_NUM_QUEUES; q++)
+		zip_cmd_qbuf_free(zip, q);
+
+	pci_set_drvdata(pdev, NULL);
+
+	/* remove zip device from zip device list */
+	zip_dev[zip->index] = NULL;
+	kfree(zip);
+
+	zip_dbg_exit();
+}
+
+/* Dummy Functions */
+int zip_alloc_lzs_ctx(struct crypto_tfm *tfm)
+{
+	return 0;
+}
+
+int zip_alloc_zip_ctx(struct crypto_tfm *tfm)
+{
+	return 0;
+}
+
+void zip_free_zip_ctx(struct crypto_tfm *tfm)
+{
+}
+
+int  zip_deflate_comp(struct crypto_tfm *tfm,
+		      const u8 *src, unsigned int slen,
+		      u8 *dst, unsigned int *dlen)
+{
+	return 0;
+}
+
+int  zip_inflate_comp(struct crypto_tfm *tfm,
+		      const u8 *src, unsigned int slen,
+		      u8 *dst, unsigned int *dlen)
+{
+	return 0;
+}
+
+/* PCI Sub-System Interface */
+static struct pci_driver zip_driver = {
+	.name	    =  DRV_NAME,
+	.id_table   =  zip_id_table,
+	.probe	    =  zip_probe,
+	.remove     =  zip_remove,
+};
+
+/* Kernel Crypto Subsystem Interface */
+
+static struct crypto_alg zip_comp_deflate = {
+	.cra_name		= "deflate",
+	.cra_flags		= CRYPTO_ALG_TYPE_COMPRESS,
+	.cra_ctxsize		= sizeof(struct zip_kernel_ctx),
+	.cra_priority           = 300,
+	.cra_module		= THIS_MODULE,
+	.cra_init		= zip_alloc_zip_ctx,
+	.cra_exit		= zip_free_zip_ctx,
+	.cra_u			= { .compress = {
+		.coa_compress	= zip_deflate_comp,
+		.coa_decompress	= zip_inflate_comp
+		 } }
+};
+
+static struct crypto_alg zip_comp_lzs = {
+	.cra_name		= "lzs",
+	.cra_flags		= CRYPTO_ALG_TYPE_COMPRESS,
+	.cra_ctxsize		= sizeof(struct zip_kernel_ctx),
+	.cra_priority           = 300,
+	.cra_module		= THIS_MODULE,
+	.cra_init		= zip_alloc_lzs_ctx,
+	.cra_exit		= zip_free_zip_ctx,
+	.cra_u			= { .compress = {
+		.coa_compress	= zip_deflate_comp,
+		.coa_decompress	= zip_inflate_comp
+		 } }
+};
+
+static int zip_register_compression_device(void)
+{
+	int ret;
+
+	ret = crypto_register_alg(&zip_comp_deflate);
+	if (ret < 0) {
+		zip_err("Deflate algorithm registration failed\n");
+		return ret;
+	}
+
+	ret = crypto_register_alg(&zip_comp_lzs);
+	if (ret < 0) {
+		zip_err("LZS algorithm registration failed\n");
+		crypto_unregister_alg(&zip_comp_deflate);
+	}
+
+	return ret;
+}
+
+static void zip_unregister_compression_device(void)
+{
+	crypto_unregister_alg(&zip_comp_deflate);
+	crypto_unregister_alg(&zip_comp_lzs);
+}
+
+static int __init zip_init_module(void)
+{
+	int ret;
+
+	memset(&zip_dev, 0, sizeof(zip_dev));
+
+	zip_msg("%s\n", DRV_NAME);
+
+	ret = pci_register_driver(&zip_driver);
+	if (ret < 0) {
+		zip_err("ZIP: pci_register_driver() returned %d\n", ret);
+		return ret;
+	}
+
+	/* Register with the Kernel Crypto Interface */
+	ret = zip_register_compression_device();
+	if (ret < 0) {
+		zip_err("ZIP: Kernel Crypto Registration failed\n");
+		return 1;
+	}
+
+	return ret;
+}
+
+static void __exit zip_cleanup_module(void)
+{
+	/* Unregister this driver for pci zip devices */
+	pci_unregister_driver(&zip_driver);
+
+	/* Unregister from the kernel crypto interface */
+	zip_unregister_compression_device();
+
+	pr_info("ThunderX-ZIP driver is removed successfully\n");
+}
+
+module_init(zip_init_module);
+module_exit(zip_cleanup_module);
+
+MODULE_AUTHOR("Cavium Inc");
+MODULE_DESCRIPTION("Cavium Inc ThunderX ZIP Driver");
+MODULE_LICENSE("GPL v2");
+MODULE_DEVICE_TABLE(pci, zip_id_table);
diff --git a/drivers/crypto/cavium/zip/zip_main.h b/drivers/crypto/cavium/zip/zip_main.h
new file mode 100644
index 0000000..73b9e6d
--- /dev/null
+++ b/drivers/crypto/cavium/zip/zip_main.h
@@ -0,0 +1,126 @@
+/***********************license start************************************
+ * Copyright (c) 2003-2016 Cavium, Inc.
+ * All rights reserved.
+ *
+ * License: one of 'Cavium License' or 'GNU General Public License Version 2'
+ *
+ * This file is provided under the terms of the Cavium License (see below)
+ * or under the terms of GNU General Public License, Version 2, as
+ * published by the Free Software Foundation. When using or redistributing
+ * this file, you may do so under either license.
+ *
+ * Cavium License:  Redistribution and use in source and binary forms, with
+ * or without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  * Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ *
+ *  * Redistributions in binary form must reproduce the above
+ *    copyright notice, this list of conditions and the following
+ *    disclaimer in the documentation and/or other materials provided
+ *    with the distribution.
+ *
+ *  * Neither the name of Cavium Inc. nor the names of its contributors may be
+ *    used to endorse or promote products derived from this software without
+ *    specific prior written permission.
+ *
+ * This Software, including technical data, may be subject to U.S. export
+ * control laws, including the U.S. Export Administration Act and its
+ * associated regulations, and may be subject to export or import
+ * regulations in other countries.
+ *
+ * TO THE MAXIMUM EXTENT PERMITTED BY LAW, THE SOFTWARE IS PROVIDED "AS IS"
+ * AND WITH ALL FAULTS AND CAVIUM INC. MAKES NO PROMISES, REPRESENTATIONS
+ * OR WARRANTIES, EITHER EXPRESS, IMPLIED, STATUTORY, OR OTHERWISE, WITH
+ * RESPECT TO THE SOFTWARE, INCLUDING ITS CONDITION, ITS CONFORMITY TO ANY
+ * REPRESENTATION OR DESCRIPTION, OR THE EXISTENCE OF ANY LATENT OR PATENT
+ * DEFECTS, AND CAVIUM SPECIFICALLY DISCLAIMS ALL IMPLIED (IF ANY)
+ * WARRANTIES OF TITLE, MERCHANTABILITY, NONINFRINGEMENT, FITNESS FOR A
+ * PARTICULAR PURPOSE, LACK OF VIRUSES, ACCURACY OR COMPLETENESS, QUIET
+ * ENJOYMENT, QUIET POSSESSION OR CORRESPONDENCE TO DESCRIPTION. THE
+ * ENTIRE  RISK ARISING OUT OF USE OR PERFORMANCE OF THE SOFTWARE LIES
+ * WITH YOU.
+ ***********************license end**************************************/
+
+#ifndef __ZIP_MAIN_H__
+#define __ZIP_MAIN_H__
+
+#include "zip_device.h"
+#include "zip_regs.h"
+
+/* PCI device IDs */
+#define PCI_DEVICE_ID_THUNDERX_ZIP   0xA01A
+
+/* ZIP device BARs */
+#define PCI_CFG_ZIP_PF_BAR0   0  /* Base addr for normal regs */
+#define PCI_CFG_ZIP_PF_BAR4   4  /* Base addr for MSI-X regs  */
+
+/* Maximum available zip queues */
+#define ZIP_MAX_NUM_QUEUES    8
+#define ZIP_MAXQ_PER_ZIPENG   4
+#define ZIP_NUMENG_CN88XX     2
+
+#define ZIP_128B_ALIGN        7
+
+/* Buffer size and alignment */
+#define ZIP_CMD_QBUF_SIZE     (8064 + 8)
+#define ZIP_CMD_QBUF_ALIGN    128
+#define ZIP_DATA_BUF_ALIGN    8
+
+/*
+ * There will be max of 2^20 zip cmds in the zip instruction queue.
+ * So no of zip Chunk buffers = ((2^20) / ((2*1024)/64))
+ */
+#define ZIP_MAX_CMD           (1024 * 1024)
+#define ZIP_CMD_PER_BUF       (ZIP_CMD_QBUF_SIZE / 64)
+#define ZIP_CMD_QBUF_MAX_CNT  (1 * 1024)
+
+/* Data buffer size 64K for time being */
+#define ZIP_DATA_BUF_SIZE     (64 * 1024)
+
+/* Number of data buffers */
+#define ZIP_DATA_BUF_CNT      (32 * 1024)
+
+struct zip_registers {
+	char  *reg_name;
+	u64   reg_offset;
+};
+
+/* ZIP Instruction Queue */
+struct zip_iq {
+	u64        *sw_head;
+	u64        *sw_tail;
+	u64        *hw_tail;
+	u64        done_cnt;
+	u64        pend_cnt;
+	u64        free_flag;
+
+	/* ZIP IQ lock */
+	spinlock_t  lock;
+};
+
+/* ZIP Device */
+struct zip_device {
+	u32               index;
+	void __iomem      *reg_base;
+	struct pci_dev    *pdev;
+
+	/* Different ZIP Constants */
+	u64               depth;
+	u64               onfsize;
+	u64               ctxsize;
+
+	struct zip_iq     iq[ZIP_MAX_NUM_QUEUES];
+};
+
+/* Prototypes */
+struct zip_device *zip_get_device(int node_id);
+int zip_get_node_id(void);
+int zip_get_zipeng_count(void);
+void zip_reg_write(u64 val, u64 __iomem *addr);
+u64 zip_reg_read(u64 __iomem *addr);
+void zip_update_cmd_bufs(struct zip_device *zip_dev, u32 queue);
+u32 zip_load_instr(union zip_inst_s *instr, struct zip_device *zip_dev);
+
+#endif /* ZIP_MAIN_H */
diff --git a/drivers/crypto/cavium/zip/zip_mem.c b/drivers/crypto/cavium/zip/zip_mem.c
new file mode 100644
index 0000000..cf1800f5
--- /dev/null
+++ b/drivers/crypto/cavium/zip/zip_mem.c
@@ -0,0 +1,120 @@
+/***********************license start************************************
+ * Copyright (c) 2003-2016 Cavium, Inc.
+ * All rights reserved.
+ *
+ * License: one of 'Cavium License' or 'GNU General Public License Version 2'
+ *
+ * This file is provided under the terms of the Cavium License (see below)
+ * or under the terms of GNU General Public License, Version 2, as
+ * published by the Free Software Foundation. When using or redistributing
+ * this file, you may do so under either license.
+ *
+ * Cavium License:  Redistribution and use in source and binary forms, with
+ * or without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  * Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ *
+ *  * Redistributions in binary form must reproduce the above
+ *    copyright notice, this list of conditions and the following
+ *    disclaimer in the documentation and/or other materials provided
+ *    with the distribution.
+ *
+ *  * Neither the name of Cavium Inc. nor the names of its contributors may be
+ *    used to endorse or promote products derived from this software without
+ *    specific prior written permission.
+ *
+ * This Software, including technical data, may be subject to U.S. export
+ * control laws, including the U.S. Export Administration Act and its
+ * associated regulations, and may be subject to export or import
+ * regulations in other countries.
+ *
+ * TO THE MAXIMUM EXTENT PERMITTED BY LAW, THE SOFTWARE IS PROVIDED "AS IS"
+ * AND WITH ALL FAULTS AND CAVIUM INC. MAKES NO PROMISES, REPRESENTATIONS
+ * OR WARRANTIES, EITHER EXPRESS, IMPLIED, STATUTORY, OR OTHERWISE, WITH
+ * RESPECT TO THE SOFTWARE, INCLUDING ITS CONDITION, ITS CONFORMITY TO ANY
+ * REPRESENTATION OR DESCRIPTION, OR THE EXISTENCE OF ANY LATENT OR PATENT
+ * DEFECTS, AND CAVIUM SPECIFICALLY DISCLAIMS ALL IMPLIED (IF ANY)
+ * WARRANTIES OF TITLE, MERCHANTABILITY, NONINFRINGEMENT, FITNESS FOR A
+ * PARTICULAR PURPOSE, LACK OF VIRUSES, ACCURACY OR COMPLETENESS, QUIET
+ * ENJOYMENT, QUIET POSSESSION OR CORRESPONDENCE TO DESCRIPTION. THE
+ * ENTIRE  RISK ARISING OUT OF USE OR PERFORMANCE OF THE SOFTWARE LIES
+ * WITH YOU.
+ ***********************license end**************************************/
+
+#include <linux/types.h>
+#include <linux/vmalloc.h>
+
+#include "common.h"
+
+/**
+ * zip_cmd_qbuf_alloc - Allocates a cmd buffer for ZIP Instruction Queue
+ * @zip: Pointer to zip device structure
+ * @q:   Queue number to allocate bufffer to
+ * Return: 0 if successful, -ENOMEM otherwise
+ */
+int zip_cmd_qbuf_alloc(struct zip_device *zip, int q)
+{
+	zip_dbg_enter();
+
+	zip->iq[q].sw_head = (u64 *)__get_free_pages((GFP_KERNEL | GFP_DMA),
+						get_order(ZIP_CMD_QBUF_SIZE));
+
+	if (!zip->iq[q].sw_head)
+		return -ENOMEM;
+
+	memset(zip->iq[q].sw_head, 0, ZIP_CMD_QBUF_SIZE);
+
+	zip_dbg("cmd_qbuf_alloc[%d] Success : %p\n", q, zip->iq[q].sw_head);
+	zip_dbg_exit();
+	return 0;
+}
+
+/**
+ * zip_cmd_qbuf_free - Frees the cmd Queue buffer
+ * @zip: Pointer to zip device structure
+ * @q:   Queue number to free buffer of
+ */
+void zip_cmd_qbuf_free(struct zip_device *zip, int q)
+{
+	zip_dbg("Freeing cmd_qbuf 0x%lx\n", zip->iq[q].sw_tail);
+
+	free_pages((u64)zip->iq[q].sw_tail, get_order(ZIP_CMD_QBUF_SIZE));
+}
+
+/**
+ * zip_data_buf_alloc - Allocates memory for a data bufffer
+ * @size:   Size of the buffer to allocate
+ * Returns: Pointer to the buffer allocated
+ */
+u8 *zip_data_buf_alloc(u64 size)
+{
+	u8 *ptr;
+
+	zip_dbg_enter();
+
+	ptr = (u8 *)__get_free_pages((GFP_ATOMIC | GFP_DMA),
+					get_order(size));
+
+	if (!ptr)
+		return NULL;
+
+	memset(ptr, 0, size);
+
+	zip_dbg("Data buffer allocation success\n");
+	zip_dbg_exit();
+	return ptr;
+}
+
+/**
+ * zip_data_buf_free - Frees the memory of a data buffer
+ * @ptr:  Pointer to the buffer
+ * @size: Buffer size
+ */
+void zip_data_buf_free(u8 *ptr, u64 size)
+{
+	zip_dbg("Freeing data buffer 0x%lx\n", ptr);
+
+	free_pages((u64)ptr, get_order(size));
+}
diff --git a/drivers/crypto/cavium/zip/zip_mem.h b/drivers/crypto/cavium/zip/zip_mem.h
new file mode 100644
index 0000000..23591d8
--- /dev/null
+++ b/drivers/crypto/cavium/zip/zip_mem.h
@@ -0,0 +1,78 @@
+/***********************license start************************************
+ * Copyright (c) 2003-2016 Cavium, Inc.
+ * All rights reserved.
+ *
+ * License: one of 'Cavium License' or 'GNU General Public License Version 2'
+ *
+ * This file is provided under the terms of the Cavium License (see below)
+ * or under the terms of GNU General Public License, Version 2, as
+ * published by the Free Software Foundation. When using or redistributing
+ * this file, you may do so under either license.
+ *
+ * Cavium License:  Redistribution and use in source and binary forms, with
+ * or without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  * Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ *
+ *  * Redistributions in binary form must reproduce the above
+ *    copyright notice, this list of conditions and the following
+ *    disclaimer in the documentation and/or other materials provided
+ *    with the distribution.
+ *
+ *  * Neither the name of Cavium Inc. nor the names of its contributors may be
+ *    used to endorse or promote products derived from this software without
+ *    specific prior written permission.
+ *
+ * This Software, including technical data, may be subject to U.S. export
+ * control laws, including the U.S. Export Administration Act and its
+ * associated regulations, and may be subject to export or import
+ * regulations in other countries.
+ *
+ * TO THE MAXIMUM EXTENT PERMITTED BY LAW, THE SOFTWARE IS PROVIDED "AS IS"
+ * AND WITH ALL FAULTS AND CAVIUM INC. MAKES NO PROMISES, REPRESENTATIONS
+ * OR WARRANTIES, EITHER EXPRESS, IMPLIED, STATUTORY, OR OTHERWISE, WITH
+ * RESPECT TO THE SOFTWARE, INCLUDING ITS CONDITION, ITS CONFORMITY TO ANY
+ * REPRESENTATION OR DESCRIPTION, OR THE EXISTENCE OF ANY LATENT OR PATENT
+ * DEFECTS, AND CAVIUM SPECIFICALLY DISCLAIMS ALL IMPLIED (IF ANY)
+ * WARRANTIES OF TITLE, MERCHANTABILITY, NONINFRINGEMENT, FITNESS FOR A
+ * PARTICULAR PURPOSE, LACK OF VIRUSES, ACCURACY OR COMPLETENESS, QUIET
+ * ENJOYMENT, QUIET POSSESSION OR CORRESPONDENCE TO DESCRIPTION. THE
+ * ENTIRE  RISK ARISING OUT OF USE OR PERFORMANCE OF THE SOFTWARE LIES
+ * WITH YOU.
+ ***********************license end**************************************/
+
+#ifndef __ZIP_MEM_H__
+#define __ZIP_MEM_H__
+
+/**
+ * zip_cmd_qbuf_free - Frees the cmd Queue buffer
+ * @zip: Pointer to zip device structure
+ * @q:   Queue nmber to free buffer of
+ */
+void zip_cmd_qbuf_free(struct zip_device *zip, int q);
+
+/**
+ * zip_cmd_qbuf_alloc - Allocates a Chunk/cmd buffer for ZIP Inst(cmd) Queue
+ * @zip: Pointer to zip device structure
+ * @q:   Queue number to allocate bufffer to
+ * Return: 0 if successful, 1 otherwise
+ */
+int zip_cmd_qbuf_alloc(struct zip_device *zip, int q);
+
+/**
+ * zip_data_buf_alloc - Allocates memory for a data bufffer
+ * @size:   Size of the buffer to allocate
+ * Returns: Pointer to the buffer allocated
+ */
+u8 *zip_data_buf_alloc(u64 size);
+
+/**
+ * zip_data_buf_free - Frees the memory of a data buffer
+ * @ptr:  Pointer to the buffer
+ * @size: Buffer size
+ */
+void zip_data_buf_free(u8 *ptr, u64 size);
+
+#endif
diff --git a/drivers/crypto/cavium/zip/zip_regs.h b/drivers/crypto/cavium/zip/zip_regs.h
new file mode 100644
index 0000000..ec913ad
--- /dev/null
+++ b/drivers/crypto/cavium/zip/zip_regs.h
@@ -0,0 +1,1326 @@
+/***********************license start************************************
+ * Copyright (c) 2003-2016 Cavium, Inc.
+ * All rights reserved.
+ *
+ * License: one of 'Cavium License' or 'GNU General Public License Version 2'
+ *
+ * This file is provided under the terms of the Cavium License (see below)
+ * or under the terms of GNU General Public License, Version 2, as
+ * published by the Free Software Foundation. When using or redistributing
+ * this file, you may do so under either license.
+ *
+ * Cavium License:  Redistribution and use in source and binary forms, with
+ * or without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  * Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ *
+ *  * Redistributions in binary form must reproduce the above
+ *    copyright notice, this list of conditions and the following
+ *    disclaimer in the documentation and/or other materials provided
+ *    with the distribution.
+ *
+ *  * Neither the name of Cavium Inc. nor the names of its contributors may be
+ *    used to endorse or promote products derived from this software without
+ *    specific prior written permission.
+ *
+ * This Software, including technical data, may be subject to U.S. export
+ * control laws, including the U.S. Export Administration Act and its
+ * associated regulations, and may be subject to export or import
+ * regulations in other countries.
+ *
+ * TO THE MAXIMUM EXTENT PERMITTED BY LAW, THE SOFTWARE IS PROVIDED "AS IS"
+ * AND WITH ALL FAULTS AND CAVIUM INC. MAKES NO PROMISES, REPRESENTATIONS
+ * OR WARRANTIES, EITHER EXPRESS, IMPLIED, STATUTORY, OR OTHERWISE, WITH
+ * RESPECT TO THE SOFTWARE, INCLUDING ITS CONDITION, ITS CONFORMITY TO ANY
+ * REPRESENTATION OR DESCRIPTION, OR THE EXISTENCE OF ANY LATENT OR PATENT
+ * DEFECTS, AND CAVIUM SPECIFICALLY DISCLAIMS ALL IMPLIED (IF ANY)
+ * WARRANTIES OF TITLE, MERCHANTABILITY, NONINFRINGEMENT, FITNESS FOR A
+ * PARTICULAR PURPOSE, LACK OF VIRUSES, ACCURACY OR COMPLETENESS, QUIET
+ * ENJOYMENT, QUIET POSSESSION OR CORRESPONDENCE TO DESCRIPTION. THE
+ * ENTIRE  RISK ARISING OUT OF USE OR PERFORMANCE OF THE SOFTWARE LIES
+ * WITH YOU.
+ ***********************license end**************************************/
+
+#ifndef __ZIP_REGS_H__
+#define __ZIP_REGS_H__
+
+/*
+ * Configuration and status register (CSR) address and type definitions for
+ * Cavium ZIP.
+ */
+
+#include <linux/kern_levels.h>
+
+/**
+ * enum zip_comp_e - ZIP Completion Enumeration, enumerates the values of
+ * ZIP_ZRES_S[COMPCODE].
+ */
+enum zip_comp_e {
+	ZIP_COMP_E_BADCODE = 0x7,
+	ZIP_COMP_E_BADCODE2 = 0x8,
+	ZIP_COMP_E_DTRUNC = 0x2,
+	ZIP_COMP_E_FATAL = 0xb,
+	ZIP_COMP_E_ITRUNC = 0x4,
+	ZIP_COMP_E_NLEN = 0x6,
+	ZIP_COMP_E_NOTDONE = 0x0,
+	ZIP_COMP_E_PARITY = 0xa,
+	ZIP_COMP_E_RBLOCK = 0x5,
+	ZIP_COMP_E_STOP = 0x3,
+	ZIP_COMP_E_SUCCESS = 0x1,
+	ZIP_COMP_E_ZERO_LEN = 0x9,
+	ZIP_COMP_E_ENUM_LAST = 0xc,
+};
+
+/**
+ * enum zip_int_vec_e - ZIP MSI-X Vector Enumeration, enumerates the MSI-X
+ * interrupt vectors.
+ */
+enum zip_int_vec_e {
+	ZIP_INT_VEC_E_ECCE = 0x10,
+	ZIP_INT_VEC_E_FIFE = 0x11,
+	ZIP_INT_VEC_E_QUE0_DONE = 0x0,
+	ZIP_INT_VEC_E_QUE0_ERR = 0x8,
+	ZIP_INT_VEC_E_QUE1_DONE = 0x1,
+	ZIP_INT_VEC_E_QUE1_ERR = 0x9,
+	ZIP_INT_VEC_E_QUE2_DONE = 0x2,
+	ZIP_INT_VEC_E_QUE2_ERR = 0xa,
+	ZIP_INT_VEC_E_QUE3_DONE = 0x3,
+	ZIP_INT_VEC_E_QUE3_ERR = 0xb,
+	ZIP_INT_VEC_E_QUE4_DONE = 0x4,
+	ZIP_INT_VEC_E_QUE4_ERR = 0xc,
+	ZIP_INT_VEC_E_QUE5_DONE = 0x5,
+	ZIP_INT_VEC_E_QUE5_ERR = 0xd,
+	ZIP_INT_VEC_E_QUE6_DONE = 0x6,
+	ZIP_INT_VEC_E_QUE6_ERR = 0xe,
+	ZIP_INT_VEC_E_QUE7_DONE = 0x7,
+	ZIP_INT_VEC_E_QUE7_ERR = 0xf,
+	ZIP_INT_VEC_E_ENUM_LAST = 0x12,
+};
+
+/**
+ * union zip_zptr_addr_s - ZIP Generic Pointer Structure for ADDR.
+ *
+ * It is the generic format of pointers in ZIP_INST_S.
+ */
+union zip_zptr_addr_s {
+	u64 u_reg64;
+	struct {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_49_63              : 15;
+		u64 addr                        : 49;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 addr                        : 49;
+		u64 reserved_49_63              : 15;
+#endif
+	} s;
+
+};
+
+/**
+ * union zip_zptr_ctl_s - ZIP Generic Pointer Structure for CTL.
+ *
+ * It is the generic format of pointers in ZIP_INST_S.
+ */
+union zip_zptr_ctl_s {
+	u64 u_reg64;
+	struct {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_112_127            : 16;
+		u64 length                      : 16;
+		u64 reserved_67_95              : 29;
+		u64 fw                          : 1;
+		u64 nc                          : 1;
+		u64 data_be                     : 1;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 data_be                     : 1;
+		u64 nc                          : 1;
+		u64 fw                          : 1;
+		u64 reserved_67_95              : 29;
+		u64 length                      : 16;
+		u64 reserved_112_127            : 16;
+#endif
+	} s;
+};
+
+/**
+ * union zip_inst_s - ZIP Instruction Structure.
+ * Each ZIP instruction has 16 words (they are called IWORD0 to IWORD15 within
+ * the structure).
+ */
+union zip_inst_s {
+	u64 u_reg64[16];
+	struct {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 doneint                     : 1;
+		u64 reserved_56_62              : 7;
+		u64 totaloutputlength           : 24;
+		u64 reserved_27_31              : 5;
+		u64 exn                         : 3;
+		u64 reserved_23_23              : 1;
+		u64 exbits                      : 7;
+		u64 reserved_12_15              : 4;
+		u64 sf                          : 1;
+		u64 ss                          : 2;
+		u64 cc                          : 2;
+		u64 ef                          : 1;
+		u64 bf                          : 1;
+		u64 ce                          : 1;
+		u64 reserved_3_3                : 1;
+		u64 ds                          : 1;
+		u64 dg                          : 1;
+		u64 hg                          : 1;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 hg                          : 1;
+		u64 dg                          : 1;
+		u64 ds                          : 1;
+		u64 reserved_3_3                : 1;
+		u64 ce                          : 1;
+		u64 bf                          : 1;
+		u64 ef                          : 1;
+		u64 cc                          : 2;
+		u64 ss                          : 2;
+		u64 sf                          : 1;
+		u64 reserved_12_15              : 4;
+		u64 exbits                      : 7;
+		u64 reserved_23_23              : 1;
+		u64 exn                         : 3;
+		u64 reserved_27_31              : 5;
+		u64 totaloutputlength           : 24;
+		u64 reserved_56_62              : 7;
+		u64 doneint                     : 1;
+#endif
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 historylength               : 16;
+		u64 reserved_96_111             : 16;
+		u64 adlercrc32                  : 32;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 adlercrc32                  : 32;
+		u64 reserved_96_111             : 16;
+		u64 historylength               : 16;
+#endif
+		union zip_zptr_addr_s ctx_ptr_addr;
+		union zip_zptr_ctl_s ctx_ptr_ctl;
+		union zip_zptr_addr_s his_ptr_addr;
+		union zip_zptr_ctl_s his_ptr_ctl;
+		union zip_zptr_addr_s inp_ptr_addr;
+		union zip_zptr_ctl_s inp_ptr_ctl;
+		union zip_zptr_addr_s out_ptr_addr;
+		union zip_zptr_ctl_s out_ptr_ctl;
+		union zip_zptr_addr_s res_ptr_addr;
+		union zip_zptr_ctl_s res_ptr_ctl;
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_817_831            : 15;
+		u64 wq_ptr                      : 49;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 wq_ptr                      : 49;
+		u64 reserved_817_831            : 15;
+#endif
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_882_895            : 14;
+		u64 tt                          : 2;
+		u64 reserved_874_879            : 6;
+		u64 grp                         : 10;
+		u64 tag                         : 32;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 tag                         : 32;
+		u64 grp                         : 10;
+		u64 reserved_874_879            : 6;
+		u64 tt                          : 2;
+		u64 reserved_882_895            : 14;
+#endif
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_896_959            : 64;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 reserved_896_959            : 64;
+#endif
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_960_1023           : 64;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 reserved_960_1023           : 64;
+#endif
+	} s;
+};
+
+/**
+ * union zip_nptr_s - ZIP Instruction Next-Chunk-Buffer Pointer (NPTR)
+ * Structure
+ *
+ * ZIP_NPTR structure is used to chain all the zip instruction buffers
+ * together. ZIP instruction buffers are managed (allocated and released) by
+ * the software.
+ */
+union zip_nptr_s {
+	u64 u_reg64;
+	struct {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_49_63              : 15;
+		u64 addr                        : 49;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 addr                        : 49;
+		u64 reserved_49_63              : 15;
+#endif
+	} s;
+};
+
+/**
+ * union zip_zptr_s - ZIP Generic Pointer Structure.
+ *
+ * It is the generic format of pointers in ZIP_INST_S.
+ */
+union zip_zptr_s {
+	u64 u_reg64[2];
+	struct {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_49_63              : 15;
+		u64 addr                        : 49;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 addr                        : 49;
+		u64 reserved_49_63              : 15;
+#endif
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_112_127            : 16;
+		u64 length                      : 16;
+		u64 reserved_67_95              : 29;
+		u64 fw                          : 1;
+		u64 nc                          : 1;
+		u64 data_be                     : 1;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 data_be                     : 1;
+		u64 nc                          : 1;
+		u64 fw                          : 1;
+		u64 reserved_67_95              : 29;
+		u64 length                      : 16;
+		u64 reserved_112_127            : 16;
+#endif
+	} s;
+};
+
+/**
+ * union zip_zres_s - ZIP Result Structure
+ *
+ * The ZIP coprocessor writes the result structure after it completes the
+ * invocation. The result structure is exactly 24 bytes, and each invocation of
+ * the ZIP coprocessor produces exactly one result structure.
+ */
+union zip_zres_s {
+	u64 u_reg64[3];
+	struct {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 crc32                       : 32;
+		u64 adler32                     : 32;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 adler32                     : 32;
+		u64 crc32                       : 32;
+#endif
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 totalbyteswritten           : 32;
+		u64 totalbytesread              : 32;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 totalbytesread              : 32;
+		u64 totalbyteswritten           : 32;
+#endif
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 totalbitsprocessed          : 32;
+		u64 doneint                     : 1;
+		u64 reserved_155_158            : 4;
+		u64 exn                         : 3;
+		u64 reserved_151_151            : 1;
+		u64 exbits                      : 7;
+		u64 reserved_137_143            : 7;
+		u64 ef                          : 1;
+
+		volatile u64 compcode           : 8;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+
+		volatile u64 compcode           : 8;
+		u64 ef                          : 1;
+		u64 reserved_137_143            : 7;
+		u64 exbits                      : 7;
+		u64 reserved_151_151            : 1;
+		u64 exn                         : 3;
+		u64 reserved_155_158            : 4;
+		u64 doneint                     : 1;
+		u64 totalbitsprocessed          : 32;
+#endif
+	} s;
+};
+
+/**
+ * union zip_cmd_ctl - Structure representing the register that controls
+ * clock and reset.
+ */
+union zip_cmd_ctl {
+	u64 u_reg64;
+	struct zip_cmd_ctl_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_2_63               : 62;
+		u64 forceclk                    : 1;
+		u64 reset                       : 1;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 reset                       : 1;
+		u64 forceclk                    : 1;
+		u64 reserved_2_63               : 62;
+#endif
+	} s;
+};
+
+#define ZIP_CMD_CTL 0x0ull
+
+/**
+ * union zip_constants - Data structure representing the register that contains
+ * all of the current implementation-related parameters of the zip core in this
+ * chip.
+ */
+union zip_constants {
+	u64 u_reg64;
+	struct zip_constants_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 nexec                       : 8;
+		u64 reserved_49_55              : 7;
+		u64 syncflush_capable           : 1;
+		u64 depth                       : 16;
+		u64 onfsize                     : 12;
+		u64 ctxsize                     : 12;
+		u64 reserved_1_7                : 7;
+		u64 disabled                    : 1;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 disabled                    : 1;
+		u64 reserved_1_7                : 7;
+		u64 ctxsize                     : 12;
+		u64 onfsize                     : 12;
+		u64 depth                       : 16;
+		u64 syncflush_capable           : 1;
+		u64 reserved_49_55              : 7;
+		u64 nexec                       : 8;
+#endif
+	} s;
+};
+
+#define ZIP_CONSTANTS 0x00A0ull
+
+/**
+ * union zip_corex_bist_status - Represents registers which have the BIST
+ * status of memories in zip cores.
+ *
+ * Each bit is the BIST result of an individual memory
+ * (per bit, 0 = pass and 1 = fail).
+ */
+union zip_corex_bist_status {
+	u64 u_reg64;
+	struct zip_corex_bist_status_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_53_63              : 11;
+		u64 bstatus                     : 53;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 bstatus                     : 53;
+		u64 reserved_53_63              : 11;
+#endif
+	} s;
+};
+
+static inline u64 ZIP_COREX_BIST_STATUS(u64 param1)
+{
+	if (((param1 <= 1)))
+		return 0x0520ull + (param1 & 1) * 0x8ull;
+	pr_err("ZIP_COREX_BIST_STATUS: %llu\n", param1);
+	return 0;
+}
+
+/**
+ * union zip_ctl_bist_status - Represents register that has the BIST status of
+ * memories in ZIP_CTL (instruction buffer, G/S pointer FIFO, input data
+ * buffer, output data buffers).
+ *
+ * Each bit is the BIST result of an individual memory
+ * (per bit, 0 = pass and 1 = fail).
+ */
+union zip_ctl_bist_status {
+	u64 u_reg64;
+	struct zip_ctl_bist_status_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_9_63               : 55;
+		u64 bstatus                     : 9;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 bstatus                     : 9;
+		u64 reserved_9_63               : 55;
+#endif
+	} s;
+};
+
+#define ZIP_CTL_BIST_STATUS 0x0510ull
+
+/**
+ * union zip_ctl_cfg - Represents the register that controls the behavior of
+ * the ZIP DMA engines.
+ *
+ * It is recommended to keep default values for normal operation. Changing the
+ * values of the fields may be useful for diagnostics.
+ */
+union zip_ctl_cfg {
+	u64 u_reg64;
+	struct zip_ctl_cfg_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_52_63              : 12;
+		u64 ildf                        : 4;
+		u64 reserved_36_47              : 12;
+		u64 drtf                        : 4;
+		u64 reserved_27_31              : 5;
+		u64 stcf                        : 3;
+		u64 reserved_19_23              : 5;
+		u64 ldf                         : 3;
+		u64 reserved_2_15               : 14;
+		u64 busy                        : 1;
+		u64 reserved_0_0                : 1;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 reserved_0_0                : 1;
+		u64 busy                        : 1;
+		u64 reserved_2_15               : 14;
+		u64 ldf                         : 3;
+		u64 reserved_19_23              : 5;
+		u64 stcf                        : 3;
+		u64 reserved_27_31              : 5;
+		u64 drtf                        : 4;
+		u64 reserved_36_47              : 12;
+		u64 ildf                        : 4;
+		u64 reserved_52_63              : 12;
+#endif
+	} s;
+};
+
+#define ZIP_CTL_CFG 0x0560ull
+
+/**
+ * union zip_dbg_corex_inst - Represents the registers that reflect the status
+ * of the current instruction that the ZIP core is executing or has executed.
+ *
+ * These registers are only for debug use.
+ */
+union zip_dbg_corex_inst {
+	u64 u_reg64;
+	struct zip_dbg_corex_inst_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 busy                        : 1;
+		u64 reserved_35_62              : 28;
+		u64 qid                         : 3;
+		u64 iid                         : 32;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 iid                         : 32;
+		u64 qid                         : 3;
+		u64 reserved_35_62              : 28;
+		u64 busy                        : 1;
+#endif
+	} s;
+};
+
+static inline u64 ZIP_DBG_COREX_INST(u64 param1)
+{
+	if (((param1 <= 1)))
+		return 0x0640ull + (param1 & 1) * 0x8ull;
+	pr_err("ZIP_DBG_COREX_INST: %llu\n", param1);
+	return 0;
+}
+
+/**
+ * union zip_dbg_corex_sta - Represents registers that reflect the status of
+ * the zip cores.
+ *
+ * They are for debug use only.
+ */
+union zip_dbg_corex_sta {
+	u64 u_reg64;
+	struct zip_dbg_corex_sta_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 busy                        : 1;
+		u64 reserved_37_62              : 26;
+		u64 ist                         : 5;
+		u64 nie                         : 32;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 nie                         : 32;
+		u64 ist                         : 5;
+		u64 reserved_37_62              : 26;
+		u64 busy                        : 1;
+#endif
+	} s;
+};
+
+static inline u64 ZIP_DBG_COREX_STA(u64 param1)
+{
+	if (((param1 <= 1)))
+		return 0x0680ull + (param1 & 1) * 0x8ull;
+	pr_err("ZIP_DBG_COREX_STA: %llu\n", param1);
+	return 0;
+}
+
+/**
+ * union zip_dbg_quex_sta - Represets registers that reflect status of the zip
+ * instruction queues.
+ *
+ * They are for debug use only.
+ */
+union zip_dbg_quex_sta {
+	u64 u_reg64;
+	struct zip_dbg_quex_sta_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 busy                        : 1;
+		u64 reserved_56_62              : 7;
+		u64 rqwc                        : 24;
+		u64 nii                         : 32;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 nii                         : 32;
+		u64 rqwc                        : 24;
+		u64 reserved_56_62              : 7;
+		u64 busy                        : 1;
+#endif
+	} s;
+};
+
+static inline u64 ZIP_DBG_QUEX_STA(u64 param1)
+{
+	if (((param1 <= 7)))
+		return 0x1800ull + (param1 & 7) * 0x8ull;
+	pr_err("ZIP_DBG_QUEX_STA: %llu\n", param1);
+	return 0;
+}
+
+/**
+ * union zip_ecc_ctl - Represents the register that enables ECC for each
+ * individual internal memory that requires ECC.
+ *
+ * For debug purpose, it can also flip one or two bits in the ECC data.
+ */
+union zip_ecc_ctl {
+	u64 u_reg64;
+	struct zip_ecc_ctl_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_19_63              : 45;
+		u64 vmem_cdis                   : 1;
+		u64 vmem_fs                     : 2;
+		u64 reserved_15_15              : 1;
+		u64 idf1_cdis                   : 1;
+		u64 idf1_fs                     : 2;
+		u64 reserved_11_11              : 1;
+		u64 idf0_cdis                   : 1;
+		u64 idf0_fs                     : 2;
+		u64 reserved_7_7                : 1;
+		u64 gspf_cdis                   : 1;
+		u64 gspf_fs                     : 2;
+		u64 reserved_3_3                : 1;
+		u64 iqf_cdis                    : 1;
+		u64 iqf_fs                      : 2;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 iqf_fs                      : 2;
+		u64 iqf_cdis                    : 1;
+		u64 reserved_3_3                : 1;
+		u64 gspf_fs                     : 2;
+		u64 gspf_cdis                   : 1;
+		u64 reserved_7_7                : 1;
+		u64 idf0_fs                     : 2;
+		u64 idf0_cdis                   : 1;
+		u64 reserved_11_11              : 1;
+		u64 idf1_fs                     : 2;
+		u64 idf1_cdis                   : 1;
+		u64 reserved_15_15              : 1;
+		u64 vmem_fs                     : 2;
+		u64 vmem_cdis                   : 1;
+		u64 reserved_19_63              : 45;
+#endif
+	} s;
+};
+
+#define ZIP_ECC_CTL 0x0568ull
+
+/* NCB - zip_ecce_ena_w1c */
+union zip_ecce_ena_w1c {
+	u64 u_reg64;
+	struct zip_ecce_ena_w1c_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_37_63              : 27;
+		u64 dbe                         : 5;
+		u64 reserved_5_31               : 27;
+		u64 sbe                         : 5;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 sbe                         : 5;
+		u64 reserved_5_31               : 27;
+		u64 dbe                         : 5;
+		u64 reserved_37_63              : 27;
+#endif
+	} s;
+};
+
+#define ZIP_ECCE_ENA_W1C 0x0598ull
+
+/* NCB - zip_ecce_ena_w1s */
+union zip_ecce_ena_w1s {
+	u64 u_reg64;
+	struct zip_ecce_ena_w1s_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_37_63              : 27;
+		u64 dbe                         : 5;
+		u64 reserved_5_31               : 27;
+		u64 sbe                         : 5;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 sbe                         : 5;
+		u64 reserved_5_31               : 27;
+		u64 dbe                         : 5;
+		u64 reserved_37_63              : 27;
+#endif
+	} s;
+};
+
+#define ZIP_ECCE_ENA_W1S 0x0590ull
+
+/**
+ * union zip_ecce_int - Represents the register that contains the status of the
+ * ECC interrupt sources.
+ */
+union zip_ecce_int {
+	u64 u_reg64;
+	struct zip_ecce_int_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_37_63              : 27;
+		u64 dbe                         : 5;
+		u64 reserved_5_31               : 27;
+		u64 sbe                         : 5;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 sbe                         : 5;
+		u64 reserved_5_31               : 27;
+		u64 dbe                         : 5;
+		u64 reserved_37_63              : 27;
+#endif
+	} s;
+};
+
+#define ZIP_ECCE_INT 0x0580ull
+
+/* NCB - zip_ecce_int_w1s */
+union zip_ecce_int_w1s {
+	u64 u_reg64;
+	struct zip_ecce_int_w1s_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_37_63              : 27;
+		u64 dbe                         : 5;
+		u64 reserved_5_31               : 27;
+		u64 sbe                         : 5;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 sbe                         : 5;
+		u64 reserved_5_31               : 27;
+		u64 dbe                         : 5;
+		u64 reserved_37_63              : 27;
+#endif
+	} s;
+};
+
+#define ZIP_ECCE_INT_W1S 0x0588ull
+
+/* NCB - zip_fife_ena_w1c */
+union zip_fife_ena_w1c {
+	u64 u_reg64;
+	struct zip_fife_ena_w1c_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_42_63              : 22;
+		u64 asserts                     : 42;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 asserts                     : 42;
+		u64 reserved_42_63              : 22;
+#endif
+	} s;
+};
+
+#define ZIP_FIFE_ENA_W1C 0x0090ull
+
+/* NCB - zip_fife_ena_w1s */
+union zip_fife_ena_w1s {
+	u64 u_reg64;
+	struct zip_fife_ena_w1s_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_42_63              : 22;
+		u64 asserts                     : 42;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 asserts                     : 42;
+		u64 reserved_42_63              : 22;
+#endif
+	} s;
+};
+
+#define ZIP_FIFE_ENA_W1S 0x0088ull
+
+/* NCB - zip_fife_int */
+union zip_fife_int {
+	u64 u_reg64;
+	struct zip_fife_int_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_42_63              : 22;
+		u64 asserts                     : 42;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 asserts                     : 42;
+		u64 reserved_42_63              : 22;
+#endif
+	} s;
+};
+
+#define ZIP_FIFE_INT 0x0078ull
+
+/* NCB - zip_fife_int_w1s */
+union zip_fife_int_w1s {
+	u64 u_reg64;
+	struct zip_fife_int_w1s_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_42_63              : 22;
+		u64 asserts                     : 42;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 asserts                     : 42;
+		u64 reserved_42_63              : 22;
+#endif
+	} s;
+};
+
+#define ZIP_FIFE_INT_W1S 0x0080ull
+
+/**
+ * union zip_msix_pbax - Represents the register that is the MSI-X PBA table
+ *
+ * The bit number is indexed by the ZIP_INT_VEC_E enumeration.
+ */
+union zip_msix_pbax {
+	u64 u_reg64;
+	struct zip_msix_pbax_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 pend                        : 64;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 pend                        : 64;
+#endif
+	} s;
+};
+
+static inline u64 ZIP_MSIX_PBAX(u64 param1)
+{
+	if (((param1 == 0)))
+		return 0x0000838000FF0000ull;
+	pr_err("ZIP_MSIX_PBAX: %llu\n", param1);
+	return 0;
+}
+
+/**
+ * union zip_msix_vecx_addr - Represents the register that is the MSI-X vector
+ * table, indexed by the ZIP_INT_VEC_E enumeration.
+ */
+union zip_msix_vecx_addr {
+	u64 u_reg64;
+	struct zip_msix_vecx_addr_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_49_63              : 15;
+		u64 addr                        : 47;
+		u64 reserved_1_1                : 1;
+		u64 secvec                      : 1;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 secvec                      : 1;
+		u64 reserved_1_1                : 1;
+		u64 addr                        : 47;
+		u64 reserved_49_63              : 15;
+#endif
+	} s;
+};
+
+static inline u64 ZIP_MSIX_VECX_ADDR(u64 param1)
+{
+	if (((param1 <= 17)))
+		return 0x0000838000F00000ull + (param1 & 31) * 0x10ull;
+	pr_err("ZIP_MSIX_VECX_ADDR: %llu\n", param1);
+	return 0;
+}
+
+/**
+ * union zip_msix_vecx_ctl - Represents the register that is the MSI-X vector
+ * table, indexed by the ZIP_INT_VEC_E enumeration.
+ */
+union zip_msix_vecx_ctl {
+	u64 u_reg64;
+	struct zip_msix_vecx_ctl_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_33_63              : 31;
+		u64 mask                        : 1;
+		u64 reserved_20_31              : 12;
+		u64 data                        : 20;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 data                        : 20;
+		u64 reserved_20_31              : 12;
+		u64 mask                        : 1;
+		u64 reserved_33_63              : 31;
+#endif
+	} s;
+};
+
+static inline u64 ZIP_MSIX_VECX_CTL(u64 param1)
+{
+	if (((param1 <= 17)))
+		return 0x0000838000F00008ull + (param1 & 31) * 0x10ull;
+	pr_err("ZIP_MSIX_VECX_CTL: %llu\n", param1);
+	return 0;
+}
+
+/**
+ * union zip_quex_done - Represents the registers that contain the per-queue
+ * instruction done count.
+ */
+union zip_quex_done {
+	u64 u_reg64;
+	struct zip_quex_done_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_20_63              : 44;
+		u64 done                        : 20;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 done                        : 20;
+		u64 reserved_20_63              : 44;
+#endif
+	} s;
+};
+
+static inline u64 ZIP_QUEX_DONE(u64 param1)
+{
+	if (((param1 <= 7)))
+		return 0x2000ull + (param1 & 7) * 0x8ull;
+	pr_err("ZIP_QUEX_DONE: %llu\n", param1);
+	return 0;
+}
+
+/**
+ * union zip_quex_done_ack - Represents the registers on write to which will
+ * decrement the per-queue instructiona done count.
+ */
+union zip_quex_done_ack {
+	u64 u_reg64;
+	struct zip_quex_done_ack_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_20_63              : 44;
+		u64 done_ack                    : 20;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 done_ack                    : 20;
+		u64 reserved_20_63              : 44;
+#endif
+	} s;
+};
+
+static inline u64 ZIP_QUEX_DONE_ACK(u64 param1)
+{
+	if (((param1 <= 7)))
+		return 0x2200ull + (param1 & 7) * 0x8ull;
+	pr_err("ZIP_QUEX_DONE_ACK: %llu\n", param1);
+	return 0;
+}
+
+/**
+ * union zip_quex_done_ena_w1c - Represents the register which when written
+ * 1 to will disable the DONEINT interrupt for the queue.
+ */
+union zip_quex_done_ena_w1c {
+	u64 u_reg64;
+	struct zip_quex_done_ena_w1c_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_1_63               : 63;
+		u64 done_ena                    : 1;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 done_ena                    : 1;
+		u64 reserved_1_63               : 63;
+#endif
+	} s;
+};
+
+static inline u64 ZIP_QUEX_DONE_ENA_W1C(u64 param1)
+{
+	if (((param1 <= 7)))
+		return 0x2600ull + (param1 & 7) * 0x8ull;
+	pr_err("ZIP_QUEX_DONE_ENA_W1C: %llu\n", param1);
+	return 0;
+}
+
+/**
+ * union zip_quex_done_ena_w1s - Represents the register that when written 1 to
+ * will enable the DONEINT interrupt for the queue.
+ */
+union zip_quex_done_ena_w1s {
+	u64 u_reg64;
+	struct zip_quex_done_ena_w1s_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_1_63               : 63;
+		u64 done_ena                    : 1;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 done_ena                    : 1;
+		u64 reserved_1_63               : 63;
+#endif
+	} s;
+};
+
+static inline u64 ZIP_QUEX_DONE_ENA_W1S(u64 param1)
+{
+	if (((param1 <= 7)))
+		return 0x2400ull + (param1 & 7) * 0x8ull;
+	pr_err("ZIP_QUEX_DONE_ENA_W1S: %llu\n", param1);
+	return 0;
+}
+
+/**
+ * union zip_quex_done_wait - Represents the register that specifies the per
+ * queue interrupt coalescing settings.
+ */
+union zip_quex_done_wait {
+	u64 u_reg64;
+	struct zip_quex_done_wait_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_48_63              : 16;
+		u64 time_wait                   : 16;
+		u64 reserved_20_31              : 12;
+		u64 num_wait                    : 20;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 num_wait                    : 20;
+		u64 reserved_20_31              : 12;
+		u64 time_wait                   : 16;
+		u64 reserved_48_63              : 16;
+#endif
+	} s;
+};
+
+static inline u64 ZIP_QUEX_DONE_WAIT(u64 param1)
+{
+	if (((param1 <= 7)))
+		return 0x2800ull + (param1 & 7) * 0x8ull;
+	pr_err("ZIP_QUEX_DONE_WAIT: %llu\n", param1);
+	return 0;
+}
+
+/**
+ * union zip_quex_doorbell - Represents doorbell registers for the ZIP
+ * instruction queues.
+ */
+union zip_quex_doorbell {
+	u64 u_reg64;
+	struct zip_quex_doorbell_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_20_63              : 44;
+		u64 dbell_cnt                   : 20;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 dbell_cnt                   : 20;
+		u64 reserved_20_63              : 44;
+#endif
+	} s;
+};
+
+static inline u64 ZIP_QUEX_DOORBELL(u64 param1)
+{
+	if (((param1 <= 7)))
+		return 0x4000ull + (param1 & 7) * 0x8ull;
+	pr_err("ZIP_QUEX_DOORBELL: %llu\n", param1);
+	return 0;
+}
+
+union zip_quex_err_ena_w1c {
+	u64 u_reg64;
+	struct zip_quex_err_ena_w1c_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_5_63               : 59;
+		u64 mdbe                        : 1;
+		u64 nwrp                        : 1;
+		u64 nrrp                        : 1;
+		u64 irde                        : 1;
+		u64 dovf                        : 1;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 dovf                        : 1;
+		u64 irde                        : 1;
+		u64 nrrp                        : 1;
+		u64 nwrp                        : 1;
+		u64 mdbe                        : 1;
+		u64 reserved_5_63               : 59;
+#endif
+	} s;
+};
+
+static inline u64 ZIP_QUEX_ERR_ENA_W1C(u64 param1)
+{
+	if (((param1 <= 7)))
+		return 0x3600ull + (param1 & 7) * 0x8ull;
+	pr_err("ZIP_QUEX_ERR_ENA_W1C: %llu\n", param1);
+	return 0;
+}
+
+union zip_quex_err_ena_w1s {
+	u64 u_reg64;
+	struct zip_quex_err_ena_w1s_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_5_63               : 59;
+		u64 mdbe                        : 1;
+		u64 nwrp                        : 1;
+		u64 nrrp                        : 1;
+		u64 irde                        : 1;
+		u64 dovf                        : 1;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 dovf                        : 1;
+		u64 irde                        : 1;
+		u64 nrrp                        : 1;
+		u64 nwrp                        : 1;
+		u64 mdbe                        : 1;
+		u64 reserved_5_63               : 59;
+#endif
+	} s;
+};
+
+static inline u64 ZIP_QUEX_ERR_ENA_W1S(u64 param1)
+{
+	if (((param1 <= 7)))
+		return 0x3400ull + (param1 & 7) * 0x8ull;
+	pr_err("ZIP_QUEX_ERR_ENA_W1S: %llu\n", param1);
+	return 0;
+}
+
+/**
+ * union zip_quex_err_int - Represents registers that contain the per-queue
+ * error interrupts.
+ */
+union zip_quex_err_int {
+	u64 u_reg64;
+	struct zip_quex_err_int_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_5_63               : 59;
+		u64 mdbe                        : 1;
+		u64 nwrp                        : 1;
+		u64 nrrp                        : 1;
+		u64 irde                        : 1;
+		u64 dovf                        : 1;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 dovf                        : 1;
+		u64 irde                        : 1;
+		u64 nrrp                        : 1;
+		u64 nwrp                        : 1;
+		u64 mdbe                        : 1;
+		u64 reserved_5_63               : 59;
+#endif
+	} s;
+};
+
+static inline u64 ZIP_QUEX_ERR_INT(u64 param1)
+{
+	if (((param1 <= 7)))
+		return 0x3000ull + (param1 & 7) * 0x8ull;
+	pr_err("ZIP_QUEX_ERR_INT: %llu\n", param1);
+	return 0;
+}
+
+/* NCB - zip_que#_err_int_w1s */
+union zip_quex_err_int_w1s {
+	u64 u_reg64;
+	struct zip_quex_err_int_w1s_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_5_63               : 59;
+		u64 mdbe                        : 1;
+		u64 nwrp                        : 1;
+		u64 nrrp                        : 1;
+		u64 irde                        : 1;
+		u64 dovf                        : 1;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 dovf                        : 1;
+		u64 irde                        : 1;
+		u64 nrrp                        : 1;
+		u64 nwrp                        : 1;
+		u64 mdbe                        : 1;
+		u64 reserved_5_63               : 59;
+#endif
+	} s;
+};
+
+static inline u64 ZIP_QUEX_ERR_INT_W1S(u64 param1)
+{
+	if (((param1 <= 7)))
+		return 0x3200ull + (param1 & 7) * 0x8ull;
+	pr_err("ZIP_QUEX_ERR_INT_W1S: %llu\n", param1);
+	return 0;
+}
+
+/**
+ * union zip_quex_gcfg - Represents the registers that reflect status of the
+ * zip instruction queues,debug use only.
+ */
+union zip_quex_gcfg {
+	u64 u_reg64;
+	struct zip_quex_gcfg_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_4_63               : 60;
+		u64 iqb_ldwb                    : 1;
+		u64 cbw_sty                     : 1;
+		u64 l2ld_cmd                    : 2;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 l2ld_cmd                    : 2;
+		u64 cbw_sty                     : 1;
+		u64 iqb_ldwb                    : 1;
+		u64 reserved_4_63               : 60;
+#endif
+	} s;
+};
+
+static inline u64 ZIP_QUEX_GCFG(u64 param1)
+{
+	if (((param1 <= 7)))
+		return 0x1A00ull + (param1 & 7) * 0x8ull;
+	pr_err("ZIP_QUEX_GCFG: %llu\n", param1);
+	return 0;
+}
+
+/**
+ * union zip_quex_map - Represents the registers that control how each
+ * instruction queue maps to zip cores.
+ */
+union zip_quex_map {
+	u64 u_reg64;
+	struct zip_quex_map_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_2_63               : 62;
+		u64 zce                         : 2;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 zce                         : 2;
+		u64 reserved_2_63               : 62;
+#endif
+	} s;
+};
+
+static inline u64 ZIP_QUEX_MAP(u64 param1)
+{
+	if (((param1 <= 7)))
+		return 0x1400ull + (param1 & 7) * 0x8ull;
+	pr_err("ZIP_QUEX_MAP: %llu\n", param1);
+	return 0;
+}
+
+/**
+ * union zip_quex_sbuf_addr - Represents the registers that set the buffer
+ * parameters for the instruction queues.
+ *
+ * When quiescent (i.e. outstanding doorbell count is 0), it is safe to rewrite
+ * this register to effectively reset the command buffer state machine.
+ * These registers must be programmed after SW programs the corresponding
+ * ZIP_QUE(0..7)_SBUF_CTL.
+ */
+union zip_quex_sbuf_addr {
+	u64 u_reg64;
+	struct zip_quex_sbuf_addr_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_49_63              : 15;
+		u64 ptr                         : 42;
+		u64 off                         : 7;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 off                         : 7;
+		u64 ptr                         : 42;
+		u64 reserved_49_63              : 15;
+#endif
+	} s;
+};
+
+static inline u64 ZIP_QUEX_SBUF_ADDR(u64 param1)
+{
+	if (((param1 <= 7)))
+		return 0x1000ull + (param1 & 7) * 0x8ull;
+	pr_err("ZIP_QUEX_SBUF_ADDR: %llu\n", param1);
+	return 0;
+}
+
+/**
+ * union zip_quex_sbuf_ctl - Represents the registers that set the buffer
+ * parameters for the instruction queues.
+ *
+ * When quiescent (i.e. outstanding doorbell count is 0), it is safe to rewrite
+ * this register to effectively reset the command buffer state machine.
+ * These registers must be programmed before SW programs the corresponding
+ * ZIP_QUE(0..7)_SBUF_ADDR.
+ */
+union zip_quex_sbuf_ctl {
+	u64 u_reg64;
+	struct zip_quex_sbuf_ctl_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_45_63              : 19;
+		u64 size                        : 13;
+		u64 inst_be                     : 1;
+		u64 reserved_24_30              : 7;
+		u64 stream_id                   : 8;
+		u64 reserved_12_15              : 4;
+		u64 aura                        : 12;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 aura                        : 12;
+		u64 reserved_12_15              : 4;
+		u64 stream_id                   : 8;
+		u64 reserved_24_30              : 7;
+		u64 inst_be                     : 1;
+		u64 size                        : 13;
+		u64 reserved_45_63              : 19;
+#endif
+	} s;
+};
+
+static inline u64 ZIP_QUEX_SBUF_CTL(u64 param1)
+{
+	if (((param1 <= 7)))
+		return 0x1200ull + (param1 & 7) * 0x8ull;
+	pr_err("ZIP_QUEX_SBUF_CTL: %llu\n", param1);
+	return 0;
+}
+
+/**
+ * union zip_que_ena - Represents queue enable register
+ *
+ * If a queue is disabled, ZIP_CTL stops fetching instructions from the queue.
+ */
+union zip_que_ena {
+	u64 u_reg64;
+	struct zip_que_ena_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_8_63               : 56;
+		u64 ena                         : 8;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 ena                         : 8;
+		u64 reserved_8_63               : 56;
+#endif
+	} s;
+};
+
+#define ZIP_QUE_ENA 0x0500ull
+
+/**
+ * union zip_que_pri - Represents the register that defines the priority
+ * between instruction queues.
+ */
+union zip_que_pri {
+	u64 u_reg64;
+	struct zip_que_pri_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_8_63               : 56;
+		u64 pri                         : 8;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 pri                         : 8;
+		u64 reserved_8_63               : 56;
+#endif
+	} s;
+};
+
+#define ZIP_QUE_PRI 0x0508ull
+
+/**
+ * union zip_throttle - Represents the register that controls the maximum
+ * number of in-flight X2I data fetch transactions.
+ *
+ * Writing 0 to this register causes the ZIP module to temporarily suspend NCB
+ * accesses; it is not recommended for normal operation, but may be useful for
+ * diagnostics.
+ */
+union zip_throttle {
+	u64 u_reg64;
+	struct zip_throttle_s {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		u64 reserved_6_63               : 58;
+		u64 ld_infl                     : 6;
+#elif defined(__LITTLE_ENDIAN_BITFIELD)
+		u64 ld_infl                     : 6;
+		u64 reserved_6_63               : 58;
+#endif
+	} s;
+};
+
+#define ZIP_THROTTLE 0x0010ull
+
+#endif /* _CSRS_ZIP__ */
-- 
2.9.0.rc0.21.g7777322

^ permalink raw reply related

* [RFC PATCH 2/3] crypto: zip - Wire-up Compression / decompression HW offload
From: Jan Glauber @ 2016-12-12 15:04 UTC (permalink / raw)
  To: Herbert Xu
  Cc: linux-crypto, linux-kernel, David S . Miller, Mahipal Challa,
	Vishnu Nair, Jan Glauber
In-Reply-To: <20161212150439.18627-1-jglauber@cavium.com>

From: Mahipal Challa <Mahipal.Challa@cavium.com>

This contains changes for adding compression/decompression h/w offload
functionality for both DEFLATE and LZS.

Signed-off-by: Mahipal Challa <Mahipal.Challa@cavium.com>
Signed-off-by: Vishnu Nair <Vishnu.Nair@cavium.com>
Signed-off-by: Jan Glauber <jglauber@cavium.com>
---
 drivers/crypto/cavium/zip/Makefile      |   5 +-
 drivers/crypto/cavium/zip/zip_crypto.c  | 243 ++++++++++++++++++++++++++++++++
 drivers/crypto/cavium/zip/zip_crypto.h  |   6 +
 drivers/crypto/cavium/zip/zip_deflate.c | 190 +++++++++++++++++++++++++
 drivers/crypto/cavium/zip/zip_deflate.h |  62 ++++++++
 drivers/crypto/cavium/zip/zip_device.c  |   1 +
 drivers/crypto/cavium/zip/zip_inflate.c | 211 +++++++++++++++++++++++++++
 drivers/crypto/cavium/zip/zip_inflate.h |  62 ++++++++
 drivers/crypto/cavium/zip/zip_main.c    |  29 ----
 9 files changed, 779 insertions(+), 30 deletions(-)
 create mode 100644 drivers/crypto/cavium/zip/zip_crypto.c
 create mode 100644 drivers/crypto/cavium/zip/zip_deflate.c
 create mode 100644 drivers/crypto/cavium/zip/zip_deflate.h
 create mode 100644 drivers/crypto/cavium/zip/zip_inflate.c
 create mode 100644 drivers/crypto/cavium/zip/zip_inflate.h

diff --git a/drivers/crypto/cavium/zip/Makefile b/drivers/crypto/cavium/zip/Makefile
index 2c07508..b2f3baaf 100644
--- a/drivers/crypto/cavium/zip/Makefile
+++ b/drivers/crypto/cavium/zip/Makefile
@@ -5,4 +5,7 @@
 obj-$(CONFIG_CRYPTO_DEV_CAVIUM_ZIP) += thunderx_zip.o
 thunderx_zip-y := zip_main.o    \
                   zip_device.o  \
-                  zip_mem.o
+                  zip_crypto.o  \
+                  zip_mem.o     \
+                  zip_deflate.o \
+                  zip_inflate.o
diff --git a/drivers/crypto/cavium/zip/zip_crypto.c b/drivers/crypto/cavium/zip/zip_crypto.c
new file mode 100644
index 0000000..888e18b
--- /dev/null
+++ b/drivers/crypto/cavium/zip/zip_crypto.c
@@ -0,0 +1,243 @@
+/***********************license start************************************
+ * Copyright (c) 2003-2016 Cavium, Inc.
+ * All rights reserved.
+ *
+ * License: one of 'Cavium License' or 'GNU General Public License Version 2'
+ *
+ * This file is provided under the terms of the Cavium License (see below)
+ * or under the terms of GNU General Public License, Version 2, as
+ * published by the Free Software Foundation. When using or redistributing
+ * this file, you may do so under either license.
+ *
+ * Cavium License:  Redistribution and use in source and binary forms, with
+ * or without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  * Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ *
+ *  * Redistributions in binary form must reproduce the above
+ *    copyright notice, this list of conditions and the following
+ *    disclaimer in the documentation and/or other materials provided
+ *    with the distribution.
+ *
+ *  * Neither the name of Cavium Inc. nor the names of its contributors may be
+ *    used to endorse or promote products derived from this software without
+ *    specific prior written permission.
+ *
+ * This Software, including technical data, may be subject to U.S. export
+ * control laws, including the U.S. Export Administration Act and its
+ * associated regulations, and may be subject to export or import
+ * regulations in other countries.
+ *
+ * TO THE MAXIMUM EXTENT PERMITTED BY LAW, THE SOFTWARE IS PROVIDED "AS IS"
+ * AND WITH ALL FAULTS AND CAVIUM INC. MAKES NO PROMISES, REPRESENTATIONS
+ * OR WARRANTIES, EITHER EXPRESS, IMPLIED, STATUTORY, OR OTHERWISE, WITH
+ * RESPECT TO THE SOFTWARE, INCLUDING ITS CONDITION, ITS CONFORMITY TO ANY
+ * REPRESENTATION OR DESCRIPTION, OR THE EXISTENCE OF ANY LATENT OR PATENT
+ * DEFECTS, AND CAVIUM SPECIFICALLY DISCLAIMS ALL IMPLIED (IF ANY)
+ * WARRANTIES OF TITLE, MERCHANTABILITY, NONINFRINGEMENT, FITNESS FOR A
+ * PARTICULAR PURPOSE, LACK OF VIRUSES, ACCURACY OR COMPLETENESS, QUIET
+ * ENJOYMENT, QUIET POSSESSION OR CORRESPONDENCE TO DESCRIPTION. THE
+ * ENTIRE  RISK ARISING OUT OF USE OR PERFORMANCE OF THE SOFTWARE LIES
+ * WITH YOU.
+ ***********************license end**************************************/
+
+#include "zip_crypto.h"
+
+static void zip_static_init_zip_ops(struct zip_operation *zip_ops,
+				    int lzs_flag)
+{
+	zip_ops->flush        = ZIP_FLUSH_FINISH;
+
+	/* equivalent to level 6 of opensource zlib */
+	zip_ops->speed          = 1;
+
+	if (!lzs_flag) {
+		zip_ops->ccode		= 0; /* Auto Huffman */
+		zip_ops->lzs_flag	= 0;
+		zip_ops->format		= ZLIB_FORMAT;
+	} else {
+		zip_ops->ccode		= 3; /* LZS Encoding */
+		zip_ops->lzs_flag	= 1;
+		zip_ops->format		= LZS_FORMAT;
+	}
+	zip_ops->begin_file   = 1;
+	zip_ops->history_len  = 0;
+	zip_ops->end_file     = 1;
+	zip_ops->compcode     = 0;
+	zip_ops->csum	      = 1; /* Adler checksum desired */
+}
+
+/* Legacy Compress framework start */
+
+int zip_alloc_zip_ctx(struct crypto_tfm *tfm)
+{
+	struct zip_kernel_ctx *zip_ctx    = crypto_tfm_ctx(tfm);
+	struct zip_operation  *comp_ctx   = &zip_ctx->zip_comp;
+	struct zip_operation  *decomp_ctx = &zip_ctx->zip_decomp;
+
+	zip_static_init_zip_ops(comp_ctx, 0);
+	zip_static_init_zip_ops(decomp_ctx, 0);
+
+	comp_ctx->input  = zip_data_buf_alloc(MAX_INPUT_BUFFER_SIZE);
+	if (!comp_ctx->input)
+		return -ENOMEM;
+
+	comp_ctx->output = zip_data_buf_alloc(MAX_OUTPUT_BUFFER_SIZE);
+	if (!comp_ctx->output)
+		goto err_comp_input;
+
+	decomp_ctx->input  = zip_data_buf_alloc(MAX_INPUT_BUFFER_SIZE);
+	if (!decomp_ctx->input)
+		goto err_comp_output;
+
+	decomp_ctx->output = zip_data_buf_alloc(MAX_OUTPUT_BUFFER_SIZE);
+	if (!decomp_ctx->output)
+		goto err_decomp_input;
+
+	return 0;
+
+err_decomp_input:
+	zip_data_buf_free(decomp_ctx->input, MAX_INPUT_BUFFER_SIZE);
+
+err_comp_output:
+	zip_data_buf_free(comp_ctx->output, MAX_OUTPUT_BUFFER_SIZE);
+
+err_comp_input:
+	zip_data_buf_free(comp_ctx->input, MAX_INPUT_BUFFER_SIZE);
+
+	return -ENOMEM;
+}
+
+int zip_alloc_lzs_ctx(struct crypto_tfm *tfm)
+{
+	struct zip_kernel_ctx *zip_ctx    = crypto_tfm_ctx(tfm);
+	struct zip_operation  *comp_ctx   = &zip_ctx->zip_comp;
+	struct zip_operation  *decomp_ctx = &zip_ctx->zip_decomp;
+
+	zip_static_init_zip_ops(comp_ctx, 1);
+	zip_static_init_zip_ops(decomp_ctx, 1);
+
+	comp_ctx->input  = zip_data_buf_alloc(MAX_INPUT_BUFFER_SIZE);
+	if (!comp_ctx->input)
+		return -ENOMEM;
+
+	comp_ctx->output = zip_data_buf_alloc(MAX_OUTPUT_BUFFER_SIZE);
+	if (!comp_ctx->output)
+		goto err_comp_input;
+
+	decomp_ctx->input  = zip_data_buf_alloc(MAX_INPUT_BUFFER_SIZE);
+	if (!decomp_ctx->input)
+		goto err_comp_output;
+
+	decomp_ctx->output = zip_data_buf_alloc(MAX_OUTPUT_BUFFER_SIZE);
+	if (!decomp_ctx->output)
+		goto err_decomp_input;
+
+	return 0;
+
+err_decomp_input:
+	zip_data_buf_free(decomp_ctx->input, MAX_INPUT_BUFFER_SIZE);
+
+err_comp_output:
+	zip_data_buf_free(comp_ctx->output, MAX_OUTPUT_BUFFER_SIZE);
+
+err_comp_input:
+	zip_data_buf_free(comp_ctx->input, MAX_INPUT_BUFFER_SIZE);
+
+	return -ENOMEM;
+}
+
+void zip_free_zip_ctx(struct crypto_tfm *tfm)
+{
+	struct zip_kernel_ctx *zip_ctx    = crypto_tfm_ctx(tfm);
+	struct zip_operation  *comp_ctx   = &zip_ctx->zip_comp;
+	struct zip_operation  *dec_ctx = &zip_ctx->zip_decomp;
+
+	zip_data_buf_free(comp_ctx->input, MAX_INPUT_BUFFER_SIZE);
+	zip_data_buf_free(comp_ctx->output, MAX_OUTPUT_BUFFER_SIZE);
+
+	zip_data_buf_free(dec_ctx->input, MAX_INPUT_BUFFER_SIZE);
+	zip_data_buf_free(dec_ctx->output, MAX_OUTPUT_BUFFER_SIZE);
+}
+
+int  zip_deflate_comp(struct crypto_tfm *tfm,
+		      const u8 *src, unsigned int slen,
+		      u8 *dst, unsigned int *dlen)
+{
+	struct zip_kernel_ctx *zip_ctx  = NULL;
+	struct zip_operation  *zip_ops   = NULL;
+	struct zip_state      zip_state;
+	struct zip_device     *zip = NULL;
+	int ret;
+
+	if (!tfm || !src || !dst || !dlen)
+		return -ENOMEM;
+
+	zip = zip_get_device(zip_get_node_id());
+	if (!zip)
+		return -ENODEV;
+
+	memset(&zip_state, 0, sizeof(struct zip_state));
+
+	zip_ctx = crypto_tfm_ctx(tfm);
+	zip_ops = &zip_ctx->zip_comp;
+
+	zip_ops->input_len  = slen;
+	zip_ops->output_len = *dlen;
+
+	memcpy(zip_ops->input, src, slen);
+
+	ret = zip_deflate(zip_ops, &zip_state, zip);
+
+	if (!ret) {
+		*dlen = zip_ops->output_len;
+		memcpy(dst, zip_ops->output, *dlen);
+	}
+
+	return ret;
+}
+
+int  zip_inflate_comp(struct crypto_tfm *tfm,
+		      const u8 *src, unsigned int slen,
+		      u8 *dst, unsigned int *dlen)
+{
+	struct zip_kernel_ctx *zip_ctx  = NULL;
+	struct zip_operation  *zip_ops   = NULL;
+	struct zip_state      zip_state;
+	struct zip_device     *zip = NULL;
+	int ret;
+
+	if (!tfm || !src || !dst || !dlen)
+		return -ENOMEM;
+
+	zip = zip_get_device(zip_get_node_id());
+	if (!zip)
+		return -ENODEV;
+
+	memset(&zip_state, 0, sizeof(struct zip_state));
+
+	zip_ctx = crypto_tfm_ctx(tfm);
+	zip_ops = &zip_ctx->zip_decomp;
+
+	memcpy(zip_ops->input, src, slen);
+
+	/* Work around for a bug in zlib which needs an extra bytes sometimes */
+	if (zip_ops->ccode != 3) /* Not LZS Encoding */
+		zip_ops->input[slen++] = 0;
+
+	zip_ops->input_len  = slen;
+	zip_ops->output_len = *dlen;
+
+	ret = zip_inflate(zip_ops, &zip_state, zip);
+
+	if (!ret) {
+		*dlen = zip_ops->output_len;
+		memcpy(dst, zip_ops->output, *dlen);
+	}
+
+	return ret;
+}
+
+/* Legacy compress framework end */
diff --git a/drivers/crypto/cavium/zip/zip_crypto.h b/drivers/crypto/cavium/zip/zip_crypto.h
index 1215049..26792e9 100644
--- a/drivers/crypto/cavium/zip/zip_crypto.h
+++ b/drivers/crypto/cavium/zip/zip_crypto.h
@@ -48,6 +48,8 @@
 
 #include <linux/crypto.h>
 #include "common.h"
+#include "zip_deflate.h"
+#include "zip_inflate.h"
 
 struct zip_kernel_ctx {
 	struct zip_operation zip_comp;
@@ -57,5 +59,9 @@ struct zip_kernel_ctx {
 int  zip_alloc_zip_ctx(struct crypto_tfm *tfm);
 int  zip_alloc_lzs_ctx(struct crypto_tfm *tfm);
 void zip_free_zip_ctx(struct crypto_tfm *tfm);
+int  zip_deflate_comp(struct crypto_tfm *tfm, const u8 *src, unsigned int slen,
+		      u8 *dst, unsigned int *dlen);
+int  zip_inflate_comp(struct crypto_tfm *tfm, const u8 *src, unsigned int slen,
+		      u8 *dst, unsigned int *dlen);
 
 #endif
diff --git a/drivers/crypto/cavium/zip/zip_deflate.c b/drivers/crypto/cavium/zip/zip_deflate.c
new file mode 100644
index 0000000..913cc25
--- /dev/null
+++ b/drivers/crypto/cavium/zip/zip_deflate.c
@@ -0,0 +1,190 @@
+/***********************license start************************************
+ * Copyright (c) 2003-2016 Cavium, Inc.
+ * All rights reserved.
+ *
+ * License: one of 'Cavium License' or 'GNU General Public License Version 2'
+ *
+ * This file is provided under the terms of the Cavium License (see below)
+ * or under the terms of GNU General Public License, Version 2, as
+ * published by the Free Software Foundation. When using or redistributing
+ * this file, you may do so under either license.
+ *
+ * Cavium License:  Redistribution and use in source and binary forms, with
+ * or without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  * Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ *
+ *  * Redistributions in binary form must reproduce the above
+ *    copyright notice, this list of conditions and the following
+ *    disclaimer in the documentation and/or other materials provided
+ *    with the distribution.
+ *
+ *  * Neither the name of Cavium Inc. nor the names of its contributors may be
+ *    used to endorse or promote products derived from this software without
+ *    specific prior written permission.
+ *
+ * This Software, including technical data, may be subject to U.S. export
+ * control laws, including the U.S. Export Administration Act and its
+ * associated regulations, and may be subject to export or import
+ * regulations in other countries.
+ *
+ * TO THE MAXIMUM EXTENT PERMITTED BY LAW, THE SOFTWARE IS PROVIDED "AS IS"
+ * AND WITH ALL FAULTS AND CAVIUM INC. MAKES NO PROMISES, REPRESENTATIONS
+ * OR WARRANTIES, EITHER EXPRESS, IMPLIED, STATUTORY, OR OTHERWISE, WITH
+ * RESPECT TO THE SOFTWARE, INCLUDING ITS CONDITION, ITS CONFORMITY TO ANY
+ * REPRESENTATION OR DESCRIPTION, OR THE EXISTENCE OF ANY LATENT OR PATENT
+ * DEFECTS, AND CAVIUM SPECIFICALLY DISCLAIMS ALL IMPLIED (IF ANY)
+ * WARRANTIES OF TITLE, MERCHANTABILITY, NONINFRINGEMENT, FITNESS FOR A
+ * PARTICULAR PURPOSE, LACK OF VIRUSES, ACCURACY OR COMPLETENESS, QUIET
+ * ENJOYMENT, QUIET POSSESSION OR CORRESPONDENCE TO DESCRIPTION. THE
+ * ENTIRE  RISK ARISING OUT OF USE OR PERFORMANCE OF THE SOFTWARE LIES
+ * WITH YOU.
+ ***********************license end**************************************/
+
+#include <linux/delay.h>
+#include <linux/sched.h>
+
+#include "common.h"
+#include "zip_deflate.h"
+
+/* Prepares the deflate zip command */
+static int prepare_zip_command(struct zip_operation *zip_ops,
+			       struct zip_state *s, union zip_inst_s *zip_cmd)
+{
+	union zip_zres_s *result_ptr = &s->result;
+
+	memset(zip_cmd, 0, sizeof(s->zip_cmd));
+	memset(result_ptr, 0, sizeof(s->result));
+
+	/* IWORD #0 */
+	/* History gather */
+	zip_cmd->s.hg = 0;
+	/* compression enable = 1 for deflate */
+	zip_cmd->s.ce = 1;
+	/* sf (sync flush) */
+	zip_cmd->s.sf = 1;
+	/* ef (end of file) */
+	if (zip_ops->flush == ZIP_FLUSH_FINISH) {
+		zip_cmd->s.ef = 1;
+		zip_cmd->s.sf = 0;
+	}
+
+	zip_cmd->s.cc = zip_ops->ccode;
+	/* ss (compression speed/storage) */
+	zip_cmd->s.ss = zip_ops->speed;
+
+	/* IWORD #1 */
+	/* adler checksum */
+	zip_cmd->s.adlercrc32 = zip_ops->csum;
+	zip_cmd->s.historylength = zip_ops->history_len;
+	zip_cmd->s.dg = 0;
+
+	/* IWORD # 6 and 7 - compression input/history pointer */
+	zip_cmd->s.inp_ptr_addr.s.addr  = __pa(zip_ops->input);
+	zip_cmd->s.inp_ptr_ctl.s.length = (zip_ops->input_len +
+					   zip_ops->history_len);
+	zip_cmd->s.ds = 0;
+
+	/* IWORD # 8 and 9 - Output pointer */
+	zip_cmd->s.out_ptr_addr.s.addr  = __pa(zip_ops->output);
+	zip_cmd->s.out_ptr_ctl.s.length = zip_ops->output_len;
+	/* maximum number of output-stream bytes that can be written */
+	zip_cmd->s.totaloutputlength    = zip_ops->output_len;
+
+	/* IWORD # 10 and 11 - Result pointer */
+	zip_cmd->s.res_ptr_addr.s.addr = __pa(result_ptr);
+	/* Clearing completion code */
+	result_ptr->s.compcode = 0;
+
+	return 0;
+}
+
+/**
+ * zip_deflate - API to offload deflate operation to hardware
+ * @zip_ops: Pointer to zip operation structure
+ * @s:       Pointer to the structure representing zip state
+ * @zip_dev: Pointer to zip device structure
+ *
+ * This function prepares the zip deflate command and submits it to the zip
+ * engine for processing.
+ *
+ * Return: 0 if successful or error code
+ */
+int zip_deflate(struct zip_operation *zip_ops, struct zip_state *s,
+		struct zip_device *zip_dev)
+{
+	union zip_inst_s *zip_cmd = &s->zip_cmd;
+	union zip_zres_s *result_ptr = &s->result;
+	u32 queue;
+
+	/* Prepares zip command based on the input parameters */
+	prepare_zip_command(zip_ops, s, zip_cmd);
+
+	/* Loads zip command into command queues and rings door bell */
+	queue = zip_load_instr(zip_cmd, zip_dev);
+
+	while (!result_ptr->s.compcode)
+		continue;
+
+	zip_ops->compcode = result_ptr->s.compcode;
+	switch (zip_ops->compcode) {
+	case ZIP_NOTDONE:
+		zip_dbg("Zip instruction not yet completed");
+		return ZIP_ERROR;
+
+	case ZIP_SUCCESS:
+		zip_dbg("Zip instruction completed successfully");
+		zip_update_cmd_bufs(zip_dev, queue);
+		break;
+
+	case ZIP_DTRUNC:
+		zip_dbg("Output Truncate error");
+		/* Returning ZIP_ERROR to avoid copy to user */
+		return ZIP_ERROR;
+
+	default:
+		zip_err("Zip instruction failed. Code:%d", zip_ops->compcode);
+		return ZIP_ERROR;
+	}
+
+	/* Update the CRC depending on the format */
+	switch (zip_ops->format) {
+	case RAW_FORMAT:
+		zip_dbg("RAW Format: %d ", zip_ops->format);
+		/* Get checksum from engine, need to feed it again */
+		zip_ops->csum = result_ptr->s.adler32;
+		break;
+
+	case ZLIB_FORMAT:
+		zip_dbg("ZLIB Format: %d ", zip_ops->format);
+		zip_ops->csum = result_ptr->s.adler32;
+		break;
+
+	case GZIP_FORMAT:
+		zip_dbg("GZIP Format: %d ", zip_ops->format);
+		zip_ops->csum = result_ptr->s.crc32;
+		break;
+
+	case LZS_FORMAT:
+		zip_dbg("LZS Format: %d ", zip_ops->format);
+		break;
+
+	default:
+		zip_err("Unknown Format:%d\n", zip_ops->format);
+	}
+
+	/* Update output_len */
+	if (zip_ops->output_len < result_ptr->s.totalbyteswritten) {
+		/* Dynamic stop && strm->output_len < zipconstants[onfsize] */
+		zip_err("output_len (%d) < total bytes written(%d)\n",
+			zip_ops->output_len, result_ptr->s.totalbyteswritten);
+		zip_ops->output_len = 0;
+
+	} else {
+		zip_ops->output_len = result_ptr->s.totalbyteswritten;
+	}
+
+	return 0;
+}
diff --git a/drivers/crypto/cavium/zip/zip_deflate.h b/drivers/crypto/cavium/zip/zip_deflate.h
new file mode 100644
index 0000000..bdb5207
--- /dev/null
+++ b/drivers/crypto/cavium/zip/zip_deflate.h
@@ -0,0 +1,62 @@
+/***********************license start************************************
+ * Copyright (c) 2003-2016 Cavium, Inc.
+ * All rights reserved.
+ *
+ * License: one of 'Cavium License' or 'GNU General Public License Version 2'
+ *
+ * This file is provided under the terms of the Cavium License (see below)
+ * or under the terms of GNU General Public License, Version 2, as
+ * published by the Free Software Foundation. When using or redistributing
+ * this file, you may do so under either license.
+ *
+ * Cavium License:  Redistribution and use in source and binary forms, with
+ * or without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  * Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ *
+ *  * Redistributions in binary form must reproduce the above
+ *    copyright notice, this list of conditions and the following
+ *    disclaimer in the documentation and/or other materials provided
+ *    with the distribution.
+ *
+ *  * Neither the name of Cavium Inc. nor the names of its contributors may be
+ *    used to endorse or promote products derived from this software without
+ *    specific prior written permission.
+ *
+ * This Software, including technical data, may be subject to U.S. export
+ * control laws, including the U.S. Export Administration Act and its
+ * associated regulations, and may be subject to export or import
+ * regulations in other countries.
+ *
+ * TO THE MAXIMUM EXTENT PERMITTED BY LAW, THE SOFTWARE IS PROVIDED "AS IS"
+ * AND WITH ALL FAULTS AND CAVIUM INC. MAKES NO PROMISES, REPRESENTATIONS
+ * OR WARRANTIES, EITHER EXPRESS, IMPLIED, STATUTORY, OR OTHERWISE, WITH
+ * RESPECT TO THE SOFTWARE, INCLUDING ITS CONDITION, ITS CONFORMITY TO ANY
+ * REPRESENTATION OR DESCRIPTION, OR THE EXISTENCE OF ANY LATENT OR PATENT
+ * DEFECTS, AND CAVIUM SPECIFICALLY DISCLAIMS ALL IMPLIED (IF ANY)
+ * WARRANTIES OF TITLE, MERCHANTABILITY, NONINFRINGEMENT, FITNESS FOR A
+ * PARTICULAR PURPOSE, LACK OF VIRUSES, ACCURACY OR COMPLETENESS, QUIET
+ * ENJOYMENT, QUIET POSSESSION OR CORRESPONDENCE TO DESCRIPTION. THE
+ * ENTIRE  RISK ARISING OUT OF USE OR PERFORMANCE OF THE SOFTWARE LIES
+ * WITH YOU.
+ ***********************license end**************************************/
+
+#ifndef __ZIP_DEFLATE_H__
+#define __ZIP_DEFLATE_H__
+
+/**
+ * zip_deflate - API to offload deflate operation to hardware
+ * @zip_ops: Pointer to zip operation structure
+ * @s:       Pointer to the structure representing zip state
+ * @zip_dev: Pointer to the structure representing zip device
+ *
+ * This function prepares the zip deflate command and submits it to the zip
+ * engine by ringing the doorbell.
+ *
+ * Return: 0 if successful or error code
+ */
+int zip_deflate(struct zip_operation *zip_ops, struct zip_state *s,
+		struct zip_device *zip_dev);
+#endif
diff --git a/drivers/crypto/cavium/zip/zip_device.c b/drivers/crypto/cavium/zip/zip_device.c
index ed21c5a..a72cdcf0 100644
--- a/drivers/crypto/cavium/zip/zip_device.c
+++ b/drivers/crypto/cavium/zip/zip_device.c
@@ -44,6 +44,7 @@
  ***********************license end**************************************/
 
 #include "common.h"
+#include "zip_deflate.h"
 
 /**
  * zip_cmd_queue_consumed - Calculates the space consumed in the command queue.
diff --git a/drivers/crypto/cavium/zip/zip_inflate.c b/drivers/crypto/cavium/zip/zip_inflate.c
new file mode 100644
index 0000000..849c4c85
--- /dev/null
+++ b/drivers/crypto/cavium/zip/zip_inflate.c
@@ -0,0 +1,211 @@
+/***********************license start************************************
+ * Copyright (c) 2003-2016 Cavium, Inc.
+ * All rights reserved.
+ *
+ * License: one of 'Cavium License' or 'GNU General Public License Version 2'
+ *
+ * This file is provided under the terms of the Cavium License (see below)
+ * or under the terms of GNU General Public License, Version 2, as
+ * published by the Free Software Foundation. When using or redistributing
+ * this file, you may do so under either license.
+ *
+ * Cavium License:  Redistribution and use in source and binary forms, with
+ * or without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  * Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ *
+ *  * Redistributions in binary form must reproduce the above
+ *    copyright notice, this list of conditions and the following
+ *    disclaimer in the documentation and/or other materials provided
+ *    with the distribution.
+ *
+ *  * Neither the name of Cavium Inc. nor the names of its contributors may be
+ *    used to endorse or promote products derived from this software without
+ *    specific prior written permission.
+ *
+ * This Software, including technical data, may be subject to U.S. export
+ * control laws, including the U.S. Export Administration Act and its
+ * associated regulations, and may be subject to export or import
+ * regulations in other countries.
+ *
+ * TO THE MAXIMUM EXTENT PERMITTED BY LAW, THE SOFTWARE IS PROVIDED "AS IS"
+ * AND WITH ALL FAULTS AND CAVIUM INC. MAKES NO PROMISES, REPRESENTATIONS
+ * OR WARRANTIES, EITHER EXPRESS, IMPLIED, STATUTORY, OR OTHERWISE, WITH
+ * RESPECT TO THE SOFTWARE, INCLUDING ITS CONDITION, ITS CONFORMITY TO ANY
+ * REPRESENTATION OR DESCRIPTION, OR THE EXISTENCE OF ANY LATENT OR PATENT
+ * DEFECTS, AND CAVIUM SPECIFICALLY DISCLAIMS ALL IMPLIED (IF ANY)
+ * WARRANTIES OF TITLE, MERCHANTABILITY, NONINFRINGEMENT, FITNESS FOR A
+ * PARTICULAR PURPOSE, LACK OF VIRUSES, ACCURACY OR COMPLETENESS, QUIET
+ * ENJOYMENT, QUIET POSSESSION OR CORRESPONDENCE TO DESCRIPTION. THE
+ * ENTIRE  RISK ARISING OUT OF USE OR PERFORMANCE OF THE SOFTWARE LIES
+ * WITH YOU.
+ ***********************license end**************************************/
+
+#include <linux/delay.h>
+#include <linux/sched.h>
+
+#include "common.h"
+#include "zip_inflate.h"
+
+static int prepare_inflate_zcmd(struct zip_operation *zip_ops,
+				struct zip_state *s, union zip_inst_s *zip_cmd)
+{
+	union zip_zres_s *result_ptr = &s->result;
+
+	memset(zip_cmd, 0, sizeof(s->zip_cmd));
+	memset(result_ptr, 0, sizeof(s->result));
+
+	/* IWORD#0 */
+
+	/* Decompression History Gather list - no gather list */
+	zip_cmd->s.hg = 0;
+	/* For decompression, CE must be 0x0. */
+	zip_cmd->s.ce = 0;
+	/* For decompression, SS must be 0x0. */
+	zip_cmd->s.ss = 0;
+	/* For decompression, SF should always be set. */
+	zip_cmd->s.sf = 1;
+
+	/* Begin File */
+	if (zip_ops->begin_file == 0)
+		zip_cmd->s.bf = 0;
+	else
+		zip_cmd->s.bf = 1;
+
+	zip_cmd->s.ef = 1;
+	/* 0: for Deflate decompression, 3: for LZS decompression */
+	zip_cmd->s.cc = zip_ops->ccode;
+
+	/* IWORD #1*/
+
+	/* adler checksum */
+	zip_cmd->s.adlercrc32 = zip_ops->csum;
+
+	/*
+	 * HISTORYLENGTH must be 0x0 for any ZIP decompress operation.
+	 * History data is added to a decompression operation via IWORD3.
+	 */
+	zip_cmd->s.historylength = 0;
+	zip_cmd->s.ds = 0;
+
+	/* IWORD # 8 and 9 - Output pointer */
+	zip_cmd->s.out_ptr_addr.s.addr  = __pa(zip_ops->output);
+	zip_cmd->s.out_ptr_ctl.s.length = zip_ops->output_len;
+
+	/* Maximum number of output-stream bytes that can be written */
+	zip_cmd->s.totaloutputlength    = zip_ops->output_len;
+
+	zip_dbg("Data Direct Input case ");
+
+	/* IWORD # 6 and 7 - input pointer */
+	zip_cmd->s.dg = 0;
+	zip_cmd->s.inp_ptr_addr.s.addr  = __pa((u8 *)zip_ops->input);
+	zip_cmd->s.inp_ptr_ctl.s.length = zip_ops->input_len;
+
+	/* IWORD # 10 and 11 - Result pointer */
+	zip_cmd->s.res_ptr_addr.s.addr = __pa(result_ptr);
+
+	/* Clearing completion code */
+	result_ptr->s.compcode = 0;
+
+	/* Returning 0 for time being.*/
+	return 0;
+}
+
+/**
+ * zip_inflate - API to offload inflate operation to hardware
+ * @zip_ops: Pointer to zip operation structure
+ * @s:       Pointer to the structure representing zip state
+ * @zip_dev: Pointer to zip device structure
+ *
+ * This function prepares the zip inflate command and submits it to the zip
+ * engine for processing.
+ *
+ * Return: 0 if successful or error code
+ */
+int zip_inflate(struct zip_operation *zip_ops, struct zip_state *s,
+		struct zip_device *zip_dev)
+{
+	union zip_inst_s *zip_cmd    = &s->zip_cmd;
+	union zip_zres_s  *result_ptr = &s->result;
+	u32 queue;
+
+	/* Prepare inflate zip command */
+	prepare_inflate_zcmd(zip_ops, s, zip_cmd);
+
+	/* Load inflate command to zip queue and ring the doorbell */
+	queue = zip_load_instr(zip_cmd, zip_dev);
+
+	while (!result_ptr->s.compcode)
+		continue;
+
+	zip_ops->compcode = result_ptr->s.compcode;
+	switch (zip_ops->compcode) {
+	case ZIP_NOTDONE:
+		zip_dbg("Zip Instruction not yet completed\n");
+		return ZIP_ERROR;
+
+	case ZIP_SUCCESS:
+		zip_dbg("Zip Instruction completed successfully\n");
+		break;
+
+	case ZIP_DYNAMIC_STOP:
+		zip_dbg(" Dynamic stop Initiated\n");
+		break;
+
+	default:
+		zip_dbg("Instruction failed. Code = %d\n", zip_ops->compcode);
+		zip_update_cmd_bufs(zip_dev, queue);
+		return ZIP_ERROR;
+	}
+
+	zip_update_cmd_bufs(zip_dev, queue);
+
+	if ((zip_ops->ccode == 3) && (zip_ops->flush == 4) &&
+	    (zip_ops->compcode != ZIP_DYNAMIC_STOP))
+		result_ptr->s.ef = 1;
+
+	zip_ops->csum = result_ptr->s.adler32;
+
+	if (zip_ops->output_len < result_ptr->s.totalbyteswritten) {
+		zip_err("output_len (%d) < total bytes written (%d)\n",
+			zip_ops->output_len, result_ptr->s.totalbyteswritten);
+		zip_ops->output_len = 0;
+	} else {
+		zip_ops->output_len = result_ptr->s.totalbyteswritten;
+	}
+
+	zip_ops->bytes_read = result_ptr->s.totalbytesread;
+	zip_ops->bits_processed = result_ptr->s.totalbitsprocessed;
+	zip_ops->end_file = result_ptr->s.ef;
+	if (zip_ops->end_file) {
+		switch (zip_ops->format) {
+		case RAW_FORMAT:
+			zip_dbg("RAW Format: %d ", zip_ops->format);
+			/* Get checksum from engine */
+			zip_ops->csum = result_ptr->s.adler32;
+			break;
+
+		case ZLIB_FORMAT:
+			zip_dbg("ZLIB Format: %d ", zip_ops->format);
+			zip_ops->csum = result_ptr->s.adler32;
+			break;
+
+		case GZIP_FORMAT:
+			zip_dbg("GZIP Format: %d ", zip_ops->format);
+			zip_ops->csum = result_ptr->s.crc32;
+			break;
+
+		case LZS_FORMAT:
+			zip_dbg("LZS Format: %d ", zip_ops->format);
+			break;
+
+		default:
+			zip_err("Format error:%d\n", zip_ops->format);
+		}
+	}
+
+	return 0;
+}
diff --git a/drivers/crypto/cavium/zip/zip_inflate.h b/drivers/crypto/cavium/zip/zip_inflate.h
new file mode 100644
index 0000000..4cee4c9
--- /dev/null
+++ b/drivers/crypto/cavium/zip/zip_inflate.h
@@ -0,0 +1,62 @@
+/***********************license start************************************
+ * Copyright (c) 2003-2016 Cavium, Inc.
+ * All rights reserved.
+ *
+ * License: one of 'Cavium License' or 'GNU General Public License Version 2'
+ *
+ * This file is provided under the terms of the Cavium License (see below)
+ * or under the terms of GNU General Public License, Version 2, as
+ * published by the Free Software Foundation. When using or redistributing
+ * this file, you may do so under either license.
+ *
+ * Cavium License:  Redistribution and use in source and binary forms, with
+ * or without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  * Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ *
+ *  * Redistributions in binary form must reproduce the above
+ *    copyright notice, this list of conditions and the following
+ *    disclaimer in the documentation and/or other materials provided
+ *    with the distribution.
+ *
+ *  * Neither the name of Cavium Inc. nor the names of its contributors may be
+ *    used to endorse or promote products derived from this software without
+ *    specific prior written permission.
+ *
+ * This Software, including technical data, may be subject to U.S. export
+ * control laws, including the U.S. Export Administration Act and its
+ * associated regulations, and may be subject to export or import
+ * regulations in other countries.
+ *
+ * TO THE MAXIMUM EXTENT PERMITTED BY LAW, THE SOFTWARE IS PROVIDED "AS IS"
+ * AND WITH ALL FAULTS AND CAVIUM INC. MAKES NO PROMISES, REPRESENTATIONS
+ * OR WARRANTIES, EITHER EXPRESS, IMPLIED, STATUTORY, OR OTHERWISE, WITH
+ * RESPECT TO THE SOFTWARE, INCLUDING ITS CONDITION, ITS CONFORMITY TO ANY
+ * REPRESENTATION OR DESCRIPTION, OR THE EXISTENCE OF ANY LATENT OR PATENT
+ * DEFECTS, AND CAVIUM SPECIFICALLY DISCLAIMS ALL IMPLIED (IF ANY)
+ * WARRANTIES OF TITLE, MERCHANTABILITY, NONINFRINGEMENT, FITNESS FOR A
+ * PARTICULAR PURPOSE, LACK OF VIRUSES, ACCURACY OR COMPLETENESS, QUIET
+ * ENJOYMENT, QUIET POSSESSION OR CORRESPONDENCE TO DESCRIPTION. THE
+ * ENTIRE  RISK ARISING OUT OF USE OR PERFORMANCE OF THE SOFTWARE LIES
+ * WITH YOU.
+ ***********************license end**************************************/
+
+#ifndef __ZIP_INFLATE_H__
+#define __ZIP_INFLATE_H__
+
+/**
+ * zip_inflate - API to offload inflate operation to hardware
+ * @zip_ops: Pointer to zip operation structure
+ * @s:       Pointer to the structure representing zip state
+ * @zip_dev: Pointer to the structure representing zip device
+ *
+ * This function prepares the zip inflate command and submits it to the zip
+ * engine for processing.
+ *
+ * Return: 0 if successful or error code
+ */
+int zip_inflate(struct zip_operation *zip_ops, struct zip_state *s,
+		struct zip_device *zip_dev);
+#endif
diff --git a/drivers/crypto/cavium/zip/zip_main.c b/drivers/crypto/cavium/zip/zip_main.c
index 052c42d..ae3395f 100644
--- a/drivers/crypto/cavium/zip/zip_main.c
+++ b/drivers/crypto/cavium/zip/zip_main.c
@@ -364,35 +364,6 @@ static void zip_remove(struct pci_dev *pdev)
 	zip_dbg_exit();
 }
 
-/* Dummy Functions */
-int zip_alloc_lzs_ctx(struct crypto_tfm *tfm)
-{
-	return 0;
-}
-
-int zip_alloc_zip_ctx(struct crypto_tfm *tfm)
-{
-	return 0;
-}
-
-void zip_free_zip_ctx(struct crypto_tfm *tfm)
-{
-}
-
-int  zip_deflate_comp(struct crypto_tfm *tfm,
-		      const u8 *src, unsigned int slen,
-		      u8 *dst, unsigned int *dlen)
-{
-	return 0;
-}
-
-int  zip_inflate_comp(struct crypto_tfm *tfm,
-		      const u8 *src, unsigned int slen,
-		      u8 *dst, unsigned int *dlen)
-{
-	return 0;
-}
-
 /* PCI Sub-System Interface */
 static struct pci_driver zip_driver = {
 	.name	    =  DRV_NAME,
-- 
2.9.0.rc0.21.g7777322

^ permalink raw reply related

* [RFC PATCH 3/3] crypto: zip - Add Compression/decompression statistics
From: Jan Glauber @ 2016-12-12 15:04 UTC (permalink / raw)
  To: Herbert Xu
  Cc: linux-crypto, linux-kernel, David S . Miller, Mahipal Challa,
	Vishnu Nair, Jan Glauber
In-Reply-To: <20161212150439.18627-1-jglauber@cavium.com>

From: Mahipal Challa <Mahipal.Challa@cavium.com>

Add statistics for compression/decompression hardware offload
under debugfs.

Signed-off-by: Mahipal Challa <Mahipal.Challa@cavium.com>
Signed-off-by: Vishnu Nair <Vishnu.Nair@cavium.com>
Signed-off-by: Jan Glauber <jglauber@cavium.com>
---
 drivers/crypto/cavium/zip/zip_deflate.c |  10 ++
 drivers/crypto/cavium/zip/zip_inflate.c |  12 ++
 drivers/crypto/cavium/zip/zip_main.c    | 227 ++++++++++++++++++++++++++++++++
 drivers/crypto/cavium/zip/zip_main.h    |  15 +++
 4 files changed, 264 insertions(+)

diff --git a/drivers/crypto/cavium/zip/zip_deflate.c b/drivers/crypto/cavium/zip/zip_deflate.c
index 913cc25..11052d8 100644
--- a/drivers/crypto/cavium/zip/zip_deflate.c
+++ b/drivers/crypto/cavium/zip/zip_deflate.c
@@ -122,12 +122,19 @@ int zip_deflate(struct zip_operation *zip_ops, struct zip_state *s,
 	/* Prepares zip command based on the input parameters */
 	prepare_zip_command(zip_ops, s, zip_cmd);
 
+	atomic64_add(zip_ops->input_len, &zip_dev->stats.comp_in_bytes);
 	/* Loads zip command into command queues and rings door bell */
 	queue = zip_load_instr(zip_cmd, zip_dev);
 
+	/* Stats update for compression requests submitted */
+	atomic64_inc(&zip_dev->stats.comp_req_submit);
+
 	while (!result_ptr->s.compcode)
 		continue;
 
+	/* Stats update for compression requests completed */
+	atomic64_inc(&zip_dev->stats.comp_req_complete);
+
 	zip_ops->compcode = result_ptr->s.compcode;
 	switch (zip_ops->compcode) {
 	case ZIP_NOTDONE:
@@ -175,6 +182,9 @@ int zip_deflate(struct zip_operation *zip_ops, struct zip_state *s,
 		zip_err("Unknown Format:%d\n", zip_ops->format);
 	}
 
+	atomic64_add(result_ptr->s.totalbyteswritten,
+		     &zip_dev->stats.comp_out_bytes);
+
 	/* Update output_len */
 	if (zip_ops->output_len < result_ptr->s.totalbyteswritten) {
 		/* Dynamic stop && strm->output_len < zipconstants[onfsize] */
diff --git a/drivers/crypto/cavium/zip/zip_inflate.c b/drivers/crypto/cavium/zip/zip_inflate.c
index 849c4c85..44503d8 100644
--- a/drivers/crypto/cavium/zip/zip_inflate.c
+++ b/drivers/crypto/cavium/zip/zip_inflate.c
@@ -135,12 +135,20 @@ int zip_inflate(struct zip_operation *zip_ops, struct zip_state *s,
 	/* Prepare inflate zip command */
 	prepare_inflate_zcmd(zip_ops, s, zip_cmd);
 
+	atomic64_add(zip_ops->input_len, &zip_dev->stats.decomp_in_bytes);
+
 	/* Load inflate command to zip queue and ring the doorbell */
 	queue = zip_load_instr(zip_cmd, zip_dev);
 
+	/* Decompression requests submitted stats update */
+	atomic64_inc(&zip_dev->stats.decomp_req_submit);
+
 	while (!result_ptr->s.compcode)
 		continue;
 
+	/* Decompression requests completed stats update */
+	atomic64_inc(&zip_dev->stats.decomp_req_complete);
+
 	zip_ops->compcode = result_ptr->s.compcode;
 	switch (zip_ops->compcode) {
 	case ZIP_NOTDONE:
@@ -157,6 +165,7 @@ int zip_inflate(struct zip_operation *zip_ops, struct zip_state *s,
 
 	default:
 		zip_dbg("Instruction failed. Code = %d\n", zip_ops->compcode);
+		atomic64_inc(&zip_dev->stats.decomp_bad_reqs);
 		zip_update_cmd_bufs(zip_dev, queue);
 		return ZIP_ERROR;
 	}
@@ -169,6 +178,9 @@ int zip_inflate(struct zip_operation *zip_ops, struct zip_state *s,
 
 	zip_ops->csum = result_ptr->s.adler32;
 
+	atomic64_add(result_ptr->s.totalbyteswritten,
+		     &zip_dev->stats.decomp_out_bytes);
+
 	if (zip_ops->output_len < result_ptr->s.totalbyteswritten) {
 		zip_err("output_len (%d) < total bytes written (%d)\n",
 			zip_ops->output_len, result_ptr->s.totalbyteswritten);
diff --git a/drivers/crypto/cavium/zip/zip_main.c b/drivers/crypto/cavium/zip/zip_main.c
index ae3395f..56631bf 100644
--- a/drivers/crypto/cavium/zip/zip_main.c
+++ b/drivers/crypto/cavium/zip/zip_main.c
@@ -427,6 +427,228 @@ static void zip_unregister_compression_device(void)
 	crypto_unregister_alg(&zip_comp_lzs);
 }
 
+/*
+ * debugfs functions
+ */
+#ifdef CONFIG_DEBUG_FS
+#include <linux/debugfs.h>
+
+/* Displays ZIP device statistics */
+static int zip_show_stats(struct seq_file *s, void *unused)
+{
+	u64 val = 0ull;
+	u64 avg_chunk = 0ull, avg_cr = 0ull;
+	u32 q = 0;
+
+	int index  = 0;
+	struct zip_device *zip;
+	struct zip_stats  *st;
+
+	for (index = 0; index < MAX_ZIP_DEVICES; index++) {
+		if (zip_dev[index]) {
+			zip = zip_dev[index];
+			st  = &zip->stats;
+
+			/* Get all the pending requests */
+			for (q = 0; q < ZIP_NUM_QUEUES; q++) {
+				val = zip_reg_read((zip->reg_base +
+						    ZIP_DBG_COREX_STA(q)));
+				val = (val >> 32);
+				val = val & 0xffffff;
+				atomic64_add(val, &st->pending_req);
+			}
+
+			avg_chunk = (atomic64_read(&st->comp_in_bytes) /
+				     atomic64_read(&st->comp_req_complete));
+			avg_cr = (atomic64_read(&st->comp_in_bytes) /
+				  atomic64_read(&st->comp_out_bytes));
+			seq_printf(s, "        ZIP Device %d Stats\n"
+				      "-----------------------------------\n"
+				      "Comp Req Submitted        : \t%ld\n"
+				      "Comp Req Completed        : \t%ld\n"
+				      "Compress In Bytes         : \t%ld\n"
+				      "Compressed Out Bytes      : \t%ld\n"
+				      "Average Chunk size        : \t%llu\n"
+				      "Average Compression ratio : \t%llu\n"
+				      "Decomp Req Submitted      : \t%ld\n"
+				      "Decomp Req Completed      : \t%ld\n"
+				      "Decompress In Bytes       : \t%ld\n"
+				      "Decompressed Out Bytes    : \t%ld\n"
+				      "Decompress Bad requests   : \t%ld\n"
+				      "Pending Req               : \t%ld\n"
+					"---------------------------------\n",
+				       index,
+				       atomic64_read(&st->comp_req_submit),
+				       atomic64_read(&st->comp_req_complete),
+				       atomic64_read(&st->comp_in_bytes),
+				       atomic64_read(&st->comp_out_bytes),
+				       avg_chunk,
+				       avg_cr,
+				       atomic64_read(&st->decomp_req_submit),
+				       atomic64_read(&st->decomp_req_complete),
+				       atomic64_read(&st->decomp_in_bytes),
+				       atomic64_read(&st->decomp_out_bytes),
+				       atomic64_read(&st->decomp_bad_reqs),
+				       atomic64_read(&st->pending_req));
+
+			/* Reset pending requests  count */
+			atomic64_set(&st->pending_req, 0);
+		}
+	}
+	return 0;
+}
+
+/* Clears stats data */
+static int zip_clear_stats(struct seq_file *s, void *unused)
+{
+	int index = 0;
+
+	for (index = 0; index < MAX_ZIP_DEVICES; index++) {
+		if (zip_dev[index]) {
+			memset(&zip_dev[index]->stats, 0,
+			       sizeof(struct zip_state));
+			seq_printf(s, "Cleared stats for zip %d\n", index);
+		}
+	}
+
+	return 0;
+}
+
+static struct zip_registers zipregs[64] = {
+	{"ZIP_CMD_CTL        ",  0x0000ull},
+	{"ZIP_THROTTLE       ",  0x0010ull},
+	{"ZIP_CONSTANTS      ",  0x00A0ull},
+	{"ZIP_QUE0_MAP       ",  0x1400ull},
+	{"ZIP_QUE1_MAP       ",  0x1408ull},
+	{"ZIP_QUE_ENA        ",  0x0500ull},
+	{"ZIP_QUE_PRI        ",  0x0508ull},
+	{"ZIP_QUE0_DONE      ",  0x2000ull},
+	{"ZIP_QUE1_DONE      ",  0x2008ull},
+	{"ZIP_QUE0_DOORBELL  ",  0x4000ull},
+	{"ZIP_QUE1_DOORBELL  ",  0x4008ull},
+	{"ZIP_QUE0_SBUF_ADDR ",  0x1000ull},
+	{"ZIP_QUE1_SBUF_ADDR ",  0x1008ull},
+	{"ZIP_QUE0_SBUF_CTL  ",  0x1200ull},
+	{"ZIP_QUE1_SBUF_CTL  ",  0x1208ull},
+	{ NULL, 0}
+};
+
+/* Prints registers' contents */
+static int zip_print_regs(struct seq_file *s, void *unused)
+{
+	u64 val = 0;
+	int i = 0, index = 0;
+
+	for (index = 0; index < MAX_ZIP_DEVICES; index++) {
+		if (zip_dev[index]) {
+			seq_printf(s, "--------------------------------\n"
+				      "     ZIP Device %d Registers\n"
+				      "--------------------------------\n",
+				      index);
+
+			i = 0;
+
+			while (zipregs[i].reg_name) {
+				val = zip_reg_read((zip_dev[index]->reg_base +
+						    zipregs[i].reg_offset));
+				seq_printf(s, "%s: 0x%016llx\n",
+					   zipregs[i].reg_name, val);
+				i++;
+			}
+		}
+	}
+	return 0;
+}
+
+static int zip_stats_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, zip_show_stats, NULL);
+}
+
+static const struct file_operations zip_stats_fops = {
+	.owner = THIS_MODULE,
+	.open  = zip_stats_open,
+	.read  = seq_read,
+};
+
+static int zip_clear_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, zip_clear_stats, NULL);
+}
+
+static const struct file_operations zip_clear_fops = {
+	.owner = THIS_MODULE,
+	.open  = zip_clear_open,
+	.read  = seq_read,
+};
+
+static int zip_regs_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, zip_print_regs, NULL);
+}
+
+static const struct file_operations zip_regs_fops = {
+	.owner = THIS_MODULE,
+	.open  = zip_regs_open,
+	.read  = seq_read,
+};
+
+/* Root directory for thunderx_zip debugfs entry */
+static struct dentry *zip_debugfs_root;
+
+static int __init zip_debugfs_init(void)
+{
+	struct dentry *zip_stats, *zip_clear, *zip_regs;
+
+	if (!debugfs_initialized())
+		return -ENODEV;
+
+	zip_debugfs_root = debugfs_create_dir("thunderx_zip", NULL);
+	if (!zip_debugfs_root)
+		return -ENOMEM;
+
+	/* Creating files for entries inside thunderx_zip directory */
+	zip_stats = debugfs_create_file("zip_stats", S_IRUGO,
+					zip_debugfs_root,
+					NULL, &zip_stats_fops);
+	if (!zip_stats)
+		goto failed_to_create;
+
+	zip_clear = debugfs_create_file("zip_clear", S_IRUGO,
+					zip_debugfs_root,
+					NULL, &zip_clear_fops);
+	if (!zip_clear)
+		goto failed_to_create;
+
+	zip_regs = debugfs_create_file("zip_regs", S_IRUGO,
+				       zip_debugfs_root,
+				       NULL, &zip_regs_fops);
+	if (!zip_regs)
+		goto failed_to_create;
+
+	return 0;
+
+failed_to_create:
+	debugfs_remove_recursive(zip_debugfs_root);
+	return -ENOENT;
+}
+
+static void __exit zip_debugfs_exit(void)
+{
+	debugfs_remove_recursive(zip_debugfs_root);
+}
+
+#else
+static int __init zip_debugfs_init(void)
+{
+	return 0;
+}
+
+static void __exit zip_debugfs_exit(void) { }
+
+#endif
+/* debugfs - end */
+
 static int __init zip_init_module(void)
 {
 	int ret;
@@ -448,11 +670,16 @@ static int __init zip_init_module(void)
 		return 1;
 	}
 
+	if (zip_debugfs_init())
+		zip_msg("debugfs initialization failed\n");
+
 	return ret;
 }
 
 static void __exit zip_cleanup_module(void)
 {
+	zip_debugfs_exit();
+
 	/* Unregister this driver for pci zip devices */
 	pci_unregister_driver(&zip_driver);
 
diff --git a/drivers/crypto/cavium/zip/zip_main.h b/drivers/crypto/cavium/zip/zip_main.h
index 73b9e6d..cd7963e 100644
--- a/drivers/crypto/cavium/zip/zip_main.h
+++ b/drivers/crypto/cavium/zip/zip_main.h
@@ -87,6 +87,20 @@ struct zip_registers {
 	u64   reg_offset;
 };
 
+/* ZIP Compression - Decompression stats */
+struct zip_stats {
+	atomic64_t    comp_req_submit;
+	atomic64_t    comp_req_complete;
+	atomic64_t    decomp_req_submit;
+	atomic64_t    decomp_req_complete;
+	atomic64_t    pending_req;
+	atomic64_t    comp_in_bytes;
+	atomic64_t    comp_out_bytes;
+	atomic64_t    decomp_in_bytes;
+	atomic64_t    decomp_out_bytes;
+	atomic64_t    decomp_bad_reqs;
+};
+
 /* ZIP Instruction Queue */
 struct zip_iq {
 	u64        *sw_head;
@@ -112,6 +126,7 @@ struct zip_device {
 	u64               ctxsize;
 
 	struct zip_iq     iq[ZIP_MAX_NUM_QUEUES];
+	struct zip_stats  stats;
 };
 
 /* Prototypes */
-- 
2.9.0.rc0.21.g7777322

^ permalink raw reply related

* [RFC PATCH 0/3] Cavium ThunderX ZIP driver
From: Jan Glauber @ 2016-12-12 15:04 UTC (permalink / raw)
  To: Herbert Xu
  Cc: linux-crypto, linux-kernel, David S . Miller, Mahipal Challa,
	Vishnu Nair, Jan Glauber

Hi Herbert,

this series adds support for hardware accelerated compression & decompression
as found on ThunderX (arm64) SOCs. I've been reviewing this driver internally
for some time and would like to get feedback on the RFC to see if this goes
into the right direction and to see if there are any concerns.

We've discussed switching to the new acomp algorithm but for the time being
decided against acomp because our test cases are not yet supported with it.

To test the ZIP driver we've used ZSWAP and IPComp.

Performance numbers from ZSWAP look promising.
The "average time" for compressing a 4KB page:

Compression Software	:  128 usec
Compression HW deflate	:   16 usec
Compression HW LZS	:   10 usec

Decompression Software	: 20 usec
Decompression HW deflate: 7 usec
Decompression HW LZS	: 5 usec

Patches are on top of 4.9.

Feedback welcome!
Jan 

---------------------

Mahipal Challa (3):
  crypto: zip - Add ThunderX ZIP driver core
  crypto: zip - Wire-up Compression / decompression HW offload
  crypto: zip - Add Compression/decompression statistics

 drivers/crypto/Kconfig                  |    7 +
 drivers/crypto/Makefile                 |    1 +
 drivers/crypto/cavium/Makefile          |    4 +
 drivers/crypto/cavium/zip/Makefile      |   11 +
 drivers/crypto/cavium/zip/common.h      |  258 ++++++
 drivers/crypto/cavium/zip/zip_crypto.c  |  243 ++++++
 drivers/crypto/cavium/zip/zip_crypto.h  |   67 ++
 drivers/crypto/cavium/zip/zip_deflate.c |  200 +++++
 drivers/crypto/cavium/zip/zip_deflate.h |   62 ++
 drivers/crypto/cavium/zip/zip_device.c  |  209 +++++
 drivers/crypto/cavium/zip/zip_device.h  |  138 ++++
 drivers/crypto/cavium/zip/zip_inflate.c |  223 ++++++
 drivers/crypto/cavium/zip/zip_inflate.h |   62 ++
 drivers/crypto/cavium/zip/zip_main.c    |  698 ++++++++++++++++
 drivers/crypto/cavium/zip/zip_main.h    |  141 ++++
 drivers/crypto/cavium/zip/zip_mem.c     |  120 +++
 drivers/crypto/cavium/zip/zip_mem.h     |   78 ++
 drivers/crypto/cavium/zip/zip_regs.h    | 1326 +++++++++++++++++++++++++++++++
 18 files changed, 3848 insertions(+)
 create mode 100644 drivers/crypto/cavium/Makefile
 create mode 100644 drivers/crypto/cavium/zip/Makefile
 create mode 100644 drivers/crypto/cavium/zip/common.h
 create mode 100644 drivers/crypto/cavium/zip/zip_crypto.c
 create mode 100644 drivers/crypto/cavium/zip/zip_crypto.h
 create mode 100644 drivers/crypto/cavium/zip/zip_deflate.c
 create mode 100644 drivers/crypto/cavium/zip/zip_deflate.h
 create mode 100644 drivers/crypto/cavium/zip/zip_device.c
 create mode 100644 drivers/crypto/cavium/zip/zip_device.h
 create mode 100644 drivers/crypto/cavium/zip/zip_inflate.c
 create mode 100644 drivers/crypto/cavium/zip/zip_inflate.h
 create mode 100644 drivers/crypto/cavium/zip/zip_main.c
 create mode 100644 drivers/crypto/cavium/zip/zip_main.h
 create mode 100644 drivers/crypto/cavium/zip/zip_mem.c
 create mode 100644 drivers/crypto/cavium/zip/zip_mem.h
 create mode 100644 drivers/crypto/cavium/zip/zip_regs.h

-- 
2.9.0.rc0.21.g7777322

^ permalink raw reply

* Re: [PATCH 1/1] crypto: asymmetric_keys: set error code on failure
From: David Howells @ 2016-12-12 16:10 UTC (permalink / raw)
  To: Pan Bian
  Cc: dhowells, Herbert Xu, David S. Miller, keyrings, linux-crypto,
	linux-kernel, Pan Bian
In-Reply-To: <1480777024-7410-1-git-send-email-bianpan201602@163.com>

Pan Bian <bianpan201602@163.com> wrote:

>  	outlen = crypto_akcipher_maxsize(tfm);
>  	output = kmalloc(outlen, GFP_KERNEL);
> -	if (!output)
> +	if (!output) {
> +		ret = -ENOMEM;
>  		goto error_free_req;
> +	}

This is preferred:

+	ret = -ENOMEM;
 	outlen = crypto_akcipher_maxsize(tfm);
 	output = kmalloc(outlen, GFP_KERNEL);
 	if (!output)
 		goto error_free_req;

I'll alter your patch.

David

^ permalink raw reply

* [PATCH] crypto: arm64/aes: reimplement bit-sliced ARM/NEON implementation for arm64
From: Ard Biesheuvel @ 2016-12-12 17:45 UTC (permalink / raw)
  To: linux-crypto, herbert; +Cc: linux-arm-kernel, nico, will.deacon, Ard Biesheuvel

This is a reimplementation of the NEON version of the bit-sliced AES
algorithm. This code is heavily based on Andy Polyakov's OpenSSL version
for ARM, which is also available in the kernel. This is an alternative for
the existing NEON implementation for arm64 authored by me, which suffers
from poor performance due to its reliance on the pathologically slow four
register variant of the tbl/tbx NEON instruction.

This version is about ~30% (*) faster than the generic C code, but only in
cases where the input can be 8x interleaved (this is a fundamental property
of bit slicing). For this reason, only the chaining modes ECB, XTS and CTR
are implemented. (The significance of ECB is that it could potentially be
used by other chaining modes)

* Measured on Cortex-A57. Note that this is still an order of magnitude
  slower than the implementations that use the dedicated AES instructions
  introduced in ARMv8, but those are part of an optional extension, and so
  it is good to have a fallback.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/crypto/Kconfig           |   6 +
 arch/arm64/crypto/Makefile          |   3 +
 arch/arm64/crypto/aes-neonbs-core.S | 905 ++++++++++++++++++++++++++++++++++++
 arch/arm64/crypto/aes-neonbs-glue.c | 300 ++++++++++++
 4 files changed, 1214 insertions(+)
 create mode 100644 arch/arm64/crypto/aes-neonbs-core.S
 create mode 100644 arch/arm64/crypto/aes-neonbs-glue.c

diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig
index 450a85df041a..cd0e7a6146b7 100644
--- a/arch/arm64/crypto/Kconfig
+++ b/arch/arm64/crypto/Kconfig
@@ -72,4 +72,10 @@ config CRYPTO_CRC32_ARM64
 	depends on ARM64
 	select CRYPTO_HASH
 
+config CRYPTO_AES_NEON_BS
+	tristate "AES in ECB/CBC/CTR/XTS modes using bit-sliced NEON algorithm"
+	depends on KERNEL_MODE_NEON
+	select CRYPTO_BLKCIPHER
+	select CRYPTO_AES
+
 endif
diff --git a/arch/arm64/crypto/Makefile b/arch/arm64/crypto/Makefile
index aa8888d7b744..11d20714ec48 100644
--- a/arch/arm64/crypto/Makefile
+++ b/arch/arm64/crypto/Makefile
@@ -41,6 +41,9 @@ sha256-arm64-y := sha256-glue.o sha256-core.o
 obj-$(CONFIG_CRYPTO_SHA512_ARM64) += sha512-arm64.o
 sha512-arm64-y := sha512-glue.o sha512-core.o
 
+obj-$(CONFIG_CRYPTO_AES_NEON_BS) += aes-neon-bs.o
+aes-neon-bs-y := aes-neonbs-core.o aes-neonbs-glue.o
+
 AFLAGS_aes-ce.o		:= -DINTERLEAVE=4
 AFLAGS_aes-neon.o	:= -DINTERLEAVE=4
 
diff --git a/arch/arm64/crypto/aes-neonbs-core.S b/arch/arm64/crypto/aes-neonbs-core.S
new file mode 100644
index 000000000000..d027c276cc75
--- /dev/null
+++ b/arch/arm64/crypto/aes-neonbs-core.S
@@ -0,0 +1,905 @@
+/*
+ * Bit sliced AES using NEON instructions
+ *
+ * Copyright (C) 2016 Linaro Ltd <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+/*
+ * The algorithm implemented here is described in detail by the paper
+ * 'Faster and Timing-Attack Resistant AES-GCM' by Emilia Kaesper and
+ * Peter Schwabe (https://eprint.iacr.org/2009/129.pdf)
+ *
+ * This implementation is based primarily on the OpenSSL implementation
+ * for 32-bit ARM written by Andy Polyakov <appro@openssl.org>
+ */
+
+#include <linux/linkage.h>
+#include <asm/assembler.h>
+
+	.text
+
+	rounds		.req	x11
+	bskey		.req	x12
+
+	.macro		in_bs_ch, b0, b1, b2, b3, b4, b5, b6, b7
+	eor		\b2, \b2, \b1
+	eor		\b5, \b5, \b6
+	eor		\b3, \b3, \b0
+	eor		\b6, \b6, \b2
+	eor		\b5, \b5, \b0
+	eor		\b6, \b6, \b3
+	eor		\b3, \b3, \b7
+	eor		\b7, \b7, \b5
+	eor		\b3, \b3, \b4
+	eor		\b4, \b4, \b5
+	eor		\b2, \b2, \b7
+	eor		\b3, \b3, \b1
+	eor		\b1, \b1, \b5
+	.endm
+
+	.macro		out_bs_ch, b0, b1, b2, b3, b4, b5, b6, b7
+	eor		\b0, \b0, \b6
+	eor		\b1, \b1, \b4
+	eor		\b4, \b4, \b6
+	eor		\b2, \b2, \b0
+	eor		\b6, \b6, \b1
+	eor		\b1, \b1, \b5
+	eor		\b5, \b5, \b3
+	eor		\b3, \b3, \b7
+	eor		\b7, \b7, \b5
+	eor		\b2, \b2, \b5
+	eor		\b4, \b4, \b7
+	.endm
+
+	.macro		inv_in_bs_ch, b6, b1, b2, b4, b7, b0, b3, b5
+	eor		\b1, \b1, \b7
+	eor		\b4, \b4, \b7
+	eor		\b7, \b7, \b5
+	eor		\b1, \b1, \b3
+	eor		\b2, \b2, \b5
+	eor		\b3, \b3, \b7
+	eor		\b6, \b6, \b1
+	eor		\b2, \b2, \b0
+	eor		\b5, \b5, \b3
+	eor		\b4, \b4, \b6
+	eor		\b0, \b0, \b6
+	eor		\b1, \b1, \b4
+	.endm
+
+	.macro		inv_out_bs_ch, b6, b5, b0, b3, b7, b1, b4, b2
+	eor		\b1, \b1, \b5
+	eor		\b2, \b2, \b7
+	eor		\b3, \b3, \b1
+	eor		\b4, \b4, \b5
+	eor		\b7, \b7, \b5
+	eor		\b3, \b3, \b4
+	eor 		\b5, \b5, \b0
+	eor		\b3, \b3, \b7
+	eor		\b6, \b6, \b2
+	eor		\b2, \b2, \b1
+	eor		\b6, \b6, \b3
+	eor		\b3, \b3, \b0
+	eor		\b5, \b5, \b6
+	.endm
+
+	.macro		mul_gf4, x0, x1, y0, y1, t0, t1
+	eor 		\t0, \y0, \y1
+	and		\t0, \t0, \x0
+	eor		\x0, \x0, \x1
+	and		\t1, \x1, \y0
+	and		\x0, \x0, \y1
+	eor		\x1, \t1, \t0
+	eor		\x0, \x0, \t1
+	.endm
+
+	.macro		mul_gf4_n, x0, x1, y0, y1, t0
+	eor		\t0, \y0, \y1
+	and		\t0, \t0, \x0
+	eor		\x0, \x0, \x1
+	and		\x1, \x1, \y0
+	and		\x0, \x0, \y1
+	eor		\x1, \x1, \x0
+	eor		\x0, \x0, \t0
+	.endm
+
+	.macro		mul_gf4_n_gf4, x0, x1, y0, y1, t0, x2, x3, y2, y3, t1
+	eor		\t0, \y0, \y1
+	eor 		\t1, \y2, \y3
+	and		\t0, \t0, \x0
+	and		\t1, \t1, \x2
+	eor		\x0, \x0, \x1
+	eor		\x2, \x2, \x3
+	and		\x1, \x1, \y0
+	and		\x3, \x3, \y2
+	and		\x0, \x0, \y1
+	and		\x2, \x2, \y3
+	eor		\x1, \x1, \x0
+	eor		\x2, \x2, \x3
+	eor		\x0, \x0, \t0
+	eor		\x3, \x3, \t1
+	.endm
+
+	.macro		mul_gf16_2, x0, x1, x2, x3, x4, x5, x6, x7, \
+				    y0, y1, y2, y3, t0, t1, t2, t3
+	eor		\t0, \x0, \x2
+	eor		\t1, \x1, \x3
+	mul_gf4  	\x0, \x1, \y0, \y1, \t2, \t3
+	eor		\y0, \y0, \y2
+	eor		\y1, \y1, \y3
+	mul_gf4_n_gf4	\t0, \t1, \y0, \y1, \t3, \x2, \x3, \y2, \y3, \t2
+	eor		\x0, \x0, \t0
+	eor		\x2, \x2, \t0
+	eor		\x1, \x1, \t1
+	eor		\x3, \x3, \t1
+	eor		\t0, \x4, \x6
+	eor		\t1, \x5, \x7
+	mul_gf4_n_gf4	\t0, \t1, \y0, \y1, \t3, \x6, \x7, \y2, \y3, \t2
+	eor		\y0, \y0, \y2
+	eor		\y1, \y1, \y3
+	mul_gf4  	\x4, \x5, \y0, \y1, \t2, \t3
+	eor		\x4, \x4, \t0
+	eor		\x6, \x6, \t0
+	eor		\x5, \x5, \t1
+	eor		\x7, \x7, \t1
+	.endm
+
+	.macro		inv_gf256, x0, x1, x2, x3, x4, x5, x6, x7, \
+				   t0, t1, t2, t3, s0, s1, s2, s3
+	eor		\t3, \x4, \x6
+	eor		\t2, \x5, \x7
+	eor		\t1, \x1, \x3
+	eor		\s1, \x7, \x6
+	mov		\t0, \t2
+	eor		\s0, \x0, \x2
+	orr		\t2, \t2, \t1
+	eor		\s3, \t3, \t0
+	and		\s2, \t3, \s0
+	orr		\t3, \t3, \s0
+	eor		\s0, \s0, \t1
+	and		\t0, \t0, \t1
+	eor		\t1, \x3, \x2
+	and		\s3, \s3, \s0
+	and		\s1, \s1, \t1
+	eor		\t1, \x4, \x5
+	eor		\s0, \x1, \x0
+	eor		\t3, \t3, \s1
+	eor		\t2, \t2, \s1
+	and		\s1, \t1, \s0
+	orr		\t1, \t1, \s0
+	eor		\t3, \t3, \s3
+	eor		\t0, \t0, \s1
+	eor		\t2, \t2, \s2
+	eor		\t1, \t1, \s3
+	eor		\t0, \t0, \s2
+	and		\s0, \x7, \x3
+	eor		\t1, \t1, \s2
+	and		\s1, \x6, \x2
+	and		\s2, \x5, \x1
+	orr		\s3, \x4, \x0
+	eor		\t3, \t3, \s0
+	eor		\t1, \t1, \s2
+	eor		\t0, \t0, \s3
+	eor		\t2, \t2, \s1
+	and		\s2, \t3, \t1
+	mov		\s0, \t0
+	eor		\s1, \t2, \s2
+	eor		\s3, \t0, \s2
+	eor		\s2, \t0, \s2
+	bsl		\s1, \t1, \t0
+	bsl		\s3, \t3, \t2
+	eor		\t3, \t3, \t2
+	bsl		\s0, \s1, \s2
+	bsl		\t0, \s2, \s1
+	and		\s2, \s0, \s3
+	eor		\t1, \t1, \t0
+	eor		\s2, \s2, \t3
+	mul_gf16_2	\x0, \x1, \x2, \x3, \x4, \x5, \x6, \x7, \
+			\s3, \s2, \s1, \t1, \s0, \t0, \t2, \t3
+	.endm
+
+	.macro		sbox, b0, b1, b2, b3, b4, b5, b6, b7, \
+			      t0, t1, t2, t3, s0, s1, s2, s3
+	in_bs_ch	\b0\().16b, \b1\().16b, \b2\().16b, \b3\().16b, \
+			\b4\().16b, \b5\().16b, \b6\().16b, \b7\().16b
+	inv_gf256	\b6\().16b, \b5\().16b, \b0\().16b, \b3\().16b, \
+			\b7\().16b, \b1\().16b, \b4\().16b, \b2\().16b, \
+			\t0\().16b, \t1\().16b, \t2\().16b, \t3\().16b, \
+			\s0\().16b, \s1\().16b, \s2\().16b, \s3\().16b
+	out_bs_ch	\b7\().16b, \b1\().16b, \b4\().16b, \b2\().16b, \
+			\b6\().16b, \b5\().16b, \b0\().16b, \b3\().16b
+	.endm
+
+	.macro		inv_sbox, b0, b1, b2, b3, b4, b5, b6, b7, \
+				  t0, t1, t2, t3, s0, s1, s2, s3
+	inv_in_bs_ch	\b0\().16b, \b1\().16b, \b2\().16b, \b3\().16b, \
+			\b4\().16b, \b5\().16b, \b6\().16b, \b7\().16b
+	inv_gf256	\b5\().16b, \b1\().16b, \b2\().16b, \b6\().16b, \
+			\b3\().16b, \b7\().16b, \b0\().16b, \b4\().16b, \
+			\t0\().16b, \t1\().16b, \t2\().16b, \t3\().16b, \
+			\s0\().16b, \s1\().16b, \s2\().16b, \s3\().16b
+	inv_out_bs_ch	\b3\().16b, \b7\().16b, \b0\().16b, \b4\().16b, \
+			\b5\().16b, \b1\().16b, \b2\().16b, \b6\().16b
+	.endm
+
+	.macro		enc_next_rk
+	ldp		q16, q17, [bskey], #32
+	ldp		q18, q19, [bskey], #32
+	ldp		q20, q21, [bskey], #32
+	ldp		q22, q23, [bskey], #32
+	.endm
+
+	.macro		dec_next_rk
+	ldp		q16, q17, [bskey, #-128]!
+	ldp		q18, q19, [bskey, #32]
+	ldp		q20, q21, [bskey, #64]
+	ldp		q22, q23, [bskey, #96]
+	.endm
+
+	.macro		add_round_key, x0, x1, x2, x3, x4, x5, x6, x7
+	eor		\x0\().16b, \x0\().16b, v16.16b
+	eor		\x1\().16b, \x1\().16b, v17.16b
+	eor		\x2\().16b, \x2\().16b, v18.16b
+	eor		\x3\().16b, \x3\().16b, v19.16b
+	eor		\x4\().16b, \x4\().16b, v20.16b
+	eor		\x5\().16b, \x5\().16b, v21.16b
+	eor		\x6\().16b, \x6\().16b, v22.16b
+	eor		\x7\().16b, \x7\().16b, v23.16b
+	.endm
+
+	.macro		shift_rows, x0, x1, x2, x3, x4, x5, x6, x7, mask
+	tbl		\x0\().16b, {\x0\().16b}, \mask\().16b
+	tbl		\x1\().16b, {\x1\().16b}, \mask\().16b
+	tbl		\x2\().16b, {\x2\().16b}, \mask\().16b
+	tbl		\x3\().16b, {\x3\().16b}, \mask\().16b
+	tbl		\x4\().16b, {\x4\().16b}, \mask\().16b
+	tbl		\x5\().16b, {\x5\().16b}, \mask\().16b
+	tbl		\x6\().16b, {\x6\().16b}, \mask\().16b
+	tbl		\x7\().16b, {\x7\().16b}, \mask\().16b
+	.endm
+
+	.macro		mix_cols, x0, x1, x2, x3, x4, x5, x6, x7, \
+				  t0, t1, t2, t3, t4, t5, t6, t7, inv
+	ext		\t0\().16b, \x0\().16b, \x0\().16b, #12
+	ext		\t1\().16b, \x1\().16b, \x1\().16b, #12
+	eor		\x0\().16b, \x0\().16b, \t0\().16b
+	ext		\t2\().16b, \x2\().16b, \x2\().16b, #12
+	eor		\x1\().16b, \x1\().16b, \t1\().16b
+	ext		\t3\().16b, \x3\().16b, \x3\().16b, #12
+	eor		\x2\().16b, \x2\().16b, \t2\().16b
+	ext		\t4\().16b, \x4\().16b, \x4\().16b, #12
+	eor		\x3\().16b, \x3\().16b, \t3\().16b
+	ext		\t5\().16b, \x5\().16b, \x5\().16b, #12
+	eor		\x4\().16b, \x4\().16b, \t4\().16b
+	ext		\t6\().16b, \x6\().16b, \x6\().16b, #12
+	eor		\x5\().16b, \x5\().16b, \t5\().16b
+	ext		\t7\().16b, \x7\().16b, \x7\().16b, #12
+	eor		\x6\().16b, \x6\().16b, \t6\().16b
+	eor		\t1\().16b, \t1\().16b, \x0\().16b
+	eor		\x7\().16b, \x7\().16b, \t7\().16b
+	ext		\x0\().16b, \x0\().16b, \x0\().16b, #8
+	eor		\t2\().16b, \t2\().16b, \x1\().16b
+	eor		\t0\().16b, \t0\().16b, \x7\().16b
+	eor		\t1\().16b, \t1\().16b, \x7\().16b
+	ext		\x1\().16b, \x1\().16b, \x1\().16b, #8
+	eor		\t5\().16b, \t5\().16b, \x4\().16b
+	eor		\x0\().16b, \x0\().16b, \t0\().16b
+	eor		\t6\().16b, \t6\().16b, \x5\().16b
+	eor		\x1\().16b, \x1\().16b, \t1\().16b
+	ext		\t0\().16b, \x4\().16b, \x4\().16b, #8
+	eor		\t4\().16b, \t4\().16b, \x3\().16b
+	ext		\t1\().16b, \x5\().16b, \x5\().16b, #8
+	eor		\t7\().16b, \t7\().16b, \x6\().16b
+	ext		\x4\().16b, \x3\().16b, \x3\().16b, #8
+	eor		\t3\().16b, \t3\().16b, \x2\().16b
+	ext		\x5\().16b, \x7\().16b, \x7\().16b, #8
+	eor		\t4\().16b, \t4\().16b, \x7\().16b
+	ext		\x3\().16b, \x6\().16b, \x6\().16b, #8
+	eor		\t3\().16b, \t3\().16b, \x7\().16b
+	ext		\x6\().16b, \x2\().16b, \x2\().16b, #8
+	eor		\x7\().16b, \t1\().16b, \t5\().16b
+	.ifb		\inv
+	eor		\x2\().16b, \t0\().16b, \t4\().16b
+	eor		\x4\().16b, \x4\().16b, \t3\().16b
+	eor		\x5\().16b, \x5\().16b, \t7\().16b
+	eor		\x3\().16b, \x3\().16b, \t6\().16b
+	eor		\x6\().16b, \x6\().16b, \t2\().16b
+	.else
+	eor		\t3\().16b, \t3\().16b, \x4\().16b
+	eor		\x5\().16b, \x5\().16b, \t7\().16b
+	eor		\x2\().16b, \x3\().16b, \t6\().16b
+	eor		\x3\().16b, \t0\().16b, \t4\().16b
+	eor		\x4\().16b, \x6\().16b, \t2\().16b
+	mov		\x6\().16b, \t3\().16b
+	.endif
+	.endm
+
+	.macro		inv_mix_cols, x0, x1, x2, x3, x4, x5, x6, x7, \
+				      t0, t1, t2, t3, t4, t5, t6, t7
+	ext		\t0\().16b, \x0\().16b, \x0\().16b, #8
+	ext		\t6\().16b, \x6\().16b, \x6\().16b, #8
+	ext		\t7\().16b, \x7\().16b, \x7\().16b, #8
+	eor		\t0\().16b, \t0\().16b, \x0\().16b
+	ext		\t1\().16b, \x1\().16b, \x1\().16b, #8
+	eor		\t6\().16b, \t6\().16b, \x6\().16b
+	ext		\t2\().16b, \x2\().16b, \x2\().16b, #8
+	eor		\t7\().16b, \t7\().16b, \x7\().16b
+	ext		\t3\().16b, \x3\().16b, \x3\().16b, #8
+	eor		\t1\().16b, \t1\().16b, \x1\().16b
+	ext		\t4\().16b, \x4\().16b, \x4\().16b, #8
+	eor		\t2\().16b, \t2\().16b, \x2\().16b
+	ext		\t5\().16b, \x5\().16b, \x5\().16b, #8
+	eor		\t3\().16b, \t3\().16b, \x3\().16b
+	eor		\t4\().16b, \t4\().16b, \x4\().16b
+	eor		\t5\().16b, \t5\().16b, \x5\().16b
+	eor		\x0\().16b, \x0\().16b, \t6\().16b
+	eor		\x1\().16b, \x1\().16b, \t6\().16b
+	eor		\x2\().16b, \x2\().16b, \t0\().16b
+	eor		\x4\().16b, \x4\().16b, \t2\().16b
+	eor		\x3\().16b, \x3\().16b, \t1\().16b
+	eor		\x1\().16b, \x1\().16b, \t7\().16b
+	eor		\x2\().16b, \x2\().16b, \t7\().16b
+	eor		\x4\().16b, \x4\().16b, \t6\().16b
+	eor		\x5\().16b, \x5\().16b, \t3\().16b
+	eor		\x3\().16b, \x3\().16b, \t6\().16b
+	eor		\x6\().16b, \x6\().16b, \t4\().16b
+	eor		\x4\().16b, \x4\().16b, \t7\().16b
+	eor		\x5\().16b, \x5\().16b, \t7\().16b
+	eor		\x7\().16b, \x7\().16b, \t5\().16b
+	mix_cols	\x0, \x1, \x2, \x3, \x4, \x5, \x6, \x7, \
+			\t0, \t1, \t2, \t3, \t4, \t5, \t6, \t7, 1
+	.endm
+
+	.macro		swapmove_2x, a0, b0, a1, b1, n, mask, t0, t1
+	ushr		\t0\().2d, \b0\().2d, #\n
+	ushr		\t1\().2d, \b1\().2d, #\n
+	eor		\t0\().16b, \t0\().16b, \a0\().16b
+	eor		\t1\().16b, \t1\().16b, \a1\().16b
+	and		\t0\().16b, \t0\().16b, \mask\().16b
+	and		\t1\().16b, \t1\().16b, \mask\().16b
+	eor		\a0\().16b, \a0\().16b, \t0\().16b
+	shl		\t0\().2d, \t0\().2d, #\n
+	eor		\a1\().16b, \a1\().16b, \t1\().16b
+	shl		\t1\().2d, \t1\().2d, #\n
+	eor		\b0\().16b, \b0\().16b, \t0\().16b
+	eor		\b1\().16b, \b1\().16b, \t1\().16b
+	.endm
+
+	.macro		bitslice, x7, x6, x5, x4, x3, x2, x1, x0, t0, t1, t2, t3
+	movi		\t0\().16b, #0x55
+	movi		\t1\().16b, #0x33
+	swapmove_2x	\x0, \x1, \x2, \x3, 1, \t0, \t2, \t3
+	swapmove_2x	\x4, \x5, \x6, \x7, 1, \t0, \t2, \t3
+	movi		\t0\().16b, #0x0f
+	swapmove_2x	\x0, \x2, \x1, \x3, 2, \t1, \t2, \t3
+	swapmove_2x	\x4, \x6, \x5, \x7, 2, \t1, \t2, \t3
+	swapmove_2x	\x0, \x4, \x1, \x5, 4, \t0, \t2, \t3
+	swapmove_2x	\x2, \x6, \x3, \x7, 4, \t0, \t2, \t3
+	.endm
+
+
+	.align		6
+M0:	.octa		0x0004080c0105090d02060a0e03070b0f
+
+M0SR:	.octa		0x0004080c05090d010a0e02060f03070b
+SR:	.octa		0x0f0e0d0c0a09080b0504070600030201
+SRM0:	.octa		0x01060b0c0207080d0304090e00050a0f
+
+M0ISR:	.octa		0x0004080c0d0105090a0e0206070b0f03
+ISR:	.octa		0x0f0e0d0c080b0a090504070602010003
+ISRM0:	.octa		0x0306090c00070a0d01040b0e0205080f
+
+	/*
+	 * void aesbs_convert_key(u8 out[], u32 const rk[], int rounds)
+	 */
+ENTRY(aesbs_convert_key)
+	ld1		{v7.4s}, [x1], #16		// load round 0 key
+	ld1		{v17.4s}, [x1], #16		// load round 1 key
+
+	movi		v8.16b,  #0x01			// bit masks
+	movi		v9.16b,  #0x02
+	movi		v10.16b, #0x04
+	movi		v11.16b, #0x08
+	movi		v12.16b, #0x10
+	movi		v13.16b, #0x20
+	movi		v14.16b, #0x40
+	movi		v15.16b, #0x80
+	ldr		q16, M0
+
+	sub		x2, x2, #1
+	str		q7, [x0], #16		// save round 0 key
+
+.Lkey_loop:
+	tbl		v7.16b ,{v17.16b}, v16.16b
+	ld1		{v17.4s}, [x1], #16		// load next round key
+
+	cmtst		v0.16b, v7.16b, v8.16b
+	cmtst		v1.16b, v7.16b, v9.16b
+	cmtst		v2.16b, v7.16b, v10.16b
+	cmtst		v3.16b, v7.16b, v11.16b
+	cmtst		v4.16b, v7.16b, v12.16b
+	cmtst		v5.16b, v7.16b, v13.16b
+	cmtst		v6.16b, v7.16b, v14.16b
+	cmtst		v7.16b, v7.16b, v15.16b
+	not		v0.16b, v0.16b
+	not		v1.16b, v1.16b
+	not		v5.16b, v5.16b
+	not		v6.16b, v6.16b
+
+	subs		x2, x2, #1
+	stp		q2, q3, [x0, #32]
+	stp		q4, q5, [x0, #64]
+	stp		q6, q7, [x0, #96]
+	stp		q0, q1, [x0], #128
+	b.ne		.Lkey_loop
+
+	movi		v7.16b, #0x63			// compose .L63
+	eor		v17.16b, v17.16b, v7.16b
+	str		q17, [x0]
+	ret
+ENDPROC(aesbs_convert_key)
+
+	.align		4
+aesbs_encrypt8:
+	ldr		q9, [bskey], #16		// round 0 key
+	ldr		q8, M0SR
+	ldr		q24, SR
+
+	eor		v10.16b, v0.16b, v9.16b		// xor with round0 key
+	eor		v11.16b, v1.16b, v9.16b
+	tbl		v0.16b, {v10.16b}, v8.16b
+	eor		v12.16b, v2.16b, v9.16b
+	tbl		v1.16b, {v11.16b}, v8.16b
+	eor		v13.16b, v3.16b, v9.16b
+	tbl		v2.16b, {v12.16b}, v8.16b
+	eor		v14.16b, v4.16b, v9.16b
+	tbl		v3.16b, {v13.16b}, v8.16b
+	eor		v15.16b, v5.16b, v9.16b
+	tbl		v4.16b, {v14.16b}, v8.16b
+	eor		v10.16b, v6.16b, v9.16b
+	tbl		v5.16b, {v15.16b}, v8.16b
+	eor		v11.16b, v7.16b, v9.16b
+	tbl		v6.16b, {v10.16b}, v8.16b
+	tbl		v7.16b, {v11.16b}, v8.16b
+
+	bitslice	v0, v1, v2, v3, v4, v5, v6, v7, v8, v9, v10, v11
+
+	sub		rounds, rounds, #1
+	b		.Lenc_sbox
+
+.Lenc_loop:
+	shift_rows	v0, v1, v2, v3, v4, v5, v6, v7, v24
+.Lenc_sbox:
+	sbox		v0, v1, v2, v3, v4, v5, v6, v7, v8, v9, v10, v11, v12, \
+								v13, v14, v15
+	subs		rounds, rounds, #1
+	b.cc		.Lenc_done
+
+	enc_next_rk
+
+	mix_cols	v0, v1, v4, v6, v3, v7, v2, v5, v8, v9, v10, v11, v12, \
+								v13, v14, v15
+
+	add_round_key	v0, v1, v2, v3, v4, v5, v6, v7
+
+	b.ne		.Lenc_loop
+	ldr		q24, SRM0
+	b		.Lenc_loop
+
+.Lenc_done:
+	ldr		q12, [bskey]			// last round key
+
+	bitslice	v0, v1, v4, v6, v3, v7, v2, v5, v8, v9, v10, v11
+
+	eor		v0.16b, v0.16b, v12.16b
+	eor		v1.16b, v1.16b, v12.16b
+	eor		v4.16b, v4.16b, v12.16b
+	eor		v6.16b, v6.16b, v12.16b
+	eor		v3.16b, v3.16b, v12.16b
+	eor		v7.16b, v7.16b, v12.16b
+	eor		v2.16b, v2.16b, v12.16b
+	eor		v5.16b, v5.16b, v12.16b
+	ret
+ENDPROC(aesbs_encrypt8)
+
+	.align		4
+aesbs_decrypt8:
+	lsl		x9, rounds, #7
+	add		bskey, bskey, x9
+
+	ldr		q9, [bskey, #-112]!		// round 0 key
+	ldr		q8, M0ISR
+	ldr		q24, ISR
+
+	eor		v10.16b, v0.16b, v9.16b		// xor with round0 key
+	eor		v11.16b, v1.16b, v9.16b
+	tbl		v0.16b, {v10.16b}, v8.16b
+	eor		v12.16b, v2.16b, v9.16b
+	tbl		v1.16b, {v11.16b}, v8.16b
+	eor		v13.16b, v3.16b, v9.16b
+	tbl		v2.16b, {v12.16b}, v8.16b
+	eor		v14.16b, v4.16b, v9.16b
+	tbl		v3.16b, {v13.16b}, v8.16b
+	eor		v15.16b, v5.16b, v9.16b
+	tbl		v4.16b, {v14.16b}, v8.16b
+	eor		v10.16b, v6.16b, v9.16b
+	tbl		v5.16b, {v15.16b}, v8.16b
+	eor		v11.16b, v7.16b, v9.16b
+	tbl		v6.16b, {v10.16b}, v8.16b
+	tbl		v7.16b, {v11.16b}, v8.16b
+
+	bitslice	v0, v1, v2, v3, v4, v5, v6, v7, v8, v9, v10, v11
+
+	sub		rounds, rounds, #1
+	b		.Ldec_sbox
+
+.Ldec_loop:
+	shift_rows	v0, v1, v2, v3, v4, v5, v6, v7, v24
+.Ldec_sbox:
+	inv_sbox	v0, v1, v2, v3, v4, v5, v6, v7, v8, v9, v10, v11, v12, \
+								v13, v14, v15
+	subs		rounds, rounds, #1
+	b.cc		.Ldec_done
+
+	dec_next_rk
+
+	add_round_key	v0, v1, v6, v4, v2, v7, v3, v5
+
+	inv_mix_cols	v0, v1, v6, v4, v2, v7, v3, v5, v8, v9, v10, v11, v12, \
+								v13, v14, v15
+
+	b.ne		.Ldec_loop
+	ldr		q24, ISRM0
+	b		.Ldec_loop
+.Ldec_done:
+	ldr		q12, [bskey, #-16]		// last round key
+
+	bitslice	v0, v1, v6, v4, v2, v7, v3, v5, v8, v9, v10, v11
+
+	eor		v0.16b, v0.16b, v12.16b
+	eor		v1.16b, v1.16b, v12.16b
+	eor		v6.16b, v6.16b, v12.16b
+	eor		v4.16b, v4.16b, v12.16b
+	eor		v2.16b, v2.16b, v12.16b
+	eor		v7.16b, v7.16b, v12.16b
+	eor		v3.16b, v3.16b, v12.16b
+	eor		v5.16b, v5.16b, v12.16b
+	ret
+ENDPROC(aesbs_decrypt8)
+
+	/*
+	 * aesbs_ecb_encrypt(u8 out[], u8 const in[], u8 const rk[], int rounds,
+	 *		     int blocks)
+	 * aesbs_ecb_decrypt(u8 out[], u8 const in[], u8 const rk[], int rounds,
+	 *		     int blocks)
+	 */
+	.macro		__ecb_crypt, do8, o0, o1, o2, o3, o4, o5, o6, o7
+	stp		x29, x30, [sp, #-16]!
+	mov		x29, sp
+
+99:	mov		x5, #1
+	lsl		x5, x5, x4
+	subs		w4, w4, #8
+	csel		x4, x4, xzr, pl
+	csel		x5, x5, xzr, mi
+
+	ld1		{v0.16b}, [x1], #16
+	tbnz		x5, #1, 0f
+	ld1		{v1.16b}, [x1], #16
+	tbnz		x5, #2, 0f
+	ld1		{v2.16b}, [x1], #16
+	tbnz		x5, #3, 0f
+	ld1		{v3.16b}, [x1], #16
+	tbnz		x5, #4, 0f
+	ld1		{v4.16b}, [x1], #16
+	tbnz		x5, #5, 0f
+	ld1		{v5.16b}, [x1], #16
+	tbnz		x5, #6, 0f
+	ld1		{v6.16b}, [x1], #16
+	tbnz		x5, #7, 0f
+	ld1		{v7.16b}, [x1], #16
+
+0:	mov		bskey, x2
+	mov		rounds, x3
+	bl		\do8
+
+	st1		{\o0\().16b}, [x0], #16
+	tbnz		x5, #1, 1f
+	st1		{\o1\().16b}, [x0], #16
+	tbnz		x5, #2, 1f
+	st1		{\o2\().16b}, [x0], #16
+	tbnz		x5, #3, 1f
+	st1		{\o3\().16b}, [x0], #16
+	tbnz		x5, #4, 1f
+	st1		{\o4\().16b}, [x0], #16
+	tbnz		x5, #5, 1f
+	st1		{\o5\().16b}, [x0], #16
+	tbnz		x5, #6, 1f
+	st1		{\o6\().16b}, [x0], #16
+	tbnz		x5, #7, 1f
+	st1		{\o7\().16b}, [x0], #16
+
+	cbnz		x4, 99b
+
+1:	ldp		x29, x30, [sp], #16
+	ret
+	.endm
+
+	.align		4
+ENTRY(aesbs_ecb_encrypt)
+	__ecb_crypt	aesbs_encrypt8, v0, v1, v4, v6, v3, v7, v2, v5
+ENDPROC(aesbs_ecb_encrypt)
+
+	.align		4
+ENTRY(aesbs_ecb_decrypt)
+	__ecb_crypt	aesbs_decrypt8, v0, v1, v6, v4, v2, v7, v3, v5
+ENDPROC(aesbs_ecb_decrypt)
+
+	.macro		next_tweak, out, in, const, tmp
+	sshr		\tmp\().2d,  \in\().2d,   #63
+	and		\tmp\().16b, \tmp\().16b, \const\().16b
+	add		\out\().2d,  \in\().2d,   \in\().2d
+	ext		\tmp\().16b, \tmp\().16b, \tmp\().16b, #8
+	eor		\out\().16b, \out\().16b, \tmp\().16b
+	.endm
+
+	.align		4
+.Lxts_mul_x:
+CPU_LE(	.quad		1, 0x87		)
+CPU_BE(	.quad		0x87, 1		)
+
+	/*
+	 * aesbs_xts_encrypt(u8 out[], u8 const in[], u8 const rk[], int rounds,
+	 *		     int blocks, u8 iv[])
+	 * aesbs_xts_decrypt(u8 out[], u8 const in[], u8 const rk[], int rounds,
+	 *		     int blocks, u8 iv[])
+	 */
+__xts_crypt8:
+	mov		x6, #1
+	lsl		x6, x6, x4
+	subs		w4, w4, #8
+	csel		x4, x4, xzr, pl
+	csel		x6, x6, xzr, mi
+
+	ld1		{v0.16b}, [x1], #16
+	next_tweak	v26, v25, v30, v31
+	eor		v0.16b, v0.16b, v25.16b
+	tbnz		x6, #1, 0f
+
+	ld1		{v1.16b}, [x1], #16
+	next_tweak	v27, v26, v30, v31
+	eor		v1.16b, v1.16b, v26.16b
+	tbnz		x6, #2, 0f
+
+	ld1		{v2.16b}, [x1], #16
+	next_tweak	v28, v27, v30, v31
+	eor		v2.16b, v2.16b, v27.16b
+	tbnz		x6, #3, 0f
+
+	ld1		{v3.16b}, [x1], #16
+	next_tweak	v29, v28, v30, v31
+	eor		v3.16b, v3.16b, v28.16b
+	tbnz		x6, #4, 0f
+
+	ld1		{v4.16b}, [x1], #16
+	str		q29, [sp, #16]
+	eor		v4.16b, v4.16b, v29.16b
+	next_tweak	v29, v29, v30, v31
+	tbnz		x6, #5, 0f
+
+	ld1		{v5.16b}, [x1], #16
+	str		q29, [sp, #32]
+	eor		v5.16b, v5.16b, v29.16b
+	next_tweak	v29, v29, v30, v31
+	tbnz		x6, #6, 0f
+
+	ld1		{v6.16b}, [x1], #16
+	str		q29, [sp, #48]
+	eor		v6.16b, v6.16b, v29.16b
+	next_tweak	v29, v29, v30, v31
+	tbnz		x6, #7, 0f
+
+	ld1		{v7.16b}, [x1], #16
+	str		q29, [sp, #64]
+	eor		v7.16b, v7.16b, v29.16b
+	next_tweak	v29, v29, v30, v31
+
+0:	mov		bskey, x2
+	mov		rounds, x3
+	br		x7
+ENDPROC(__xts_crypt8)
+
+	.macro		__xts_crypt, do8, o0, o1, o2, o3, o4, o5, o6, o7
+	stp		x29, x30, [sp, #-80]!
+	mov		x29, sp
+
+	ldr		q30, .Lxts_mul_x
+	ld1		{v25.16b}, [x5]
+
+99:	adr		x7, \do8
+	bl		__xts_crypt8
+
+	ldp		q16, q17, [sp, #16]
+	ldp		q18, q19, [sp, #48]
+
+	eor		\o0\().16b, \o0\().16b, v25.16b
+	eor		\o1\().16b, \o1\().16b, v26.16b
+	eor		\o2\().16b, \o2\().16b, v27.16b
+	eor		\o3\().16b, \o3\().16b, v28.16b
+
+	st1		{\o0\().16b}, [x0], #16
+	mov		v25.16b, v26.16b
+	tbnz		x6, #1, 1f
+	st1		{\o1\().16b}, [x0], #16
+	mov		v25.16b, v27.16b
+	tbnz		x6, #2, 1f
+	st1		{\o2\().16b}, [x0], #16
+	mov		v25.16b, v28.16b
+	tbnz		x6, #3, 1f
+	st1		{\o3\().16b}, [x0], #16
+	mov		v25.16b, v29.16b
+	tbnz		x6, #4, 1f
+
+	eor		\o4\().16b, \o4\().16b, v16.16b
+	eor		\o5\().16b, \o5\().16b, v17.16b
+	eor		\o6\().16b, \o6\().16b, v18.16b
+	eor		\o7\().16b, \o7\().16b, v19.16b
+
+	st1		{\o4\().16b}, [x0], #16
+	tbnz		x6, #5, 1f
+	st1		{\o5\().16b}, [x0], #16
+	tbnz		x6, #6, 1f
+	st1		{\o6\().16b}, [x0], #16
+	tbnz		x6, #7, 1f
+	st1		{\o7\().16b}, [x0], #16
+
+	cbnz		x4, 99b
+
+1:	st1		{v25.16b}, [x5]
+	ldp		x29, x30, [sp], #80
+	ret
+	.endm
+
+ENTRY(aesbs_xts_encrypt)
+	__xts_crypt	aesbs_encrypt8, v0, v1, v4, v6, v3, v7, v2, v5
+ENDPROC(aesbs_xts_encrypt)
+
+ENTRY(aesbs_xts_decrypt)
+	__xts_crypt	aesbs_decrypt8, v0, v1, v6, v4, v2, v7, v3, v5
+ENDPROC(aesbs_xts_decrypt)
+
+	.macro		next_ctr, v
+	mov		\v\().d[1], x8
+	mov		\v\().d[0], x7
+	adds		x8, x8, #1
+	adc		x7, x7, xzr
+	rev64		\v\().16b, \v\().16b
+	.endm
+
+	/*
+	 * aesbs_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[],
+	 *		     int rounds, int blocks, u8 iv[], bool final)
+	 */
+ENTRY(aesbs_ctr_encrypt)
+	stp		x29, x30, [sp, #-16]!
+	mov		x29, sp
+
+	add		x4, x4, x6		// do one extra block if final
+
+	ldp		x7, x8, [x5]
+	ld1		{v0.16b}, [x5]
+CPU_LE(	rev		x7, x7		)
+CPU_LE(	rev		x8, x8		)
+	adds		x8, x8, #1
+	adc		x7, x7, xzr
+
+99:	mov		x9, #1
+	lsl		x9, x9, x4
+	subs		w4, w4, #8
+	csel		x4, x4, xzr, pl
+	csel		x9, x9, xzr, le
+
+	tbnz		x9, #1, 0f
+
+	next_ctr	v1
+	tbnz		x9, #2, 0f
+
+	next_ctr	v2
+	tbnz		x9, #3, 0f
+
+	next_ctr	v3
+	tbnz		x9, #4, 0f
+
+	next_ctr	v4
+	tbnz		x9, #5, 0f
+
+	next_ctr	v5
+	tbnz		x9, #6, 0f
+
+	next_ctr	v6
+	tbnz		x9, #7, 0f
+
+	next_ctr	v7
+
+0:	mov		bskey, x2
+	mov		rounds, x3
+	bl		aesbs_encrypt8
+
+	lsr		x9, x9, x6		// disregard the final block
+	tbnz		x9, #0, 0f
+
+	ld1		{v8.16b}, [x1], #16
+	eor		v0.16b, v0.16b, v8.16b
+	st1		{v0.16b}, [x0], #16
+	tbnz		x9, #1, 1f
+
+	ld1		{v9.16b}, [x1], #16
+	eor		v1.16b, v1.16b, v9.16b
+	st1		{v1.16b}, [x0], #16
+	tbnz		x9, #2, 2f
+
+	ld1		{v10.16b}, [x1], #16
+	eor		v4.16b, v4.16b, v10.16b
+	st1		{v4.16b}, [x0], #16
+	tbnz		x9, #3, 3f
+
+	ld1		{v11.16b}, [x1], #16
+	eor		v6.16b, v6.16b, v11.16b
+	st1		{v6.16b}, [x0], #16
+	tbnz		x9, #4, 4f
+
+	ld1		{v12.16b}, [x1], #16
+	eor		v3.16b, v3.16b, v12.16b
+	st1		{v3.16b}, [x0], #16
+	tbnz		x9, #5, 5f
+
+	ld1		{v13.16b}, [x1], #16
+	eor		v7.16b, v7.16b, v13.16b
+	st1		{v7.16b}, [x0], #16
+	tbnz		x9, #6, 6f
+
+	ld1		{v14.16b}, [x1], #16
+	eor		v2.16b, v2.16b, v14.16b
+	st1		{v2.16b}, [x0], #16
+	tbnz		x9, #7, 7f
+
+	ld1		{v15.16b}, [x1], #16
+	eor		v5.16b, v5.16b, v15.16b
+	st1		{v5.16b}, [x0], #16
+
+	next_ctr	v0
+	cbnz		x4, 99b
+
+0:	st1		{v0.16b}, [x5]
+8:	ldp		x29, x30, [sp], #16
+	ret
+
+	/*
+	 * If we are handling the tail of the input (x6 == 1), return the
+	 * final keystream block back to the caller via the IV buffer.
+	 */
+1:	cbz		x6, 8b
+	st1		{v1.16b}, [x5]
+	b		8b
+2:	cbz		x6, 8b
+	st1		{v4.16b}, [x5]
+	b		8b
+3:	cbz		x6, 8b
+	st1		{v6.16b}, [x5]
+	b		8b
+4:	cbz		x6, 8b
+	st1		{v3.16b}, [x5]
+	b		8b
+5:	cbz		x6, 8b
+	st1		{v7.16b}, [x5]
+	b		8b
+6:	cbz		x6, 8b
+	st1		{v2.16b}, [x5]
+	b		8b
+7:	cbz		x6, 8b
+	st1		{v5.16b}, [x5]
+	b		8b
+ENDPROC(aesbs_ctr_encrypt)
diff --git a/arch/arm64/crypto/aes-neonbs-glue.c b/arch/arm64/crypto/aes-neonbs-glue.c
new file mode 100644
index 000000000000..57982172563c
--- /dev/null
+++ b/arch/arm64/crypto/aes-neonbs-glue.c
@@ -0,0 +1,300 @@
+/*
+ * Bit sliced AES using NEON instructions
+ *
+ * Copyright (C) 2016 Linaro Ltd <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <asm/neon.h>
+#include <crypto/aes.h>
+#include <crypto/internal/simd.h>
+#include <crypto/internal/skcipher.h>
+#include <crypto/xts.h>
+#include <linux/module.h>
+
+MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
+MODULE_LICENSE("GPL v2");
+
+asmlinkage void aesbs_ecb_encrypt(u8 out[], u8 const in[], u8 const rk[],
+				  int rounds, int blocks);
+asmlinkage void aesbs_ecb_decrypt(u8 out[], u8 const in[], u8 const rk[],
+				  int rounds, int blocks);
+
+asmlinkage void aesbs_xts_encrypt(u8 out[], u8 const in[], u8 const rk[],
+				  int rounds, int blocks, u8 iv[]);
+asmlinkage void aesbs_xts_decrypt(u8 out[], u8 const in[], u8 const rk[],
+				  int rounds, int blocks, u8 iv[]);
+
+asmlinkage void aesbs_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[],
+				  int rounds, int blocks, u8 iv[], bool final);
+
+asmlinkage void aesbs_convert_key(u8 out[], u32 const rk[], int rounds);
+
+struct aesbs_key {
+	u8			key[13 * (8 * AES_BLOCK_SIZE) + 32];
+};
+
+struct aesbs_ctx {
+	struct aesbs_key	bskey;
+	int			rounds;
+};
+
+struct aesbs_xts_ctx {
+	struct aesbs_key	bskey;
+	struct crypto_cipher	*tweak_tfm;
+	int			rounds;
+};
+
+static int aesbs_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
+			unsigned int key_len)
+{
+	struct aesbs_ctx *ctx = crypto_skcipher_ctx(tfm);
+	struct crypto_aes_ctx rk;
+	int err;
+
+	err = crypto_aes_expand_key(&rk, in_key, key_len);
+	if (err)
+		return err;
+
+	ctx->rounds = 6 + key_len / 4;
+
+	kernel_neon_begin();
+	aesbs_convert_key(ctx->bskey.key, rk.key_enc, ctx->rounds);
+	kernel_neon_end();
+
+	return 0;
+}
+
+static int xts_init(struct crypto_skcipher *tfm)
+{
+	struct aesbs_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
+
+	ctx->tweak_tfm = crypto_alloc_cipher("aes", 0, CRYPTO_ALG_ASYNC);
+	if (IS_ERR(ctx->tweak_tfm))
+		return PTR_ERR(ctx->tweak_tfm);
+
+	return 0;
+}
+
+static void xts_exit(struct crypto_skcipher *tfm)
+{
+	struct aesbs_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
+
+	crypto_free_cipher(ctx->tweak_tfm);
+}
+
+static int aesbs_xts_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
+			    unsigned int key_len)
+{
+	struct aesbs_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
+	struct crypto_aes_ctx rk;
+	int err;
+
+	err = xts_verify_key(tfm, in_key, key_len);
+	if (err)
+		return err;
+
+	err = crypto_cipher_setkey(ctx->tweak_tfm, in_key + key_len / 2,
+				   key_len / 2);
+	if (err)
+		return err;
+
+	err = crypto_aes_expand_key(&rk, in_key, key_len / 2);
+	if (err)
+		return err;
+
+	ctx->rounds = 6 + key_len / 8;
+
+	kernel_neon_begin();
+	aesbs_convert_key(ctx->bskey.key, rk.key_enc, ctx->rounds);
+	kernel_neon_end();
+
+	return 0;
+}
+
+static int __ecb_crypt(struct skcipher_request *req,
+		       void (*fn)(u8 out[], u8 const in[], u8 const rk[],
+				  int rounds, int blocks))
+{
+	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+	struct aesbs_ctx *ctx = crypto_skcipher_ctx(tfm);
+	struct skcipher_walk walk;
+	int err;
+
+	err = skcipher_walk_virt(&walk, req, true);
+
+	kernel_neon_begin();
+	while (walk.nbytes >= AES_BLOCK_SIZE) {
+		unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE;
+
+		if (walk.nbytes < walk.total)
+			blocks = round_down(blocks,
+					    walk.chunksize / AES_BLOCK_SIZE);
+
+		fn(walk.dst.virt.addr, walk.src.virt.addr, ctx->bskey.key,
+		   ctx->rounds, blocks);
+		err = skcipher_walk_done(&walk,
+					 walk.nbytes - blocks * AES_BLOCK_SIZE);
+	}
+	kernel_neon_end();
+
+	return err;
+}
+
+static int ecb_encrypt(struct skcipher_request *req)
+{
+	return __ecb_crypt(req, aesbs_ecb_encrypt);
+}
+
+static int ecb_decrypt(struct skcipher_request *req)
+{
+	return __ecb_crypt(req, aesbs_ecb_decrypt);
+}
+
+static int __xts_crypt(struct skcipher_request *req,
+		       void (*fn)(u8 out[], u8 const in[], u8 const rk[],
+				  int rounds, int blocks, u8 iv[]))
+{
+	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+	struct aesbs_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
+	struct skcipher_walk walk;
+	int err;
+
+	err = skcipher_walk_virt(&walk, req, true);
+
+	crypto_cipher_encrypt_one(ctx->tweak_tfm, walk.iv, walk.iv);
+
+	kernel_neon_begin();
+	while (walk.nbytes >= AES_BLOCK_SIZE) {
+		unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE;
+
+		if (walk.nbytes < walk.total)
+			blocks = round_down(blocks,
+					    walk.chunksize / AES_BLOCK_SIZE);
+
+		fn(walk.dst.virt.addr, walk.src.virt.addr, ctx->bskey.key,
+		   ctx->rounds, blocks, walk.iv);
+		err = skcipher_walk_done(&walk,
+					 walk.nbytes - blocks * AES_BLOCK_SIZE);
+	}
+	kernel_neon_end();
+
+	return err;
+}
+
+static int xts_encrypt(struct skcipher_request *req)
+{
+	return __xts_crypt(req, aesbs_xts_encrypt);
+}
+
+static int xts_decrypt(struct skcipher_request *req)
+{
+	return __xts_crypt(req, aesbs_xts_decrypt);
+}
+
+static int ctr_encrypt(struct skcipher_request *req)
+{
+	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+	struct aesbs_ctx *ctx = crypto_skcipher_ctx(tfm);
+	struct skcipher_walk walk;
+	int err;
+
+	err = skcipher_walk_virt(&walk, req, true);
+
+	kernel_neon_begin();
+	while (walk.nbytes > 0) {
+		unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE;
+		bool final = (walk.total % AES_BLOCK_SIZE) != 0;
+
+		if (walk.nbytes < walk.total) {
+			blocks = round_down(blocks,
+					    walk.chunksize / AES_BLOCK_SIZE);
+			final = false;
+		}
+
+		aesbs_ctr_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
+				  ctx->bskey.key, ctx->rounds, blocks, walk.iv,
+				  final);
+
+		if (final) {
+			u8 *dst = walk.dst.virt.addr + blocks * AES_BLOCK_SIZE;
+			u8 *src = walk.src.virt.addr + blocks * AES_BLOCK_SIZE;
+
+			if (dst != src)
+				memcpy(dst, src, walk.total % AES_BLOCK_SIZE);
+			crypto_xor(dst, walk.iv, walk.total % AES_BLOCK_SIZE);
+
+			err = skcipher_walk_done(&walk, 0);
+			break;
+		}
+		err = skcipher_walk_done(&walk,
+					 walk.nbytes - blocks * AES_BLOCK_SIZE);
+	}
+	kernel_neon_end();
+
+	return err;
+}
+
+static struct skcipher_alg aes_algs[] = { {
+	.base.cra_name		= "ecb(aes)",
+	.base.cra_driver_name	= "ecb-aes-neonbs",
+	.base.cra_priority	= 200,
+	.base.cra_blocksize	= AES_BLOCK_SIZE,
+	.base.cra_ctxsize	= sizeof(struct aesbs_ctx),
+	.base.cra_module	= THIS_MODULE,
+
+	.min_keysize		= AES_MIN_KEY_SIZE,
+	.max_keysize		= AES_MAX_KEY_SIZE,
+	.chunksize		= 8 * AES_BLOCK_SIZE,
+	.setkey			= aesbs_setkey,
+	.encrypt		= ecb_encrypt,
+	.decrypt		= ecb_decrypt,
+}, {
+	.base.cra_name		= "xts(aes)",
+	.base.cra_driver_name	= "xts-aes-neonbs",
+	.base.cra_priority	= 200,
+	.base.cra_blocksize	= AES_BLOCK_SIZE,
+	.base.cra_ctxsize	= sizeof(struct aesbs_xts_ctx),
+	.base.cra_module	= THIS_MODULE,
+
+	.min_keysize		= 2 * AES_MIN_KEY_SIZE,
+	.max_keysize		= 2 * AES_MAX_KEY_SIZE,
+	.chunksize		= 8 * AES_BLOCK_SIZE,
+	.ivsize			= AES_BLOCK_SIZE,
+	.setkey			= aesbs_xts_setkey,
+	.encrypt		= xts_encrypt,
+	.decrypt		= xts_decrypt,
+	.init			= xts_init,
+	.exit			= xts_exit,
+}, {
+	.base.cra_name		= "ctr(aes)",
+	.base.cra_driver_name	= "ctr-aes-neonbs",
+	.base.cra_priority	= 200,
+	.base.cra_blocksize	= 1,
+	.base.cra_ctxsize	= sizeof(struct aesbs_ctx),
+	.base.cra_module	= THIS_MODULE,
+
+	.min_keysize		= AES_MIN_KEY_SIZE,
+	.max_keysize		= AES_MAX_KEY_SIZE,
+	.chunksize		= 8 * AES_BLOCK_SIZE,
+	.ivsize			= AES_BLOCK_SIZE,
+	.setkey			= aesbs_setkey,
+	.encrypt		= ctr_encrypt,
+	.decrypt		= ctr_encrypt,
+} };
+
+static int __init aes_init(void)
+{
+	return crypto_register_skciphers(aes_algs, ARRAY_SIZE(aes_algs));
+}
+
+static void aes_exit(void)
+{
+	crypto_unregister_skciphers(aes_algs, ARRAY_SIZE(aes_algs));
+}
+
+module_init(aes_init);
+module_exit(aes_exit);
-- 
2.7.4

^ permalink raw reply related

* Re: Remaining crypto API regressions with CONFIG_VMAP_STACK
From: Andy Lutomirski @ 2016-12-12 18:34 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-crypto, linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	kernel-hardening@lists.openwall.com, Herbert Xu,
	Andrew Lutomirski, Stephan Mueller
In-Reply-To: <20161209230851.GB64048@google.com>

On Fri, Dec 9, 2016 at 3:08 PM, Eric Biggers <ebiggers3@gmail.com> wrote:
> In the 4.9 kernel, virtually-mapped stacks will be supported and enabled by
> default on x86_64.  This has been exposing a number of problems in which
> on-stack buffers are being passed into the crypto API, which to support crypto
> accelerators operates on 'struct page' rather than on virtual memory.

Here's my status.

>         drivers/crypto/bfin_crc.c:351
>         drivers/crypto/qce/sha.c:299
>         drivers/crypto/sahara.c:973,988
>         drivers/crypto/talitos.c:1910
>         drivers/crypto/qce/sha.c:325

I have a patch to make these depend on !VMAP_STACK.

>         drivers/crypto/ccp/ccp-crypto-aes-cmac.c:105,119,142
>         drivers/crypto/ccp/ccp-crypto-sha.c:95,109,124
>         drivers/crypto/ccp/ccp-crypto-aes-xts.c:162
>         drivers/crypto/ccp/ccp-crypto-aes.c:94

According to Herbert, these are fine.  I'm personally less convinced
since I'm very confused as to what "async" means in the crypto code,
but I'm going to leave these alone.

>
> And these other places do crypto operations on buffers clearly on the stack:
>
>         drivers/usb/wusbcore/crypto.c:264
>         security/keys/encrypted-keys/encrypted.c:500

I have a patch.

>         drivers/net/wireless/intersil/orinoco/mic.c:72

I have a patch to convert this to, drumroll please:

    priv->tx_tfm_mic = crypto_alloc_shash("michael_mic", 0,
                          CRYPTO_ALG_ASYNC);

Herbert, I'm at a loss as what a "shash" that's "ASYNC" even means.

>         net/ceph/crypto.c:182

This:

size_t zero_padding = (0x10 - (src_len & 0x0f));

is an amazing line of code...

But this driver uses cbc and wants to do synchronous crypto, and I
don't think that the crypto API supports real synchronous crypto using
CBC, so I'm going to let someone else fix this.

>         net/rxrpc/rxkad.c:737,1000

Herbert, can you fix this?

>         fs/cifs/smbencrypt.c:96

I have a patch.


My pile is here:

https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/log/?h=crypto

I'll send out the patches soon.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: Remaining crypto API regressions with CONFIG_VMAP_STACK
From: Gary R Hook @ 2016-12-12 18:45 UTC (permalink / raw)
  To: Andy Lutomirski, Eric Biggers
  Cc: linux-crypto, linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	kernel-hardening@lists.openwall.com, Herbert Xu,
	Andrew Lutomirski, Stephan Mueller
In-Reply-To: <CALCETrWfa5VJQNu3XjeFhF0cDFWF+M-dPwsT_7dzO5YSxsneGg@mail.gmail.com>

On 12/12/2016 12:34 PM, Andy Lutomirski wrote:

<...snip...>

>
> I have a patch to make these depend on !VMAP_STACK.
>
>>         drivers/crypto/ccp/ccp-crypto-aes-cmac.c:105,119,142
>>         drivers/crypto/ccp/ccp-crypto-sha.c:95,109,124
>>         drivers/crypto/ccp/ccp-crypto-aes-xts.c:162
>>         drivers/crypto/ccp/ccp-crypto-aes.c:94
>
> According to Herbert, these are fine.  I'm personally less convinced
> since I'm very confused as to what "async" means in the crypto code,
> but I'm going to leave these alone.

I went back through the code, and AFAICT every argument to sg_init_one() in
the above-cited files is a buffer that is part of the request context. Which
is allocated by the crypto framework, and therefore will never be on the 
stack.
Right?

I don't (as yet) see a need for any patch to these. Someone correct me 
if I'm
missing something.

<...snip...>

-- 
This is my day job. Follow me at:
IG/Twitter/Facebook: @grhookphoto
IG/Twitter/Facebook: @grhphotographer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* [PATCH] wusbcore: Fix one more crypto-on-the-stack bug
From: Andy Lutomirski @ 2016-12-12 20:52 UTC (permalink / raw)
  To: linux-kernel, linux-usb, gregkh
  Cc: Eric Biggers, linux-crypto, Herbert Xu, Stephan Mueller,
	Andy Lutomirski

The driver put a constant buffer of all zeros on the stack and
pointed a scatterlist entry at it.  This doesn't work with virtual
stacks.  Make the buffer static to fix it.

Cc: stable@vger.kernel.org # 4.9 only
Reported-by: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 drivers/usb/wusbcore/crypto.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/usb/wusbcore/crypto.c b/drivers/usb/wusbcore/crypto.c
index 79451f7ef1b7..a7e007a0cd49 100644
--- a/drivers/usb/wusbcore/crypto.c
+++ b/drivers/usb/wusbcore/crypto.c
@@ -216,7 +216,7 @@ static int wusb_ccm_mac(struct crypto_skcipher *tfm_cbc,
 	struct scatterlist sg[4], sg_dst;
 	void *dst_buf;
 	size_t dst_size;
-	const u8 bzero[16] = { 0 };
+	static const u8 bzero[16] = { 0 };
 	u8 iv[crypto_skcipher_ivsize(tfm_cbc)];
 	size_t zero_padding;
 
-- 
2.9.3

^ permalink raw reply related

* [PATCH] keys/encrypted: Fix two crypto-on-the-stack bugs
From: Andy Lutomirski @ 2016-12-12 20:53 UTC (permalink / raw)
  To: linux-kernel, linux-usb, dhowells, keyrings
  Cc: Eric Biggers, linux-crypto, Herbert Xu, Stephan Mueller,
	Andy Lutomirski
In-Reply-To: <8c273c9c41f51b34bb3115086f1d776895580637.1481575835.git.luto@kernel.org>

The driver put a constant buffer of all zeros on the stack and
pointed a scatterlist entry at it in two places.  This doesn't work
with virtual stacks.  Use a static 16-byte buffer of zeros instead.

Cc: stable@vger.kernel.org # 4.9 only
Reported-by: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 security/keys/encrypted-keys/encrypted.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/security/keys/encrypted-keys/encrypted.c b/security/keys/encrypted-keys/encrypted.c
index 17a06105ccb6..fab2fb864002 100644
--- a/security/keys/encrypted-keys/encrypted.c
+++ b/security/keys/encrypted-keys/encrypted.c
@@ -46,6 +46,7 @@ static const char key_format_default[] = "default";
 static const char key_format_ecryptfs[] = "ecryptfs";
 static unsigned int ivsize;
 static int blksize;
+static const char zero_pad[16] = {0};
 
 #define KEY_TRUSTED_PREFIX_LEN (sizeof (KEY_TRUSTED_PREFIX) - 1)
 #define KEY_USER_PREFIX_LEN (sizeof (KEY_USER_PREFIX) - 1)
@@ -481,7 +482,6 @@ static int derived_key_encrypt(struct encrypted_key_payload *epayload,
 	unsigned int encrypted_datalen;
 	u8 iv[AES_BLOCK_SIZE];
 	unsigned int padlen;
-	char pad[16];
 	int ret;
 
 	encrypted_datalen = roundup(epayload->decrypted_datalen, blksize);
@@ -493,11 +493,10 @@ static int derived_key_encrypt(struct encrypted_key_payload *epayload,
 		goto out;
 	dump_decrypted_data(epayload);
 
-	memset(pad, 0, sizeof pad);
 	sg_init_table(sg_in, 2);
 	sg_set_buf(&sg_in[0], epayload->decrypted_data,
 		   epayload->decrypted_datalen);
-	sg_set_buf(&sg_in[1], pad, padlen);
+	sg_set_buf(&sg_in[1], zero_pad, padlen);
 
 	sg_init_table(sg_out, 1);
 	sg_set_buf(sg_out, epayload->encrypted_data, encrypted_datalen);
@@ -584,7 +583,6 @@ static int derived_key_decrypt(struct encrypted_key_payload *epayload,
 	struct skcipher_request *req;
 	unsigned int encrypted_datalen;
 	u8 iv[AES_BLOCK_SIZE];
-	char pad[16];
 	int ret;
 
 	encrypted_datalen = roundup(epayload->decrypted_datalen, blksize);
@@ -594,13 +592,12 @@ static int derived_key_decrypt(struct encrypted_key_payload *epayload,
 		goto out;
 	dump_encrypted_data(epayload, encrypted_datalen);
 
-	memset(pad, 0, sizeof pad);
 	sg_init_table(sg_in, 1);
 	sg_init_table(sg_out, 2);
 	sg_set_buf(sg_in, epayload->encrypted_data, encrypted_datalen);
 	sg_set_buf(&sg_out[0], epayload->decrypted_data,
 		   epayload->decrypted_datalen);
-	sg_set_buf(&sg_out[1], pad, sizeof pad);
+	sg_set_buf(&sg_out[1], zero_pad, sizeof zero_pad);
 
 	memcpy(iv, epayload->iv, sizeof(iv));
 	skcipher_request_set_crypt(req, sg_in, sg_out, encrypted_datalen, iv);
-- 
2.9.3

^ permalink raw reply related

* [PATCH] cifs: Fix smbencrypt() to stop pointing a scatterlist at the stack
From: Andy Lutomirski @ 2016-12-12 20:54 UTC (permalink / raw)
  To: linux-kernel, linux-usb, sfrench
  Cc: Eric Biggers, linux-crypto, Herbert Xu, Stephan Mueller,
	linux-cifs, Andy Lutomirski
In-Reply-To: <8c273c9c41f51b34bb3115086f1d776895580637.1481575835.git.luto@kernel.org>

smbencrypt() points a scatterlist to the stack, which is breaks if
CONFIG_VMAP_STACK=y.

Fix it by switching to crypto_cipher_encrypt_one().  The new code
should be considerably faster as an added benefit.

This code is nearly identical to some code that Eric Biggers
suggested.

Cc: stable@vger.kernel.org # 4.9 only
Reported-by: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---

Compile-tested only.

fs/cifs/smbencrypt.c | 40 ++++++++--------------------------------
 1 file changed, 8 insertions(+), 32 deletions(-)

diff --git a/fs/cifs/smbencrypt.c b/fs/cifs/smbencrypt.c
index 699b7868108f..c12bffefa3c9 100644
--- a/fs/cifs/smbencrypt.c
+++ b/fs/cifs/smbencrypt.c
@@ -23,7 +23,7 @@
    Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
 */
 
-#include <crypto/skcipher.h>
+#include <linux/crypto.h>
 #include <linux/module.h>
 #include <linux/slab.h>
 #include <linux/fs.h>
@@ -69,46 +69,22 @@ str_to_key(unsigned char *str, unsigned char *key)
 static int
 smbhash(unsigned char *out, const unsigned char *in, unsigned char *key)
 {
-	int rc;
 	unsigned char key2[8];
-	struct crypto_skcipher *tfm_des;
-	struct scatterlist sgin, sgout;
-	struct skcipher_request *req;
+	struct crypto_cipher *tfm_des;
 
 	str_to_key(key, key2);
 
-	tfm_des = crypto_alloc_skcipher("ecb(des)", 0, CRYPTO_ALG_ASYNC);
+	tfm_des = crypto_alloc_cipher("des", 0, 0);
 	if (IS_ERR(tfm_des)) {
-		rc = PTR_ERR(tfm_des);
-		cifs_dbg(VFS, "could not allocate des crypto API\n");
-		goto smbhash_err;
-	}
-
-	req = skcipher_request_alloc(tfm_des, GFP_KERNEL);
-	if (!req) {
-		rc = -ENOMEM;
 		cifs_dbg(VFS, "could not allocate des crypto API\n");
-		goto smbhash_free_skcipher;
+		return PTR_ERR(tfm_des);
 	}
 
-	crypto_skcipher_setkey(tfm_des, key2, 8);
-
-	sg_init_one(&sgin, in, 8);
-	sg_init_one(&sgout, out, 8);
+	crypto_cipher_setkey(tfm_des, key2, 8);
+	crypto_cipher_encrypt_one(tfm_des, out, in);
+	crypto_free_cipher(tfm_des);
 
-	skcipher_request_set_callback(req, 0, NULL, NULL);
-	skcipher_request_set_crypt(req, &sgin, &sgout, 8, NULL);
-
-	rc = crypto_skcipher_encrypt(req);
-	if (rc)
-		cifs_dbg(VFS, "could not encrypt crypt key rc: %d\n", rc);
-
-	skcipher_request_free(req);
-
-smbhash_free_skcipher:
-	crypto_free_skcipher(tfm_des);
-smbhash_err:
-	return rc;
+	return 0;
 }
 
 static int
-- 
2.9.3

^ permalink raw reply related

* [PATCH] crypto: Make a few drivers depend on !VMAP_STACK
From: Andy Lutomirski @ 2016-12-12 20:55 UTC (permalink / raw)
  To: linux-kernel, linux-usb
  Cc: Eric Biggers, linux-crypto, Herbert Xu, Stephan Mueller,
	Andy Lutomirski
In-Reply-To: <8c273c9c41f51b34bb3115086f1d776895580637.1481575835.git.luto@kernel.org>

Eric Biggers found several crypto drivers that point scatterlists at
the stack.  These drivers should never load on x86, but, for future
safety, make them depend on !VMAP_STACK.

No -stable backport should be needed as no released kernel
configuration should be affected.

Reported-by: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 drivers/crypto/Kconfig | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index 4d2b81f2b223..481e67e54ffd 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -245,7 +245,7 @@ config CRYPTO_DEV_TALITOS
 	select CRYPTO_BLKCIPHER
 	select CRYPTO_HASH
 	select HW_RANDOM
-	depends on FSL_SOC
+	depends on FSL_SOC && !VMAP_STACK
 	help
 	  Say 'Y' here to use the Freescale Security Engine (SEC)
 	  to offload cryptographic algorithm computation.
@@ -357,7 +357,7 @@ config CRYPTO_DEV_PICOXCELL
 
 config CRYPTO_DEV_SAHARA
 	tristate "Support for SAHARA crypto accelerator"
-	depends on ARCH_MXC && OF
+	depends on ARCH_MXC && OF && !VMAP_STACK
 	select CRYPTO_BLKCIPHER
 	select CRYPTO_AES
 	select CRYPTO_ECB
@@ -410,7 +410,7 @@ endif # if CRYPTO_DEV_UX500
 
 config CRYPTO_DEV_BFIN_CRC
 	tristate "Support for Blackfin CRC hardware"
-	depends on BF60x
+	depends on BF60x && !VMAP_STACK
 	help
 	  Newer Blackfin processors have CRC hardware. Select this if you
 	  want to use the Blackfin CRC module.
@@ -487,7 +487,7 @@ source "drivers/crypto/qat/Kconfig"
 
 config CRYPTO_DEV_QCE
 	tristate "Qualcomm crypto engine accelerator"
-	depends on (ARCH_QCOM || COMPILE_TEST) && HAS_DMA && HAS_IOMEM
+	depends on (ARCH_QCOM || COMPILE_TEST) && HAS_DMA && HAS_IOMEM && !VMAP_STACK
 	select CRYPTO_AES
 	select CRYPTO_DES
 	select CRYPTO_ECB
-- 
2.9.3

^ permalink raw reply related

* [PATCH] orinoco: Use shash instead of ahash for MIC calculations
From: Andy Lutomirski @ 2016-12-12 20:55 UTC (permalink / raw)
  To: linux-kernel, linux-usb, linux-wireless
  Cc: Eric Biggers, linux-crypto, Herbert Xu, Stephan Mueller,
	Andy Lutomirski
In-Reply-To: <8c273c9c41f51b34bb3115086f1d776895580637.1481575835.git.luto@kernel.org>

Eric Biggers pointed out that the orinoco driver pointed scatterlists
at the stack.

Fix it by switching from ahash to shash.  The result should be
simpler, faster, and more correct.

Cc: stable@vger.kernel.org # 4.9 only
Reported-by: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---

Compile-tested only.

drivers/net/wireless/intersil/orinoco/mic.c     | 44 +++++++++++++++----------
 drivers/net/wireless/intersil/orinoco/mic.h     |  3 +-
 drivers/net/wireless/intersil/orinoco/orinoco.h |  4 +--
 3 files changed, 30 insertions(+), 21 deletions(-)

diff --git a/drivers/net/wireless/intersil/orinoco/mic.c b/drivers/net/wireless/intersil/orinoco/mic.c
index bc7397d709d3..08bc7822f820 100644
--- a/drivers/net/wireless/intersil/orinoco/mic.c
+++ b/drivers/net/wireless/intersil/orinoco/mic.c
@@ -16,7 +16,7 @@
 /********************************************************************/
 int orinoco_mic_init(struct orinoco_private *priv)
 {
-	priv->tx_tfm_mic = crypto_alloc_ahash("michael_mic", 0,
+	priv->tx_tfm_mic = crypto_alloc_shash("michael_mic", 0,
 					      CRYPTO_ALG_ASYNC);
 	if (IS_ERR(priv->tx_tfm_mic)) {
 		printk(KERN_DEBUG "orinoco_mic_init: could not allocate "
@@ -25,7 +25,7 @@ int orinoco_mic_init(struct orinoco_private *priv)
 		return -ENOMEM;
 	}
 
-	priv->rx_tfm_mic = crypto_alloc_ahash("michael_mic", 0,
+	priv->rx_tfm_mic = crypto_alloc_shash("michael_mic", 0,
 					      CRYPTO_ALG_ASYNC);
 	if (IS_ERR(priv->rx_tfm_mic)) {
 		printk(KERN_DEBUG "orinoco_mic_init: could not allocate "
@@ -40,17 +40,16 @@ int orinoco_mic_init(struct orinoco_private *priv)
 void orinoco_mic_free(struct orinoco_private *priv)
 {
 	if (priv->tx_tfm_mic)
-		crypto_free_ahash(priv->tx_tfm_mic);
+		crypto_free_shash(priv->tx_tfm_mic);
 	if (priv->rx_tfm_mic)
-		crypto_free_ahash(priv->rx_tfm_mic);
+		crypto_free_shash(priv->rx_tfm_mic);
 }
 
-int orinoco_mic(struct crypto_ahash *tfm_michael, u8 *key,
+int orinoco_mic(struct crypto_shash *tfm_michael, u8 *key,
 		u8 *da, u8 *sa, u8 priority,
 		u8 *data, size_t data_len, u8 *mic)
 {
-	AHASH_REQUEST_ON_STACK(req, tfm_michael);
-	struct scatterlist sg[2];
+	SHASH_DESC_ON_STACK(desc, tfm_michael);
 	u8 hdr[ETH_HLEN + 2]; /* size of header + padding */
 	int err;
 
@@ -67,18 +66,27 @@ int orinoco_mic(struct crypto_ahash *tfm_michael, u8 *key,
 	hdr[ETH_ALEN * 2 + 2] = 0;
 	hdr[ETH_ALEN * 2 + 3] = 0;
 
-	/* Use scatter gather to MIC header and data in one go */
-	sg_init_table(sg, 2);
-	sg_set_buf(&sg[0], hdr, sizeof(hdr));
-	sg_set_buf(&sg[1], data, data_len);
+	desc->tfm = tfm_michael;
+	desc->flags = 0;
 
-	if (crypto_ahash_setkey(tfm_michael, key, MIC_KEYLEN))
-		return -1;
+	err = crypto_shash_setkey(tfm_michael, key, MIC_KEYLEN);
+	if (err)
+		return err;
+
+	err = crypto_shash_init(desc);
+	if (err)
+		return err;
+
+	err = crypto_shash_update(desc, hdr, sizeof(hdr));
+	if (err)
+		return err;
+
+	err = crypto_shash_update(desc, data, data_len);
+	if (err)
+		return err;
+
+	err = crypto_shash_final(desc, mic);
+	shash_desc_zero(desc);
 
-	ahash_request_set_tfm(req, tfm_michael);
-	ahash_request_set_callback(req, 0, NULL, NULL);
-	ahash_request_set_crypt(req, sg, mic, data_len + sizeof(hdr));
-	err = crypto_ahash_digest(req);
-	ahash_request_zero(req);
 	return err;
 }
diff --git a/drivers/net/wireless/intersil/orinoco/mic.h b/drivers/net/wireless/intersil/orinoco/mic.h
index ce731d05cc98..e8724e889219 100644
--- a/drivers/net/wireless/intersil/orinoco/mic.h
+++ b/drivers/net/wireless/intersil/orinoco/mic.h
@@ -6,6 +6,7 @@
 #define _ORINOCO_MIC_H_
 
 #include <linux/types.h>
+#include <crypto/hash.h>
 
 #define MICHAEL_MIC_LEN 8
 
@@ -15,7 +16,7 @@ struct crypto_ahash;
 
 int orinoco_mic_init(struct orinoco_private *priv);
 void orinoco_mic_free(struct orinoco_private *priv);
-int orinoco_mic(struct crypto_ahash *tfm_michael, u8 *key,
+int orinoco_mic(struct crypto_shash *tfm_michael, u8 *key,
 		u8 *da, u8 *sa, u8 priority,
 		u8 *data, size_t data_len, u8 *mic);
 
diff --git a/drivers/net/wireless/intersil/orinoco/orinoco.h b/drivers/net/wireless/intersil/orinoco/orinoco.h
index 2f0c84b1c440..5fa1c3e3713f 100644
--- a/drivers/net/wireless/intersil/orinoco/orinoco.h
+++ b/drivers/net/wireless/intersil/orinoco/orinoco.h
@@ -152,8 +152,8 @@ struct orinoco_private {
 	u8 *wpa_ie;
 	int wpa_ie_len;
 
-	struct crypto_ahash *rx_tfm_mic;
-	struct crypto_ahash *tx_tfm_mic;
+	struct crypto_shash *rx_tfm_mic;
+	struct crypto_shash *tx_tfm_mic;
 
 	unsigned int wpa_enabled:1;
 	unsigned int tkip_cm_active:1;
-- 
2.9.3

^ permalink raw reply related

* Re: [PATCH v2] siphash: add cryptographically secure hashtable function
From: Jason A. Donenfeld @ 2016-12-12 21:17 UTC (permalink / raw)
  To: Eric Biggers
  Cc: kernel-hardening, LKML, Linux Crypto Mailing List, Linus Torvalds,
	George Spelvin, Scott Bauer, Andi Kleen, Andy Lutomirski, Greg KH,
	Jean-Philippe Aumasson, Daniel J . Bernstein
In-Reply-To: <20161212054229.GA31382@zzz>

Hey Eric,

Lots of good points; thanks for the review. Responses are inline below.

On Mon, Dec 12, 2016 at 6:42 AM, Eric Biggers <ebiggers3@gmail.com> wrote:
> Maybe add to the help text for CONFIG_TEST_HASH that it now tests siphash too?

Good call. Will do.

> This assumes the key and message buffers are aligned to __alignof__(u64).
> Unless that's going to be a clearly documented requirement for callers, you
> should use get_unaligned_le64() instead.  And you can pass a 'u8 *' directly to
> get_unaligned_le64(), no need for a helper function.

I had thought about that briefly, but just sort of figured most people
were passing in aligned variables... but that's a pretty bad
assumption to make especially for 64-bit alignment. I'll switch to
using the get_unaligned functions.

[As a side note, I wonder if crypto/chacha20_generic.c should be using
the unaligned functions instead too, at least for the iv reading...]

> It makes sense for this to return a u64, but that means the cpu_to_le64() is
> wrong, since u64 indicates CPU endianness.  It should just return 'b'.

At first I was very opposed to making this change, since by returning
a value with an explicit byte order, you can cast to u8 and have
uniform indexed byte access across platforms. But of course this
doesn't make any sense, since it's returning a u64, and it makes all
other bitwise operations non-uniform anyway.  I checked with JP
(co-creator of siphash, CC'd) and he confirmed your suspicion that it
was just to make the test vector comparison easier and for some
byte-wise uniformity, but that it's not strictly necessary. So, I've
removed that last cpu_to_le64, and I've also refactored those test
vectors to be written as ULL literals, so that a simple == integer
comparison will work across platforms.

> Can you mention in a comment where the test vectors came from?

Sure, will do.


> If you make the output really be CPU-endian like I'm suggesting then this will
> need to be something like:
>
>         if (out != get_unaligned_le64(test_vectors[i])) {
>
> Or else make the test vectors be an array of u64.

Yep, I wound up doing that.

Thanks Eric! Will submit a v3 soon if nobody else has comments.

Jason

^ permalink raw reply

* Re: [PATCH v2] siphash: add cryptographically secure hashtable function
From: Linus Torvalds @ 2016-12-12 21:37 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: kernel-hardening@lists.openwall.com, LKML,
	Linux Crypto Mailing List, George Spelvin, Scott Bauer,
	Andi Kleen, Andy Lutomirski, Greg KH, Jean-Philippe Aumasson,
	Daniel J . Bernstein
In-Reply-To: <CAHmME9qSW1U3dU+VjV8UBz=XOMfpbTkOCyrz74VnQTNcJW_FUw@mail.gmail.com>

On Sun, Dec 11, 2016 at 9:48 PM, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> I modified the test to hash data of size 0 through 7 repeatedly
> 100000000 times, and benchmarked that a few times on a Skylake laptop.
> The `load_unaligned_zeropad & bytemask_from_count` version was
> consistently 7% slower.
>
> I then modified it again to simply hash a 4 byte constant repeatedly
> 1000000000 times. The `load_unaligned_zeropad & bytemask_from_count`
> version was around 6% faster. I tried again with a 7 byte constant and
> got more or less a similar result.
>
> Then I tried with a 1 byte constant, and found that the
> `load_unaligned_zeropad & bytemask_from_count` version was slower.
>
> So, it would seem that between the `if (left)` and the `switch
> (left)`, there's the same number of branches.

Interesting.

For the dcache code (which is where that trick comes from), we used to
have a loop (rather than the duff's device thing), and it performed
badly due to the consistently badly predicted branch of the loop. But
I never compared it against the duff's device version.

I guess you could try to just remove the "if (left)" test entirely, if
it is at least partly the mispredict. It should do the right thing
even with a zero count, and it might schedule the code better. Code
size _should_ be better with the byte mask model (which won't matter
in the hot loop example, since it will all be cached, possibly even in
the uop cache for really tight benchmark loops).

             Linus

^ permalink raw reply

* Re: [PATCH v2] siphash: add cryptographically secure hashtable function
From: Jason A. Donenfeld @ 2016-12-12 21:44 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: kernel-hardening@lists.openwall.com, LKML,
	Linux Crypto Mailing List, George Spelvin, Scott Bauer,
	Andi Kleen, Andy Lutomirski, Greg KH, Jean-Philippe Aumasson,
	Daniel J . Bernstein

Hi Linus,

> I guess you could try to just remove the "if (left)" test entirely, if
> it is at least partly the mispredict. It should do the right thing
> even with a zero count, and it might schedule the code better. Code
> size _should_ be better with the byte mask model (which won't matter
> in the hot loop example, since it will all be cached, possibly even in
> the uop cache for really tight benchmark loops).

Originally I had just forgotten the `if (left)`, and had the same
sub-par benchmarks. In the v3 revision that I'm working on at the
moment, I'm using your dcache trick for cases 3,5,6,7 and
short-circuiting cases 1,2,4 to just directly access those bytes as
integers. For the 32-bit case, I do something similar, but built
inside of the duff's device. This should give optimal performance for
the most popular use cases, which involve hashing "some stuff" plus a
leftover u16 (port number?) or u32 (ipv4 addr?).

#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64
       switch (left) {
       case 0: break;
       case 1: b |= data[0]; break;
       case 2: b |= get_unaligned_le16(data); break;
       case 4: b |= get_unaligned_le32(data); break;
       default:
               b |= le64_to_cpu(load_unaligned_zeropad(data) &
bytemask_from_count(left));
               break;
       }
#else
       switch (left) {
       case 7: b |= ((u64)data[6]) << 48;
       case 6: b |= ((u64)data[5]) << 40;
       case 5: b |= ((u64)data[4]) << 32;
       case 4: b |= get_unaligned_le32(data); break;
       case 3: b |= ((u64)data[2]) << 16;
       case 2: b |= get_unaligned_le16(data); break;
       case 1: b |= data[0];
       }
#endif

It seems like this might be best of all worlds?

Jason

^ permalink raw reply

* Re: [PATCH] wusbcore: Fix one more crypto-on-the-stack bug
From: Greg KH @ 2016-12-12 21:44 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: linux-kernel, linux-usb, Eric Biggers, linux-crypto, Herbert Xu,
	Stephan Mueller
In-Reply-To: <8c273c9c41f51b34bb3115086f1d776895580637.1481575835.git.luto@kernel.org>

On Mon, Dec 12, 2016 at 12:52:45PM -0800, Andy Lutomirski wrote:
> The driver put a constant buffer of all zeros on the stack and
> pointed a scatterlist entry at it.  This doesn't work with virtual
> stacks.  Make the buffer static to fix it.
> 
> Cc: stable@vger.kernel.org # 4.9 only
> Reported-by: Eric Biggers <ebiggers3@gmail.com>
> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> ---
>  drivers/usb/wusbcore/crypto.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/usb/wusbcore/crypto.c b/drivers/usb/wusbcore/crypto.c
> index 79451f7ef1b7..a7e007a0cd49 100644
> --- a/drivers/usb/wusbcore/crypto.c
> +++ b/drivers/usb/wusbcore/crypto.c
> @@ -216,7 +216,7 @@ static int wusb_ccm_mac(struct crypto_skcipher *tfm_cbc,
>  	struct scatterlist sg[4], sg_dst;
>  	void *dst_buf;
>  	size_t dst_size;
> -	const u8 bzero[16] = { 0 };
> +	static const u8 bzero[16] = { 0 };

Hm, can static memory handle DMA?  That's a requirement of the USB
stack, does this data later end up being sent down to a USB host
controller?

thanks,

greg k-h

^ permalink raw reply

* Re: [PATCH v2] siphash: add cryptographically secure hashtable function
From: Jason A. Donenfeld @ 2016-12-12 21:57 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: kernel-hardening@lists.openwall.com, LKML,
	Linux Crypto Mailing List, George Spelvin, Scott Bauer,
	Andi Kleen, Andy Lutomirski, Greg KH, Jean-Philippe Aumasson,
	Daniel J . Bernstein
In-Reply-To: <CAHmME9o3otY8oKW1TGDWM23j4yz3PVvZViuwmfJ+szpWbm2BfA@mail.gmail.com>

On Mon, Dec 12, 2016 at 10:44 PM, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> #if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64
>        switch (left) {
>        case 0: break;
>        case 1: b |= data[0]; break;
>        case 2: b |= get_unaligned_le16(data); break;
>        case 4: b |= get_unaligned_le32(data); break;
>        default:
>                b |= le64_to_cpu(load_unaligned_zeropad(data) &
> bytemask_from_count(left));
>                break;
>        }
> #else
>        switch (left) {
>        case 7: b |= ((u64)data[6]) << 48;
>        case 6: b |= ((u64)data[5]) << 40;
>        case 5: b |= ((u64)data[4]) << 32;
>        case 4: b |= get_unaligned_le32(data); break;
>        case 3: b |= ((u64)data[2]) << 16;
>        case 2: b |= get_unaligned_le16(data); break;
>        case 1: b |= data[0];
>        }
> #endif

As it turns out, perhaps unsurprisingly, the code generation here is
really not nice, resulting in many branches instead of a computed
jump. I'll submit v3 with just a branch-less load_unaligned_zeropad
for the 64-bit/dcache case and the duff's device for the other case.

^ permalink raw reply

* Re: [PATCH v6 2/2] crypto: add virtio-crypto driver
From: Michael S. Tsirkin @ 2016-12-12 22:05 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Gonglei (Arei), linux-kernel@vger.kernel.org,
	qemu-devel@nongnu.org, virtio-dev@lists.oasis-open.org,
	virtualization@lists.linux-foundation.org,
	linux-crypto@vger.kernel.org, Luonengjun, stefanha@redhat.com,
	Huangweidong (C), Wubin (H), xin.zeng@intel.com, Claudio Fontana,
	pasic@linux.vnet.ibm.com, davem@davemloft.net,
	Zhoujian (jay, Euler)
In-Reply-To: <20161212105407.GA3033@gondor.apana.org.au>

On Mon, Dec 12, 2016 at 06:54:07PM +0800, Herbert Xu wrote:
> On Mon, Dec 12, 2016 at 06:25:12AM +0000, Gonglei (Arei) wrote:
> > Hi, Michael & Herbert
> > 
> > Because the virtio-crypto device emulation had been in QEMU 2.8,
> > would you please merge the virtio-crypto driver for 4.10 if no other
> > comments? If so, Miachel pls ack and/or review the patch, then
> > Herbert will take it (I asked him last week). Thank you!
> > 
> > Ps: Note on 4.10 merge window timing from Linus
> >  https://lkml.org/lkml/2016/12/7/506
> > 
> > Dec 23rd is the deadline for 4.10 merge window.
> 
> Sorry but it's too late for 4.10.  It needed to have been in my
> tree before the merge window opened to make it for this cycle.
> 
> Cheers,


Objections to me merging this? I'm preparing my tree right now.

> -- 
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* [PATCH v3] siphash: add cryptographically secure hashtable function
From: Jason A. Donenfeld @ 2016-12-12 22:18 UTC (permalink / raw)
  To: Linus Torvalds, kernel-hardening@lists.openwall.com, LKML,
	Linux Crypto Mailing List, George Spelvin, Scott Bauer,
	Andi Kleen, Andy Lutomirski, Greg KH, Eric Biggers
  Cc: Jason A. Donenfeld, Jean-Philippe Aumasson, Daniel J . Bernstein
In-Reply-To: <CA+55aFymjmEPNx8ZwhxtiE=nPG_5gbkzUQhdRAwTareuNcV=tA@mail.gmail.com>

SipHash is a 64-bit keyed hash function that is actually a
cryptographically secure PRF, like HMAC. Except SipHash is super fast,
and is meant to be used as a hashtable keyed lookup function.

SipHash isn't just some new trendy hash function. It's been around for a
while, and there really isn't anything that comes remotely close to
being useful in the way SipHash is. With that said, why do we need this?

There are a variety of attacks known as "hashtable poisoning" in which an
attacker forms some data such that the hash of that data will be the
same, and then preceeds to fill up all entries of a hashbucket. This is
a realistic and well-known denial-of-service vector.

Linux developers already seem to be aware that this is an issue, and
various places that use hash tables in, say, a network context, use a
non-cryptographically secure function (usually jhash) and then try to
twiddle with the key on a time basis (or in many cases just do nothing
and hope that nobody notices). While this is an admirable attempt at
solving the problem, it doesn't actually fix it. SipHash fixes it.

(It fixes it in such a sound way that you could even build a stream
cipher out of SipHash that would resist the modern cryptanalysis.)

There are a modicum of places in the kernel that are vulnerable to
hashtable poisoning attacks, either via userspace vectors or network
vectors, and there's not a reliable mechanism inside the kernel at the
moment to fix it. The first step toward fixing these issues is actually
getting a secure primitive into the kernel for developers to use. Then
we can, bit by bit, port things over to it as deemed appropriate.

Dozens of languages are already using this internally for their hash
tables. Some of the BSDs already use this in their kernels. SipHash is
a widely known high-speed solution to a widely known problem, and it's
time we catch-up.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Daniel J. Bernstein <djb@cr.yp.to>
---
Changes from v2->v3:

  - The unaligned helpers are now used for reading from u8* arrays.
  - Linus' trick with load_unaligned_zeropad has been implemented for
    64-bit/dcache platforms.
  - Non 64-bit/dcache platforms now use a more optimized duff's device
    for shortcutting certain sized left-overs.
  - The Kconfig help text for the test now mentions siphash.
  - The function now returns a native-endian byte sequence inside a
    u64, which is more correct. As well, the tests vectors are now
    represented as u64 literals, rather than byte sequences.
  - The origin of the test vectors is now inside a comment.


 include/linux/siphash.h | 20 +++++++++++++
 lib/Kconfig.debug       |  6 ++--
 lib/Makefile            |  5 ++--
 lib/siphash.c           | 75 +++++++++++++++++++++++++++++++++++++++++++++++++
 lib/test_siphash.c      | 74 ++++++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 175 insertions(+), 5 deletions(-)
 create mode 100644 include/linux/siphash.h
 create mode 100644 lib/siphash.c
 create mode 100644 lib/test_siphash.c

diff --git a/include/linux/siphash.h b/include/linux/siphash.h
new file mode 100644
index 000000000000..6623b3090645
--- /dev/null
+++ b/include/linux/siphash.h
@@ -0,0 +1,20 @@
+/* Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>
+ *
+ * This file is provided under a dual BSD/GPLv2 license.
+ *
+ * SipHash: a fast short-input PRF
+ * https://131002.net/siphash/
+ */
+
+#ifndef _LINUX_SIPHASH_H
+#define _LINUX_SIPHASH_H
+
+#include <linux/types.h>
+
+enum siphash_lengths {
+	SIPHASH24_KEY_LEN = 16
+};
+
+u64 siphash24(const u8 *data, size_t len, const u8 key[SIPHASH24_KEY_LEN]);
+
+#endif /* _LINUX_SIPHASH_H */
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index a6c8db1d62f6..2a1797704b41 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1823,9 +1823,9 @@ config TEST_HASH
 	tristate "Perform selftest on hash functions"
 	default n
 	help
-	  Enable this option to test the kernel's integer (<linux/hash,h>)
-	  and string (<linux/stringhash.h>) hash functions on boot
-	  (or module load).
+	  Enable this option to test the kernel's integer (<linux/hash.h>),
+	  string (<linux/stringhash.h>), and siphash (<linux/siphash.h>)
+	  hash functions on boot (or module load).
 
 	  This is intended to help people writing architecture-specific
 	  optimized versions.  If unsure, say N.
diff --git a/lib/Makefile b/lib/Makefile
index 50144a3aeebd..71d398b04a74 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -22,7 +22,8 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \
 	 sha1.o chacha20.o md5.o irq_regs.o argv_split.o \
 	 flex_proportions.o ratelimit.o show_mem.o \
 	 is_single_threaded.o plist.o decompress.o kobject_uevent.o \
-	 earlycpio.o seq_buf.o nmi_backtrace.o nodemask.o win_minmax.o
+	 earlycpio.o seq_buf.o siphash.o \
+	 nmi_backtrace.o nodemask.o win_minmax.o
 
 lib-$(CONFIG_MMU) += ioremap.o
 lib-$(CONFIG_SMP) += cpumask.o
@@ -44,7 +45,7 @@ obj-$(CONFIG_TEST_HEXDUMP) += test_hexdump.o
 obj-y += kstrtox.o
 obj-$(CONFIG_TEST_BPF) += test_bpf.o
 obj-$(CONFIG_TEST_FIRMWARE) += test_firmware.o
-obj-$(CONFIG_TEST_HASH) += test_hash.o
+obj-$(CONFIG_TEST_HASH) += test_hash.o test_siphash.o
 obj-$(CONFIG_TEST_KASAN) += test_kasan.o
 obj-$(CONFIG_TEST_KSTRTOX) += test-kstrtox.o
 obj-$(CONFIG_TEST_LKM) += test_module.o
diff --git a/lib/siphash.c b/lib/siphash.c
new file mode 100644
index 000000000000..b259a3295c50
--- /dev/null
+++ b/lib/siphash.c
@@ -0,0 +1,75 @@
+/* Copyright (C) 2015-2016 Jason A. Donenfeld <Jason@zx2c4.com>
+ * Copyright (C) 2012-2014 Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
+ * Copyright (C) 2012-2014 Daniel J. Bernstein <djb@cr.yp.to>
+ *
+ * This file is provided under a dual BSD/GPLv2 license.
+ *
+ * SipHash: a fast short-input PRF
+ * https://131002.net/siphash/
+ */
+
+#include <linux/siphash.h>
+#include <linux/kernel.h>
+#include <asm/unaligned.h>
+
+#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64
+#include <linux/dcache.h>
+#include <asm/word-at-a-time.h>
+#endif
+
+#define SIPROUND \
+	do { \
+	v0 += v1; v1 = rol64(v1, 13); v1 ^= v0; v0 = rol64(v0, 32); \
+	v2 += v3; v3 = rol64(v3, 16); v3 ^= v2; \
+	v0 += v3; v3 = rol64(v3, 21); v3 ^= v0; \
+	v2 += v1; v1 = rol64(v1, 17); v1 ^= v2; v2 = rol64(v2, 32); \
+	} while(0)
+
+u64 siphash24(const u8 *data, size_t len, const u8 key[SIPHASH24_KEY_LEN])
+{
+	u64 v0 = 0x736f6d6570736575ULL;
+	u64 v1 = 0x646f72616e646f6dULL;
+	u64 v2 = 0x6c7967656e657261ULL;
+	u64 v3 = 0x7465646279746573ULL;
+	u64 b = ((u64)len) << 56;
+	u64 k0 = get_unaligned_le64(key);
+	u64 k1 = get_unaligned_le64(key + sizeof(u64));
+	u64 m;
+	const u8 *end = data + len - (len % sizeof(u64));
+	const u8 left = len & (sizeof(u64) - 1);
+	v3 ^= k1;
+	v2 ^= k0;
+	v1 ^= k1;
+	v0 ^= k0;
+	for (; data != end; data += sizeof(u64)) {
+		m = get_unaligned_le64(data);
+		v3 ^= m;
+		SIPROUND;
+		SIPROUND;
+		v0 ^= m;
+	}
+#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64
+	b |= le64_to_cpu(load_unaligned_zeropad(data) & bytemask_from_count(left));
+#else
+	switch (left) {
+	case 7: b |= ((u64)data[6]) << 48;
+	case 6: b |= ((u64)data[5]) << 40;
+	case 5: b |= ((u64)data[4]) << 32;
+	case 4: b |= get_unaligned_le32(data); break;
+	case 3: b |= ((u64)data[2]) << 16;
+	case 2: b |= get_unaligned_le16(data); break;
+	case 1: b |= data[0];
+	}
+#endif
+	v3 ^= b;
+	SIPROUND;
+	SIPROUND;
+	v0 ^= b;
+	v2 ^= 0xff;
+	SIPROUND;
+	SIPROUND;
+	SIPROUND;
+	SIPROUND;
+	return (v0 ^ v1) ^ (v2 ^ v3);
+}
+EXPORT_SYMBOL(siphash24);
diff --git a/lib/test_siphash.c b/lib/test_siphash.c
new file mode 100644
index 000000000000..336298aaa33b
--- /dev/null
+++ b/lib/test_siphash.c
@@ -0,0 +1,74 @@
+/* Test cases for siphash.c
+ *
+ * Copyright (C) 2015-2016 Jason A. Donenfeld <Jason@zx2c4.com>
+ *
+ * This file is provided under a dual BSD/GPLv2 license.
+ *
+ * SipHash: a fast short-input PRF
+ * https://131002.net/siphash/
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/siphash.h>
+#include <linux/kernel.h>
+#include <linux/string.h>
+#include <linux/errno.h>
+#include <linux/module.h>
+
+/* Test vectors taken from official reference source available at:
+ *     https://131002.net/siphash/siphash24.c
+ */
+static const u64 test_vectors[64] = {
+	0x726fdb47dd0e0e31ULL, 0x74f839c593dc67fdULL, 0x0d6c8009d9a94f5aULL,
+	0x85676696d7fb7e2dULL, 0xcf2794e0277187b7ULL, 0x18765564cd99a68dULL,
+	0xcbc9466e58fee3ceULL, 0xab0200f58b01d137ULL, 0x93f5f5799a932462ULL,
+	0x9e0082df0ba9e4b0ULL, 0x7a5dbbc594ddb9f3ULL, 0xf4b32f46226bada7ULL,
+	0x751e8fbc860ee5fbULL, 0x14ea5627c0843d90ULL, 0xf723ca908e7af2eeULL,
+	0xa129ca6149be45e5ULL, 0x3f2acc7f57c29bdbULL, 0x699ae9f52cbe4794ULL,
+	0x4bc1b3f0968dd39cULL, 0xbb6dc91da77961bdULL, 0xbed65cf21aa2ee98ULL,
+	0xd0f2cbb02e3b67c7ULL, 0x93536795e3a33e88ULL, 0xa80c038ccd5ccec8ULL,
+	0xb8ad50c6f649af94ULL, 0xbce192de8a85b8eaULL, 0x17d835b85bbb15f3ULL,
+	0x2f2e6163076bcfadULL, 0xde4daaaca71dc9a5ULL, 0xa6a2506687956571ULL,
+	0xad87a3535c49ef28ULL, 0x32d892fad841c342ULL, 0x7127512f72f27cceULL,
+	0xa7f32346f95978e3ULL, 0x12e0b01abb051238ULL, 0x15e034d40fa197aeULL,
+	0x314dffbe0815a3b4ULL, 0x027990f029623981ULL, 0xcadcd4e59ef40c4dULL,
+	0x9abfd8766a33735cULL, 0x0e3ea96b5304a7d0ULL, 0xad0c42d6fc585992ULL,
+	0x187306c89bc215a9ULL, 0xd4a60abcf3792b95ULL, 0xf935451de4f21df2ULL,
+	0xa9538f0419755787ULL, 0xdb9acddff56ca510ULL, 0xd06c98cd5c0975ebULL,
+	0xe612a3cb9ecba951ULL, 0xc766e62cfcadaf96ULL, 0xee64435a9752fe72ULL,
+	0xa192d576b245165aULL, 0x0a8787bf8ecb74b2ULL, 0x81b3e73d20b49b6fULL,
+	0x7fa8220ba3b2eceaULL, 0x245731c13ca42499ULL, 0xb78dbfaf3a8d83bdULL,
+	0xea1ad565322a1a0bULL, 0x60e61c23a3795013ULL, 0x6606d7e446282b93ULL,
+	0x6ca4ecb15c5f91e1ULL, 0x9f626da15c9625f3ULL, 0xe51b38608ef25f57ULL,
+	0x958a324ceb064572ULL
+};
+
+static int __init siphash_test_init(void)
+{
+	u8 in[64], k[16], i;
+	int ret = 0;
+
+	for (i = 0; i < 16; ++i)
+		k[i] = i;
+	for (i = 0; i < 64; ++i) {
+		in[i] = i;
+		if (siphash24(in, i, k) != test_vectors[i]) {
+			pr_info("self-test %u: FAIL\n", i + 1);
+			ret = -EINVAL;
+		}
+	}
+	if (!ret)
+		pr_info("self-tests: pass\n");
+	return ret;
+}
+
+static void __exit siphash_test_exit(void)
+{
+}
+
+module_init(siphash_test_init);
+module_exit(siphash_test_exit);
+
+MODULE_AUTHOR("Jason A. Donenfeld <Jason@zx2c4.com>");
+MODULE_LICENSE("Dual BSD/GPL");
-- 
2.11.0

^ permalink raw reply related

* Re: [PATCH] keys/encrypted: Fix two crypto-on-the-stack bugs
From: David Howells @ 2016-12-12 22:28 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: dhowells, linux-kernel, linux-usb, keyrings, Eric Biggers,
	linux-crypto, Herbert Xu, Stephan Mueller
In-Reply-To: <e958f214e8885968be8045ffde813ac339b81178.1481575835.git.luto@kernel.org>

Andy Lutomirski <luto@kernel.org> wrote:

> +static const char zero_pad[16] = {0};

Isn't there a global page of zeros or something that we can share?  Also, you
shouldn't explicitly initialise it so that it stays in .bss.

> -	sg_set_buf(&sg_out[1], pad, sizeof pad);
> +	sg_set_buf(&sg_out[1], zero_pad, sizeof zero_pad);

Can you put brackets on the sizeof?

Thanks,
David

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox