Linux cryptographic layer development
 help / color / mirror / Atom feed
* Re: [PATCH v1 2/2] crypto: mediatek - add DT bindings documentation
From: Matthias Brugger @ 2016-12-05 10:18 UTC (permalink / raw)
  To: Ryder Lee, Herbert Xu, David S. Miller
  Cc: devicetree, linux-mediatek, linux-kernel, linux-crypto,
	linux-arm-kernel, Sean Wang, Roy Luo
In-Reply-To: <1480921284-45827-3-git-send-email-ryder.lee@mediatek.com>



On 05/12/16 08:01, Ryder Lee wrote:
> Add DT bindings documentation for the crypto driver
>
> Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
> ---
>  .../devicetree/bindings/crypto/mediatek-crypto.txt | 32 ++++++++++++++++++++++
>  1 file changed, 32 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/crypto/mediatek-crypto.txt
>
> diff --git a/Documentation/devicetree/bindings/crypto/mediatek-crypto.txt b/Documentation/devicetree/bindings/crypto/mediatek-crypto.txt
> new file mode 100644
> index 0000000..8b1db08
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/crypto/mediatek-crypto.txt
> @@ -0,0 +1,32 @@
> +MediaTek cryptographic accelerators
> +
> +Required properties:
> +- compatible: Should be "mediatek,mt7623-crypto"

Do you know how big the difference is between the crypto engine for 
mt7623/mt2701/mt8521p in comparison, let's say mt8173 or mt6797?
Do this SoCs have a crypot engine? If so and they are quite similar, we 
might think of adding a mtk-crypto binding and add soc specific bindings.

Regards,
Matthias

> +- reg: Address and length of the register set for the device
> +- interrupts: Should contain the five crypto engines interrupts in numeric
> +	order. These are global system and four descriptor rings.
> +- clocks: the clock used by the core
> +- clock-names: the names of the clock listed in the clocks property. These are
> +	"ethif", "cryp"
> +- power-domains: Must contain a reference to the PM domain.
> +
> +
> +Optional properties:
> +- interrupt-parent: Should be the phandle for the interrupt controller
> +  that services interrupts for this device
> +
> +
> +Example:
> +	crypto: crypto@1b240000 {
> +		compatible = "mediatek,mt7623-crypto";
> +		reg = <0 0x1b240000 0 0x20000>;
> +		interrupts = <GIC_SPI 82 IRQ_TYPE_LEVEL_LOW>,
> +			     <GIC_SPI 83 IRQ_TYPE_LEVEL_LOW>,
> +			     <GIC_SPI 84 IRQ_TYPE_LEVEL_LOW>,
> +			     <GIC_SPI 91 IRQ_TYPE_LEVEL_LOW>,
> +			     <GIC_SPI 97 IRQ_TYPE_LEVEL_LOW>;
> +		clocks = <&topckgen CLK_TOP_ETHIF_SEL>,
> +			 <&ethsys CLK_ETHSYS_CRYPTO>;
> +		clock-names = "ethif","cryp";
> +		power-domains = <&scpsys MT2701_POWER_DOMAIN_ETH>;
> +	};
>

^ permalink raw reply

* Re: [PATCH v2 0/6] crypto: ARM/arm64 CRC-T10DIF/CRC32/CRC32C roundup
From: Ard Biesheuvel @ 2016-12-05  9:20 UTC (permalink / raw)
  To: linux-crypto@vger.kernel.org, Herbert Xu; +Cc: Ard Biesheuvel
In-Reply-To: <1480852447-25082-1-git-send-email-ard.biesheuvel@linaro.org>

On 4 December 2016 at 11:54, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> This v2 combines the CRC-T10DIF and CRC32 implementations for both ARM and
> arm64 that I sent out a couple of weeks ago, and adds support to the latter
> for CRC32C.
>

Please don't apply yet. There is an issue in the 32-bit ARM code I
only spotted just now.


> Ard Biesheuvel (6):
>   crypto: testmgr - avoid overlap in chunked tests
>   crypto: testmgr - add/enhance test cases for CRC-T10DIF
>   crypto: arm64/crct10dif - port x86 SSE implementation to arm64
>   crypto: arm/crct10dif - port x86 SSE implementation to ARM
>   crypto: arm64/crc32 - accelerated support based on x86 SSE
>     implementation
>   crypto: arm/crc32 - accelerated support based on x86 SSE
>     implementation
>
>  arch/arm/crypto/Kconfig               |  10 +
>  arch/arm/crypto/Makefile              |   4 +
>  arch/arm/crypto/crc32-ce-core.S       | 306 +++++++++++++++++
>  arch/arm/crypto/crc32-ce-glue.c       | 195 +++++++++++
>  arch/arm/crypto/crct10dif-ce-core.S   | 349 ++++++++++++++++++++
>  arch/arm/crypto/crct10dif-ce-glue.c   |  95 ++++++
>  arch/arm64/crypto/Kconfig             |  11 +
>  arch/arm64/crypto/Makefile            |   6 +
>  arch/arm64/crypto/crc32-ce-core.S     | 266 +++++++++++++++
>  arch/arm64/crypto/crc32-ce-glue.c     | 188 +++++++++++
>  arch/arm64/crypto/crct10dif-ce-core.S | 317 ++++++++++++++++++
>  arch/arm64/crypto/crct10dif-ce-glue.c |  91 +++++
>  crypto/testmgr.c                      |   2 +-
>  crypto/testmgr.h                      |  70 ++--
>  14 files changed, 1881 insertions(+), 29 deletions(-)
>  create mode 100644 arch/arm/crypto/crc32-ce-core.S
>  create mode 100644 arch/arm/crypto/crc32-ce-glue.c
>  create mode 100644 arch/arm/crypto/crct10dif-ce-core.S
>  create mode 100644 arch/arm/crypto/crct10dif-ce-glue.c
>  create mode 100644 arch/arm64/crypto/crc32-ce-core.S
>  create mode 100644 arch/arm64/crypto/crc32-ce-glue.c
>  create mode 100644 arch/arm64/crypto/crct10dif-ce-core.S
>  create mode 100644 arch/arm64/crypto/crct10dif-ce-glue.c
>
> --
> 2.7.4
>

^ permalink raw reply

* [PATCH] crypto: caam - fix pointer size for AArch64 boot loader, AArch32 kernel
From: Horia Geantă @ 2016-12-05  9:06 UTC (permalink / raw)
  To: Herbert Xu; +Cc: David S. Miller, linux-crypto, Dan Douglass, Alison Wang

Start with a clean slate before dealing with bit 16 (pointer size)
of Master Configuration Register.
This fixes the case of AArch64 boot loader + AArch32 kernel, when
the boot loader might set MCFGR[PS] and kernel would fail to clear it.

Cc: <stable@vger.kernel.org>
Reported-by: Alison Wang <alison.wang@nxp.com>
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
---
 drivers/crypto/caam/ctrl.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/caam/ctrl.c b/drivers/crypto/caam/ctrl.c
index be62a7f482ac..0a6ca3919270 100644
--- a/drivers/crypto/caam/ctrl.c
+++ b/drivers/crypto/caam/ctrl.c
@@ -556,8 +556,9 @@ static int caam_probe(struct platform_device *pdev)
 	 * Enable DECO watchdogs and, if this is a PHYS_ADDR_T_64BIT kernel,
 	 * long pointers in master configuration register
 	 */
-	clrsetbits_32(&ctrl->mcr, MCFGR_AWCACHE_MASK, MCFGR_AWCACHE_CACH |
-		      MCFGR_AWCACHE_BUFF | MCFGR_WDENABLE | MCFGR_LARGE_BURST |
+	clrsetbits_32(&ctrl->mcr, MCFGR_AWCACHE_MASK | MCFGR_LONG_PTR,
+		      MCFGR_AWCACHE_CACH | MCFGR_AWCACHE_BUFF |
+		      MCFGR_WDENABLE | MCFGR_LARGE_BURST |
 		      (sizeof(dma_addr_t) == sizeof(u64) ? MCFGR_LONG_PTR : 0));
 
 	/*
-- 
2.4.4

^ permalink raw reply related

* [PATCH v2 0/2] CESA: Fixes for STD ahash requests
From: Romain Perier @ 2016-12-05  8:56 UTC (permalink / raw)
  To: Boris Brezillon, Arnaud Ebalard
  Cc: linux-crypto, Jason Cooper, Andrew Lunn, Sebastian Hesselbarth,
	Gregory Clement, Thomas Petazzoni, stable, Nadav Haklai,
	Ofer Heifetz


This set of patches fixes two issues for STD ahash requests. The first
one is that the operation template is copied twice to the SRAM from the
step function, it is not needed. The second one is also contained in the
step function which copies creq->state to the engine for all type of
requests, even if this one is a fragment of the initial req and is
re-launched. This might corrupt the context of the request in some cases.

Romain Perier (2):
  crypto: marvell - Don't copy hash operation twice into the SRAM
  crypto: marvell - Don't corrupt state of an STD req for re-stepped
    ahash

 drivers/crypto/marvell/hash.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

-- 

v1: https://www.mail-archive.com/linux-crypto@vger.kernel.org/msg22113.html

Changes in v2:
 - Rephrased commit message for PATCH 2/2
 - Added Cc: stable for both commits
 - Fixed coding style issue in PATCH 2/2
 - Rephased the cover letter

2.9.3

^ permalink raw reply

* [PATCH v2 2/2] crypto: marvell - Don't corrupt state of an STD req for re-stepped ahash
From: Romain Perier @ 2016-12-05  8:56 UTC (permalink / raw)
  To: Boris Brezillon, Arnaud Ebalard
  Cc: linux-crypto, Jason Cooper, Andrew Lunn, Sebastian Hesselbarth,
	Gregory Clement, Thomas Petazzoni, stable, Nadav Haklai,
	Ofer Heifetz
In-Reply-To: <20161205085639.21034-1-romain.perier@free-electrons.com>

mv_cesa_hash_std_step() copies the creq->state into the SRAM at each
step, but this is only required on the first one. By doing that, we
overwrite the engine state, and get erroneous results when the crypto
request is split in several chunks to fit in the internal SRAM.

This commit changes the function to copy the state only on the first
step.

Fixes: commit 2786cee8e50b ("crypto: marvell - Move SRAM I/O op...")
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Cc: <stable@vger.kernel.org>
---
 drivers/crypto/marvell/hash.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
index fbbcbf8..317cf02 100644
--- a/drivers/crypto/marvell/hash.c
+++ b/drivers/crypto/marvell/hash.c
@@ -168,9 +168,11 @@ static void mv_cesa_ahash_std_step(struct ahash_request *req)
 	mv_cesa_adjust_op(engine, &creq->op_tmpl);
 	memcpy_toio(engine->sram, &creq->op_tmpl, sizeof(creq->op_tmpl));
 
-	digsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(req));
-	for (i = 0; i < digsize / 4; i++)
-		writel_relaxed(creq->state[i], engine->regs + CESA_IVDIG(i));
+	if (!sreq->offset) {
+		digsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(req));
+		for (i = 0; i < digsize / 4; i++)
+			writel_relaxed(creq->state[i], engine->regs + CESA_IVDIG(i));
+	}
 
 	if (creq->cache_ptr)
 		memcpy_toio(engine->sram + CESA_SA_DATA_SRAM_OFFSET,
-- 
2.9.3

^ permalink raw reply related

* [PATCH v2 1/2] crypto: marvell - Don't copy hash operation twice into the SRAM
From: Romain Perier @ 2016-12-05  8:56 UTC (permalink / raw)
  To: Boris Brezillon, Arnaud Ebalard
  Cc: linux-crypto, Jason Cooper, Andrew Lunn, Sebastian Hesselbarth,
	Gregory Clement, Thomas Petazzoni, stable, Nadav Haklai,
	Ofer Heifetz
In-Reply-To: <20161205085639.21034-1-romain.perier@free-electrons.com>

No need to copy the template of an hash operation twice into the SRAM
from the step function.

Fixes: commit 85030c5168f1 ("crypto: marvell - Add support for chai...")
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Cc: <stable@vger.kernel.org>
---
 drivers/crypto/marvell/hash.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/crypto/marvell/hash.c b/drivers/crypto/marvell/hash.c
index 2a92605..fbbcbf8 100644
--- a/drivers/crypto/marvell/hash.c
+++ b/drivers/crypto/marvell/hash.c
@@ -172,9 +172,6 @@ static void mv_cesa_ahash_std_step(struct ahash_request *req)
 	for (i = 0; i < digsize / 4; i++)
 		writel_relaxed(creq->state[i], engine->regs + CESA_IVDIG(i));
 
-	mv_cesa_adjust_op(engine, &creq->op_tmpl);
-	memcpy_toio(engine->sram, &creq->op_tmpl, sizeof(creq->op_tmpl));
-
 	if (creq->cache_ptr)
 		memcpy_toio(engine->sram + CESA_SA_DATA_SRAM_OFFSET,
 			    creq->cache, creq->cache_ptr);
-- 
2.9.3

^ permalink raw reply related

* Re: [PATCH v1 1/2] Add crypto driver support for some MediaTek chips
From: Corentin Labbe @ 2016-12-05  8:52 UTC (permalink / raw)
  To: Ryder Lee
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, Herbert Xu, Sean Wang,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Roy Luo,
	linux-mediatek-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-crypto-u79uwXL29TY76Z2rM5mHXA, Matthias Brugger,
	David S. Miller,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
In-Reply-To: <1480921284-45827-2-git-send-email-ryder.lee-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org>

Hello

I have two minor comment.

On Mon, Dec 05, 2016 at 03:01:23PM +0800, Ryder Lee wrote:
> This adds support for the MediaTek hardware accelerator on
> mt7623/mt2701/mt8521p SoC.
> 
> This driver currently implement:
> - SHA1 and SHA2 family(HMAC) hash alogrithms.

There is a typo for algorithms.

[...]
> +/**
> + * struct mtk_desc - DMA descriptor
> + * @hdr:	the descriptor control header
> + * @buf:	DMA address of input buffer segment
> + * @ct:		DMA address of command token that control operation flow
> + * @ct_hdr:	the command token control header
> + * @tag:	the user-defined field
> + * @tfm:	DMA address of transform state
> + * @bound:	align descriptors offset boundary
> + *
> + * Structure passed to the crypto engine to describe where source
> + * data needs to be fetched and how it needs to be processed.
> + */
> +struct mtk_desc {
> +	u32 hdr;
> +	u32 buf;
> +	u32 ct;
> +	u32 ct_hdr;
> +	u32 tag;
> +	u32 tfm;
> +	u32 bound[2];
> +};

Do you have tested this descriptor with BE/LE kernel ?

Regards
Corentin Labbe

^ permalink raw reply

* [PATCH v1 2/2] crypto: mediatek - add DT bindings documentation
From: Ryder Lee @ 2016-12-05  7:01 UTC (permalink / raw)
  To: Herbert Xu, David S. Miller, Matthias Brugger
  Cc: devicetree, linux-mediatek, linux-kernel, linux-crypto,
	linux-arm-kernel, Sean Wang, Roy Luo, Ryder Lee
In-Reply-To: <1480921284-45827-1-git-send-email-ryder.lee@mediatek.com>

Add DT bindings documentation for the crypto driver

Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
---
 .../devicetree/bindings/crypto/mediatek-crypto.txt | 32 ++++++++++++++++++++++
 1 file changed, 32 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/crypto/mediatek-crypto.txt

diff --git a/Documentation/devicetree/bindings/crypto/mediatek-crypto.txt b/Documentation/devicetree/bindings/crypto/mediatek-crypto.txt
new file mode 100644
index 0000000..8b1db08
--- /dev/null
+++ b/Documentation/devicetree/bindings/crypto/mediatek-crypto.txt
@@ -0,0 +1,32 @@
+MediaTek cryptographic accelerators
+
+Required properties:
+- compatible: Should be "mediatek,mt7623-crypto"
+- reg: Address and length of the register set for the device
+- interrupts: Should contain the five crypto engines interrupts in numeric
+	order. These are global system and four descriptor rings.
+- clocks: the clock used by the core
+- clock-names: the names of the clock listed in the clocks property. These are
+	"ethif", "cryp"
+- power-domains: Must contain a reference to the PM domain.
+
+
+Optional properties:
+- interrupt-parent: Should be the phandle for the interrupt controller
+  that services interrupts for this device
+
+
+Example:
+	crypto: crypto@1b240000 {
+		compatible = "mediatek,mt7623-crypto";
+		reg = <0 0x1b240000 0 0x20000>;
+		interrupts = <GIC_SPI 82 IRQ_TYPE_LEVEL_LOW>,
+			     <GIC_SPI 83 IRQ_TYPE_LEVEL_LOW>,
+			     <GIC_SPI 84 IRQ_TYPE_LEVEL_LOW>,
+			     <GIC_SPI 91 IRQ_TYPE_LEVEL_LOW>,
+			     <GIC_SPI 97 IRQ_TYPE_LEVEL_LOW>;
+		clocks = <&topckgen CLK_TOP_ETHIF_SEL>,
+			 <&ethsys CLK_ETHSYS_CRYPTO>;
+		clock-names = "ethif","cryp";
+		power-domains = <&scpsys MT2701_POWER_DOMAIN_ETH>;
+	};
-- 
1.9.1

^ permalink raw reply related

* [PATCH v1 1/2] Add crypto driver support for some MediaTek chips
From: Ryder Lee @ 2016-12-05  7:01 UTC (permalink / raw)
  To: Herbert Xu, David S. Miller, Matthias Brugger
  Cc: devicetree, linux-mediatek, linux-kernel, linux-crypto,
	linux-arm-kernel, Sean Wang, Roy Luo, Ryder Lee
In-Reply-To: <1480921284-45827-1-git-send-email-ryder.lee@mediatek.com>

This adds support for the MediaTek hardware accelerator on
mt7623/mt2701/mt8521p SoC.

This driver currently implement:
- SHA1 and SHA2 family(HMAC) hash alogrithms.
- AES block cipher in CBC/ECB mode with 128/196/256 bits keys.

Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
---
 drivers/crypto/Kconfig                 |   17 +
 drivers/crypto/Makefile                |    1 +
 drivers/crypto/mediatek/Makefile       |    2 +
 drivers/crypto/mediatek/mtk-aes.c      |  763 +++++++++++++++++
 drivers/crypto/mediatek/mtk-platform.c |  580 +++++++++++++
 drivers/crypto/mediatek/mtk-platform.h |  235 ++++++
 drivers/crypto/mediatek/mtk-regs.h     |  194 +++++
 drivers/crypto/mediatek/mtk-sha.c      | 1423 ++++++++++++++++++++++++++++++++
 8 files changed, 3215 insertions(+)
 create mode 100644 drivers/crypto/mediatek/Makefile
 create mode 100644 drivers/crypto/mediatek/mtk-aes.c
 create mode 100644 drivers/crypto/mediatek/mtk-platform.c
 create mode 100644 drivers/crypto/mediatek/mtk-platform.h
 create mode 100644 drivers/crypto/mediatek/mtk-regs.h
 create mode 100644 drivers/crypto/mediatek/mtk-sha.c

diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index 4d2b81f..ad0a00b 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -553,6 +553,23 @@ config CRYPTO_DEV_ROCKCHIP
 	  This driver interfaces with the hardware crypto accelerator.
 	  Supporting cbc/ecb chainmode, and aes/des/des3_ede cipher mode.
 
+config CRYPTO_DEV_MEDIATEK
+	tristate "MediaTek's Cryptographic Engine driver"
+	depends on ARM && (ARCH_MEDIATEK || COMPILE_TEST)
+	select NEON
+	select KERNEL_MODE_NEON
+	select ARM_CRYPTO
+	select CRYPTO_AES
+	select CRYPTO_BLKCIPHER
+	select CRYPTO_SHA1_ARM_NEON
+	select CRYPTO_SHA256_ARM
+	select CRYPTO_SHA512_ARM
+	select CRYPTO_HMAC
+	help
+	  This driver allows you to utilize the hardware crypto accelerator
+	  which can be found on the MT7623 MT2701, MT8521p, etc ....
+	  Select this if you want to use it for AES/SHA1/SHA2 algorithms.
+
 source "drivers/crypto/chelsio/Kconfig"
 
 endif # CRYPTO_HW
diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
index ad7250f..272b51a 100644
--- a/drivers/crypto/Makefile
+++ b/drivers/crypto/Makefile
@@ -10,6 +10,7 @@ obj-$(CONFIG_CRYPTO_DEV_IMGTEC_HASH) += img-hash.o
 obj-$(CONFIG_CRYPTO_DEV_IXP4XX) += ixp4xx_crypto.o
 obj-$(CONFIG_CRYPTO_DEV_MV_CESA) += mv_cesa.o
 obj-$(CONFIG_CRYPTO_DEV_MARVELL_CESA) += marvell/
+obj-$(CONFIG_CRYPTO_DEV_MEDIATEK) += mediatek/
 obj-$(CONFIG_CRYPTO_DEV_MXS_DCP) += mxs-dcp.o
 obj-$(CONFIG_CRYPTO_DEV_NIAGARA2) += n2_crypto.o
 n2_crypto-y := n2_core.o n2_asm.o
diff --git a/drivers/crypto/mediatek/Makefile b/drivers/crypto/mediatek/Makefile
new file mode 100644
index 0000000..187be79
--- /dev/null
+++ b/drivers/crypto/mediatek/Makefile
@@ -0,0 +1,2 @@
+obj-$(CONFIG_CRYPTO_DEV_MEDIATEK) += mtk-crypto.o
+mtk-crypto-objs:= mtk-platform.o mtk-aes.o mtk-sha.o
diff --git a/drivers/crypto/mediatek/mtk-aes.c b/drivers/crypto/mediatek/mtk-aes.c
new file mode 100644
index 0000000..0208981
--- /dev/null
+++ b/drivers/crypto/mediatek/mtk-aes.c
@@ -0,0 +1,763 @@
+/*
+ * Cryptographic API.
+ *
+ * Support for MediaTek AES hardware accelerator.
+ *
+ * Copyright (c) 2016 MediaTek Inc.
+ * Author: Ryder Lee <ryder.lee@mediatek.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * Some ideas are from atmel-aes.c drivers.
+ */
+
+#include <crypto/aes.h>
+#include <crypto/algapi.h>
+#include <crypto/scatterwalk.h>
+#include <linux/dma-mapping.h>
+#include <linux/scatterlist.h>
+#include "mtk-platform.h"
+#include "mtk-regs.h"
+
+#define AES_QUEUE_LENGTH	512
+#define AES_BUFFER_ORDER	2
+#define AES_BUFFER_SIZE		((PAGE_SIZE << AES_BUFFER_ORDER) \
+				& ~(AES_BLOCK_SIZE - 1))
+
+/* AES command token */
+#define AES_CT_SIZE_ECB		2
+#define AES_CT_SIZE_CBC		3
+#define AES_CT_CTRL_HDR		0x00220000
+#define AES_COMMAND0		0x05000000
+#define AES_COMMAND1		0x2d060000
+#define AES_COMMAND2		0xe4a63806
+
+/* AES transform information */
+#define AES_TFM_ECB		(0x0 << 0)
+#define AES_TFM_CBC		(0x1 << 0)
+#define AES_TFM_DECRYPT		(0x5 << 0)
+#define AES_TFM_ENCRYPT		(0x4 << 0)
+#define AES_TFM_SIZE(x)		((x) << 8)
+#define AES_TFM_128BITS		(0xb << 16)
+#define AES_TFM_192BITS		(0xd << 16)
+#define AES_TFM_256BITS		(0xf << 16)
+#define AES_TFM_FULL_IV		(0xf << 5)
+
+/* AES flags */
+#define AES_FLAGS_MODE_MSK	GENMASK(2, 0)
+#define AES_FLAGS_ECB		BIT(0)
+#define AES_FLAGS_CBC		BIT(1)
+#define AES_FLAGS_ENCRYPT	BIT(2)
+#define AES_FLAGS_BUSY		BIT(3)
+
+/**
+ * AES command token(CT) is a set of hardware instructions that
+ * are used to control crypto engine AES processing flow.
+ */
+struct mtk_aes_ct {
+	u32 ct_ctrl0;
+	u32 ct_ctrl1;
+	u32 ct_ctrl2;
+};
+
+/**
+ * AES transform state(tfm) is use to define AES transform state
+ * and contains all keys and initial vectors.
+ */
+struct mtk_aes_tfm {
+	u32 tfm_ctrl0;
+	u32 tfm_ctrl1;
+	/* store keys and IVs */
+	u8 state[AES_KEYSIZE_256 + AES_BLOCK_SIZE] __aligned(sizeof(u32));
+};
+
+/**
+ * mtk_aes_info consists of command token and transform state of AES,
+ * which should be encapsulated in command and result descriptors.
+ * The packet processing engine requires these information to do:
+ *
+ * - Commands decoding and control of the crypto engine’s data path.
+ * - Coordinating hardware data fetch and store operations.
+ * - Result token construction and output.
+ */
+struct mtk_aes_info {
+	struct mtk_aes_ct ct;
+	struct mtk_aes_tfm tfm;
+};
+
+struct mtk_aes_reqctx {
+	u64 mode;
+};
+
+struct mtk_aes_ctx {
+	struct mtk_cryp *cryp;
+	struct mtk_aes_info info;
+	u32 keylen;
+
+	unsigned long flags;
+};
+
+struct mtk_aes_drv {
+	struct list_head dev_list;
+	/* device list lock */
+	spinlock_t lock;
+};
+
+static struct mtk_aes_drv mtk_aes = {
+	.dev_list = LIST_HEAD_INIT(mtk_aes.dev_list),
+	.lock = __SPIN_LOCK_UNLOCKED(mtk_aes.lock),
+};
+
+static inline u32 mtk_aes_read(struct mtk_cryp *cryp, u32 offset)
+{
+	return readl_relaxed(cryp->base + offset);
+}
+
+static inline void mtk_aes_write(struct mtk_cryp *cryp,
+				 u32 offset, u32 value)
+{
+	writel_relaxed(value, cryp->base + offset);
+}
+
+static struct mtk_cryp *mtk_aes_find_dev(struct mtk_aes_ctx *ctx)
+{
+	struct mtk_cryp *cryp = NULL;
+	struct mtk_cryp *tmp;
+
+	spin_lock_bh(&mtk_aes.lock);
+	if (!ctx->cryp) {
+		list_for_each_entry(tmp, &mtk_aes.dev_list, aes_list) {
+			cryp = tmp;
+			break;
+		}
+		ctx->cryp = cryp;
+	} else {
+		cryp = ctx->cryp;
+	}
+	spin_unlock_bh(&mtk_aes.lock);
+
+	return cryp;
+}
+
+static inline size_t mtk_aes_padlen(size_t len)
+{
+	len &= AES_BLOCK_SIZE - 1;
+	return len ? AES_BLOCK_SIZE - len : 0;
+}
+
+static bool mtk_aes_check_aligned(struct scatterlist *sg,
+				  size_t len, struct mtk_aes_dma *dma)
+{
+	int nents;
+
+	if (!IS_ALIGNED(len, AES_BLOCK_SIZE))
+		return false;
+
+	for (nents = 0; sg; sg = sg_next(sg), ++nents) {
+		if (!IS_ALIGNED(sg->offset, sizeof(u32)))
+			return false;
+
+		if (len <= sg->length) {
+			if (!IS_ALIGNED(len, AES_BLOCK_SIZE))
+				return false;
+
+			dma->nents = nents + 1;
+			dma->remainder = sg->length - len;
+			sg->length = len;
+			return true;
+		}
+
+		if (!IS_ALIGNED(sg->length, AES_BLOCK_SIZE))
+			return false;
+
+		len -= sg->length;
+	}
+
+	return false;
+}
+
+/* Initialize and map transform information of AES */
+static int mtk_aes_info_map(struct mtk_cryp *cryp,
+			    struct mtk_aes *aes, size_t len)
+{
+	struct mtk_aes_ctx *ctx = crypto_ablkcipher_ctx(
+			crypto_ablkcipher_reqtfm(aes->req));
+	struct mtk_aes_info *info = aes->info;
+	struct mtk_aes_ct *ct = &info->ct;
+	struct mtk_aes_tfm *tfm = &info->tfm;
+	u32 keylen = ctx->keylen;
+
+	aes->ct_hdr = AES_CT_CTRL_HDR | len;
+	ct->ct_ctrl0 = AES_COMMAND0 | len;
+	ct->ct_ctrl1 = AES_COMMAND1;
+
+	if (aes->flags & AES_FLAGS_ENCRYPT)
+		tfm->tfm_ctrl0 = AES_TFM_ENCRYPT;
+	else
+		tfm->tfm_ctrl0 = AES_TFM_DECRYPT;
+
+	if (aes->flags & AES_FLAGS_CBC) {
+		aes->ct_size = AES_CT_SIZE_CBC;
+		ct->ct_ctrl2 = AES_COMMAND2;
+
+		tfm->tfm_ctrl0 |=
+			AES_TFM_SIZE(SIZE_IN_WORDS(keylen + AES_BLOCK_SIZE));
+		tfm->tfm_ctrl1 = AES_TFM_CBC;
+		tfm->tfm_ctrl1 |= AES_TFM_FULL_IV;
+
+		memcpy(tfm->state + keylen, aes->req->info, AES_BLOCK_SIZE);
+	} else if (aes->flags & AES_FLAGS_ECB) {
+		aes->ct_size = AES_CT_SIZE_ECB;
+		tfm->tfm_ctrl0 |= AES_TFM_SIZE(SIZE_IN_WORDS(keylen));
+		tfm->tfm_ctrl1 = AES_TFM_ECB;
+	}
+
+	if (keylen == AES_KEYSIZE_128)
+		tfm->tfm_ctrl0 |= AES_TFM_128BITS;
+	else if (keylen == AES_KEYSIZE_256)
+		tfm->tfm_ctrl0 |= AES_TFM_256BITS;
+	else if (keylen == AES_KEYSIZE_192)
+		tfm->tfm_ctrl0 |= AES_TFM_192BITS;
+
+	aes->ct_dma = dma_map_single(cryp->dev, info, sizeof(*info),
+					DMA_TO_DEVICE);
+	if (unlikely(dma_mapping_error(cryp->dev, aes->ct_dma))) {
+		dev_err(cryp->dev, "dma %d bytes error\n", sizeof(*info));
+		return -EINVAL;
+	}
+	aes->tfm_dma = aes->ct_dma + sizeof(*ct);
+
+	return 0;
+}
+
+static int mtk_aes_xmit(struct mtk_cryp *cryp, struct mtk_aes *aes)
+{
+	struct mtk_ring *ring = cryp->ring[aes->id];
+	struct mtk_desc *cmd = NULL, *res = NULL;
+	struct scatterlist *ssg, *dsg;
+	u32 len = aes->src.sg_len;
+	int nents;
+
+	/* Fill command and result descriptors */
+	for (nents = 0; nents < len; ++nents) {
+		ssg = &aes->src.sg[nents];
+		dsg = &aes->dst.sg[nents];
+
+		cmd = ring->cmd_base + ring->pos;
+		res = ring->res_base + ring->pos;
+
+		res->hdr = MTK_DESC_BUF_LEN(dsg->length);
+		res->buf = sg_dma_address(dsg);
+
+		cmd->hdr = MTK_DESC_BUF_LEN(ssg->length);
+		cmd->buf = sg_dma_address(ssg);
+
+		if (nents == 0) {
+			res->hdr |= MTK_DESC_FIRST;
+			cmd->hdr |= MTK_DESC_FIRST;
+			cmd->hdr |= MTK_DESC_CT_LEN(aes->ct_size);
+			cmd->ct = aes->ct_dma;
+			cmd->ct_hdr = aes->ct_hdr;
+			cmd->tfm = aes->tfm_dma;
+		}
+
+		if (++ring->pos == MTK_MAX_DESC_NUM)
+			ring->pos = 0;
+	}
+
+	cmd->hdr |= MTK_DESC_LAST;
+	res->hdr |= MTK_DESC_LAST;
+
+	/*
+	 * make sure that all changes to the dma ring are done before we
+	 * start engine.
+	 */
+	wmb();
+	/* Start DMA transfer */
+	mtk_aes_write(cryp, RDR_PREP_COUNT(aes->id), MTK_DESC_CNT(len));
+	mtk_aes_write(cryp, CDR_PREP_COUNT(aes->id), MTK_DESC_CNT(len));
+
+	return -EINPROGRESS;
+}
+
+static inline void mtk_aes_restore_sg(const struct mtk_aes_dma *dma)
+{
+	struct scatterlist *sg = dma->sg;
+	int nents = dma->nents;
+
+	if (!dma->remainder)
+		return;
+
+	while (--nents > 0 && sg)
+		sg = sg_next(sg);
+
+	if (!sg)
+		return;
+
+	sg->length += dma->remainder;
+}
+
+static int mtk_aes_map(struct mtk_cryp *cryp, struct mtk_aes *aes)
+{
+	struct scatterlist *src = aes->req->src;
+	struct scatterlist *dst = aes->req->dst;
+	size_t len = aes->req->nbytes;
+	size_t padlen = 0;
+	bool src_aligned, dst_aligned;
+
+	aes->total = len;
+	aes->src.sg = src;
+	aes->dst.sg = dst;
+	aes->real_dst = dst;
+
+	src_aligned = mtk_aes_check_aligned(src, len, &aes->src);
+	if (src == dst)
+		dst_aligned = src_aligned;
+	else
+		dst_aligned = mtk_aes_check_aligned(dst, len, &aes->dst);
+
+	if (!src_aligned || !dst_aligned) {
+		padlen = mtk_aes_padlen(len);
+
+		if (len + padlen > AES_BUFFER_SIZE)
+			return -ENOMEM;
+
+		if (!src_aligned) {
+			sg_copy_to_buffer(src, sg_nents(src), aes->buf, len);
+			aes->src.sg = &aes->aligned_sg;
+			aes->src.nents = 1;
+			aes->src.remainder = 0;
+		}
+
+		if (!dst_aligned) {
+			aes->dst.sg = &aes->aligned_sg;
+			aes->dst.nents = 1;
+			aes->dst.remainder = 0;
+		}
+
+		sg_init_table(&aes->aligned_sg, 1);
+		sg_set_buf(&aes->aligned_sg, aes->buf, len + padlen);
+	}
+
+	if (aes->src.sg == aes->dst.sg) {
+		aes->src.sg_len = dma_map_sg(cryp->dev, aes->src.sg,
+				aes->src.nents, DMA_BIDIRECTIONAL);
+		aes->dst.sg_len = aes->src.sg_len;
+		if (unlikely(!aes->src.sg_len))
+			return -EFAULT;
+	} else {
+		aes->src.sg_len = dma_map_sg(cryp->dev, aes->src.sg,
+				aes->src.nents, DMA_TO_DEVICE);
+		if (unlikely(!aes->src.sg_len))
+			return -EFAULT;
+
+		aes->dst.sg_len = dma_map_sg(cryp->dev, aes->dst.sg,
+				aes->dst.nents, DMA_FROM_DEVICE);
+		if (unlikely(!aes->dst.sg_len)) {
+			dma_unmap_sg(cryp->dev, aes->src.sg,
+				     aes->src.nents, DMA_TO_DEVICE);
+			return -EFAULT;
+		}
+	}
+
+	return mtk_aes_info_map(cryp, aes, len + padlen);
+}
+
+static int mtk_aes_handle_queue(struct mtk_cryp *cryp, u8 id,
+				struct ablkcipher_request *req)
+{
+	struct mtk_aes *aes = cryp->aes[id];
+	struct crypto_async_request *areq, *backlog;
+	struct mtk_aes_reqctx *rctx;
+	struct mtk_aes_ctx *ctx;
+	unsigned long flags;
+	int err, ret = 0;
+
+	spin_lock_irqsave(&aes->lock, flags);
+	if (req)
+		ret = ablkcipher_enqueue_request(&aes->queue, req);
+	if (aes->flags & AES_FLAGS_BUSY) {
+		spin_unlock_irqrestore(&aes->lock, flags);
+		return ret;
+	}
+	backlog = crypto_get_backlog(&aes->queue);
+	areq = crypto_dequeue_request(&aes->queue);
+	if (areq)
+		aes->flags |= AES_FLAGS_BUSY;
+	spin_unlock_irqrestore(&aes->lock, flags);
+
+	if (!areq)
+		return ret;
+
+	if (backlog)
+		backlog->complete(backlog, -EINPROGRESS);
+
+	req = ablkcipher_request_cast(areq);
+	ctx = crypto_ablkcipher_ctx(crypto_ablkcipher_reqtfm(req));
+	rctx = ablkcipher_request_ctx(req);
+	rctx->mode &= AES_FLAGS_MODE_MSK;
+	/* assign new request to device */
+	aes->req = req;
+	aes->info = &ctx->info;
+	aes->flags = (aes->flags & ~AES_FLAGS_MODE_MSK) | rctx->mode;
+
+	err = mtk_aes_map(cryp, aes);
+	if (err)
+		return err;
+
+	return mtk_aes_xmit(cryp, aes);
+}
+
+static void mtk_aes_unmap(struct mtk_cryp *cryp, struct mtk_aes *aes)
+{
+	dma_unmap_single(cryp->dev, aes->ct_dma,
+			 sizeof(struct mtk_aes_info), DMA_TO_DEVICE);
+
+	if (aes->src.sg == aes->dst.sg) {
+		dma_unmap_sg(cryp->dev, aes->src.sg,
+			     aes->src.nents, DMA_BIDIRECTIONAL);
+
+		if (aes->src.sg != &aes->aligned_sg)
+			mtk_aes_restore_sg(&aes->src);
+	} else {
+		dma_unmap_sg(cryp->dev, aes->dst.sg,
+			     aes->dst.nents, DMA_FROM_DEVICE);
+
+		if (aes->dst.sg != &aes->aligned_sg)
+			mtk_aes_restore_sg(&aes->dst);
+
+		dma_unmap_sg(cryp->dev, aes->src.sg,
+			     aes->src.nents, DMA_TO_DEVICE);
+
+		if (aes->src.sg != &aes->aligned_sg)
+			mtk_aes_restore_sg(&aes->src);
+	}
+
+	if (aes->dst.sg == &aes->aligned_sg)
+		sg_copy_from_buffer(aes->real_dst,
+				    sg_nents(aes->real_dst),
+				    aes->buf, aes->total);
+}
+
+static inline void mtk_aes_complete(struct mtk_cryp *cryp,
+				    struct mtk_aes *aes)
+{
+	aes->flags &= ~AES_FLAGS_BUSY;
+
+	aes->req->base.complete(&aes->req->base, 0);
+
+	/* handle new request */
+	mtk_aes_handle_queue(cryp, aes->id, NULL);
+}
+
+/* Check and set the AES key to transform state's buffer */
+static int mtk_aes_setkey(struct crypto_ablkcipher *tfm,
+			  const u8 *key, u32 keylen)
+{
+	struct mtk_aes_ctx *ctx = crypto_ablkcipher_ctx(tfm);
+	u8 *state = ctx->info.tfm.state;
+
+	if (keylen != AES_KEYSIZE_128 &&
+	    keylen != AES_KEYSIZE_192 &&
+	    keylen != AES_KEYSIZE_256) {
+		crypto_ablkcipher_set_flags(tfm, CRYPTO_TFM_RES_BAD_KEY_LEN);
+		return -EINVAL;
+	}
+
+	ctx->keylen = keylen;
+	memcpy(state, key, keylen);
+
+	return 0;
+}
+
+static int mtk_aes_crypt(struct ablkcipher_request *req, u64 mode)
+{
+	struct mtk_aes_ctx *ctx = crypto_ablkcipher_ctx(
+			crypto_ablkcipher_reqtfm(req));
+	struct mtk_aes_reqctx *rctx = ablkcipher_request_ctx(req);
+
+	rctx->mode = mode;
+
+	return mtk_aes_handle_queue(ctx->cryp,
+			!(mode & AES_FLAGS_ENCRYPT), req);
+}
+
+static int mtk_ecb_encrypt(struct ablkcipher_request *req)
+{
+	return mtk_aes_crypt(req, AES_FLAGS_ENCRYPT | AES_FLAGS_ECB);
+}
+
+static int mtk_ecb_decrypt(struct ablkcipher_request *req)
+{
+	return mtk_aes_crypt(req, AES_FLAGS_ECB);
+}
+
+static int mtk_cbc_encrypt(struct ablkcipher_request *req)
+{
+	return mtk_aes_crypt(req, AES_FLAGS_ENCRYPT | AES_FLAGS_CBC);
+}
+
+static int mtk_cbc_decrypt(struct ablkcipher_request *req)
+{
+	return mtk_aes_crypt(req, AES_FLAGS_CBC);
+}
+
+static int mtk_aes_cra_init(struct crypto_tfm *tfm)
+{
+	struct mtk_aes_ctx *ctx = crypto_tfm_ctx(tfm);
+	struct mtk_cryp *cryp = NULL;
+
+	tfm->crt_ablkcipher.reqsize = sizeof(struct mtk_aes_reqctx);
+
+	cryp = mtk_aes_find_dev(ctx);
+	if (!cryp) {
+		pr_err("can't find crypto device\n");
+		return -ENODEV;
+	}
+
+	return 0;
+}
+
+static struct crypto_alg aes_algs[] = {
+{
+	.cra_name		=	"cbc(aes)",
+	.cra_driver_name	=	"cbc-aes-mtk",
+	.cra_priority		=	400,
+	.cra_flags		=	CRYPTO_ALG_TYPE_ABLKCIPHER |
+						CRYPTO_ALG_ASYNC,
+	.cra_init		=	mtk_aes_cra_init,
+	.cra_blocksize		=	AES_BLOCK_SIZE,
+	.cra_ctxsize		=	sizeof(struct mtk_aes_ctx),
+	.cra_alignmask		=	15,
+	.cra_type		=	&crypto_ablkcipher_type,
+	.cra_module		=	THIS_MODULE,
+	.cra_u.ablkcipher	=	{
+		.min_keysize	=	AES_MIN_KEY_SIZE,
+		.max_keysize	=	AES_MAX_KEY_SIZE,
+		.setkey		=	mtk_aes_setkey,
+		.encrypt	=	mtk_cbc_encrypt,
+		.decrypt	=	mtk_cbc_decrypt,
+		.ivsize		=	AES_BLOCK_SIZE,
+	}
+},
+{
+	.cra_name		=	"ecb(aes)",
+	.cra_driver_name	=	"ecb-aes-mtk",
+	.cra_priority		=	400,
+	.cra_flags		=	CRYPTO_ALG_TYPE_ABLKCIPHER |
+						CRYPTO_ALG_ASYNC,
+	.cra_init		=	mtk_aes_cra_init,
+	.cra_blocksize		=	AES_BLOCK_SIZE,
+	.cra_ctxsize		=	sizeof(struct mtk_aes_ctx),
+	.cra_alignmask		=	15,
+	.cra_type		=	&crypto_ablkcipher_type,
+	.cra_module		=	THIS_MODULE,
+	.cra_u.ablkcipher	=	{
+		.min_keysize	=	AES_MIN_KEY_SIZE,
+		.max_keysize	=	AES_MAX_KEY_SIZE,
+		.setkey		=	mtk_aes_setkey,
+		.encrypt	=	mtk_ecb_encrypt,
+		.decrypt	=	mtk_ecb_decrypt,
+	}
+},
+};
+
+static void mtk_aes_enc_task(unsigned long data)
+{
+	struct mtk_cryp *cryp = (struct mtk_cryp *)data;
+	struct mtk_aes *aes = cryp->aes[0];
+
+	mtk_aes_unmap(cryp, aes);
+	mtk_aes_complete(cryp, aes);
+}
+
+static void mtk_aes_dec_task(unsigned long data)
+{
+	struct mtk_cryp *cryp = (struct mtk_cryp *)data;
+	struct mtk_aes *aes = cryp->aes[1];
+
+	mtk_aes_unmap(cryp, aes);
+	mtk_aes_complete(cryp, aes);
+}
+
+static irqreturn_t mtk_aes_enc_irq(int irq, void *dev_id)
+{
+	struct mtk_cryp *cryp = (struct mtk_cryp *)dev_id;
+	struct mtk_aes *aes = cryp->aes[0];
+	u32 val = mtk_aes_read(cryp, RDR_STAT(RING0));
+
+	mtk_aes_write(cryp, RDR_STAT(RING0), val);
+
+	if (likely(AES_FLAGS_BUSY & aes->flags)) {
+		mtk_aes_write(cryp, RDR_PROC_COUNT(RING0), MTK_DESC_CNT_CLR);
+		mtk_aes_write(cryp, RDR_THRESH(RING0), MTK_RDR_THRESH_DEF);
+
+		tasklet_schedule(&aes->task);
+	} else {
+		dev_warn(cryp->dev, "AES interrupt when no active requests.\n");
+	}
+	return IRQ_HANDLED;
+}
+
+static irqreturn_t mtk_aes_dec_irq(int irq, void *dev_id)
+{
+	struct mtk_cryp *cryp = (struct mtk_cryp *)dev_id;
+	struct mtk_aes *aes = cryp->aes[1];
+	u32 val = mtk_aes_read(cryp, RDR_STAT(RING1));
+
+	mtk_aes_write(cryp, RDR_STAT(RING1), val);
+
+	if (likely(AES_FLAGS_BUSY & aes->flags)) {
+		mtk_aes_write(cryp, RDR_PROC_COUNT(RING1), MTK_DESC_CNT_CLR);
+		mtk_aes_write(cryp, RDR_THRESH(RING1), MTK_RDR_THRESH_DEF);
+
+		tasklet_schedule(&aes->task);
+	} else {
+		dev_warn(cryp->dev, "AES interrupt when no active requests.\n");
+	}
+	return IRQ_HANDLED;
+}
+
+/*
+ * The purpose of creating encryption and decryption records is
+ * to process outbound/inbound data in parallel, it can improve
+ * performance in most use cases, such as IPSec VPN, especially
+ * under heavy network traffic.
+ */
+static int mtk_aes_record_init(struct mtk_cryp *cryp)
+{
+	struct mtk_aes **aes = cryp->aes;
+	int i, err = -ENOMEM;
+
+	for (i = 0; i < RECORD_NUM; i++) {
+		aes[i] = kzalloc(sizeof(**aes), GFP_KERNEL);
+		if (!aes[i])
+			goto err_cleanup;
+
+		aes[i]->buf = (void *)__get_free_pages(GFP_KERNEL,
+						AES_BUFFER_ORDER);
+		if (!aes[i]->buf)
+			goto err_cleanup;
+
+		aes[i]->id = i;
+
+		spin_lock_init(&aes[i]->lock);
+		crypto_init_queue(&aes[i]->queue, AES_QUEUE_LENGTH);
+	}
+
+	tasklet_init(&aes[0]->task, mtk_aes_enc_task, (unsigned long)cryp);
+	tasklet_init(&aes[1]->task, mtk_aes_dec_task, (unsigned long)cryp);
+
+	return 0;
+
+err_cleanup:
+	for (; i--; ) {
+		free_page((unsigned long)aes[i]->buf);
+		kfree(aes[i]);
+	}
+
+	return err;
+}
+
+static void mtk_aes_record_free(struct mtk_cryp *cryp)
+{
+	int i;
+
+	for (i = 0; i < RECORD_NUM; i++) {
+		tasklet_kill(&cryp->aes[i]->task);
+		free_page((unsigned long)cryp->aes[i]->buf);
+		kfree(cryp->aes[i]);
+	}
+}
+
+static void mtk_aes_unregister_algs(void)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(aes_algs); i++)
+		crypto_unregister_alg(&aes_algs[i]);
+}
+
+static int mtk_aes_register_algs(void)
+{
+	int err, i, j;
+
+	for (i = 0; i < ARRAY_SIZE(aes_algs); i++) {
+		err = crypto_register_alg(&aes_algs[i]);
+		if (err)
+			goto err_aes_algs;
+	}
+
+	return 0;
+
+err_aes_algs:
+	for (j = 0; j < i; j++)
+		crypto_unregister_alg(&aes_algs[j]);
+
+	return err;
+}
+
+int mtk_cipher_alg_register(struct mtk_cryp *cryp)
+{
+	int ret;
+
+	INIT_LIST_HEAD(&cryp->aes_list);
+
+	/* Initialize two cipher records */
+	ret = mtk_aes_record_init(cryp);
+	if (ret)
+		goto err_record;
+
+	/* Ring0 irq is use by encryption record */
+	ret = devm_request_irq(cryp->dev, cryp->irq[RING0], mtk_aes_enc_irq,
+			       IRQF_TRIGGER_LOW, "mtk-aes", cryp);
+	if (ret) {
+		dev_err(cryp->dev, "unable to request AES encryption irq.\n");
+		goto err_res;
+	}
+
+	/* Ring1 irq is use by decryption record */
+	ret = devm_request_irq(cryp->dev, cryp->irq[RING1], mtk_aes_dec_irq,
+			       IRQF_TRIGGER_LOW, "mtk-aes", cryp);
+	if (ret) {
+		dev_err(cryp->dev, "unable to request AES decryption irq.\n");
+		goto err_res;
+	}
+
+	/* Enable ring0 and ring1 interrupt */
+	mtk_aes_write(cryp, AIC_ENABLE_SET(RING0), MTK_IRQ_RDR0);
+	mtk_aes_write(cryp, AIC_ENABLE_SET(RING1), MTK_IRQ_RDR1);
+
+	spin_lock(&mtk_aes.lock);
+	list_add_tail(&cryp->aes_list, &mtk_aes.dev_list);
+	spin_unlock(&mtk_aes.lock);
+
+	ret = mtk_aes_register_algs();
+	if (ret)
+		goto err_algs;
+
+	return 0;
+
+err_algs:
+	spin_lock(&mtk_aes.lock);
+	list_del(&cryp->aes_list);
+	spin_unlock(&mtk_aes.lock);
+err_res:
+	mtk_aes_record_free(cryp);
+err_record:
+
+	dev_err(cryp->dev, "mtk-aes initialization failed.\n");
+	return ret;
+}
+
+void mtk_cipher_alg_release(struct mtk_cryp *cryp)
+{
+	spin_lock(&mtk_aes.lock);
+	list_del(&cryp->aes_list);
+	spin_unlock(&mtk_aes.lock);
+
+	mtk_aes_unregister_algs();
+	mtk_aes_record_free(cryp);
+}
diff --git a/drivers/crypto/mediatek/mtk-platform.c b/drivers/crypto/mediatek/mtk-platform.c
new file mode 100644
index 0000000..25025fe
--- /dev/null
+++ b/drivers/crypto/mediatek/mtk-platform.c
@@ -0,0 +1,580 @@
+/*
+ * Support for MediaTek cryptographic accelerator.
+ *
+ * Copyright (c) 2016 MediaTek Inc.
+ * Author: Ryder Lee <ryder.lee@mediatek.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License.
+ *
+ */
+
+#include <linux/clk.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/mfd/syscon.h>
+#include <linux/module.h>
+#include <linux/platform_device.h>
+#include <linux/pm_runtime.h>
+#include "mtk-platform.h"
+#include "mtk-regs.h"
+
+#define MTK_BURST_SIZE(x, y)		(((x) & ~0xf0) | ((y) << 4))
+#define MTK_DESC_SIZE_SET(x)		((x) << 0)
+#define MTK_DESC_OFFSET_SET(x)		((x) << 16)
+#define MTK_DFSE_RING_ID(x)		(((x) >> 12) & 0xf)
+#define MTK_DSE_MIN_DATA(x)		((x) << 0)
+#define MTK_DSE_MAX_DATA(x)		((x) << 8)
+#define MTK_DFE_MIN_DATA(x)		((x) << 0)
+#define MTK_DFE_MAX_DATA(x)		((x) << 8)
+#define MTK_DFE_MIN_CTRL(x)		((x) << 16)
+#define MTK_DFE_MAX_CTRL(x)		((x) << 24)
+#define MTK_FETCH_SIZE_SET(x)		((x) << 0)
+#define MTK_FETCH_THRESH_SET(x)		((x) << 16)
+#define MTK_IN_BUF_MIN_THRESH(x)	((x) << 8)
+#define MTK_IN_BUF_MAX_THRESH(x)	((x) << 12)
+#define MTK_OUT_BUF_MIN_THRESH(x)	((x) << 0)
+#define MTK_OUT_BUF_MAX_THRESH(x)	((x) << 4)
+#define MTK_CMD_FIFO_SIZE(x)		(((x) >> 8) & 0xf)
+#define MTK_RES_FIFO_SIZE(x)		(((x) >> 12) & 0xf)
+#define MTK_HIA_DATA_WIDTH(x)		(((x) >> 25) & 0x3)
+#define MTK_HIA_DMA_LENGTH(x)		(((x) >> 20) & 0x1f)
+#define MTK_IN_TBUF_SIZE(x)		(((x) >> 4) & 0xf)
+#define MTK_IN_DBUF_SIZE(x)		(((x) >> 8) & 0xf)
+#define MTK_OUT_DBUF_SIZE(x)		(((x) >> 16) & 0xf)
+#define MTK_AIC_INT_NUM(x)		((x) & 0x3f)
+#define MTK_AIC_VER_GET(x)		((x) & 0x0ff0ffff)
+#define MTK_PE_TOKEN_CTRL_DEF		0x00014004
+#define MTK_PE_INT_CTRL_DEF		0xc00f400f
+#define MTK_PRNG_CTRL_EN		BIT(0)
+#define MTK_PRNG_CTRL_AUTO		BIT(1)
+#define MTK_TOKEN_TIMEOUT_EN		BIT(22)
+#define MTK_OVL_IRQ_EN			BIT(25)
+#define MTK_ATP_PRESENT			BIT(30)
+#define MTK_DFSE_THR_CTRL_EN		BIT(30)
+#define MTK_DFSE_THR_CTRL_RESET		BIT(31)
+#define MTK_HIA_SIGNATURE		((u16)0x35ca)
+#define MTK_CDR_STAT_CLR		0x1f
+#define MTK_RDR_STAT_CLR		0xff
+#define MTK_AIC_VER11			0x011036C9
+#define MTK_AIC_VER12			0x012036C9
+#define MTK_AIC_GLOBAL_CLR		0x7FF00000
+#define MTK_DFSE_IDLE			0xf
+
+/**
+ * This engine is an integrated security subsystem to accelerate
+ * cryptographic functions and protocols to off-load the host processor.
+ *
+ * Hardware modules are briefly introduced below:
+ *
+ * Host Interface Adapter(HIA) - the main interface between the host
+ * system and the hardware subsystem. It is responsible for attaching
+ * processing engine to the specific host bus interface and provides a
+ * standardized software view for off loading tasks to the engine.
+ *
+ * Command Descriptor Ring Manager(CDR Manager) - keeps track of how many
+ * CD the host has prepared in the CDR. It monitors the fill level of its
+ * CD-FIFO and if there's sufficient space for the next block of descriptors,
+ * then it fires off a DMA request to fetch a block of CDs.
+ *
+ * Data fetch engine(DFE) - It is responsible for parsing the CD and
+ * setting up the required control and packet data DMA transfers from
+ * system memory to the processing engine.
+ *
+ * Result Descriptor Ring Manager(RDR Manager) - same as CDR Manager,
+ * but target is result descriptors, Moreover, it also handles the RD
+ * updates under control of the DSE. For each packet data segment
+ * processed, the DSE triggers the RDR Manager to write the updated RD.
+ * If triggered to update, the RDR Manager sets up a DMA operation to
+ * copy the RD from the DSE to the correct location in the RDR.
+ *
+ * Data Store Engine(DSE) - It is responsible for parsing the prepared RD
+ * and setting up the required control and packet data DMA transfers from
+ * the processing engine to system memory.
+ *
+ * Advanced Interrupt Controllers(AICs) - receive interrupt request signals
+ * from various sources and combine them into one interrupt output. The AICs
+ * are use by:
+ * - One for the HIA global and processing engine interrupts.
+ * - The others for the descriptor ring interrupts.
+ */
+
+/* Cryptographic engine capabilities */
+struct mtk_sys_cap {
+	/* host interface adapter */
+	u32 hia_ver;
+	u32 hia_opt;
+	/* packet engine */
+	u32 pkt_eng_opt;
+	/* global hardware */
+	u32 hw_opt;
+};
+
+static void mtk_desc_ring_link(struct mtk_cryp *cryp, u32 mask)
+{
+	/* Assign rings to DFE/DSE thread and enable it */
+	writel(MTK_DFSE_THR_CTRL_EN | mask, cryp->base + DFE_THR_CTRL);
+	writel(MTK_DFSE_THR_CTRL_EN | mask, cryp->base + DSE_THR_CTRL);
+}
+
+static void mtk_dfe_dse_buf_setup(struct mtk_cryp *cryp,
+				  struct mtk_sys_cap *cap)
+{
+	u32 width = MTK_HIA_DATA_WIDTH(cap->hia_opt) + 2;
+	u32 len = MTK_HIA_DMA_LENGTH(cap->hia_opt) - 1;
+	u32 ipbuf = min(MTK_IN_DBUF_SIZE(cap->hw_opt) + width, len);
+	u32 opbuf = min(MTK_OUT_DBUF_SIZE(cap->hw_opt) + width, len);
+	u32 itbuf = min(MTK_IN_TBUF_SIZE(cap->hw_opt) + width, len);
+	u32 val;
+
+	val = MTK_DFE_MIN_DATA(ipbuf - 1) | MTK_DFE_MAX_DATA(ipbuf) |
+		MTK_DFE_MIN_CTRL(itbuf - 1) | MTK_DFE_MAX_CTRL(itbuf);
+	writel(val, cryp->base + DFE_CFG);
+
+	val = MTK_DFE_MIN_DATA(opbuf - 1) | MTK_DFE_MAX_DATA(opbuf);
+	writel(val, cryp->base + DSE_CFG);
+
+	val = MTK_IN_BUF_MIN_THRESH(ipbuf - 1) | MTK_IN_BUF_MAX_THRESH(ipbuf);
+	writel(val, cryp->base + PE_IN_DBUF_THRESH);
+
+	val = MTK_IN_BUF_MIN_THRESH(itbuf - 1) | MTK_IN_BUF_MAX_THRESH(itbuf);
+	writel(val, cryp->base + PE_IN_TBUF_THRESH);
+
+	val = MTK_OUT_BUF_MIN_THRESH(opbuf - 1) | MTK_OUT_BUF_MAX_THRESH(opbuf);
+	writel(val, cryp->base + PE_OUT_DBUF_THRESH);
+
+	writel(0, cryp->base + PE_OUT_TBUF_THRESH);
+	writel(0, cryp->base + PE_OUT_BUF_CTRL);
+}
+
+static int mtk_dfe_dse_state_check(struct mtk_cryp *cryp)
+{
+	int ret = -EINVAL;
+	u32 val;
+
+	/* Check for completion of all DMA transfers */
+	val = readl(cryp->base + DFE_THR_STAT);
+	if (MTK_DFSE_RING_ID(val) == MTK_DFSE_IDLE) {
+		val = readl(cryp->base + DSE_THR_STAT);
+		if (MTK_DFSE_RING_ID(val) == MTK_DFSE_IDLE)
+			ret = 0;
+	}
+
+	if (!ret) {
+		/* Take DFE/DSE thread out of reset */
+		writel(0, cryp->base + DFE_THR_CTRL);
+		writel(0, cryp->base + DSE_THR_CTRL);
+	} else {
+		return -EBUSY;
+	}
+
+	return 0;
+}
+
+static int mtk_dfe_dse_reset(struct mtk_cryp *cryp)
+{
+	int err;
+
+	/* Reset DSE/DFE and correct system priorities for all rings. */
+	writel(MTK_DFSE_THR_CTRL_RESET, cryp->base + DFE_THR_CTRL);
+	writel(0, cryp->base + DFE_PRIO_0);
+	writel(0, cryp->base + DFE_PRIO_1);
+	writel(0, cryp->base + DFE_PRIO_2);
+	writel(0, cryp->base + DFE_PRIO_3);
+
+	writel(MTK_DFSE_THR_CTRL_RESET, cryp->base + DSE_THR_CTRL);
+	writel(0, cryp->base + DSE_PRIO_0);
+	writel(0, cryp->base + DSE_PRIO_1);
+	writel(0, cryp->base + DSE_PRIO_2);
+	writel(0, cryp->base + DSE_PRIO_3);
+
+	err = mtk_dfe_dse_state_check(cryp);
+	if (err)
+		return err;
+
+	return 0;
+}
+
+static void mtk_cmd_desc_ring_setup(struct mtk_cryp *cryp,
+				    int i, struct mtk_sys_cap *cap)
+{
+	/* Full descriptor that fits FIFO minus one */
+	u32 count =
+		((1 << MTK_CMD_FIFO_SIZE(cap->hia_opt)) / MTK_DESC_SIZE) - 1;
+	u32 size = count * MTK_DESC_OFFSET;
+	u32 thresh = count * MTK_DESC_SIZE;
+	u32 val;
+
+	/* Temporarily disable external triggering */
+	writel(0, cryp->base + CDR_CFG(i));
+
+	/* Clear CDR count */
+	writel(MTK_DESC_CNT_CLR, cryp->base + CDR_PREP_COUNT(i));
+	writel(MTK_DESC_CNT_CLR, cryp->base + CDR_PROC_COUNT(i));
+
+	writel(0, cryp->base + CDR_PREP_PNTR(i));
+	writel(0, cryp->base + CDR_PROC_PNTR(i));
+	writel(0, cryp->base + CDR_DMA_CFG(i));
+
+	/* Configure command ring host address space */
+	writel(0, cryp->base + CDR_BASE_ADDR_HI(i));
+	writel(cryp->ring[i]->cmd_dma, cryp->base + CDR_BASE_ADDR_LO(i));
+
+	writel(MTK_MAX_RING_SIZE, cryp->base + CDR_RING_SIZE(i));
+
+	/* Clear and disable all CDR interrupts */
+	writel(MTK_CDR_STAT_CLR, cryp->base + CDR_STAT(i));
+
+	/*
+	 * Set command descriptor offset and enable additional
+	 * token present in descriptor.
+	 */
+	val = MTK_DESC_SIZE_SET(MTK_DESC_SIZE) |
+		MTK_DESC_OFFSET_SET(MTK_DESC_OFFSET) |
+		MTK_ATP_PRESENT;
+	writel(val, cryp->base + CDR_DESC_SIZE(i));
+
+	val = MTK_FETCH_SIZE_SET(size) | MTK_FETCH_THRESH_SET(thresh);
+	writel(val, cryp->base + CDR_CFG(i));
+}
+
+static void mtk_res_desc_ring_setup(struct mtk_cryp *cryp,
+				    int i, struct mtk_sys_cap *cap)
+{
+	u32 rndup = 2;
+	u32 count = ((1 << MTK_RES_FIFO_SIZE(cap->hia_opt)) / rndup) - 1;
+	u32 size = count * MTK_DESC_OFFSET;
+	u32 thresh = count * rndup;
+	u32 val;
+
+	writel(0, cryp->base + RDR_CFG(i));
+
+	writel(MTK_DESC_CNT_CLR, cryp->base + RDR_PREP_COUNT(i));
+	writel(MTK_DESC_CNT_CLR, cryp->base + RDR_PROC_COUNT(i));
+
+	writel(0, cryp->base + RDR_PREP_PNTR(i));
+	writel(0, cryp->base + RDR_PROC_PNTR(i));
+	writel(0, cryp->base + RDR_DMA_CFG(i));
+
+	writel(0, cryp->base + RDR_BASE_ADDR_HI(i));
+	writel(cryp->ring[i]->res_dma, cryp->base + RDR_BASE_ADDR_LO(i));
+
+	writel(MTK_MAX_RING_SIZE, cryp->base + RDR_RING_SIZE(i));
+	writel(MTK_RDR_STAT_CLR, cryp->base + RDR_STAT(i));
+
+	/*
+	 * RDR manager generates update interrupts on a per-completed-packet,
+	 * and the rd_proc_thresh_irq interrupt is fired when proc_pkt_count
+	 * for the RDR exceeds the number of packets.
+	 */
+	writel(MTK_RDR_THRESH_DEF, cryp->base + RDR_THRESH(i));
+
+	/*
+	 * Configure a threshold and time-out value for the processed
+	 * result descriptors (or complete packets) that are written to
+	 * the RDR.
+	 */
+	val = MTK_DESC_SIZE_SET(MTK_DESC_SIZE) |
+		MTK_DESC_OFFSET_SET(MTK_DESC_OFFSET);
+	writel(val, cryp->base + RDR_DESC_SIZE(i));
+
+	/*
+	 * Configure HIA fetch size and fetch threshold that are used to
+	 * fetch blocks of multiple descriptors.
+	 */
+	val = MTK_FETCH_SIZE_SET(size) |
+		MTK_FETCH_THRESH_SET(thresh) |
+		MTK_OVL_IRQ_EN;
+	writel(val, cryp->base + RDR_CFG(i));
+}
+
+static int mtk_packet_engine_setup(struct mtk_cryp *cryp)
+{
+	struct mtk_sys_cap cap;
+	int i, err;
+	u32 val;
+
+	cap.hia_ver = readl(cryp->base + HIA_VERSION);
+	cap.hia_opt = readl(cryp->base + HIA_OPTIONS);
+	cap.hw_opt = readl(cryp->base + EIP97_OPTIONS);
+
+	if (!(((u16)cap.hia_ver) == MTK_HIA_SIGNATURE))
+		return -EINVAL;
+
+	/* Configure endianness conversion method for master (DMA) interface */
+	writel(0, cryp->base + EIP97_MST_CTRL);
+
+	/* Set HIA burst size */
+	val = readl(cryp->base + HIA_MST_CTRL);
+	writel(MTK_BURST_SIZE(val, 5), cryp->base + HIA_MST_CTRL);
+
+	err = mtk_dfe_dse_reset(cryp);
+	if (err) {
+		dev_err(cryp->dev, "Failed to reset DFE and DSE.\n");
+		return err;
+	}
+
+	mtk_dfe_dse_buf_setup(cryp, &cap);
+
+	/* Enable the 4 rings for the packet engines. */
+	mtk_desc_ring_link(cryp, 0xf);
+
+	for (i = 0; i < RING_MAX; i++) {
+		mtk_cmd_desc_ring_setup(cryp, i, &cap);
+		mtk_res_desc_ring_setup(cryp, i, &cap);
+	}
+
+	val = MTK_PE_TOKEN_CTRL_DEF | MTK_TOKEN_TIMEOUT_EN;
+	writel(val, cryp->base + PE_TOKEN_CTRL_STAT);
+
+	/* Clear all pending interrupts */
+	writel(MTK_AIC_GLOBAL_CLR, cryp->base + AIC_G_ACK);
+	writel(MTK_PE_INT_CTRL_DEF, cryp->base + PE_INTERRUPT_CTRL_STAT);
+
+	return 0;
+}
+
+static int mtk_aic_cap_check(struct mtk_cryp *cryp, int hw)
+{
+	u32 val;
+
+	if (hw == RING_MAX)
+		val = readl(cryp->base + AIC_G_VERSION);
+	else
+		val = readl(cryp->base + AIC_VERSION(hw));
+
+	val = MTK_AIC_VER_GET(val);
+	if (val != MTK_AIC_VER11 && val != MTK_AIC_VER12)
+		return -ENXIO;
+
+	if (hw == RING_MAX)
+		val = readl(cryp->base + AIC_G_OPTIONS);
+	else
+		val = readl(cryp->base + AIC_OPTIONS(hw));
+
+	val = MTK_AIC_INT_NUM(val);
+	if (!val || val > 32)
+		return -ENXIO;
+
+	return 0;
+}
+
+static int mtk_aic_init(struct mtk_cryp *cryp, int hw)
+{
+	int err;
+
+	err = mtk_aic_cap_check(cryp, hw);
+	if (err)
+		return err;
+
+	/* Disable all interrupts and set initial configuration */
+	if (hw == RING_MAX) {
+		writel(0, cryp->base + AIC_G_ENABLE_CTRL);
+		writel(0, cryp->base + AIC_G_POL_CTRL);
+		writel(0, cryp->base + AIC_G_TYPE_CTRL);
+		writel(0, cryp->base + AIC_G_ENABLE_SET);
+	} else {
+		writel(0, cryp->base + AIC_ENABLE_CTRL(hw));
+		writel(0, cryp->base + AIC_POL_CTRL(hw));
+		writel(0, cryp->base + AIC_TYPE_CTRL(hw));
+		writel(0, cryp->base + AIC_ENABLE_SET(hw));
+	}
+
+	return 0;
+}
+
+static int mtk_accelerator_init(struct mtk_cryp *cryp)
+{
+	int i, err;
+
+	/* Initialize advanced interrupt controller(AIC) */
+	for (i = 0; i < IRQ_NUM; i++) {
+		err = mtk_aic_init(cryp, i);
+		if (err) {
+			dev_err(cryp->dev, "Failed to initialize AIC.\n");
+			return err;
+		}
+	}
+
+	/* Initialize packet engine */
+	err = mtk_packet_engine_setup(cryp);
+	if (err) {
+		dev_err(cryp->dev, "Failed to configure packet engine.\n");
+		return err;
+	}
+
+	return 0;
+}
+
+static void mtk_desc_dma_free(struct mtk_cryp *cryp)
+{
+	int i;
+
+	for (i = 0; i < RING_MAX; i++) {
+		dma_free_coherent(cryp->dev, MTK_MAX_RING_SIZE,
+				  cryp->ring[i]->res_base,
+				  cryp->ring[i]->res_dma);
+		dma_free_coherent(cryp->dev, MTK_MAX_RING_SIZE,
+				  cryp->ring[i]->cmd_base,
+				  cryp->ring[i]->cmd_dma);
+		kfree(cryp->ring[i]);
+	}
+}
+
+static int mtk_desc_ring_alloc(struct mtk_cryp *cryp)
+{
+	struct mtk_ring **ring = cryp->ring;
+	int i, err = ENOMEM;
+
+	for (i = 0; i < RING_MAX; i++) {
+		ring[i] = kzalloc(sizeof(**ring), GFP_KERNEL);
+		if (!ring[i])
+			goto err_cleanup;
+
+		ring[i]->cmd_base = dma_zalloc_coherent(cryp->dev,
+					   MTK_MAX_RING_SIZE, &ring[i]->cmd_dma,
+					   GFP_KERNEL);
+		if (!ring[i]->cmd_base)
+			goto err_cleanup;
+
+		ring[i]->res_base = dma_zalloc_coherent(cryp->dev,
+					   MTK_MAX_RING_SIZE, &ring[i]->res_dma,
+					   GFP_KERNEL);
+		if (!ring[i]->res_base)
+			goto err_cleanup;
+	}
+	return 0;
+
+err_cleanup:
+	for (; i--; ) {
+		dma_free_coherent(cryp->dev, MTK_MAX_RING_SIZE,
+				  ring[i]->res_base, ring[i]->res_dma);
+		dma_free_coherent(cryp->dev, MTK_MAX_RING_SIZE,
+				  ring[i]->cmd_base, ring[i]->cmd_dma);
+		kfree(ring[i]);
+	}
+	return err;
+}
+
+static int mtk_crypto_probe(struct platform_device *pdev)
+{
+	struct resource *res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	struct mtk_cryp *cryp;
+	int i, err;
+
+	cryp = devm_kzalloc(&pdev->dev, sizeof(*cryp), GFP_KERNEL);
+	if (!cryp)
+		return -ENOMEM;
+
+	cryp->base = devm_ioremap_resource(&pdev->dev, res);
+	if (IS_ERR(cryp->base))
+		return PTR_ERR(cryp->base);
+
+	for (i = 0; i < IRQ_NUM; i++) {
+		cryp->irq[i] = platform_get_irq(pdev, i);
+		if (cryp->irq[i] < 0) {
+			dev_err(cryp->dev, "no IRQ:%d resource info\n", i);
+			return -ENXIO;
+		}
+	}
+
+	cryp->clk_ethif = devm_clk_get(&pdev->dev, "ethif");
+	cryp->clk_cryp = devm_clk_get(&pdev->dev, "cryp");
+	if (IS_ERR(cryp->clk_ethif) || IS_ERR(cryp->clk_cryp))
+		return -EPROBE_DEFER;
+
+	cryp->dev = &pdev->dev;
+	pm_runtime_enable(cryp->dev);
+	pm_runtime_get_sync(cryp->dev);
+
+	err = clk_prepare_enable(cryp->clk_ethif);
+	if (err)
+		goto err_clk_ethif;
+
+	err = clk_prepare_enable(cryp->clk_cryp);
+	if (err)
+		goto err_clk_cryp;
+
+	err = mtk_desc_ring_alloc(cryp);
+	if (err) {
+		dev_err(cryp->dev, "Unable to allocate descriptor rings.\n");
+		goto err_resource;
+	}
+
+	err = mtk_accelerator_init(cryp);
+	if (err) {
+		dev_err(cryp->dev, "Failed to initialize cryptographic engine.\n");
+		goto err_engine;
+	}
+
+	err = mtk_cipher_alg_register(cryp);
+	if (err) {
+		dev_err(cryp->dev, "Unable to register MTK-AES.\n");
+		goto err_cipher;
+	}
+
+	err = mtk_hash_alg_register(cryp);
+	if (err) {
+		dev_err(cryp->dev, "Unable to register MTK-SHA.\n");
+		goto err_hash;
+	}
+
+	platform_set_drvdata(pdev, cryp);
+	return 0;
+
+err_hash:
+	mtk_cipher_alg_release(cryp);
+err_cipher:
+	mtk_dfe_dse_reset(cryp);
+err_engine:
+	mtk_desc_dma_free(cryp);
+err_resource:
+	clk_disable_unprepare(cryp->clk_cryp);
+err_clk_cryp:
+	clk_disable_unprepare(cryp->clk_ethif);
+err_clk_ethif:
+	pm_runtime_put_sync(cryp->dev);
+	pm_runtime_disable(cryp->dev);
+
+	return err;
+}
+
+static int mtk_crypto_remove(struct platform_device *pdev)
+{
+	struct mtk_cryp *cryp = platform_get_drvdata(pdev);
+
+	mtk_hash_alg_release(cryp);
+	mtk_cipher_alg_release(cryp);
+	mtk_desc_dma_free(cryp);
+
+	clk_disable_unprepare(cryp->clk_cryp);
+	clk_disable_unprepare(cryp->clk_ethif);
+
+	pm_runtime_put_sync(cryp->dev);
+	pm_runtime_disable(cryp->dev);
+	platform_set_drvdata(pdev, NULL);
+
+	return 0;
+}
+
+const struct of_device_id of_crypto_id[] = {
+	{ .compatible = "mediatek,mt7623-crypto" },
+	{},
+};
+MODULE_DEVICE_TABLE(of, of_crypto_id);
+
+static struct platform_driver mtk_crypto_driver = {
+	.probe = mtk_crypto_probe,
+	.remove = mtk_crypto_remove,
+	.driver = {
+		   .name = "mtk-crypto",
+		   .owner = THIS_MODULE,
+		   .of_match_table = of_crypto_id,
+	},
+};
+module_platform_driver(mtk_crypto_driver);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Ryder Lee <ryder.lee@mediatek.com>");
+MODULE_DESCRIPTION("Cryptographic accelerator driver for MediaTek SoC");
diff --git a/drivers/crypto/mediatek/mtk-platform.h b/drivers/crypto/mediatek/mtk-platform.h
new file mode 100644
index 0000000..e9651f1
--- /dev/null
+++ b/drivers/crypto/mediatek/mtk-platform.h
@@ -0,0 +1,235 @@
+/*
+ * Support for MediaTek cryptographic accelerator.
+ *
+ * Copyright (c) 2016 MediaTek Inc.
+ * Author: Ryder Lee <ryder.lee@mediatek.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License.
+ *
+ */
+
+#ifndef __MTK_PLATFORM_H_
+#define __MTK_PLATFORM_H_
+
+#include <crypto/internal/hash.h>
+#include <linux/crypto.h>
+#include <linux/interrupt.h>
+
+#define MTK_RDR_THRESH_DEF	0x800001
+
+#define MTK_IRQ_RDR0		BIT(1)
+#define MTK_IRQ_RDR1		BIT(3)
+#define MTK_IRQ_RDR2		BIT(5)
+#define MTK_IRQ_RDR3		BIT(7)
+
+#define MTK_DESC_CNT_CLR	BIT(31)
+#define MTK_DESC_LAST		BIT(22)
+#define MTK_DESC_FIRST		BIT(23)
+#define MTK_DESC_BUF_LEN(x)	((x) & 0x1ffff)
+#define MTK_DESC_CT_LEN(x)	(((x) & 0xff) << 24)
+
+#define SIZE_IN_WORDS(x)	((x) >> 2)
+
+/**
+ * Ring 0/1 are used by AES encrypt and decrypt.
+ * Ring 2/3 are used by SHA.
+ */
+enum {
+	RING0 = 0,
+	RING1,
+	RING2,
+	RING3,
+	RING_MAX,
+};
+
+#define RECORD_NUM		(RING_MAX / 2)
+#define IRQ_NUM			5
+
+/**
+ * struct mtk_desc - DMA descriptor
+ * @hdr:	the descriptor control header
+ * @buf:	DMA address of input buffer segment
+ * @ct:		DMA address of command token that control operation flow
+ * @ct_hdr:	the command token control header
+ * @tag:	the user-defined field
+ * @tfm:	DMA address of transform state
+ * @bound:	align descriptors offset boundary
+ *
+ * Structure passed to the crypto engine to describe where source
+ * data needs to be fetched and how it needs to be processed.
+ */
+struct mtk_desc {
+	u32 hdr;
+	u32 buf;
+	u32 ct;
+	u32 ct_hdr;
+	u32 tag;
+	u32 tfm;
+	u32 bound[2];
+};
+
+/**
+ * struct mtk_ring - Descriptor ring
+ * @cmd_base:	pointer to command descriptor ring base
+ * @cmd_dma:	DMA address of command descriptor ring
+ * @res_base:	pointer to result descriptor ring base
+ * @res_dma:	DMA address of result descriptor ring
+ * @pos:	current position in the ring
+ *
+ * A descriptor ring is a circular buffer that is used to manage
+ * one or more descriptors. There are two type of descriptor rings;
+ * the command descriptor ring and result descriptor ring.
+ */
+struct mtk_ring {
+	struct mtk_desc *cmd_base;
+	dma_addr_t cmd_dma;
+	struct mtk_desc *res_base;
+	dma_addr_t res_dma;
+	u32 pos;
+};
+
+#define MTK_MAX_DESC_NUM	512
+#define MTK_DESC_OFFSET		SIZE_IN_WORDS(sizeof(struct mtk_desc))
+#define MTK_DESC_SIZE		(MTK_DESC_OFFSET - 2)
+#define MTK_MAX_RING_SIZE	((sizeof(struct mtk_desc) * MTK_MAX_DESC_NUM))
+#define MTK_DESC_CNT(x)		((MTK_DESC_OFFSET * (x)) << 2)
+
+/**
+ * struct mtk_aes_dma - Structure that holds sg list info
+ * @sg:		pointer to scatter-gather list
+ * @nents:	number of entries in the sg list
+ * @remainder:	remainder of sg list
+ * @sg_len:	number of entries in the sg mapped list
+ */
+struct mtk_aes_dma {
+	struct scatterlist *sg;
+	int nents;
+	u32 remainder;
+	u32 sg_len;
+};
+
+/**
+ * struct mtk_aes - AES operation record
+ * @queue:	crypto request queue
+ * @req:	pointer to ablkcipher request
+ * @task:	the tasklet is use in AES interrupt
+ * @src:	the structure that holds source sg list info
+ * @dst:	the structure that holds destination sg list info
+ * @aligned_sg:	the scatter list is use to alignment
+ * @real_dst:	pointer to the destination sg list
+ * @total:	request buffer length
+ * @buf:	pointer to page buffer
+ * @info:	pointer to AES transform state and command token
+ * @ct_hdr:	AES command token control field
+ * @ct_size:	size of AES command token
+ * @ct_dma:	DMA address of AES command token
+ * @tfm_dma:	DMA address of AES transform state
+ * @id:		record identification
+ * @flags:	it's describing AES operation state
+ * @lock:	the ablkcipher queue lock
+ *
+ * Structure used to record AES execution state.
+ */
+struct mtk_aes {
+	struct crypto_queue queue;
+	struct ablkcipher_request *req;
+	struct tasklet_struct task;
+	struct mtk_aes_dma src;
+	struct mtk_aes_dma dst;
+
+	struct scatterlist aligned_sg;
+	struct scatterlist *real_dst;
+
+	size_t total;
+	void *buf;
+
+	void *info;
+	u32 ct_hdr;
+	u32 ct_size;
+	dma_addr_t ct_dma;
+	dma_addr_t tfm_dma;
+
+	u8 id;
+	unsigned long flags;
+	/* queue lock */
+	spinlock_t lock;
+};
+
+/**
+ * struct mtk_sha - SHA operation record
+ * @queue:	crypto request queue
+ * @req:	pointer to ahash request
+ * @task:	the tasklet is use in SHA interrupt
+ * @info:	pointer to SHA transform state and command token
+ * @ct_hdr:	SHA command token control field
+ * @ct_size:	size of SHA command token
+ * @ct_dma:	DMA address of SHA command token
+ * @tfm_dma:	DMA address of SHA transform state
+ * @id:		record identification
+ * @flags:	it's describing SHA operation state
+ * @lock:	the ablkcipher queue lock
+ *
+ * Structure used to record SHA execution state.
+ */
+struct mtk_sha {
+	struct crypto_queue queue;
+	struct ahash_request *req;
+	struct tasklet_struct task;
+
+	void *info;
+	u32 ct_hdr;
+	u32 ct_size;
+	dma_addr_t ct_dma;
+	dma_addr_t tfm_dma;
+
+	u8 id;
+	unsigned long flags;
+	/* queue lock */
+	spinlock_t lock;
+};
+
+/**
+ * struct mtk_cryp - Cryptographic device
+ * @base:	pointer to mapped register I/O base
+ * @dev:	pointer to device
+ * @clk_ethif:	pointer to ethif clock
+ * @clk_cryp:	pointer to crypto clock
+ * @irq:	global system and rings IRQ
+ * @ring:	pointer to execution state of AES
+ * @aes:	pointer to execution state of SHA
+ * @sha:	each execution record map to a ring
+ * @aes_list:	device list of AES
+ * @sha_list:	device list of SHA
+ * @tmp:	pointer to temporary buffer for internal use
+ * @tmp_dma:	DMA address of temporary buffer
+ * @rec:	it's used to select SHA record for tfm
+ *
+ * Structure storing cryptographic device information.
+ */
+struct mtk_cryp {
+	void __iomem *base;
+	struct device *dev;
+	struct clk *clk_ethif;
+	struct clk *clk_cryp;
+	int irq[IRQ_NUM];
+
+	struct mtk_ring *ring[RING_MAX];
+	struct mtk_aes *aes[RECORD_NUM];
+	struct mtk_sha *sha[RECORD_NUM];
+
+	struct list_head aes_list;
+	struct list_head sha_list;
+
+	void *tmp;
+	dma_addr_t tmp_dma;
+	bool rec;
+};
+
+int mtk_cipher_alg_register(struct mtk_cryp *cryp);
+void mtk_cipher_alg_release(struct mtk_cryp *cryp);
+int mtk_hash_alg_register(struct mtk_cryp *cryp);
+void mtk_hash_alg_release(struct mtk_cryp *cryp);
+
+#endif /* __MTK_PLATFORM_H_ */
diff --git a/drivers/crypto/mediatek/mtk-regs.h b/drivers/crypto/mediatek/mtk-regs.h
new file mode 100644
index 0000000..94f4eb8
--- /dev/null
+++ b/drivers/crypto/mediatek/mtk-regs.h
@@ -0,0 +1,194 @@
+/*
+ * Support for MediaTek cryptographic accelerator.
+ *
+ * Copyright (c) 2016 MediaTek Inc.
+ * Author: Ryder Lee <ryder.lee@mediatek.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License.
+ *
+ */
+
+#ifndef __MTK_REGS_H__
+#define __MTK_REGS_H__
+
+/* HIA, Command Descriptor Ring Manager */
+#define CDR_BASE_ADDR_LO(x)		(0x0 + ((x) << 12))
+#define CDR_BASE_ADDR_HI(x)		(0x4 + ((x) << 12))
+#define CDR_DATA_BASE_ADDR_LO(x)	(0x8 + ((x) << 12))
+#define CDR_DATA_BASE_ADDR_HI(x)	(0xC + ((x) << 12))
+#define CDR_ACD_BASE_ADDR_LO(x)		(0x10 + ((x) << 12))
+#define CDR_ACD_BASE_ADDR_HI(x)		(0x14 + ((x) << 12))
+#define CDR_RING_SIZE(x)		(0x18 + ((x) << 12))
+#define CDR_DESC_SIZE(x)		(0x1C + ((x) << 12))
+#define CDR_CFG(x)			(0x20 + ((x) << 12))
+#define CDR_DMA_CFG(x)			(0x24 + ((x) << 12))
+#define CDR_THRESH(x)			(0x28 + ((x) << 12))
+#define CDR_PREP_COUNT(x)		(0x2C + ((x) << 12))
+#define CDR_PROC_COUNT(x)		(0x30 + ((x) << 12))
+#define CDR_PREP_PNTR(x)		(0x34 + ((x) << 12))
+#define CDR_PROC_PNTR(x)		(0x38 + ((x) << 12))
+#define CDR_STAT(x)			(0x3C + ((x) << 12))
+
+/* HIA, Result Descriptor Ring Manager */
+#define RDR_BASE_ADDR_LO(x)		(0x800 + ((x) << 12))
+#define RDR_BASE_ADDR_HI(x)		(0x804 + ((x) << 12))
+#define RDR_DATA_BASE_ADDR_LO(x)	(0x808 + ((x) << 12))
+#define RDR_DATA_BASE_ADDR_HI(x)	(0x80C + ((x) << 12))
+#define RDR_ACD_BASE_ADDR_LO(x)		(0x810 + ((x) << 12))
+#define RDR_ACD_BASE_ADDR_HI(x)		(0x814 + ((x) << 12))
+#define RDR_RING_SIZE(x)		(0x818 + ((x) << 12))
+#define RDR_DESC_SIZE(x)		(0x81C + ((x) << 12))
+#define RDR_CFG(x)			(0x820 + ((x) << 12))
+#define RDR_DMA_CFG(x)			(0x824 + ((x) << 12))
+#define RDR_THRESH(x)			(0x828 + ((x) << 12))
+#define RDR_PREP_COUNT(x)		(0x82C + ((x) << 12))
+#define RDR_PROC_COUNT(x)		(0x830 + ((x) << 12))
+#define RDR_PREP_PNTR(x)		(0x834 + ((x) << 12))
+#define RDR_PROC_PNTR(x)		(0x838 + ((x) << 12))
+#define RDR_STAT(x)			(0x83C + ((x) << 12))
+
+/* HIA, Ring AIC */
+#define AIC_POL_CTRL(x)			(0xE000 - ((x) << 12))
+#define	AIC_TYPE_CTRL(x)		(0xE004 - ((x) << 12))
+#define	AIC_ENABLE_CTRL(x)		(0xE008 - ((x) << 12))
+#define	AIC_RAW_STAL(x)			(0xE00C - ((x) << 12))
+#define	AIC_ENABLE_SET(x)		(0xE00C - ((x) << 12))
+#define	AIC_ENABLED_STAT(x)		(0xE010 - ((x) << 12))
+#define	AIC_ACK(x)			(0xE010 - ((x) << 12))
+#define	AIC_ENABLE_CLR(x)		(0xE014 - ((x) << 12))
+#define	AIC_OPTIONS(x)			(0xE018 - ((x) << 12))
+#define	AIC_VERSION(x)			(0xE01C - ((x) << 12))
+
+/* HIA, Global AIC */
+#define AIC_G_POL_CTRL			0xF800
+#define AIC_G_TYPE_CTRL			0xF804
+#define AIC_G_ENABLE_CTRL		0xF808
+#define AIC_G_RAW_STAT			0xF80C
+#define AIC_G_ENABLE_SET		0xF80C
+#define AIC_G_ENABLED_STAT		0xF810
+#define AIC_G_ACK			0xF810
+#define AIC_G_ENABLE_CLR		0xF814
+#define AIC_G_OPTIONS			0xF818
+#define AIC_G_VERSION			0xF81C
+
+/* HIA, Data Fetch Engine */
+#define DFE_CFG				0xF000
+#define DFE_PRIO_0			0xF010
+#define DFE_PRIO_1			0xF014
+#define DFE_PRIO_2			0xF018
+#define DFE_PRIO_3			0xF01C
+
+/* HIA, Data Fetch Engine access monitoring for CDR */
+#define DFE_RING_REGION_LO(x)		(0xF080 + ((x) << 3))
+#define DFE_RING_REGION_HI(x)		(0xF084 + ((x) << 3))
+
+/* HIA, Data Fetch Engine thread control and status for thread */
+#define DFE_THR_CTRL			0xF200
+#define DFE_THR_STAT			0xF204
+#define DFE_THR_DESC_CTRL		0xF208
+#define DFE_THR_DESC_DPTR_LO		0xF210
+#define DFE_THR_DESC_DPTR_HI		0xF214
+#define DFE_THR_DESC_ACDPTR_LO		0xF218
+#define DFE_THR_DESC_ACDPTR_HI		0xF21C
+
+/* HIA, Data Store Engine */
+#define DSE_CFG				0xF400
+#define DSE_PRIO_0			0xF410
+#define DSE_PRIO_1			0xF414
+#define DSE_PRIO_2			0xF418
+#define DSE_PRIO_3			0xF41C
+
+/* HIA, Data Store Engine access monitoring for RDR */
+#define DSE_RING_REGION_LO(x)		(0xF480 + ((x) << 3))
+#define DSE_RING_REGION_HI(x)		(0xF484 + ((x) << 3))
+
+/* HIA, Data Store Engine thread control and status for thread */
+#define DSE_THR_CTRL			0xF600
+#define DSE_THR_STAT			0xF604
+#define DSE_THR_DESC_CTRL		0xF608
+#define DSE_THR_DESC_DPTR_LO		0xF610
+#define DSE_THR_DESC_DPTR_HI		0xF614
+#define DSE_THR_DESC_S_DPTR_LO		0xF618
+#define DSE_THR_DESC_S_DPTR_HI		0xF61C
+#define DSE_THR_ERROR_STAT		0xF620
+
+/* HIA Global */
+#define HIA_MST_CTRL			0xFFF4
+#define HIA_OPTIONS			0xFFF8
+#define HIA_VERSION			0xFFFC
+
+/* Processing Engine Input Side, Processing Engine */
+#define PE_IN_DBUF_THRESH		0x10000
+#define PE_IN_TBUF_THRESH		0x10100
+
+/* Packet Engine Configuration / Status Registers */
+#define PE_TOKEN_CTRL_STAT		0x11000
+#define PE_FUNCTION_EN			0x11004
+#define PE_CONTEXT_CTRL			0x11008
+#define PE_INTERRUPT_CTRL_STAT		0x11010
+#define PE_CONTEXT_STAT			0x1100C
+#define PE_OUT_TRANS_CTRL_STAT		0x11018
+#define PE_OUT_BUF_CTRL			0x1101C
+
+/* Packet Engine PRNG Registers */
+#define PE_PRNG_STAT			0x11040
+#define PE_PRNG_CTRL			0x11044
+#define PE_PRNG_SEED_L			0x11048
+#define PE_PRNG_SEED_H			0x1104C
+#define PE_PRNG_KEY_0_L			0x11050
+#define PE_PRNG_KEY_0_H			0x11054
+#define PE_PRNG_KEY_1_L			0x11058
+#define PE_PRNG_KEY_1_H			0x1105C
+#define PE_PRNG_RES_0			0x11060
+#define PE_PRNG_RES_1			0x11064
+#define PE_PRNG_RES_2			0x11068
+#define PE_PRNG_RES_3			0x1106C
+#define PE_PRNG_LFSR_L			0x11070
+#define PE_PRNG_LFSR_H			0x11074
+
+/* Packet Engine AIC */
+#define PE_EIP96_AIC_POL_CTRL		0x113C0
+#define PE_EIP96_AIC_TYPE_CTRL		0x113C4
+#define PE_EIP96_AIC_ENABLE_CTRL	0x113C8
+#define PE_EIP96_AIC_RAW_STAT		0x113CC
+#define PE_EIP96_AIC_ENABLE_SET		0x113CC
+#define PE_EIP96_AIC_ENABLED_STAT	0x113D0
+#define PE_EIP96_AIC_ACK		0x113D0
+#define PE_EIP96_AIC_ENABLE_CLR		0x113D4
+#define PE_EIP96_AIC_OPTIONS		0x113D8
+#define PE_EIP96_AIC_VERSION		0x113DC
+
+/* Packet Engine Options & Version Registers */
+#define PE_EIP96_OPTIONS		0x113F8
+#define PE_EIP96_VERSION		0x113FC
+
+/* Processing Engine Output Side */
+#define PE_OUT_DBUF_THRESH		0x11C00
+#define PE_OUT_TBUF_THRESH		0x11D00
+
+/* Processing Engine Local AIC */
+#define PE_AIC_POL_CTRL			0x11F00
+#define PE_AIC_TYPE_CTRL		0x11F04
+#define PE_AIC_ENABLE_CTRL		0x11F08
+#define PE_AIC_RAW_STAT			0x11F0C
+#define PE_AIC_ENABLE_SET		0x11F0C
+#define PE_AIC_ENABLED_STAT		0x11F10
+#define PE_AIC_ENABLE_CLR		0x11F14
+#define PE_AIC_OPTIONS			0x11F18
+#define PE_AIC_VERSION			0x11F1C
+
+/* Processing Engine General Configuration and Version */
+#define PE_IN_FLIGHT			0x11FF0
+#define PE_OPTIONS			0x11FF8
+#define PE_VERSION			0x11FFC
+
+/* EIP-97 - Global */
+#define EIP97_CLOCK_STATE		0x1FFE4
+#define EIP97_FORCE_CLOCK_ON		0x1FFE8
+#define EIP97_FORCE_CLOCK_OFF		0x1FFEC
+#define EIP97_MST_CTRL			0x1FFF4
+#define EIP97_OPTIONS			0x1FFF8
+#define EIP97_VERSION			0x1FFFC
+#endif /* __MTK_REGS_H__ */
diff --git a/drivers/crypto/mediatek/mtk-sha.c b/drivers/crypto/mediatek/mtk-sha.c
new file mode 100644
index 0000000..191dee2
--- /dev/null
+++ b/drivers/crypto/mediatek/mtk-sha.c
@@ -0,0 +1,1423 @@
+/*
+ * Cryptographic API.
+ *
+ * Support for MediaTek SHA1/SHA2 hardware accelerator.
+ *
+ * Copyright (c) 2016 MediaTek Inc.
+ * Author: Ryder Lee <ryder.lee@mediatek.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * Some ideas are from atmel-sha.c and omap-sham.c drivers.
+ */
+
+#include <crypto/algapi.h>
+#include <crypto/internal/hash.h>
+#include <crypto/scatterwalk.h>
+#include <crypto/sha.h>
+#include <linux/crypto.h>
+#include <linux/dma-mapping.h>
+#include <linux/scatterlist.h>
+#include "mtk-platform.h"
+#include "mtk-regs.h"
+
+#define SHA_ALIGN_MSK		(sizeof(u32) - 1)
+#define SHA_QUEUE_SIZE		512
+#define SHA_TMP_STATE_SIZE	512
+
+#define SHA_DATA_LEN_MSK	GENMASK(16, 0)
+#define SHA_BUFFER_LEN		((u32)PAGE_SIZE)
+
+#define SHA_OP_UPDATE		1
+#define SHA_OP_FINAL		2
+
+/* SHA command token */
+#define SHA_CT_SIZE		5
+#define SHA_CT_CTRL_HDR		0x02220000
+#define SHA_COMMAND0		0x03020000
+#define SHA_COMMAND1		0x21060000
+#define SHA_COMMAND2		0xe0e63802
+
+/* SHA transform information */
+#define SHA_TFM_HASH		(0x2 << 0)
+#define SHA_TFM_DIG_TYPE	(0x1 << 21)
+#define SHA_TFM_SIZE(x)		((x) << 8)
+#define SHA_TFM_START		(0x1 << 4)
+#define SHA_TFM_CONTINUE		(0x1 << 5)
+#define SHA_TFM_HASH_STORE	(0x1 << 19)
+#define SHA_TFM_SHA1		(0x2 << 23)
+#define SHA_TFM_SHA256		(0x3 << 23)
+#define SHA_TFM_SHA224		(0x4 << 23)
+#define SHA_TFM_SHA512		(0x5 << 23)
+#define SHA_TFM_SHA384		(0x6 << 23)
+#define SHA_TFM_DIGEST(x)	(((x) & 0xf) << 24)
+
+/* SHA flags */
+#define SHA_FLAGS_BUSY		BIT(0)
+#define	SHA_FLAGS_FINAL		BIT(1)
+#define SHA_FLAGS_FINUP		BIT(2)
+#define SHA_FLAGS_SG		BIT(3)
+#define SHA_FLAGS_ALGO_MASK	GENMASK(8, 4)
+#define SHA_FLAGS_SHA1		BIT(4)
+#define SHA_FLAGS_SHA224	BIT(5)
+#define SHA_FLAGS_SHA256	BIT(6)
+#define SHA_FLAGS_SHA384	BIT(7)
+#define SHA_FLAGS_SHA512	BIT(8)
+#define SHA_FLAGS_HMAC		BIT(9)
+#define SHA_FLAGS_PAD		BIT(10)
+
+/**
+ * SHA command token(CT) is a set of hardware instructions that
+ * are used to control engine's processing flow of sha, and it
+ * contains the first two words of transform state.
+ */
+struct mtk_sha_ct {
+	u32 tfm_ctrl0;
+	u32 tfm_ctrl1;
+	u32 ct_ctrl0;
+	u32 ct_ctrl1;
+	u32 ct_ctrl2;
+};
+
+/**
+ * SHA transform state(tfm) is used to define SHA transform state
+ * and store the result digest that produce by crypto engine.
+ */
+struct mtk_sha_tfm {
+	u32 tfm_ctrl0;
+	u32 tfm_ctrl1;
+	/* store result digests */
+	u8 digest[SHA512_DIGEST_SIZE] __aligned(sizeof(u32));
+};
+
+/**
+ * mtk_sha_info consists of command token and transform state
+ * of SHA, its role is similar to mtk_aes_info.
+ */
+struct mtk_sha_info {
+	struct mtk_sha_ct ct;
+	struct mtk_sha_tfm tfm;
+};
+
+struct mtk_sha_reqctx {
+	struct mtk_sha_info info;
+	unsigned long flags;
+	unsigned long op;
+
+	u64 digcnt;
+	bool start;
+	size_t bufcnt;
+	dma_addr_t dma_addr;
+
+	/* walk state */
+	struct scatterlist *sg;
+	u32 offset;	/* offset in current sg */
+	u32 total;	/* total request */
+	size_t ds;
+	size_t bs;
+
+	u8 *buffer;
+};
+
+struct mtk_sha_hmac_ctx {
+	struct crypto_shash	*shash;
+	u8 ipad[SHA512_BLOCK_SIZE] __aligned(sizeof(u32));
+	u8 opad[SHA512_BLOCK_SIZE] __aligned(sizeof(u32));
+};
+
+struct mtk_sha_ctx {
+	struct mtk_cryp *cryp;
+	unsigned long flags;
+	u8 id;
+	u8 buf[SHA_BUFFER_LEN] __aligned(sizeof(u32));
+
+	struct mtk_sha_hmac_ctx	base[0];
+};
+
+struct mtk_sha_drv {
+	struct list_head dev_list;
+	/* device list lock */
+	spinlock_t lock;
+};
+
+static struct mtk_sha_drv mtk_sha = {
+	.dev_list = LIST_HEAD_INIT(mtk_sha.dev_list),
+	.lock = __SPIN_LOCK_UNLOCKED(mtk_sha.lock),
+};
+
+static int mtk_sha_handle_queue(struct mtk_cryp *cryp, u8 id,
+				struct ahash_request *req);
+
+static inline u32 mtk_sha_read(struct mtk_cryp *cryp, u32 offset)
+{
+	return readl_relaxed(cryp->base + offset);
+}
+
+static inline void mtk_sha_write(struct mtk_cryp *cryp,
+				 u32 offset, u32 value)
+{
+	writel_relaxed(value, cryp->base + offset);
+}
+
+static struct mtk_cryp *mtk_sha_find_dev(struct mtk_sha_ctx *tctx)
+{
+	struct mtk_cryp *cryp = NULL;
+	struct mtk_cryp *tmp;
+
+	spin_lock_bh(&mtk_sha.lock);
+	if (!tctx->cryp) {
+		list_for_each_entry(tmp, &mtk_sha.dev_list, sha_list) {
+			cryp = tmp;
+			break;
+		}
+		tctx->cryp = cryp;
+	} else {
+		cryp = tctx->cryp;
+	}
+
+	/*
+	 * Assign record id to tfm in round-robin fashion, and this
+	 * will help tfm to bind  to corresponding descriptor rings.
+	 */
+	tctx->id = cryp->rec;
+	cryp->rec = !cryp->rec;
+
+	spin_unlock_bh(&mtk_sha.lock);
+
+	return cryp;
+}
+
+static int mtk_sha_append_sg(struct mtk_sha_reqctx *ctx)
+{
+	size_t count;
+
+	while ((ctx->bufcnt < SHA_BUFFER_LEN) && ctx->total) {
+		count = min(ctx->sg->length - ctx->offset, ctx->total);
+		count = min(count, SHA_BUFFER_LEN - ctx->bufcnt);
+
+		if (count <= 0) {
+			/*
+			 * Check if count <= 0 because the buffer is full or
+			 * because the sg length is 0. In the latest case,
+			 * check if there is another sg in the list, a 0 length
+			 * sg doesn't necessarily mean the end of the sg list.
+			 */
+			if ((ctx->sg->length == 0) && !sg_is_last(ctx->sg)) {
+				ctx->sg = sg_next(ctx->sg);
+				continue;
+			} else {
+				break;
+			}
+		}
+
+		scatterwalk_map_and_copy(ctx->buffer + ctx->bufcnt, ctx->sg,
+					 ctx->offset, count, 0);
+
+		ctx->bufcnt += count;
+		ctx->offset += count;
+		ctx->total -= count;
+
+		if (ctx->offset == ctx->sg->length) {
+			ctx->sg = sg_next(ctx->sg);
+			if (ctx->sg)
+				ctx->offset = 0;
+			else
+				ctx->total = 0;
+		}
+	}
+
+	return 0;
+}
+
+/*
+ * The purpose of this padding is to ensure that the padded message is a
+ * multiple of 512 bits (SHA1/SHA224/SHA256) or 1024 bits (SHA384/SHA512).
+ * The bit "1" is appended at the end of the message followed by
+ * "padlen-1" zero bits. Then a 64 bits block (SHA1/SHA224/SHA256) or
+ * 128 bits block (SHA384/SHA512) equals to the message length in bits
+ * is appended.
+ *
+ * For SHA1/SHA224/SHA256, padlen is calculated as followed:
+ *  - if message length < 56 bytes then padlen = 56 - message length
+ *  - else padlen = 64 + 56 - message length
+ *
+ * For SHA384/SHA512, padlen is calculated as followed:
+ *  - if message length < 112 bytes then padlen = 112 - message length
+ *  - else padlen = 128 + 112 - message length
+ */
+static void mtk_sha_fill_padding(struct mtk_sha_reqctx *ctx, u32 len)
+{
+	u32 index, padlen;
+	u64 bits[2];
+	u64 size = ctx->digcnt;
+
+	size += ctx->bufcnt;
+	size += len;
+
+	bits[1] = cpu_to_be64(size << 3);
+	bits[0] = cpu_to_be64(size >> 61);
+
+	if (ctx->flags & (SHA_FLAGS_SHA384 | SHA_FLAGS_SHA512)) {
+		index = ctx->bufcnt & 0x7f;
+		padlen = (index < 112) ? (112 - index) : ((128 + 112) - index);
+		*(ctx->buffer + ctx->bufcnt) = 0x80;
+		memset(ctx->buffer + ctx->bufcnt + 1, 0, padlen - 1);
+		memcpy(ctx->buffer + ctx->bufcnt + padlen, bits, 16);
+		ctx->bufcnt += padlen + 16;
+		ctx->flags |= SHA_FLAGS_PAD;
+	} else {
+		index = ctx->bufcnt & 0x3f;
+		padlen = (index < 56) ? (56 - index) : ((64 + 56) - index);
+		*(ctx->buffer + ctx->bufcnt) = 0x80;
+		memset(ctx->buffer + ctx->bufcnt + 1, 0, padlen - 1);
+		memcpy(ctx->buffer + ctx->bufcnt + padlen, &bits[1], 8);
+		ctx->bufcnt += padlen + 8;
+		ctx->flags |= SHA_FLAGS_PAD;
+	}
+}
+
+/* Initialize basic transform information of SHA */
+static void mtk_sha_info_init(struct mtk_sha *sha,
+			      struct mtk_sha_reqctx *ctx)
+{
+	struct mtk_sha_info *info = sha->info;
+	struct mtk_sha_ct *ct = &info->ct;
+	struct mtk_sha_tfm *tfm = &info->tfm;
+
+	sha->ct_hdr = SHA_CT_CTRL_HDR;
+	sha->ct_size = SHA_CT_SIZE;
+
+	tfm->tfm_ctrl0 = SHA_TFM_HASH | SHA_TFM_DIG_TYPE |
+			 SHA_TFM_SIZE(SIZE_IN_WORDS(ctx->ds));
+
+	switch (ctx->flags & SHA_FLAGS_ALGO_MASK) {
+	case SHA_FLAGS_SHA1:
+		tfm->tfm_ctrl0 |= SHA_TFM_SHA1;
+		break;
+	case SHA_FLAGS_SHA224:
+		tfm->tfm_ctrl0 |= SHA_TFM_SHA224;
+		break;
+	case SHA_FLAGS_SHA256:
+		tfm->tfm_ctrl0 |= SHA_TFM_SHA256;
+		break;
+	case SHA_FLAGS_SHA384:
+		tfm->tfm_ctrl0 |= SHA_TFM_SHA384;
+		break;
+	case SHA_FLAGS_SHA512:
+		tfm->tfm_ctrl0 |= SHA_TFM_SHA512;
+		break;
+
+	default:
+		/* Should not happen... */
+		return;
+	}
+
+	tfm->tfm_ctrl1 = SHA_TFM_HASH_STORE;
+	ct->tfm_ctrl0 = tfm->tfm_ctrl0 | SHA_TFM_CONTINUE | SHA_TFM_START;
+	ct->tfm_ctrl1 = tfm->tfm_ctrl1;
+
+	ct->ct_ctrl0 = SHA_COMMAND0;
+	ct->ct_ctrl1 = SHA_COMMAND1;
+	ct->ct_ctrl2 = SHA_COMMAND2 | SHA_TFM_DIGEST(SIZE_IN_WORDS(ctx->ds));
+}
+
+/* Update input data length of transform information and map it. */
+static int mtk_sha_info_map(struct mtk_cryp *cryp,
+			    struct mtk_sha *sha, size_t len)
+{
+	struct mtk_sha_reqctx *ctx = ahash_request_ctx(sha->req);
+	struct mtk_sha_info *info = sha->info;
+	struct mtk_sha_ct *ct = &info->ct;
+
+	if (ctx->start)
+		ctx->start = false;
+	else
+		ct->tfm_ctrl0 &= ~SHA_TFM_START;
+
+	sha->ct_hdr = (sha->ct_hdr & ~SHA_DATA_LEN_MSK) | len;
+	ct->ct_ctrl0 = (ct->ct_ctrl0 & ~SHA_DATA_LEN_MSK) | len;
+
+	ctx->digcnt += len;
+
+	sha->ct_dma = dma_map_single(cryp->dev, info, sizeof(*info),
+				      DMA_BIDIRECTIONAL);
+	if (unlikely(dma_mapping_error(cryp->dev, sha->ct_dma))) {
+		dev_err(cryp->dev, "dma %d bytes error\n", sizeof(*info));
+		return -EINVAL;
+	}
+	sha->tfm_dma = sha->ct_dma + sizeof(*ct);
+
+	return 0;
+}
+
+/*
+ * Because of hardware limitation, we must pre-calculate the inner
+ * and outer digest that need to be processed firstly by engine, then
+ * apply the result digest to the input message. These complex hashing
+ * procedures limits HMAC performance, so we use fallback SW encoding.
+ */
+static int mtk_sha_finish_hmac(struct ahash_request *req)
+{
+	struct mtk_sha_ctx *tctx = crypto_tfm_ctx(req->base.tfm);
+	struct mtk_sha_hmac_ctx *bctx = tctx->base;
+	struct mtk_sha_reqctx *ctx = ahash_request_ctx(req);
+
+	SHASH_DESC_ON_STACK(shash, bctx->shash);
+
+	shash->tfm = bctx->shash;
+	shash->flags = 0; /* not CRYPTO_TFM_REQ_MAY_SLEEP */
+
+	return crypto_shash_init(shash) ?:
+	       crypto_shash_update(shash, bctx->opad, ctx->bs) ?:
+	       crypto_shash_finup(shash, req->result, ctx->ds, req->result);
+}
+
+/* Initialize request context */
+static int mtk_sha_init(struct ahash_request *req)
+{
+	struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+	struct mtk_sha_ctx *tctx = crypto_ahash_ctx(tfm);
+	struct mtk_sha_reqctx *ctx = ahash_request_ctx(req);
+
+	ctx->flags = 0;
+	ctx->ds = crypto_ahash_digestsize(tfm);
+
+	switch (ctx->ds) {
+	case SHA1_DIGEST_SIZE:
+		ctx->flags |= SHA_FLAGS_SHA1;
+		ctx->bs = SHA1_BLOCK_SIZE;
+		break;
+	case SHA224_DIGEST_SIZE:
+		ctx->flags |= SHA_FLAGS_SHA224;
+		ctx->bs = SHA224_BLOCK_SIZE;
+		break;
+	case SHA256_DIGEST_SIZE:
+		ctx->flags |= SHA_FLAGS_SHA256;
+		ctx->bs = SHA256_BLOCK_SIZE;
+		break;
+	case SHA384_DIGEST_SIZE:
+		ctx->flags |= SHA_FLAGS_SHA384;
+		ctx->bs = SHA384_BLOCK_SIZE;
+		break;
+	case SHA512_DIGEST_SIZE:
+		ctx->flags |= SHA_FLAGS_SHA512;
+		ctx->bs = SHA512_BLOCK_SIZE;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	ctx->bufcnt = 0;
+	ctx->digcnt = 0;
+	ctx->buffer = tctx->buf;
+	ctx->start = true;
+
+	if (tctx->flags & SHA_FLAGS_HMAC) {
+		struct mtk_sha_hmac_ctx *bctx = tctx->base;
+
+		memcpy(ctx->buffer, bctx->ipad, ctx->bs);
+		ctx->bufcnt = ctx->bs;
+		ctx->flags |= SHA_FLAGS_HMAC;
+	}
+
+	return 0;
+}
+
+static int mtk_sha_xmit(struct mtk_cryp *cryp, struct mtk_sha *sha,
+			dma_addr_t addr, size_t len)
+{
+	struct mtk_ring *ring = cryp->ring[sha->id];
+	struct mtk_desc *cmd = ring->cmd_base + ring->pos;
+	struct mtk_desc *res = ring->res_base + ring->pos;
+	int err;
+
+	err = mtk_sha_info_map(cryp, sha, len);
+	if (err)
+		return err;
+
+	/* Fill command and result descriptors */
+	res->hdr = MTK_DESC_FIRST | MTK_DESC_LAST |
+		    MTK_DESC_BUF_LEN(len);
+
+	res->buf = cryp->tmp_dma;
+
+	cmd->hdr = MTK_DESC_FIRST | MTK_DESC_LAST |
+		    MTK_DESC_BUF_LEN(len) |
+		    MTK_DESC_CT_LEN(sha->ct_size);
+
+	cmd->buf = addr;
+	cmd->ct = sha->ct_dma;
+	cmd->ct_hdr = sha->ct_hdr;
+	cmd->tfm = sha->tfm_dma;
+
+	if (++ring->pos == MTK_MAX_DESC_NUM)
+		ring->pos = 0;
+
+	/*
+	 * make sure that all changes to the dma ring are done before we
+	 * start engine.
+	 */
+	wmb();
+	/* Start DMA transfer */
+	mtk_sha_write(cryp, RDR_PREP_COUNT(sha->id), MTK_DESC_CNT(1));
+	mtk_sha_write(cryp, CDR_PREP_COUNT(sha->id), MTK_DESC_CNT(1));
+
+	return -EINPROGRESS;
+}
+
+static int mtk_sha_xmit2(struct mtk_cryp *cryp, struct mtk_sha *sha,
+			 struct mtk_sha_reqctx *ctx, size_t len1, size_t len2)
+{
+	struct mtk_ring *ring = cryp->ring[sha->id];
+	struct mtk_desc *cmd = ring->cmd_base + ring->pos;
+	struct mtk_desc *res = ring->res_base + ring->pos;
+	int err;
+
+	err = mtk_sha_info_map(cryp, sha, len1 + len2);
+	if (err)
+		return err;
+
+	/* Fill command and result descriptors */
+	res->hdr = MTK_DESC_BUF_LEN(len1) | MTK_DESC_FIRST;
+	res->buf = cryp->tmp_dma;
+
+	cmd->hdr = MTK_DESC_BUF_LEN(len1) | MTK_DESC_FIRST |
+					MTK_DESC_CT_LEN(sha->ct_size);
+	cmd->buf = sg_dma_address(ctx->sg);
+	cmd->ct = sha->ct_dma;
+	cmd->ct_hdr = sha->ct_hdr;
+	cmd->tfm = sha->tfm_dma;
+
+	if (++ring->pos == MTK_MAX_DESC_NUM)
+		ring->pos = 0;
+
+	cmd = ring->cmd_base + ring->pos;
+	res = ring->res_base + ring->pos;
+
+	res->hdr = MTK_DESC_BUF_LEN(len2) | MTK_DESC_LAST;
+	res->buf = cryp->tmp_dma;
+
+	cmd->hdr = MTK_DESC_BUF_LEN(len2) | MTK_DESC_LAST;
+	cmd->buf = ctx->dma_addr;
+
+	if (++ring->pos == MTK_MAX_DESC_NUM)
+		ring->pos = 0;
+
+	/*
+	 * make sure that all changes to the dma ring are done before we
+	 * start engine.
+	 */
+	wmb();
+	/* Start DMA transfer */
+	mtk_sha_write(cryp, RDR_PREP_COUNT(sha->id), MTK_DESC_CNT(2));
+	mtk_sha_write(cryp, CDR_PREP_COUNT(sha->id), MTK_DESC_CNT(2));
+
+	return -EINPROGRESS;
+}
+
+static int mtk_sha_dma_map(struct mtk_cryp *cryp, struct mtk_sha *sha,
+			   struct mtk_sha_reqctx *ctx, size_t count)
+{
+	ctx->dma_addr = dma_map_single(cryp->dev, ctx->buffer,
+				SHA_BUFFER_LEN, DMA_TO_DEVICE);
+	if (unlikely(dma_mapping_error(cryp->dev, ctx->dma_addr))) {
+		dev_err(cryp->dev, "dma map error\n");
+		return -EINVAL;
+	}
+
+	ctx->flags &= ~SHA_FLAGS_SG;
+
+	return mtk_sha_xmit(cryp, sha, ctx->dma_addr, count);
+}
+
+static int mtk_sha_update_slow(struct mtk_cryp *cryp, struct mtk_sha *sha)
+{
+	struct mtk_sha_reqctx *ctx = ahash_request_ctx(sha->req);
+	size_t count;
+	u32 final;
+
+	mtk_sha_append_sg(ctx);
+
+	final = (ctx->flags & SHA_FLAGS_FINUP) && !ctx->total;
+
+	dev_dbg(cryp->dev, "slow: bufcnt: %u\n", ctx->bufcnt);
+
+	if (final) {
+		sha->flags |= SHA_FLAGS_FINAL;
+		mtk_sha_fill_padding(ctx, 0);
+	}
+
+	if (final || (ctx->bufcnt == SHA_BUFFER_LEN && ctx->total)) {
+		count = ctx->bufcnt;
+		ctx->bufcnt = 0;
+
+		return mtk_sha_dma_map(cryp, sha, ctx, count);
+	}
+	return 0;
+}
+
+static int mtk_sha_update_start(struct mtk_cryp *cryp, struct mtk_sha *sha)
+{
+	struct mtk_sha_reqctx *ctx = ahash_request_ctx(sha->req);
+	u32 len, final, tail;
+	struct scatterlist *sg;
+
+	if (!ctx->total)
+		return 0;
+
+	if (ctx->bufcnt || ctx->offset)
+		return mtk_sha_update_slow(cryp, sha);
+
+	sg = ctx->sg;
+
+	if (!IS_ALIGNED(sg->offset, sizeof(u32)))
+		return mtk_sha_update_slow(cryp, sha);
+
+	if (!sg_is_last(sg) && !IS_ALIGNED(sg->length, ctx->bs))
+		/* size is not ctx->bs aligned */
+		return mtk_sha_update_slow(cryp, sha);
+
+	len = min(ctx->total, sg->length);
+
+	if (sg_is_last(sg)) {
+		if (!(ctx->flags & SHA_FLAGS_FINUP)) {
+			/* not last sg must be ctx->bs aligned */
+			tail = len & (ctx->bs - 1);
+			len -= tail;
+		}
+	}
+
+	ctx->total -= len;
+	ctx->offset = len; /* offset where to start slow */
+
+	final = (ctx->flags & SHA_FLAGS_FINUP) && !ctx->total;
+
+	/* Add padding */
+	if (final) {
+		size_t count;
+
+		tail = len & (ctx->bs - 1);
+		len -= tail;
+		ctx->total += tail;
+		ctx->offset = len; /* offset where to start slow */
+
+		sg = ctx->sg;
+		mtk_sha_append_sg(ctx);
+		mtk_sha_fill_padding(ctx, len);
+
+		ctx->dma_addr = dma_map_single(cryp->dev, ctx->buffer,
+			SHA_BUFFER_LEN, DMA_TO_DEVICE);
+		if (unlikely(dma_mapping_error(cryp->dev, ctx->dma_addr))) {
+			dev_err(cryp->dev, "dma map bytes error\n");
+			return -EINVAL;
+		}
+
+		sha->flags |= SHA_FLAGS_FINAL;
+		count = ctx->bufcnt;
+		ctx->bufcnt = 0;
+
+		if (len == 0) {
+			ctx->flags &= ~SHA_FLAGS_SG;
+			return mtk_sha_xmit(cryp, sha, ctx->dma_addr, count);
+
+		} else {
+			ctx->sg = sg;
+			if (!dma_map_sg(cryp->dev, ctx->sg, 1, DMA_TO_DEVICE)) {
+				dev_err(cryp->dev, "dma_map_sg error\n");
+				return -EINVAL;
+			}
+
+			ctx->flags |= SHA_FLAGS_SG;
+			return mtk_sha_xmit2(cryp, sha, ctx, len, count);
+		}
+	}
+
+	if (!dma_map_sg(cryp->dev, ctx->sg, 1, DMA_TO_DEVICE)) {
+		dev_err(cryp->dev, "dma_map_sg  error\n");
+		return -EINVAL;
+	}
+
+	ctx->flags |= SHA_FLAGS_SG;
+
+	return mtk_sha_xmit(cryp, sha, sg_dma_address(ctx->sg), len);
+}
+
+static int mtk_sha_final_req(struct mtk_cryp *cryp, struct mtk_sha *sha)
+{
+	struct ahash_request *req = sha->req;
+	struct mtk_sha_reqctx *ctx = ahash_request_ctx(req);
+	size_t count;
+
+	mtk_sha_fill_padding(ctx, 0);
+
+	sha->flags |= SHA_FLAGS_FINAL;
+	count = ctx->bufcnt;
+	ctx->bufcnt = 0;
+
+	return mtk_sha_dma_map(cryp, sha, ctx, count);
+}
+
+/* copy ready hash (+ finalize hmac) */
+static int mtk_sha_finish(struct ahash_request *req)
+{
+	struct mtk_sha_reqctx *ctx = ahash_request_ctx(req);
+	u8 *digest = ctx->info.tfm.digest;
+
+	memcpy(req->result, digest, ctx->ds);
+
+	if (ctx->flags & SHA_FLAGS_HMAC)
+		return mtk_sha_finish_hmac(req);
+
+	return 0;
+}
+
+static void mtk_sha_finish_req(struct mtk_cryp *cryp,
+			       struct mtk_sha *sha, int err)
+{
+	if (likely(!err && (SHA_FLAGS_FINAL & sha->flags)))
+		err = mtk_sha_finish(sha->req);
+
+	sha->flags &= ~(SHA_FLAGS_BUSY | SHA_FLAGS_FINAL);
+
+	sha->req->base.complete(&sha->req->base, err);
+
+	/* handle new request */
+	mtk_sha_handle_queue(cryp, sha->id - RING2, NULL);
+}
+
+static int mtk_sha_handle_queue(struct mtk_cryp *cryp, u8 id,
+				struct ahash_request *req)
+{
+	struct mtk_sha *sha = cryp->sha[id];
+	struct crypto_async_request *async_req, *backlog;
+	struct mtk_sha_reqctx *ctx;
+	unsigned long flags;
+	int err = 0, ret = 0;
+
+	spin_lock_irqsave(&sha->lock, flags);
+	if (req)
+		ret = ahash_enqueue_request(&sha->queue, req);
+
+	if (SHA_FLAGS_BUSY & sha->flags) {
+		spin_unlock_irqrestore(&sha->lock, flags);
+		return ret;
+	}
+
+	backlog = crypto_get_backlog(&sha->queue);
+	async_req = crypto_dequeue_request(&sha->queue);
+	if (async_req)
+		sha->flags |= SHA_FLAGS_BUSY;
+	spin_unlock_irqrestore(&sha->lock, flags);
+
+	if (!async_req)
+		return ret;
+
+	if (backlog)
+		backlog->complete(backlog, -EINPROGRESS);
+
+	req = ahash_request_cast(async_req);
+	ctx = ahash_request_ctx(req);
+
+	sha->req = req;
+	sha->info = &ctx->info;
+
+	mtk_sha_info_init(sha, ctx);
+
+	if (ctx->op == SHA_OP_UPDATE) {
+		err = mtk_sha_update_start(cryp, sha);
+		if (err != -EINPROGRESS && (ctx->flags & SHA_FLAGS_FINUP))
+			/* no final() after finup() */
+			err = mtk_sha_final_req(cryp, sha);
+	} else if (ctx->op == SHA_OP_FINAL) {
+		err = mtk_sha_final_req(cryp, sha);
+	}
+
+	if (unlikely(err != -EINPROGRESS))
+		/* task will not finish it, so do it here */
+		mtk_sha_finish_req(cryp, sha, err);
+
+	return ret;
+}
+
+static int mtk_sha_enqueue(struct ahash_request *req, u32 op)
+{
+	struct mtk_sha_reqctx *ctx = ahash_request_ctx(req);
+	struct mtk_sha_ctx *tctx = crypto_tfm_ctx(req->base.tfm);
+
+	ctx->op = op;
+
+	return mtk_sha_handle_queue(tctx->cryp, tctx->id, req);
+}
+
+static void mtk_sha_unmap(struct mtk_cryp *cryp, struct mtk_sha *sha)
+{
+	struct mtk_sha_reqctx *ctx = ahash_request_ctx(sha->req);
+
+	dma_unmap_single(cryp->dev, sha->ct_dma,
+			 sizeof(struct mtk_sha_info), DMA_BIDIRECTIONAL);
+
+	if (ctx->flags & SHA_FLAGS_SG) {
+		dma_unmap_sg(cryp->dev, ctx->sg, 1, DMA_TO_DEVICE);
+		if (ctx->sg->length == ctx->offset) {
+			ctx->sg = sg_next(ctx->sg);
+			if (ctx->sg)
+				ctx->offset = 0;
+		}
+		if (ctx->flags & SHA_FLAGS_PAD) {
+			dma_unmap_single(cryp->dev, ctx->dma_addr,
+					 SHA_BUFFER_LEN, DMA_TO_DEVICE);
+		}
+	} else
+		dma_unmap_single(cryp->dev, ctx->dma_addr,
+				 SHA_BUFFER_LEN, DMA_TO_DEVICE);
+}
+
+static void mtk_sha_complete(struct mtk_cryp *cryp, struct mtk_sha *sha)
+{
+	int err = 0;
+
+	err = mtk_sha_update_start(cryp, sha);
+	if (err != -EINPROGRESS)
+		mtk_sha_finish_req(cryp, sha, err);
+}
+
+static int mtk_sha_update(struct ahash_request *req)
+{
+	struct mtk_sha_reqctx *ctx = ahash_request_ctx(req);
+
+	ctx->total = req->nbytes;
+	ctx->sg = req->src;
+	ctx->offset = 0;
+
+	if ((ctx->bufcnt + ctx->total < SHA_BUFFER_LEN) &&
+	    !(ctx->flags & SHA_FLAGS_FINUP))
+		return mtk_sha_append_sg(ctx);
+
+	return mtk_sha_enqueue(req, SHA_OP_UPDATE);
+}
+
+static int mtk_sha_final(struct ahash_request *req)
+{
+	struct mtk_sha_reqctx *ctx = ahash_request_ctx(req);
+
+	ctx->flags |= SHA_FLAGS_FINUP;
+
+	if (ctx->flags & SHA_FLAGS_PAD)
+		return mtk_sha_finish(req);
+
+	return mtk_sha_enqueue(req, SHA_OP_FINAL);
+}
+
+static int mtk_sha_finup(struct ahash_request *req)
+{
+	struct mtk_sha_reqctx *ctx = ahash_request_ctx(req);
+	int err1, err2;
+
+	ctx->flags |= SHA_FLAGS_FINUP;
+
+	err1 = mtk_sha_update(req);
+	if (err1 == -EINPROGRESS || err1 == -EBUSY)
+		return err1;
+	/*
+	 * final() has to be always called to cleanup resources
+	 * even if update() failed
+	 */
+	err2 = mtk_sha_final(req);
+
+	return err1 ?: err2;
+}
+
+static int mtk_sha_digest(struct ahash_request *req)
+{
+	return mtk_sha_init(req) ?: mtk_sha_finup(req);
+}
+
+static int mtk_sha_setkey(struct crypto_ahash *tfm,
+			  const unsigned char *key, u32 keylen)
+{
+	struct mtk_sha_ctx *tctx = crypto_ahash_ctx(tfm);
+	struct mtk_sha_hmac_ctx *bctx = tctx->base;
+	size_t bs = crypto_shash_blocksize(bctx->shash);
+	size_t ds = crypto_shash_digestsize(bctx->shash);
+	int err, i;
+
+	SHASH_DESC_ON_STACK(shash, bctx->shash);
+
+	shash->tfm = bctx->shash;
+	shash->flags = crypto_shash_get_flags(bctx->shash) &
+			CRYPTO_TFM_REQ_MAY_SLEEP;
+
+	if (keylen > bs) {
+		err = crypto_shash_digest(shash, key, keylen, bctx->ipad);
+		if (err)
+			return err;
+		keylen = ds;
+	} else {
+		memcpy(bctx->ipad, key, keylen);
+	}
+
+	memset(bctx->ipad + keylen, 0, bs - keylen);
+	memcpy(bctx->opad, bctx->ipad, bs);
+
+	for (i = 0; i < bs; i++) {
+		bctx->ipad[i] ^= 0x36;
+		bctx->opad[i] ^= 0x5c;
+	}
+
+	return err;
+}
+
+static int mtk_sha_export(struct ahash_request *req, void *out)
+{
+	const struct mtk_sha_reqctx *ctx = ahash_request_ctx(req);
+
+	memcpy(out, ctx, sizeof(*ctx));
+	return 0;
+}
+
+static int mtk_sha_import(struct ahash_request *req, const void *in)
+{
+	struct mtk_sha_reqctx *ctx = ahash_request_ctx(req);
+
+	memcpy(ctx, in, sizeof(*ctx));
+	return 0;
+}
+
+static int mtk_sha_cra_init_alg(struct crypto_tfm *tfm,
+				const char *alg_base)
+{
+	struct mtk_sha_ctx *tctx = crypto_tfm_ctx(tfm);
+	struct mtk_cryp *cryp = NULL;
+
+	cryp = mtk_sha_find_dev(tctx);
+	if (!cryp)
+		return -ENODEV;
+
+	crypto_ahash_set_reqsize(__crypto_ahash_cast(tfm),
+				 sizeof(struct mtk_sha_reqctx));
+
+	if (alg_base) {
+		struct mtk_sha_hmac_ctx *bctx = tctx->base;
+
+		tctx->flags |= SHA_FLAGS_HMAC;
+		bctx->shash = crypto_alloc_shash(alg_base, 0,
+					CRYPTO_ALG_NEED_FALLBACK);
+		if (IS_ERR(bctx->shash)) {
+			pr_err("base driver %s could not be loaded.\n",
+			       alg_base);
+
+			return PTR_ERR(bctx->shash);
+		}
+	}
+	return 0;
+}
+
+static int mtk_sha_cra_init(struct crypto_tfm *tfm)
+{
+	return mtk_sha_cra_init_alg(tfm, NULL);
+}
+
+static int mtk_sha_cra_sha1_init(struct crypto_tfm *tfm)
+{
+	return mtk_sha_cra_init_alg(tfm, "sha1");
+}
+
+static int mtk_sha_cra_sha224_init(struct crypto_tfm *tfm)
+{
+	return mtk_sha_cra_init_alg(tfm, "sha224");
+}
+
+static int mtk_sha_cra_sha256_init(struct crypto_tfm *tfm)
+{
+	return mtk_sha_cra_init_alg(tfm, "sha256");
+}
+
+static int mtk_sha_cra_sha384_init(struct crypto_tfm *tfm)
+{
+	return mtk_sha_cra_init_alg(tfm, "sha384");
+}
+
+static int mtk_sha_cra_sha512_init(struct crypto_tfm *tfm)
+{
+	return mtk_sha_cra_init_alg(tfm, "sha512");
+}
+
+static void mtk_sha_cra_exit(struct crypto_tfm *tfm)
+{
+	struct mtk_sha_ctx *tctx = crypto_tfm_ctx(tfm);
+
+	if (tctx->flags & SHA_FLAGS_HMAC) {
+		struct mtk_sha_hmac_ctx *bctx = tctx->base;
+
+		crypto_free_shash(bctx->shash);
+	}
+}
+
+static struct ahash_alg algs_sha1_sha224_sha256[] = {
+{
+	.init		= mtk_sha_init,
+	.update		= mtk_sha_update,
+	.final		= mtk_sha_final,
+	.finup		= mtk_sha_finup,
+	.digest		= mtk_sha_digest,
+	.export		= mtk_sha_export,
+	.import		= mtk_sha_import,
+	.halg.digestsize	= SHA1_DIGEST_SIZE,
+	.halg.statesize = sizeof(struct mtk_sha_reqctx),
+	.halg.base	= {
+		.cra_name		= "sha1",
+		.cra_driver_name	= "mtk-sha1",
+		.cra_priority		= 400,
+		.cra_flags		= CRYPTO_ALG_ASYNC,
+		.cra_blocksize		= SHA1_BLOCK_SIZE,
+		.cra_ctxsize		= sizeof(struct mtk_sha_ctx),
+		.cra_alignmask		= SHA_ALIGN_MSK,
+		.cra_module		= THIS_MODULE,
+		.cra_init		= mtk_sha_cra_init,
+		.cra_exit		= mtk_sha_cra_exit,
+	}
+},
+{
+	.init		= mtk_sha_init,
+	.update		= mtk_sha_update,
+	.final		= mtk_sha_final,
+	.finup		= mtk_sha_finup,
+	.digest		= mtk_sha_digest,
+	.export		= mtk_sha_export,
+	.import		= mtk_sha_import,
+	.halg.digestsize	= SHA224_DIGEST_SIZE,
+	.halg.statesize = sizeof(struct mtk_sha_reqctx),
+	.halg.base	= {
+		.cra_name		= "sha224",
+		.cra_driver_name	= "mtk-sha224",
+		.cra_priority		= 400,
+		.cra_flags		= CRYPTO_ALG_ASYNC,
+		.cra_blocksize		= SHA224_BLOCK_SIZE,
+		.cra_ctxsize		= sizeof(struct mtk_sha_ctx),
+		.cra_alignmask		= SHA_ALIGN_MSK,
+		.cra_module		= THIS_MODULE,
+		.cra_init		= mtk_sha_cra_init,
+		.cra_exit		= mtk_sha_cra_exit,
+	}
+},
+{
+	.init		= mtk_sha_init,
+	.update		= mtk_sha_update,
+	.final		= mtk_sha_final,
+	.finup		= mtk_sha_finup,
+	.digest		= mtk_sha_digest,
+	.export		= mtk_sha_export,
+	.import		= mtk_sha_import,
+	.halg.digestsize	= SHA256_DIGEST_SIZE,
+	.halg.statesize = sizeof(struct mtk_sha_reqctx),
+	.halg.base	= {
+		.cra_name		= "sha256",
+		.cra_driver_name	= "mtk-sha256",
+		.cra_priority		= 400,
+		.cra_flags		= CRYPTO_ALG_ASYNC,
+		.cra_blocksize		= SHA256_BLOCK_SIZE,
+		.cra_ctxsize		= sizeof(struct mtk_sha_ctx),
+		.cra_alignmask		= SHA_ALIGN_MSK,
+		.cra_module		= THIS_MODULE,
+		.cra_init		= mtk_sha_cra_init,
+		.cra_exit		= mtk_sha_cra_exit,
+	}
+},
+{
+	.init		= mtk_sha_init,
+	.update		= mtk_sha_update,
+	.final		= mtk_sha_final,
+	.finup		= mtk_sha_finup,
+	.digest		= mtk_sha_digest,
+	.export		= mtk_sha_export,
+	.import		= mtk_sha_import,
+	.setkey		= mtk_sha_setkey,
+	.halg.digestsize	= SHA1_DIGEST_SIZE,
+	.halg.statesize = sizeof(struct mtk_sha_reqctx),
+	.halg.base	= {
+		.cra_name		= "hmac(sha1)",
+		.cra_driver_name	= "mtk-hmac-sha1",
+		.cra_priority		= 400,
+		.cra_flags		= CRYPTO_ALG_ASYNC |
+					  CRYPTO_ALG_NEED_FALLBACK,
+		.cra_blocksize		= SHA1_BLOCK_SIZE,
+		.cra_ctxsize		= sizeof(struct mtk_sha_ctx) +
+					sizeof(struct mtk_sha_hmac_ctx),
+		.cra_alignmask		= SHA_ALIGN_MSK,
+		.cra_module		= THIS_MODULE,
+		.cra_init		= mtk_sha_cra_sha1_init,
+		.cra_exit		= mtk_sha_cra_exit,
+	}
+},
+{
+	.init		= mtk_sha_init,
+	.update		= mtk_sha_update,
+	.final		= mtk_sha_final,
+	.finup		= mtk_sha_finup,
+	.digest		= mtk_sha_digest,
+	.export		= mtk_sha_export,
+	.import		= mtk_sha_import,
+	.setkey		= mtk_sha_setkey,
+	.halg.digestsize	= SHA224_DIGEST_SIZE,
+	.halg.statesize = sizeof(struct mtk_sha_reqctx),
+	.halg.base	= {
+		.cra_name		= "hmac(sha224)",
+		.cra_driver_name	= "mtk-hmac-sha224",
+		.cra_priority		= 400,
+		.cra_flags		= CRYPTO_ALG_ASYNC |
+					  CRYPTO_ALG_NEED_FALLBACK,
+		.cra_blocksize		= SHA224_BLOCK_SIZE,
+		.cra_ctxsize		= sizeof(struct mtk_sha_ctx) +
+					sizeof(struct mtk_sha_hmac_ctx),
+		.cra_alignmask		= SHA_ALIGN_MSK,
+		.cra_module		= THIS_MODULE,
+		.cra_init		= mtk_sha_cra_sha224_init,
+		.cra_exit		= mtk_sha_cra_exit,
+	}
+},
+{
+	.init		= mtk_sha_init,
+	.update		= mtk_sha_update,
+	.final		= mtk_sha_final,
+	.finup		= mtk_sha_finup,
+	.digest		= mtk_sha_digest,
+	.export		= mtk_sha_export,
+	.import		= mtk_sha_import,
+	.setkey		= mtk_sha_setkey,
+	.halg.digestsize	= SHA256_DIGEST_SIZE,
+	.halg.statesize = sizeof(struct mtk_sha_reqctx),
+	.halg.base	= {
+		.cra_name		= "hmac(sha256)",
+		.cra_driver_name	= "mtk-hmac-sha256",
+		.cra_priority		= 400,
+		.cra_flags		= CRYPTO_ALG_ASYNC |
+					  CRYPTO_ALG_NEED_FALLBACK,
+		.cra_blocksize		= SHA256_BLOCK_SIZE,
+		.cra_ctxsize		= sizeof(struct mtk_sha_ctx) +
+					sizeof(struct mtk_sha_hmac_ctx),
+		.cra_alignmask		= SHA_ALIGN_MSK,
+		.cra_module		= THIS_MODULE,
+		.cra_init		= mtk_sha_cra_sha256_init,
+		.cra_exit		= mtk_sha_cra_exit,
+	}
+},
+};
+
+static struct ahash_alg algs_sha384_sha512[] = {
+{
+	.init		= mtk_sha_init,
+	.update		= mtk_sha_update,
+	.final		= mtk_sha_final,
+	.finup		= mtk_sha_finup,
+	.digest		= mtk_sha_digest,
+	.export		= mtk_sha_export,
+	.import		= mtk_sha_import,
+	.halg.digestsize	= SHA384_DIGEST_SIZE,
+	.halg.statesize = sizeof(struct mtk_sha_reqctx),
+	.halg.base	= {
+		.cra_name		= "sha384",
+		.cra_driver_name	= "mtk-sha384",
+		.cra_priority		= 400,
+		.cra_flags		= CRYPTO_ALG_ASYNC,
+		.cra_blocksize		= SHA384_BLOCK_SIZE,
+		.cra_ctxsize		= sizeof(struct mtk_sha_ctx),
+		.cra_alignmask		= SHA_ALIGN_MSK,
+		.cra_module		= THIS_MODULE,
+		.cra_init		= mtk_sha_cra_init,
+		.cra_exit		= mtk_sha_cra_exit,
+	}
+},
+{
+	.init		= mtk_sha_init,
+	.update		= mtk_sha_update,
+	.final		= mtk_sha_final,
+	.finup		= mtk_sha_finup,
+	.digest		= mtk_sha_digest,
+	.export		= mtk_sha_export,
+	.import		= mtk_sha_import,
+	.halg.digestsize	= SHA512_DIGEST_SIZE,
+	.halg.statesize = sizeof(struct mtk_sha_reqctx),
+	.halg.base	= {
+		.cra_name		= "sha512",
+		.cra_driver_name	= "mtk-sha512",
+		.cra_priority		= 400,
+		.cra_flags		= CRYPTO_ALG_ASYNC,
+		.cra_blocksize		= SHA512_BLOCK_SIZE,
+		.cra_ctxsize		= sizeof(struct mtk_sha_ctx),
+		.cra_alignmask		= SHA_ALIGN_MSK,
+		.cra_module		= THIS_MODULE,
+		.cra_init		= mtk_sha_cra_init,
+		.cra_exit		= mtk_sha_cra_exit,
+	}
+},
+{
+	.init		= mtk_sha_init,
+	.update		= mtk_sha_update,
+	.final		= mtk_sha_final,
+	.finup		= mtk_sha_finup,
+	.digest		= mtk_sha_digest,
+	.export		= mtk_sha_export,
+	.import		= mtk_sha_import,
+	.setkey		= mtk_sha_setkey,
+	.halg.digestsize	= SHA384_DIGEST_SIZE,
+	.halg.statesize = sizeof(struct mtk_sha_reqctx),
+	.halg.base	= {
+		.cra_name		= "hmac(sha384)",
+		.cra_driver_name	= "mtk-hmac-sha384",
+		.cra_priority		= 400,
+		.cra_flags		= CRYPTO_ALG_ASYNC |
+					  CRYPTO_ALG_NEED_FALLBACK,
+		.cra_blocksize		= SHA384_BLOCK_SIZE,
+		.cra_ctxsize		= sizeof(struct mtk_sha_ctx) +
+					sizeof(struct mtk_sha_hmac_ctx),
+		.cra_alignmask		= SHA_ALIGN_MSK,
+		.cra_module		= THIS_MODULE,
+		.cra_init		= mtk_sha_cra_sha384_init,
+		.cra_exit		= mtk_sha_cra_exit,
+	}
+},
+{
+	.init		= mtk_sha_init,
+	.update		= mtk_sha_update,
+	.final		= mtk_sha_final,
+	.finup		= mtk_sha_finup,
+	.digest		= mtk_sha_digest,
+	.export		= mtk_sha_export,
+	.import		= mtk_sha_import,
+	.setkey		= mtk_sha_setkey,
+	.halg.digestsize	= SHA512_DIGEST_SIZE,
+	.halg.statesize = sizeof(struct mtk_sha_reqctx),
+	.halg.base	= {
+		.cra_name		= "hmac(sha512)",
+		.cra_driver_name	= "mtk-hmac-sha512",
+		.cra_priority		= 400,
+		.cra_flags		= CRYPTO_ALG_ASYNC |
+					  CRYPTO_ALG_NEED_FALLBACK,
+		.cra_blocksize		= SHA512_BLOCK_SIZE,
+		.cra_ctxsize		= sizeof(struct mtk_sha_ctx) +
+					sizeof(struct mtk_sha_hmac_ctx),
+		.cra_alignmask		= SHA_ALIGN_MSK,
+		.cra_module		= THIS_MODULE,
+		.cra_init		= mtk_sha_cra_sha512_init,
+		.cra_exit		= mtk_sha_cra_exit,
+	}
+},
+};
+
+static void mtk_sha_task0(unsigned long data)
+{
+	struct mtk_cryp *cryp = (struct mtk_cryp *)data;
+	struct mtk_sha *sha = cryp->sha[0];
+
+	mtk_sha_unmap(cryp, sha);
+	mtk_sha_complete(cryp, sha);
+}
+
+static void mtk_sha_task1(unsigned long data)
+{
+	struct mtk_cryp *cryp = (struct mtk_cryp *)data;
+	struct mtk_sha *sha = cryp->sha[1];
+
+	mtk_sha_unmap(cryp, sha);
+	mtk_sha_complete(cryp, sha);
+}
+
+static irqreturn_t mtk_sha_ring2_irq(int irq, void *dev_id)
+{
+	struct mtk_cryp *cryp = (struct mtk_cryp *)dev_id;
+	struct mtk_sha *sha = cryp->sha[0];
+	u32 val = mtk_sha_read(cryp, RDR_STAT(RING2));
+
+	mtk_sha_write(cryp, RDR_STAT(RING2), val);
+
+	if (likely((SHA_FLAGS_BUSY & sha->flags))) {
+		mtk_sha_write(cryp, RDR_PROC_COUNT(RING2), MTK_DESC_CNT_CLR);
+		mtk_sha_write(cryp, RDR_THRESH(RING2), MTK_RDR_THRESH_DEF);
+
+		tasklet_schedule(&sha->task);
+	} else {
+		dev_warn(cryp->dev, "AES interrupt when no active requests.\n");
+	}
+	return IRQ_HANDLED;
+}
+
+static irqreturn_t mtk_sha_ring3_irq(int irq, void *dev_id)
+{
+	struct mtk_cryp *cryp = (struct mtk_cryp *)dev_id;
+	struct mtk_sha *sha = cryp->sha[1];
+	u32 val = mtk_sha_read(cryp, RDR_STAT(RING3));
+
+	mtk_sha_write(cryp, RDR_STAT(RING3), val);
+
+	if (likely((SHA_FLAGS_BUSY & sha->flags))) {
+		mtk_sha_write(cryp, RDR_PROC_COUNT(RING3), MTK_DESC_CNT_CLR);
+		mtk_sha_write(cryp, RDR_THRESH(RING3), MTK_RDR_THRESH_DEF);
+
+		tasklet_schedule(&sha->task);
+	} else {
+		dev_warn(cryp->dev, "AES interrupt when no active requests.\n");
+	}
+	return IRQ_HANDLED;
+}
+
+/*
+ * The purpose of two SHA records is used to get extra performance.
+ * It is similar to mtk_aes_record_init().
+ */
+static int mtk_sha_record_init(struct mtk_cryp *cryp)
+{
+	struct mtk_sha **sha = cryp->sha;
+	int i, err = -ENOMEM;
+
+	for (i = 0; i < RECORD_NUM; i++) {
+		sha[i] = kzalloc(sizeof(**sha), GFP_KERNEL);
+		if (!sha[i])
+			goto err_cleanup;
+
+		sha[i]->id = i + RING2;
+
+		spin_lock_init(&sha[i]->lock);
+		crypto_init_queue(&sha[i]->queue, SHA_QUEUE_SIZE);
+	}
+
+	tasklet_init(&sha[0]->task, mtk_sha_task0, (unsigned long)cryp);
+	tasklet_init(&sha[1]->task, mtk_sha_task1, (unsigned long)cryp);
+
+	cryp->rec = 1;
+
+	return 0;
+
+err_cleanup:
+	for (; i--; )
+		kfree(sha[i]);
+	return err;
+}
+
+static void mtk_sha_record_free(struct mtk_cryp *cryp)
+{
+	int i;
+
+	for (i = 0; i < RECORD_NUM; i++) {
+		tasklet_kill(&cryp->sha[i]->task);
+		kfree(cryp->sha[i]);
+	}
+}
+
+static void mtk_sha_unregister_algs(void)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(algs_sha1_sha224_sha256); i++)
+		crypto_unregister_ahash(&algs_sha1_sha224_sha256[i]);
+
+	for (i = 0; i < ARRAY_SIZE(algs_sha384_sha512); i++)
+		crypto_unregister_ahash(&algs_sha384_sha512[i]);
+}
+
+static int mtk_sha_register_algs(void)
+{
+	int err, i;
+
+	for (i = 0; i < ARRAY_SIZE(algs_sha1_sha224_sha256); i++) {
+		err = crypto_register_ahash(&algs_sha1_sha224_sha256[i]);
+		if (err)
+			goto err_sha_224_256_algs;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(algs_sha384_sha512); i++) {
+		err = crypto_register_ahash(&algs_sha384_sha512[i]);
+		if (err)
+			goto err_sha_384_512_algs;
+	}
+
+	return 0;
+
+err_sha_384_512_algs:
+	for (; i--; )
+		crypto_unregister_ahash(&algs_sha384_sha512[i]);
+	i = ARRAY_SIZE(algs_sha1_sha224_sha256);
+err_sha_224_256_algs:
+	for (; i--; )
+		crypto_unregister_ahash(&algs_sha1_sha224_sha256[i]);
+
+	return err;
+}
+
+int mtk_hash_alg_register(struct mtk_cryp *cryp)
+{
+	int err;
+
+	INIT_LIST_HEAD(&cryp->sha_list);
+
+	/* Initialize two hash records */
+	err = mtk_sha_record_init(cryp);
+	if (err)
+		goto err_record;
+
+	/* Ring2 irq is use by SHA record0 */
+	err = devm_request_irq(cryp->dev, cryp->irq[RING2],
+			       mtk_sha_ring2_irq, IRQF_TRIGGER_LOW,
+			       "mtk-sha", cryp);
+	if (err) {
+		dev_err(cryp->dev, "unable to request sha irq0.\n");
+		goto err_res;
+	}
+
+	/* Ring3 irq is use by SHA record1 */
+	err = devm_request_irq(cryp->dev, cryp->irq[RING3],
+			       mtk_sha_ring3_irq, IRQF_TRIGGER_LOW,
+			       "mtk-sha", cryp);
+	if (err) {
+		dev_err(cryp->dev, "unable to request sha irq1.\n");
+		goto err_res;
+	}
+
+	/* enable ring2 and ring3 interrupt for hash */
+	mtk_sha_write(cryp, AIC_ENABLE_SET(RING2), MTK_IRQ_RDR2);
+	mtk_sha_write(cryp, AIC_ENABLE_SET(RING3), MTK_IRQ_RDR3);
+
+	cryp->tmp = dma_alloc_coherent(cryp->dev, SHA_TMP_STATE_SIZE,
+					&cryp->tmp_dma, GFP_KERNEL);
+	if (!cryp->tmp) {
+		dev_err(cryp->dev, "unable to allocate tmp buffer.\n");
+		err = -EINVAL;
+		goto err_res;
+	}
+
+	spin_lock(&mtk_sha.lock);
+	list_add_tail(&cryp->sha_list, &mtk_sha.dev_list);
+	spin_unlock(&mtk_sha.lock);
+
+	err = mtk_sha_register_algs();
+	if (err)
+		goto err_algs;
+
+	return 0;
+
+err_algs:
+	spin_lock(&mtk_sha.lock);
+	list_del(&cryp->sha_list);
+	spin_unlock(&mtk_sha.lock);
+	dma_free_coherent(cryp->dev, SHA_TMP_STATE_SIZE,
+			  cryp->tmp, cryp->tmp_dma);
+err_res:
+	mtk_sha_record_free(cryp);
+err_record:
+
+	dev_err(cryp->dev, "mtk-sha initialization failed.\n");
+	return err;
+}
+
+void mtk_hash_alg_release(struct mtk_cryp *cryp)
+{
+	spin_lock(&mtk_sha.lock);
+	list_del(&cryp->sha_list);
+	spin_unlock(&mtk_sha.lock);
+
+	mtk_sha_unregister_algs();
+	dma_free_coherent(cryp->dev, SHA_TMP_STATE_SIZE,
+			  cryp->tmp, cryp->tmp_dma);
+	mtk_sha_record_free(cryp);
+}
-- 
1.9.1

^ permalink raw reply related

* [PATCH v1 0/2] Add MediaTek crypto acclelrator driver
From: Ryder Lee @ 2016-12-05  7:01 UTC (permalink / raw)
  To: Herbert Xu, David S. Miller, Matthias Brugger
  Cc: devicetree, linux-mediatek, linux-kernel, linux-crypto,
	linux-arm-kernel, Sean Wang, Roy Luo, Ryder Lee

Hello,

This adds support for the MediaTek hardware accelerator on 
mt7623 SoC.

This driver currently implement: 
- SHA1 and SHA2 family(HMAC) hash alogrithms.
- AES block cipher in CBC/ECB mode with 128/196/256 bits keys.

Changes since v1:
- remove EXPORT_SYMBOL
- remove unused PRNG setting
- sort headers in alphabetical order
- add a definition for IRQ unmber
- replace ambiguous definition
- add more annotation and function comment
- add COMPILE_TEST in Kconfig


Ryder Lee (2):
  Add crypto driver support for some MediaTek chips
  crypto: mediatek - add DT bindings documentation

 .../devicetree/bindings/crypto/mediatek-crypto.txt |   32 +
 drivers/crypto/Kconfig                             |   17 +
 drivers/crypto/Makefile                            |    1 +
 drivers/crypto/mediatek/Makefile                   |    2 +
 drivers/crypto/mediatek/mtk-aes.c                  |  763 +++++++++++
 drivers/crypto/mediatek/mtk-platform.c             |  580 ++++++++
 drivers/crypto/mediatek/mtk-platform.h             |  235 ++++
 drivers/crypto/mediatek/mtk-regs.h                 |  194 +++
 drivers/crypto/mediatek/mtk-sha.c                  | 1423 ++++++++++++++++++++
 9 files changed, 3247 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/crypto/mediatek-crypto.txt
 create mode 100644 drivers/crypto/mediatek/Makefile
 create mode 100644 drivers/crypto/mediatek/mtk-aes.c
 create mode 100644 drivers/crypto/mediatek/mtk-platform.c
 create mode 100644 drivers/crypto/mediatek/mtk-platform.h
 create mode 100644 drivers/crypto/mediatek/mtk-regs.h
 create mode 100644 drivers/crypto/mediatek/mtk-sha.c

-- 
1.9.1

^ permalink raw reply

* Re: [PATCH] crypto: rsa - fix a potential race condition in build
From: Herbert Xu @ 2016-12-05  6:48 UTC (permalink / raw)
  To: Yang Shi; +Cc: davem, linux-crypto, linux-kernel
In-Reply-To: <1480722064-31714-1-git-send-email-yang.shi@windriver.com>

On Fri, Dec 02, 2016 at 03:41:04PM -0800, Yang Shi wrote:
> When building kernel with RSA enabled with multithreaded, the below
> compile failure might be caught:
> 
> | /buildarea/kernel-source/crypto/rsa_helper.c:18:28: fatal error: rsapubkey-asn1.h: No such file or directory
> | #include "rsapubkey-asn1.h"
> | ^
> | compilation terminated.
> | CC crypto/rsa-pkcs1pad.o
> | CC crypto/algboss.o
> | CC crypto/testmgr.o
> | make[3]: *** [/buildarea/kernel-source/scripts/Makefile.build:289: crypto/rsa_helper.o] Error 1
> | make[3]: *** Waiting for unfinished jobs....
> | make[2]: *** [/buildarea/kernel-source/Makefile:969: crypto] Error 2
> | make[1]: *** [Makefile:150: sub-make] Error 2
> | make: *** [Makefile:24: __sub-make] Error 2
> 
> The header file is not generated before rsa_helper is compiled, so
> adding dependency to avoid such issue.
> 
> Signed-off-by: Yang Shi <yang.shi@windriver.com>

This should already be fixed in the latest crypto tree.  Could
you please double-check?

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Crypto Fixes for 4.9
From: Herbert Xu @ 2016-12-05  6:37 UTC (permalink / raw)
  To: Linus Torvalds, David S. Miller, Linux Kernel Mailing List,
	Linux Crypto Mailing List
In-Reply-To: <20161119102748.GA4277@gondor.apana.org.au>

Hi Linus:

This push fixes the following issues:

- Intermittent build failure in RSA.
- Memory corruption in chelsio crypto driver.
- Regression in DRBG due to vmalloced stack.


Please pull from

git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6.git linus


David Michael (1):
      crypto: rsa - Add Makefile dependencies to fix parallel builds

Harsh Jain (1):
      crypto: chcr - Fix memory corruption

Stephan Mueller (1):
      crypto: drbg - prevent invalid SG mappings

 crypto/Makefile                    |    1 +
 crypto/drbg.c                      |   29 ++++++++++++++++++++++++-----
 drivers/crypto/chelsio/chcr_algo.h |    3 ++-
 include/crypto/drbg.h              |    2 ++
 4 files changed, 29 insertions(+), 6 deletions(-)

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* RE: [PATCH v5 1/1] crypto: add virtio-crypto driver
From: Gonglei (Arei) @ 2016-12-05  3:12 UTC (permalink / raw)
  To: kbuild test robot, sam@ravnborg.org, davem@davemloft.net
  Cc: kbuild-all@01.org, linux-kernel@vger.kernel.org,
	qemu-devel@nongnu.org, virtio-dev@lists.oasis-open.org,
	virtualization@lists.linux-foundation.org,
	linux-crypto@vger.kernel.org, Luonengjun, mst@redhat.com,
	stefanha@redhat.com, Huangweidong (C), Wubin (H),
	xin.zeng@intel.com, Claudio Fontana, herbert@gondor.apana.org.au,
	pasic@linux.vnet.ibm.com
In-Reply-To: <201612041032.loAEWLIy%fengguang.wu@intel.com>

I don't think the root cause of those warnings are introduced by virtio-crypto driver.

What's your opinion? Sam and David?

Thanks,
-Gonglei


> -----Original Message-----
> From: kbuild test robot [mailto:lkp@intel.com]
> Sent: Sunday, December 04, 2016 10:40 AM
> Subject: Re: [PATCH v5 1/1] crypto: add virtio-crypto driver
> 
> Hi Gonglei,
> 
> [auto build test ERROR on cryptodev/master]
> [also build test ERROR on v4.9-rc7 next-20161202]
> [if your patch is applied to the wrong git tree, please drop us a note to help
> improve the system]
> 
> url:
> https://github.com/0day-ci/linux/commits/Gonglei/crypto-add-virtio-crypto-dri
> ver/20161202-190424
> base:
> https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git
> master
> config: sparc64-allyesconfig (attached as .config)
> compiler: sparc64-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
> reproduce:
>         wget
> https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cr
> oss -O ~/bin/make.cross
>         chmod +x ~/bin/make.cross
>         # save the attached .config to linux build tree
>         make.cross ARCH=sparc64
> 
> All errors (new ones prefixed by >>):
> 
>    In file included from arch/sparc/include/asm/topology.h:4:0,
>                     from include/linux/topology.h:35,
>                     from include/linux/gfp.h:8,
>                     from include/linux/kmod.h:22,
>                     from include/linux/module.h:13,
>                     from drivers/crypto/virtio/virtio_crypto_mgr.c:21:
>    drivers/crypto/virtio/virtio_crypto_common.h: In function
> 'virtio_crypto_get_current_node':
> >> arch/sparc/include/asm/topology_64.h:44:44: error: implicit declaration of
> function 'cpu_data' [-Werror=implicit-function-declaration]
>     #define topology_physical_package_id(cpu) (cpu_data(cpu).proc_id)
>                                                ^
>    drivers/crypto/virtio/virtio_crypto_common.h:116:9: note: in expansion of
> macro 'topology_physical_package_id'
>      return topology_physical_package_id(smp_processor_id());
>             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >> arch/sparc/include/asm/topology_64.h:44:57: error: request for member
> 'proc_id' in something not a structure or union
>     #define topology_physical_package_id(cpu) (cpu_data(cpu).proc_id)
>                                                             ^
>    drivers/crypto/virtio/virtio_crypto_common.h:116:9: note: in expansion of
> macro 'topology_physical_package_id'
>      return topology_physical_package_id(smp_processor_id());
>             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
>    cc1: some warnings being treated as errors
> 
> vim +/cpu_data +44 arch/sparc/include/asm/topology_64.h
> 
> f5e706ad include/asm-sparc/topology_64.h      Sam Ravnborg
> 2008-07-17  28
> 9d079337 arch/sparc/include/asm/topology_64.h David Miller
> 2009-01-11  29  #define cpumask_of_pcibus(bus)	\
> 9d079337 arch/sparc/include/asm/topology_64.h David Miller
> 2009-01-11  30  	(pcibus_to_node(bus) == -1 ? \
> e9b37512 arch/sparc/include/asm/topology_64.h Rusty Russell
> 2009-03-16  31  	 cpu_all_mask : \
> 9d079337 arch/sparc/include/asm/topology_64.h David Miller
> 2009-01-11  32  	 cpumask_of_node(pcibus_to_node(bus)))
> f5e706ad include/asm-sparc/topology_64.h      Sam Ravnborg
> 2008-07-17  33
> 52708d69 arch/sparc/include/asm/topology_64.h Nitin Gupta
> 2015-11-02  34  int __node_distance(int, int);
> 52708d69 arch/sparc/include/asm/topology_64.h Nitin Gupta
> 2015-11-02  35  #define node_distance(a, b) __node_distance(a, b)
> 52708d69 arch/sparc/include/asm/topology_64.h Nitin Gupta
> 2015-11-02  36
> f5e706ad include/asm-sparc/topology_64.h      Sam Ravnborg
> 2008-07-17  37  #else /* CONFIG_NUMA */
> f5e706ad include/asm-sparc/topology_64.h      Sam Ravnborg
> 2008-07-17  38
> f5e706ad include/asm-sparc/topology_64.h      Sam Ravnborg
> 2008-07-17  39  #include <asm-generic/topology.h>
> f5e706ad include/asm-sparc/topology_64.h      Sam Ravnborg
> 2008-07-17  40
> f5e706ad include/asm-sparc/topology_64.h      Sam Ravnborg
> 2008-07-17  41  #endif /* !(CONFIG_NUMA) */
> f5e706ad include/asm-sparc/topology_64.h      Sam Ravnborg
> 2008-07-17  42
> f5e706ad include/asm-sparc/topology_64.h      Sam Ravnborg
> 2008-07-17  43  #ifdef CONFIG_SMP
> f5e706ad include/asm-sparc/topology_64.h      Sam Ravnborg
> 2008-07-17 @44  #define topology_physical_package_id(cpu)
> 	(cpu_data(cpu).proc_id)
> f5e706ad include/asm-sparc/topology_64.h      Sam Ravnborg
> 2008-07-17  45  #define topology_core_id(cpu)
> 	(cpu_data(cpu).core_id)
> acc455cf arch/sparc/include/asm/topology_64.h chris hyser
> 2015-04-22  46  #define topology_core_cpumask(cpu)
> 	(&cpu_core_sib_map[cpu])
> 06931e62 arch/sparc/include/asm/topology_64.h Bartosz Golaszewski
> 2015-05-26  47  #define topology_sibling_cpumask(cpu)
> 	(&per_cpu(cpu_sibling_map, cpu))
> f5e706ad include/asm-sparc/topology_64.h      Sam Ravnborg
> 2008-07-17  48  #endif /* CONFIG_SMP */
> f5e706ad include/asm-sparc/topology_64.h      Sam Ravnborg
> 2008-07-17  49
> 3905c54f arch/sparc/include/asm/topology_64.h Stephen Rothwell
> 2011-04-12  50  extern cpumask_t cpu_core_map[NR_CPUS];
> acc455cf arch/sparc/include/asm/topology_64.h chris hyser
> 2015-04-22  51  extern cpumask_t cpu_core_sib_map[NR_CPUS];
> 3905c54f arch/sparc/include/asm/topology_64.h Stephen Rothwell
> 2011-04-12  52  static inline const struct cpumask *cpu_coregroup_mask(int
> cpu)
> 
> :::::: The code at line 44 was first introduced by commit
> :::::: f5e706ad886b6a5eb59637830110b09ccebf01c5 sparc: join the remaining
> header files
> 
> :::::: TO: Sam Ravnborg <sam@ravnborg.org>
> :::::: CC: David S. Miller <davem@davemloft.net>
> 
> ---
> 0-DAY kernel test infrastructure                Open Source Technology
> Center
> https://lists.01.org/pipermail/kbuild-all                   Intel
> Corporation

^ permalink raw reply

* Re: [PATCH v2 2/3] crypto: brcm: Add Broadcom SPU driver
From: Raveendra Padasalagi @ 2016-12-04 17:56 UTC (permalink / raw)
  To: Rob Rice
  Cc: Herbert Xu, David S. Miller, Rob Herring, Mark Rutland,
	linux-crypto, devicetree, linux-kernel, Ray Jui, Scott Branden,
	Jon Mason, bcm-kernel-feedback-list, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Steve Lin
In-Reply-To: <1480714499-1476-3-git-send-email-rob.rice@broadcom.com>

Hi Rob,

For all HMAC implementation of hash algorithms, I could see driver
computes inner and outer hash using software implementation of hash
but Kconfig is missing selection of these algorithms by default, due to
which driver fails to allocate the algorithms.

So please select by default in Kconfig all the supported hash algorithms.

-Raveendra

On Sat, Dec 3, 2016 at 3:04 AM, Rob Rice <rob.rice@broadcom.com> wrote:
> Add Broadcom Secure Processing Unit (SPU) crypto driver for SPU
> hardware crypto offload. The driver supports ablkcipher, ahash,
> and aead symmetric crypto operations.
>
> Signed-off-by: Steve Lin <steven.lin1@broadcom.com>
> Signed-off-by: Rob Rice <rob.rice@broadcom.com>
> ---
>  drivers/crypto/Kconfig      |   11 +
>  drivers/crypto/Makefile     |    1 +
>  drivers/crypto/bcm/Makefile |   15 +
>  drivers/crypto/bcm/cipher.c | 4943 +++++++++++++++++++++++++++++++++++++++++++
>  drivers/crypto/bcm/cipher.h |  472 +++++
>  drivers/crypto/bcm/spu.c    | 1252 +++++++++++
>  drivers/crypto/bcm/spu.h    |  288 +++
>  drivers/crypto/bcm/spu2.c   | 1402 ++++++++++++
>  drivers/crypto/bcm/spu2.h   |  228 ++
>  drivers/crypto/bcm/spum.h   |  174 ++
>  drivers/crypto/bcm/util.c   |  584 +++++
>  drivers/crypto/bcm/util.h   |  117 +
>  12 files changed, 9487 insertions(+)
>  create mode 100644 drivers/crypto/bcm/Makefile
>  create mode 100644 drivers/crypto/bcm/cipher.c
>  create mode 100644 drivers/crypto/bcm/cipher.h
>  create mode 100644 drivers/crypto/bcm/spu.c
>  create mode 100644 drivers/crypto/bcm/spu.h
>  create mode 100644 drivers/crypto/bcm/spu2.c
>  create mode 100644 drivers/crypto/bcm/spu2.h
>  create mode 100644 drivers/crypto/bcm/spum.h
>  create mode 100644 drivers/crypto/bcm/util.c
>  create mode 100644 drivers/crypto/bcm/util.h
>
> diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
> index 4d2b81f..dd870ec 100644
> --- a/drivers/crypto/Kconfig
> +++ b/drivers/crypto/Kconfig
> @@ -555,4 +555,15 @@ config CRYPTO_DEV_ROCKCHIP
>
>  source "drivers/crypto/chelsio/Kconfig"
>
> +config CRYPTO_DEV_BCM_SPU
> +       tristate "Broadcom symmetric crypto/hash acceleration support"
> +       depends on ARCH_BCM_IPROC
> +       depends on BCM_PDC_MBOX
> +       default m
> +       select CRYPTO_DES

Add "select MD5",  "select SHA1" etc for all of the algorithms which
driver is allocating/using software implementation.

> +       help
> +       This driver provides support for Broadcom crypto acceleration using the
> +       Secure Processing Unit (SPU). The SPU driver registers ablkcipher,
> +       ahash, and aead algorithms with the kernel cryptographic API.
> +
>  endif # CRYPTO_HW
> diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
> index ad7250f..2702650 100644
> --- a/drivers/crypto/Makefile
> +++ b/drivers/crypto/Makefile
> @@ -32,3 +32,4 @@ obj-$(CONFIG_CRYPTO_DEV_VMX) += vmx/
>  obj-$(CONFIG_CRYPTO_DEV_SUN4I_SS) += sunxi-ss/
>  obj-$(CONFIG_CRYPTO_DEV_ROCKCHIP) += rockchip/
>  obj-$(CONFIG_CRYPTO_DEV_CHELSIO) += chelsio/
> +obj-$(CONFIG_CRYPTO_DEV_BCM_SPU) += bcm/
> diff --git a/drivers/crypto/bcm/Makefile b/drivers/crypto/bcm/Makefile
> new file mode 100644
> index 0000000..13cb80e
> --- /dev/null
> +++ b/drivers/crypto/bcm/Makefile
> @@ -0,0 +1,15 @@
> +# File: drivers/crypto/bcm/Makefile
> +#
> +# Makefile for crypto acceleration files for Broadcom SPU driver
> +#
> +# Uncomment to enable debug tracing in the SPU driver.
> +# CFLAGS_util.o := -DDEBUG
> +# CFLAGS_cipher.o := -DDEBUG
> +# CFLAGS_spu.o := -DDEBUG
> +# CFLAGS_spu2.o := -DDEBUG
> +
> +obj-$(CONFIG_CRYPTO_DEV_BCM_SPU) := bcm_crypto_spu.o
> +
> +bcm_crypto_spu-objs :=  util.o spu.o spu2.o cipher.o
> +
> +ccflags-y += -I. -DBCMDRIVER
> diff --git a/drivers/crypto/bcm/cipher.c b/drivers/crypto/bcm/cipher.c
> new file mode 100644
> index 0000000..f6bbb06
> --- /dev/null
> +++ b/drivers/crypto/bcm/cipher.c
> @@ -0,0 +1,4943 @@
> +/*
> + * Copyright 2016 Broadcom
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License, version 2, as
> + * published by the Free Software Foundation (the "GPL").
> + *
> + * This program is distributed in the hope that it will be useful, but
> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License version 2 (GPLv2) for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * version 2 (GPLv2) along with this source code.
> + */
> +
> +#include <linux/err.h>
> +#include <linux/module.h>
> +#include <linux/init.h>
> +#include <linux/errno.h>
> +#include <linux/kernel.h>
> +#include <linux/interrupt.h>
> +#include <linux/platform_device.h>
> +#include <linux/scatterlist.h>
> +#include <linux/crypto.h>
> +#include <linux/kthread.h>
> +#include <linux/rtnetlink.h>
> +#include <linux/sched.h>
> +#include <linux/of_address.h>
> +#include <linux/of_device.h>
> +#include <linux/io.h>
> +#include <linux/bitops.h>
> +
> +#include <crypto/algapi.h>
> +#include <crypto/aead.h>
> +#include <crypto/internal/aead.h>
> +#include <crypto/aes.h>
> +#include <crypto/des.h>
> +#include <crypto/sha.h>
> +#include <crypto/md5.h>
> +#include <crypto/authenc.h>
> +#include <crypto/skcipher.h>
> +#include <crypto/hash.h>
> +#include <crypto/aes.h>
> +#include <crypto/sha3.h>
> +
> +#include "util.h"
> +#include "cipher.h"
> +#include "spu.h"
> +#include "spum.h"
> +#include "spu2.h"
> +
> +/* ================= Device Structure ================== */
> +
> +struct device_private iproc_priv;
> +
> +/* ==================== Parameters ===================== */
> +
> +int flow_debug_logging;
> +int packet_debug_logging;
> +int debug_logging_sleep;
> +
> +module_param(flow_debug_logging, int, 0644);
> +MODULE_PARM_DESC(flow_debug_logging, "Enable Flow Debug Logging");
> +
> +module_param(packet_debug_logging, int, 0644);
> +MODULE_PARM_DESC(packet_debug_logging, "Enable Packet Debug Logging");
> +
> +module_param(debug_logging_sleep, int, 0644);
> +MODULE_PARM_DESC(debug_logging_sleep, "Packet Debug Logging Sleep");
> +
> +/*
> + * The value of these module parameters is used to set the priority for each
> + * algo type when this driver registers algos with the kernel crypto API.
> + * To use a priority other than the default, set the priority in the insmod or
> + * modprobe. Changing the module priority after init time has no effect.
> + *
> + * The default priorities are chosen to be lower (less preferred) than ARMv8 CE
> + * algos, but more preferred than generic software algos.
> + */
> +static int cipher_pri = 150;
> +static int hash_pri = 100;
> +static int aead_pri = 150;
> +
> +module_param(cipher_pri, int, 0644);
> +MODULE_PARM_DESC(cipher_pri, "Priority for cipher algos");
> +module_param(hash_pri, int, 0644);
> +MODULE_PARM_DESC(hash_pri, "Priority for hash algos");
> +module_param(aead_pri, int, 0644);
> +MODULE_PARM_DESC(aead_pri, "Priority for AEAD algos");
> +
> +#define MAX_SPUS 16
> +
> +/* A type 3 BCM header, expected to precede the SPU header for SPU-M.
> + * Bits 3 and 4 in the first byte encode the channel number (the dma ringset).
> + * 0x60 - ring 0
> + * 0x68 - ring 1
> + * 0x70 - ring 2
> + * 0x78 - ring 3
> + */
> +char BCMHEADER[] = { 0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x28 };
> +/*
> + * Some SPU hw does not use BCM header on SPU messages. So BCM_HDR_LEN
> + * is set dynamically after reading SPU type from device tree.
> + */
> +#define BCM_HDR_LEN  iproc_priv.bcm_hdr_len
> +
> +/* min and max time to sleep before retrying when mbox queue is full. usec */
> +#define MBOX_SLEEP_MIN  800
> +#define MBOX_SLEEP_MAX 1000
> +
> +static void handle_ablkcipher_resp(struct iproc_reqctx_s *rctx);
> +static void handle_ahash_resp(struct iproc_reqctx_s *rctx);
> +static int ahash_req_done(struct iproc_reqctx_s *rctx);
> +static void handle_aead_resp(struct iproc_reqctx_s *rctx);
> +
> +/**
> + * select_channel() - Select a SPU channel to handle a crypto request. Selects
> + * channel in round robin order.
> + *
> + * Return:  channel index
> + */
> +static u8 select_channel(void)
> +{
> +       u8 chan_idx = atomic_inc_return(&iproc_priv.next_chan);
> +
> +       return chan_idx % iproc_priv.spu.num_chan;
> +}
> +
> +/**
> + * spu_ablkcipher_rx_sg_create() - Build up the scatterlist of buffers used to
> + * receive a SPU response message for an ablkcipher request. Includes buffers to
> + * catch SPU message headers and the response data.
> + * @mssg:      mailbox message containing the receive sg
> + * @rctx:      crypto request context
> + * @rx_frag_num: number of scatterlist elements required to hold the
> + *             SPU response message
> + * @chunksize: Number of bytes of response data expected
> + * @stat_pad_len: Number of bytes required to pad the STAT field to
> + *             a 4-byte boundary
> + * Returns:
> + *   0 if successful
> + *   < 0 if an error
> + */
> +static int
> +spu_ablkcipher_rx_sg_create(struct brcm_message *mssg,
> +                           struct iproc_reqctx_s *rctx,
> +                           u8 rx_frag_num,
> +                           unsigned int chunksize, u32 stat_pad_len)
> +{
> +       struct device *dev = &iproc_priv.pdev->dev;
> +       struct spu_hw *spu = &iproc_priv.spu;
> +       struct scatterlist *sg; /* used to build sgs in mbox message */
> +       struct iproc_ctx_s *ctx = rctx->ctx;
> +       u32 datalen;            /* Number of bytes of response data expected */
> +
> +       mssg->spu.dst = kcalloc(rx_frag_num, sizeof(struct scatterlist),
> +                               rctx->gfp);
> +       if (!mssg->spu.dst)
> +               return -ENOMEM;
> +
> +       sg = mssg->spu.dst;
> +       sg_init_table(sg, rx_frag_num);
> +       /* Space for SPU message header */
> +       sg_set_buf(sg++, rctx->msg_buf.spu_resp_hdr, ctx->spu_resp_hdr_len);
> +
> +       /* If XTS tweak in payload, add buffer to receive encrypted tweak */
> +       if ((ctx->cipher.mode == CIPHER_MODE_XTS) &&
> +           spu->spu_xts_tweak_in_payload())
> +               sg_set_buf(sg++, rctx->msg_buf.c.supdt_tweak,
> +                          SPU_XTS_TWEAK_SIZE);
> +
> +       /* Copy in each dst sg entry from request, up to chunksize */
> +       datalen = spu_msg_sg_add(&sg, &rctx->dst_sg, &rctx->dst_skip,
> +                                rctx->dst_nents, chunksize);
> +       if (datalen < chunksize) {
> +               dev_err(dev,
> +                       "%s(): failed to copy dst sg to mbox msg. chunksize %u, datalen %u",
> +                       __func__, chunksize, datalen);
> +               return -EFAULT;
> +       }
> +
> +       if (ctx->cipher.alg == CIPHER_ALG_RC4)
> +               /* Add buffer to catch 260-byte SUPDT field for RC4 */
> +               sg_set_buf(sg++, rctx->msg_buf.c.supdt_tweak, SPU_SUPDT_LEN);
> +
> +       if (stat_pad_len)
> +               sg_set_buf(sg++, rctx->msg_buf.rx_stat_pad, stat_pad_len);
> +
> +       memset(rctx->msg_buf.rx_stat, 0, SPU_RX_STATUS_LEN);
> +       sg_set_buf(sg, rctx->msg_buf.rx_stat, spu->spu_rx_status_len());
> +
> +       return 0;
> +}
> +
> +/**
> + * spu_ablkcipher_tx_sg_create() - Build up the scatterlist of buffers used to
> + * send a SPU request message for an ablkcipher request. Includes SPU message
> + * headers and the request data.
> + * @mssg:      mailbox message containing the transmit sg
> + * @rctx:      crypto request context
> + * @tx_frag_num: number of scatterlist elements required to construct the
> + *             SPU request message
> + * @chunksize: Number of bytes of request data
> + * @pad_len:   Number of pad bytes
> + * Returns:
> + *   0 if successful
> + *   < 0 if an error
> + */
> +static int
> +spu_ablkcipher_tx_sg_create(struct brcm_message *mssg,
> +                           struct iproc_reqctx_s *rctx,
> +                           u8 tx_frag_num, unsigned int chunksize, u32 pad_len)
> +{
> +       struct device *dev = &iproc_priv.pdev->dev;
> +       struct spu_hw *spu = &iproc_priv.spu;
> +       struct scatterlist *sg; /* used to build sgs in mbox message */
> +       struct iproc_ctx_s *ctx = rctx->ctx;
> +       u32 datalen;            /* Number of bytes of response data expected */
> +       u32 stat_len;
> +
> +       mssg->spu.src = kcalloc(tx_frag_num, sizeof(struct scatterlist),
> +                               rctx->gfp);
> +       if (unlikely(!mssg->spu.src))
> +               return -ENOMEM;
> +
> +       sg = mssg->spu.src;
> +       sg_init_table(sg, tx_frag_num);
> +
> +       sg_set_buf(sg++, rctx->msg_buf.bcm_spu_req_hdr,
> +                  BCM_HDR_LEN + ctx->spu_req_hdr_len);
> +
> +       /* if XTS tweak in payload, copy from IV (where crypto API puts it) */
> +       if ((ctx->cipher.mode == CIPHER_MODE_XTS) &&
> +           spu->spu_xts_tweak_in_payload())
> +               sg_set_buf(sg++, rctx->msg_buf.iv_ctr, SPU_XTS_TWEAK_SIZE);
> +
> +       /* Copy in each src sg entry from request, up to chunksize */
> +       datalen = spu_msg_sg_add(&sg, &rctx->src_sg, &rctx->src_skip,
> +                                rctx->src_nents, chunksize);
> +       if (unlikely(datalen < chunksize)) {
> +               dev_err(dev, "%s(): failed to copy src sg to mbox msg",
> +                       __func__);
> +               return -EFAULT;
> +       }
> +
> +       if (pad_len)
> +               sg_set_buf(sg++, rctx->msg_buf.spu_req_pad, pad_len);
> +
> +       stat_len = spu->spu_tx_status_len();
> +       if (stat_len) {
> +               memset(rctx->msg_buf.tx_stat, 0, stat_len);
> +               sg_set_buf(sg, rctx->msg_buf.tx_stat, stat_len);
> +       }
> +       return 0;
> +}
> +
> +/**
> + * handle_ablkcipher_req() - Submit as much of a block cipher request as fits in
> + * a single SPU request message, starting at the current position in the request
> + * data.
> + * @rctx:      Crypto request context
> + *
> + * This may be called on the crypto API thread, or, when a request is so large
> + * it must be broken into multiple SPU messages, on the thread used to invoke
> + * the response callback. When requests are broken into multiple SPU
> + * messages, we assume subsequent messages depend on previous results, and
> + * thus always wait for previous results before submitting the next message.
> + * Because requests are submitted in lock step like this, there is no need
> + * to synchronize access to request data structures.
> + *
> + * Return: -EINPROGRESS: request has been accepted and result will be returned
> + *                      asynchronously
> + *         Any other value indicates an error
> + */
> +static int handle_ablkcipher_req(struct iproc_reqctx_s *rctx)
> +{
> +       struct spu_hw *spu = &iproc_priv.spu;
> +       struct crypto_async_request *areq = rctx->parent;
> +       struct ablkcipher_request *req =
> +           container_of(areq, struct ablkcipher_request, base);
> +       struct iproc_ctx_s *ctx = rctx->ctx;
> +       struct spu_cipher_parms cipher_parms;
> +       int err = 0;
> +       unsigned int chunksize = 0;     /* Num bytes of request to submit */
> +       int remaining = 0;      /* Bytes of request still to process */
> +       int chunk_start;        /* Beginning of data for current SPU msg */
> +
> +       /* IV or ctr value to use in this SPU msg */
> +       u8 local_iv_ctr[MAX_IV_SIZE];
> +       u32 stat_pad_len;       /* num bytes to align status field */
> +       u32 pad_len;            /* total length of all padding */
> +       bool update_key = false;
> +       struct brcm_message *mssg;      /* mailbox message */
> +       int retry_cnt = 0;
> +
> +       /* number of entries in src and dst sg in mailbox message. */
> +       u8 rx_frag_num = 2;     /* response header and STATUS */
> +       u8 tx_frag_num = 1;     /* request header */
> +
> +       flow_log("%s\n", __func__);
> +
> +       cipher_parms.alg = ctx->cipher.alg;
> +       cipher_parms.mode = ctx->cipher.mode;
> +       cipher_parms.type = ctx->cipher_type;
> +       cipher_parms.key_len = ctx->enckeylen;
> +       cipher_parms.key_buf = ctx->enckey;
> +       cipher_parms.iv_buf = local_iv_ctr;
> +       cipher_parms.iv_len = rctx->iv_ctr_len;
> +
> +       mssg = &rctx->mb_mssg;
> +       chunk_start = rctx->src_sent;
> +       remaining = rctx->total_todo - chunk_start;
> +
> +       /* determine the chunk we are breaking off and update the indexes */
> +       if ((ctx->max_payload != SPU_MAX_PAYLOAD_INF) &&
> +           (remaining > ctx->max_payload))
> +               chunksize = ctx->max_payload;
> +       else
> +               chunksize = remaining;
> +
> +       rctx->src_sent += chunksize;
> +       rctx->total_sent = rctx->src_sent;
> +
> +       /* Count number of sg entries to be included in this request */
> +       rctx->src_nents = spu_sg_count(rctx->src_sg, rctx->src_skip, chunksize);
> +       rctx->dst_nents = spu_sg_count(rctx->dst_sg, rctx->dst_skip, chunksize);
> +
> +       if ((ctx->cipher.mode == CIPHER_MODE_CBC) &&
> +           rctx->is_encrypt && chunk_start)
> +               /*
> +                * Encrypting non-first first chunk. Copy last block of
> +                * previous result to IV for this chunk.
> +                */
> +               sg_copy_part_to_buf(req->dst, rctx->msg_buf.iv_ctr,
> +                                   rctx->iv_ctr_len,
> +                                   chunk_start - rctx->iv_ctr_len);
> +
> +       if (rctx->iv_ctr_len) {
> +               /* get our local copy of the iv */
> +               __builtin_memcpy(local_iv_ctr, rctx->msg_buf.iv_ctr,
> +                                rctx->iv_ctr_len);
> +
> +               /* generate the next IV if possible */
> +               if ((ctx->cipher.mode == CIPHER_MODE_CBC) &&
> +                   !rctx->is_encrypt) {
> +                       /*
> +                        * CBC Decrypt: next IV is the last ciphertext block in
> +                        * this chunk
> +                        */
> +                       sg_copy_part_to_buf(req->src, rctx->msg_buf.iv_ctr,
> +                                           rctx->iv_ctr_len,
> +                                           rctx->src_sent - rctx->iv_ctr_len);
> +               } else if (ctx->cipher.mode == CIPHER_MODE_CTR) {
> +                       /*
> +                        * The SPU hardware increments the counter once for
> +                        * each AES block of 16 bytes. So update the counter
> +                        * for the next chunk, if there is one. Note that for
> +                        * this chunk, the counter has already been copied to
> +                        * local_iv_ctr. We can assume a block size of 16,
> +                        * because we only support CTR mode for AES, not for
> +                        * any other cipher alg.
> +                        */
> +                       add_to_ctr(rctx->msg_buf.iv_ctr, chunksize >> 4);
> +               }
> +       }
> +
> +       if (ctx->cipher.alg == CIPHER_ALG_RC4) {
> +               rx_frag_num++;
> +               if (chunk_start) {
> +                       /*
> +                        * for non-first RC4 chunks, use SUPDT from previous
> +                        * response as key for this chunk.
> +                        */
> +                       cipher_parms.key_buf = rctx->msg_buf.c.supdt_tweak;
> +                       update_key = true;
> +                       cipher_parms.type = CIPHER_TYPE_UPDT;
> +               } else if (!rctx->is_encrypt) {
> +                       /*
> +                        * First RC4 chunk. For decrypt, key in pre-built msg
> +                        * header may have been changed if encrypt required
> +                        * multiple chunks. So revert the key to the
> +                        * ctx->enckey value.
> +                        */
> +                       update_key = true;
> +                       cipher_parms.type = CIPHER_TYPE_INIT;
> +               }
> +       }
> +
> +       if (ctx->max_payload == SPU_MAX_PAYLOAD_INF)
> +               flow_log("max_payload infinite\n");
> +       else
> +               flow_log("max_payload %u\n", ctx->max_payload);
> +
> +       flow_log("sent:%u start:%u remains:%u size:%u\n",
> +                rctx->src_sent, chunk_start, remaining, chunksize);
> +
> +       /* Copy SPU header template created at setkey time */
> +       memcpy(rctx->msg_buf.bcm_spu_req_hdr, ctx->bcm_spu_req_hdr,
> +              sizeof(rctx->msg_buf.bcm_spu_req_hdr));
> +
> +       /*
> +        * Pass SUPDT field as key. Key field in finish() call is only used
> +        * when update_key has been set above for RC4. Will be ignored in
> +        * all other cases.
> +        */
> +       spu->spu_cipher_req_finish(rctx->msg_buf.bcm_spu_req_hdr + BCM_HDR_LEN,
> +                                  ctx->spu_req_hdr_len, !(rctx->is_encrypt),
> +                                  &cipher_parms, update_key, chunksize);
> +
> +       atomic64_add(chunksize, &iproc_priv.bytes_out);
> +
> +       stat_pad_len = spu->spu_wordalign_padlen(chunksize);
> +       if (stat_pad_len)
> +               rx_frag_num++;
> +       pad_len = stat_pad_len;
> +       if (pad_len) {
> +               tx_frag_num++;
> +               spu->spu_request_pad(rctx->msg_buf.spu_req_pad, 0,
> +                                    0, ctx->auth.alg, ctx->auth.mode,
> +                                    rctx->total_sent, stat_pad_len);
> +       }
> +
> +       spu->spu_dump_msg_hdr(rctx->msg_buf.bcm_spu_req_hdr + BCM_HDR_LEN,
> +                             ctx->spu_req_hdr_len);
> +       packet_log("payload:\n");
> +       dump_sg(rctx->src_sg, rctx->src_skip, chunksize);
> +       packet_dump("   pad: ", rctx->msg_buf.spu_req_pad, pad_len);
> +
> +       /*
> +        * Build mailbox message containing SPU request msg and rx buffers
> +        * to catch response message
> +        */
> +       memset(mssg, 0, sizeof(*mssg));
> +       mssg->type = BRCM_MESSAGE_SPU;
> +       mssg->ctx = rctx;       /* Will be returned in response */
> +
> +       /* Create rx scatterlist to catch result */
> +       rx_frag_num += rctx->dst_nents;
> +
> +       if ((ctx->cipher.mode == CIPHER_MODE_XTS) &&
> +           spu->spu_xts_tweak_in_payload())
> +               rx_frag_num++;  /* extra sg to insert tweak */
> +
> +       err = spu_ablkcipher_rx_sg_create(mssg, rctx, rx_frag_num, chunksize,
> +                                         stat_pad_len);
> +       if (err)
> +               return err;
> +
> +       /* Create tx scatterlist containing SPU request message */
> +       tx_frag_num += rctx->src_nents;
> +       if (spu->spu_tx_status_len())
> +               tx_frag_num++;
> +
> +       if ((ctx->cipher.mode == CIPHER_MODE_XTS) &&
> +           spu->spu_xts_tweak_in_payload())
> +               tx_frag_num++;  /* extra sg to insert tweak */
> +
> +       err = spu_ablkcipher_tx_sg_create(mssg, rctx, tx_frag_num, chunksize,
> +                                         pad_len);
> +       if (err)
> +               return err;
> +
> +       err = mbox_send_message(iproc_priv.mbox[rctx->chan_idx], mssg);
> +       if (req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP) {
> +               while ((err == -ENOBUFS) && (retry_cnt < SPU_MB_RETRY_MAX)) {
> +                       /*
> +                        * Mailbox queue is full. Since MAY_SLEEP is set, assume
> +                        * not in atomic context and we can wait and try again.
> +                        */
> +                       retry_cnt++;
> +                       usleep_range(MBOX_SLEEP_MIN, MBOX_SLEEP_MAX);
> +                       err = mbox_send_message(iproc_priv.mbox[rctx->chan_idx],
> +                                               mssg);
> +                       atomic_inc(&iproc_priv.mb_no_spc);
> +               }
> +       }
> +       if (unlikely(err < 0)) {
> +               atomic_inc(&iproc_priv.mb_send_fail);
> +               return err;
> +       }
> +
> +       return -EINPROGRESS;
> +}
> +
> +/**
> + * handle_ablkcipher_resp() - Process a block cipher SPU response. Updates the
> + * total received count for the request and updates global stats.
> + * @rctx:      Crypto request context
> + */
> +static void handle_ablkcipher_resp(struct iproc_reqctx_s *rctx)
> +{
> +       struct spu_hw *spu = &iproc_priv.spu;
> +#ifdef DEBUG
> +       struct crypto_async_request *areq = rctx->parent;
> +       struct ablkcipher_request *req = ablkcipher_request_cast(areq);
> +#endif
> +       struct iproc_ctx_s *ctx = rctx->ctx;
> +       u32 payload_len;
> +
> +       /* See how much data was returned */
> +       payload_len = spu->spu_payload_length(rctx->msg_buf.spu_resp_hdr);
> +
> +       /*
> +        * In XTS mode, the first SPU_XTS_TWEAK_SIZE bytes may be the
> +        * encrypted tweak ("i") value; we don't count those.
> +        */
> +       if ((ctx->cipher.mode == CIPHER_MODE_XTS) &&
> +           spu->spu_xts_tweak_in_payload() &&
> +           (payload_len >= SPU_XTS_TWEAK_SIZE))
> +               payload_len -= SPU_XTS_TWEAK_SIZE;
> +
> +       atomic64_add(payload_len, &iproc_priv.bytes_in);
> +
> +       flow_log("%s() offset: %u, bd_len: %u BD:\n",
> +                __func__, rctx->total_received, payload_len);
> +
> +       dump_sg(req->dst, rctx->total_received, payload_len);
> +       if (ctx->cipher.alg == CIPHER_ALG_RC4)
> +               packet_dump("  supdt ", rctx->msg_buf.c.supdt_tweak,
> +                           SPU_SUPDT_LEN);
> +
> +       rctx->total_received += payload_len;
> +       if (rctx->total_received == rctx->total_todo) {
> +               atomic_inc(&iproc_priv.op_counts[SPU_OP_CIPHER]);
> +               atomic_inc(
> +                  &iproc_priv.cipher_cnt[ctx->cipher.alg][ctx->cipher.mode]);
> +       }
> +}
> +
> +/**
> + * spu_ahash_rx_sg_create() - Build up the scatterlist of buffers used to
> + * receive a SPU response message for an ahash request.
> + * @mssg:      mailbox message containing the receive sg
> + * @rctx:      crypto request context
> + * @rx_frag_num: number of scatterlist elements required to hold the
> + *             SPU response message
> + * @digestsize: length of hash digest, in bytes
> + * @stat_pad_len: Number of bytes required to pad the STAT field to
> + *             a 4-byte boundary
> + * Return:
> + *   0 if successful
> + *   < 0 if an error
> + */
> +static int
> +spu_ahash_rx_sg_create(struct brcm_message *mssg,
> +                      struct iproc_reqctx_s *rctx,
> +                      u8 rx_frag_num, unsigned int digestsize,
> +                      u32 stat_pad_len)
> +{
> +       struct spu_hw *spu = &iproc_priv.spu;
> +       struct scatterlist *sg; /* used to build sgs in mbox message */
> +       struct iproc_ctx_s *ctx = rctx->ctx;
> +
> +       mssg->spu.dst = kcalloc(rx_frag_num, sizeof(struct scatterlist),
> +                               rctx->gfp);
> +       if (!mssg->spu.dst)
> +               return -ENOMEM;
> +
> +       sg = mssg->spu.dst;
> +       sg_init_table(sg, rx_frag_num);
> +       /* Space for SPU message header */
> +       sg_set_buf(sg++, rctx->msg_buf.spu_resp_hdr, ctx->spu_resp_hdr_len);
> +
> +       /* Space for digest */
> +       sg_set_buf(sg++, rctx->msg_buf.digest, digestsize);
> +
> +       if (stat_pad_len)
> +               sg_set_buf(sg++, rctx->msg_buf.rx_stat_pad, stat_pad_len);
> +
> +       memset(rctx->msg_buf.rx_stat, 0, SPU_RX_STATUS_LEN);
> +       sg_set_buf(sg, rctx->msg_buf.rx_stat, spu->spu_rx_status_len());
> +       return 0;
> +}
> +
> +/**
> + * spu_ahash_tx_sg_create() -  Build up the scatterlist of buffers used to send
> + * a SPU request message for an ahash request. Includes SPU message headers and
> + * the request data.
> + * @mssg:      mailbox message containing the transmit sg
> + * @rctx:      crypto request context
> + * @tx_frag_num: number of scatterlist elements required to construct the
> + *             SPU request message
> + * @spu_hdr_len: length in bytes of SPU message header
> + * @hash_carry_len: Number of bytes of data carried over from previous req
> + * @new_data_len: Number of bytes of new request data
> + * @pad_len:   Number of pad bytes
> + * Return:
> + *   0 if successful
> + *   < 0 if an error
> + */
> +static int
> +spu_ahash_tx_sg_create(struct brcm_message *mssg,
> +                      struct iproc_reqctx_s *rctx,
> +                      u8 tx_frag_num,
> +                      u32 spu_hdr_len,
> +                      unsigned int hash_carry_len,
> +                      unsigned int new_data_len, u32 pad_len)
> +{
> +       struct device *dev = &iproc_priv.pdev->dev;
> +       struct spu_hw *spu = &iproc_priv.spu;
> +       struct scatterlist *sg; /* used to build sgs in mbox message */
> +       u32 datalen;            /* Number of bytes of response data expected */
> +       u32 stat_len;
> +
> +       mssg->spu.src = kcalloc(tx_frag_num, sizeof(struct scatterlist),
> +                               rctx->gfp);
> +       if (!mssg->spu.src)
> +               return -ENOMEM;
> +
> +       sg = mssg->spu.src;
> +       sg_init_table(sg, tx_frag_num);
> +
> +       sg_set_buf(sg++, rctx->msg_buf.bcm_spu_req_hdr,
> +                  BCM_HDR_LEN + spu_hdr_len);
> +
> +       if (hash_carry_len)
> +               sg_set_buf(sg++, rctx->hash_carry, hash_carry_len);
> +
> +       if (new_data_len) {
> +               /* Copy in each src sg entry from request, up to chunksize */
> +               datalen = spu_msg_sg_add(&sg, &rctx->src_sg, &rctx->src_skip,
> +                                        rctx->src_nents, new_data_len);
> +               if (datalen < new_data_len) {
> +                       dev_err(dev,
> +                               "%s(): failed to copy src sg to mbox msg",
> +                               __func__);
> +                       return -EFAULT;
> +               }
> +       }
> +
> +       if (pad_len)
> +               sg_set_buf(sg++, rctx->msg_buf.spu_req_pad, pad_len);
> +
> +       stat_len = spu->spu_tx_status_len();
> +       if (stat_len) {
> +               memset(rctx->msg_buf.tx_stat, 0, stat_len);
> +               sg_set_buf(sg, rctx->msg_buf.tx_stat, stat_len);
> +       }
> +
> +       return 0;
> +}
> +
> +/**
> + * handle_ahash_req() - Process an asynchronous hash request from the crypto
> + * API.
> + * @rctx:  Crypto request context
> + *
> + * Builds a SPU request message embedded in a mailbox message and submits the
> + * mailbox message on a selected mailbox channel. The SPU request message is
> + * constructed as a scatterlist, including entries from the crypto API's
> + * src scatterlist to avoid copying the data to be hashed. This function is
> + * called either on the thread from the crypto API, or, in the case that the
> + * crypto API request is too large to fit in a single SPU request message,
> + * on the thread that invokes the receive callback with a response message.
> + * Because some operations require the response from one chunk before the next
> + * chunk can be submitted, we always wait for the response for the previous
> + * chunk before submitting the next chunk. Because requests are submitted in
> + * lock step like this, there is no need to synchronize access to request data
> + * structures.
> + *
> + * Return:
> + *   -EINPROGRESS: request has been submitted to SPU and response will be
> + *                returned asynchronously
> + *   -EAGAIN:      non-final request included a small amount of data, which for
> + *                efficiency we did not submit to the SPU, but instead stored
> + *                to be submitted to the SPU with the next part of the request
> + *   other:        an error code
> + */
> +static int handle_ahash_req(struct iproc_reqctx_s *rctx)
> +{
> +       struct spu_hw *spu = &iproc_priv.spu;
> +       struct crypto_async_request *areq = rctx->parent;
> +       struct ahash_request *req = ahash_request_cast(areq);
> +       struct crypto_ahash *ahash = crypto_ahash_reqtfm(req);
> +       struct crypto_tfm *tfm = crypto_ahash_tfm(ahash);
> +       unsigned int blocksize = crypto_tfm_alg_blocksize(tfm);
> +       struct iproc_ctx_s *ctx = rctx->ctx;
> +
> +       /* number of bytes still to be hashed in this req */
> +       unsigned int nbytes_to_hash = 0;
> +       int err = 0;
> +       unsigned int chunksize = 0;     /* length of hash carry + new data */
> +       /*
> +        * length of new data, not from hash carry, to be submitted in
> +        * this hw request
> +        */
> +       unsigned int new_data_len;
> +
> +       unsigned int chunk_start = 0;
> +       u32 db_size;     /* Length of data field, incl gcm and hash padding */
> +       int pad_len = 0; /* total pad len, including gcm, hash, stat padding */
> +       u32 data_pad_len = 0;   /* length of GCM/CCM padding */
> +       u32 stat_pad_len = 0;   /* length of padding to align STATUS word */
> +       struct brcm_message *mssg;      /* mailbox message */
> +       struct spu_request_opts req_opts;
> +       struct spu_cipher_parms cipher_parms;
> +       struct spu_hash_parms hash_parms;
> +       struct spu_aead_parms aead_parms;
> +       unsigned int local_nbuf;
> +       u32 spu_hdr_len;
> +       unsigned int digestsize;
> +       u16 rem = 0;
> +       int retry_cnt = 0;
> +
> +       /*
> +        * number of entries in src and dst sg. Always includes SPU msg header.
> +        * rx always includes a buffer to catch digest and STATUS.
> +        */
> +       u8 rx_frag_num = 3;
> +       u8 tx_frag_num = 1;
> +
> +       flow_log("total_todo %u, total_sent %u\n",
> +                rctx->total_todo, rctx->total_sent);
> +
> +       memset(&req_opts, 0, sizeof(req_opts));
> +       memset(&cipher_parms, 0, sizeof(cipher_parms));
> +       memset(&hash_parms, 0, sizeof(hash_parms));
> +       memset(&aead_parms, 0, sizeof(aead_parms));
> +
> +       req_opts.bd_suppress = true;
> +       hash_parms.alg = ctx->auth.alg;
> +       hash_parms.mode = ctx->auth.mode;
> +       hash_parms.type = HASH_TYPE_NONE;
> +       hash_parms.key_buf = (u8 *)ctx->authkey;
> +       hash_parms.key_len = ctx->authkeylen;
> +
> +       /*
> +        * For hash algorithms below assignment looks bit odd but
> +        * it's needed for AES-XCBC and AES-CMAC hash algorithms
> +        * to differentiate between 128, 192, 256 bit key values.
> +        * Based on the key values, hash algorithm is selected.
> +        * For example for 128 bit key, hash algorithm is AES-128.
> +        */
> +       cipher_parms.type = ctx->cipher_type;
> +
> +       mssg = &rctx->mb_mssg;
> +       chunk_start = rctx->src_sent;
> +
> +       /*
> +        * Compute the amount remaining to hash. This may include data
> +        * carried over from previous requests.
> +        */
> +       nbytes_to_hash = rctx->total_todo - rctx->total_sent;
> +       chunksize = nbytes_to_hash;
> +       if ((ctx->max_payload != SPU_MAX_PAYLOAD_INF) &&
> +           (chunksize > ctx->max_payload))
> +               chunksize = ctx->max_payload;
> +
> +       /*
> +        * If this is not a final request and the request data is not a multiple
> +        * of a full block, then simply park the extra data and prefix it to the
> +        * data for the next request.
> +        */
> +       if (!rctx->is_final) {
> +               u8 *dest = rctx->hash_carry + rctx->hash_carry_len;
> +               u16 new_len;  /* len of data to add to hash carry */
> +
> +               rem = chunksize % blocksize;   /* remainder */
> +               if (rem) {
> +                       /* chunksize not a multiple of blocksize */
> +                       chunksize -= rem;
> +                       if (chunksize == 0) {
> +                               /* Don't have a full block to submit to hw */
> +                               new_len = rem - rctx->hash_carry_len;
> +                               sg_copy_part_to_buf(req->src, dest, new_len,
> +                                                   rctx->src_sent);
> +                               rctx->hash_carry_len = rem;
> +                               flow_log("Exiting with hash carry len: %u\n",
> +                                        rctx->hash_carry_len);
> +                               packet_dump("  buf: ",
> +                                           rctx->hash_carry,
> +                                           rctx->hash_carry_len);
> +                               return -EAGAIN;
> +                       }
> +               }
> +       }
> +
> +       /* if we have hash carry, then prefix it to the data in this request */
> +       local_nbuf = rctx->hash_carry_len;
> +       rctx->hash_carry_len = 0;
> +       if (local_nbuf)
> +               tx_frag_num++;
> +       new_data_len = chunksize - local_nbuf;
> +
> +       /* Count number of sg entries to be used in this request */
> +       rctx->src_nents = spu_sg_count(rctx->src_sg, rctx->src_skip,
> +                                      new_data_len);
> +
> +       /* AES hashing keeps key size in type field, so need to copy it here */
> +       if (hash_parms.alg == HASH_ALG_AES)
> +               hash_parms.type = cipher_parms.type;
> +       else
> +               hash_parms.type = spu->spu_hash_type(rctx->total_sent);
> +
> +       digestsize = spu->spu_digest_size(ctx->digestsize, ctx->auth.alg,
> +                                         hash_parms.type);
> +       hash_parms.digestsize = digestsize;
> +
> +       /* update the indexes */
> +       rctx->total_sent += chunksize;
> +       /* if you sent a prebuf then that wasn't from this req->src */
> +       rctx->src_sent += new_data_len;
> +
> +       if ((rctx->total_sent == rctx->total_todo) && rctx->is_final)
> +               hash_parms.pad_len = spu->spu_hash_pad_len(hash_parms.alg,
> +                                                          hash_parms.mode,
> +                                                          chunksize,
> +                                                          blocksize);
> +
> +       /*
> +        * If a non-first chunk, then include the digest returned from the
> +        * previous chunk so that hw can add to it (except for AES types).
> +        */
> +       if ((hash_parms.type == HASH_TYPE_UPDT) &&
> +           (hash_parms.alg != HASH_ALG_AES)) {
> +               hash_parms.key_buf = rctx->incr_hash;
> +               hash_parms.key_len = digestsize;
> +       }
> +
> +       atomic64_add(chunksize, &iproc_priv.bytes_out);
> +
> +       flow_log("%s() final: %u nbuf: %u ",
> +                __func__, rctx->is_final, local_nbuf);
> +
> +       if (ctx->max_payload == SPU_MAX_PAYLOAD_INF)
> +               flow_log("max_payload infinite\n");
> +       else
> +               flow_log("max_payload %u\n", ctx->max_payload);
> +
> +       flow_log("chunk_start: %u chunk_size: %u\n", chunk_start, chunksize);
> +
> +       /* Prepend SPU header with type 3 BCM header */
> +       memcpy(rctx->msg_buf.bcm_spu_req_hdr, BCMHEADER, BCM_HDR_LEN);
> +
> +       hash_parms.prebuf_len = local_nbuf;
> +       spu_hdr_len = spu->spu_create_request(rctx->msg_buf.bcm_spu_req_hdr +
> +                                             BCM_HDR_LEN,
> +                                             &req_opts, &cipher_parms,
> +                                             &hash_parms, &aead_parms,
> +                                             new_data_len);
> +
> +       if (spu_hdr_len == 0) {
> +               pr_err("Failed to create SPU request header\n");
> +               return -EFAULT;
> +       }
> +
> +       /*
> +        * Determine total length of padding required. Put all padding in one
> +        * buffer.
> +        */
> +       data_pad_len = spu->spu_gcm_ccm_pad_len(ctx->cipher.mode, chunksize);
> +       db_size = spu_real_db_size(0, 0, local_nbuf, new_data_len,
> +                                  0, 0, hash_parms.pad_len);
> +       if (spu->spu_tx_status_len())
> +               stat_pad_len = spu->spu_wordalign_padlen(db_size);
> +       if (stat_pad_len)
> +               rx_frag_num++;
> +       pad_len = hash_parms.pad_len + data_pad_len + stat_pad_len;
> +       if (pad_len) {
> +               tx_frag_num++;
> +               spu->spu_request_pad(rctx->msg_buf.spu_req_pad, data_pad_len,
> +                                    hash_parms.pad_len, ctx->auth.alg,
> +                                    ctx->auth.mode, rctx->total_sent,
> +                                    stat_pad_len);
> +       }
> +
> +       spu->spu_dump_msg_hdr(rctx->msg_buf.bcm_spu_req_hdr + BCM_HDR_LEN,
> +                             spu_hdr_len);
> +       packet_dump("    prebuf: ", rctx->hash_carry, local_nbuf);
> +       flow_log("Data:\n");
> +       dump_sg(rctx->src_sg, rctx->src_skip, new_data_len);
> +       packet_dump("   pad: ", rctx->msg_buf.spu_req_pad, pad_len);
> +
> +       /*
> +        * Build mailbox message containing SPU request msg and rx buffers
> +        * to catch response message
> +        */
> +       memset(mssg, 0, sizeof(*mssg));
> +       mssg->type = BRCM_MESSAGE_SPU;
> +       mssg->ctx = rctx;       /* Will be returned in response */
> +
> +       /* Create rx scatterlist to catch result */
> +       err = spu_ahash_rx_sg_create(mssg, rctx, rx_frag_num, digestsize,
> +                                    stat_pad_len);
> +       if (err)
> +               return err;
> +
> +       /* Create tx scatterlist containing SPU request message */
> +       tx_frag_num += rctx->src_nents;
> +       if (spu->spu_tx_status_len())
> +               tx_frag_num++;
> +       err = spu_ahash_tx_sg_create(mssg, rctx, tx_frag_num, spu_hdr_len,
> +                                    local_nbuf, new_data_len, pad_len);
> +       if (err)
> +               return err;
> +
> +       err = mbox_send_message(iproc_priv.mbox[rctx->chan_idx], mssg);
> +       if (req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP) {
> +               while ((err == -ENOBUFS) && (retry_cnt < SPU_MB_RETRY_MAX)) {
> +                       /*
> +                        * Mailbox queue is full. Since MAY_SLEEP is set, assume
> +                        * not in atomic context and we can wait and try again.
> +                        */
> +                       retry_cnt++;
> +                       usleep_range(MBOX_SLEEP_MIN, MBOX_SLEEP_MAX);
> +                       err = mbox_send_message(iproc_priv.mbox[rctx->chan_idx],
> +                                               mssg);
> +                       atomic_inc(&iproc_priv.mb_no_spc);
> +               }
> +       }
> +       if (err < 0) {
> +               atomic_inc(&iproc_priv.mb_send_fail);
> +               return err;
> +       }
> +       return -EINPROGRESS;
> +}
> +
> +/**
> + * handle_ahash_resp() - Process a SPU response message for a hash request.
> + * Checks if the entire crypto API request has been processed, and if so,
> + * invokes post processing on the result.
> + * @rctx: Crypto request context
> + */
> +static void handle_ahash_resp(struct iproc_reqctx_s *rctx)
> +{
> +       struct iproc_ctx_s *ctx = rctx->ctx;
> +#ifdef DEBUG
> +       struct crypto_async_request *areq = rctx->parent;
> +       struct ahash_request *req = ahash_request_cast(areq);
> +       struct crypto_ahash *ahash = crypto_ahash_reqtfm(req);
> +       unsigned int blocksize =
> +               crypto_tfm_alg_blocksize(crypto_ahash_tfm(ahash));
> +#endif
> +       /*
> +        * Save hash to use as input to next op if incremental. Might be copying
> +        * too much, but that's easier than figuring out actual digest size here
> +        */
> +       memcpy(rctx->incr_hash, rctx->msg_buf.digest, MAX_DIGEST_SIZE);
> +
> +       flow_log("%s() blocksize:%u digestsize:%u\n",
> +                __func__, blocksize, ctx->digestsize);
> +
> +       atomic64_add(ctx->digestsize, &iproc_priv.bytes_in);
> +
> +       if (rctx->is_final && (rctx->total_sent == rctx->total_todo))
> +               ahash_req_done(rctx);
> +}
> +
> +/**
> + * spu_hmac_outer_hash() - Request synchonous software compute of the outer hash
> + * for an HMAC request.
> + * @req:  The HMAC request from the crypto API
> + * @ctx:  The session context
> + *
> + * Return: 0 if synchronous hash operation successful
> + *         -EINVAL if the hash algo is unrecognized
> + *         any other value indicates an error
> + */
> +static int spu_hmac_outer_hash(struct ahash_request *req,
> +                              struct iproc_ctx_s *ctx)
> +{
> +       struct crypto_ahash *ahash = crypto_ahash_reqtfm(req);
> +       unsigned int blocksize =
> +               crypto_tfm_alg_blocksize(crypto_ahash_tfm(ahash));
> +       int rc;
> +
> +       switch (ctx->auth.alg) {
> +       case HASH_ALG_MD5:
> +               rc = do_shash("md5", req->result, ctx->opad, blocksize,
> +                             req->result, ctx->digestsize, NULL, 0);
> +               break;
> +       case HASH_ALG_SHA1:
> +               rc = do_shash("sha1", req->result, ctx->opad, blocksize,
> +                             req->result, ctx->digestsize, NULL, 0);
> +               break;
> +       case HASH_ALG_SHA224:
> +               rc = do_shash("sha224", req->result, ctx->opad, blocksize,
> +                             req->result, ctx->digestsize, NULL, 0);
> +               break;
> +       case HASH_ALG_SHA256:
> +               rc = do_shash("sha256", req->result, ctx->opad, blocksize,
> +                             req->result, ctx->digestsize, NULL, 0);
> +               break;
> +       case HASH_ALG_SHA384:
> +               rc = do_shash("sha384", req->result, ctx->opad, blocksize,
> +                             req->result, ctx->digestsize, NULL, 0);
> +               break;
> +       case HASH_ALG_SHA512:
> +               rc = do_shash("sha512", req->result, ctx->opad, blocksize,
> +                             req->result, ctx->digestsize, NULL, 0);
> +               break;
> +       default:
> +               pr_err("%s() Error : unknown hmac type\n", __func__);
> +               rc = -EINVAL;
> +       }
> +       return rc;
> +}
> +
> +/**
> + * ahash_req_done() - Process a hash result from the SPU hardware.
> + * @rctx: Crypto request context
> + *
> + * Return: 0 if successful
> + *         < 0 if an error
> + */
> +static int ahash_req_done(struct iproc_reqctx_s *rctx)
> +{
> +       struct spu_hw *spu = &iproc_priv.spu;
> +       struct crypto_async_request *areq = rctx->parent;
> +       struct ahash_request *req = ahash_request_cast(areq);
> +       struct iproc_ctx_s *ctx = rctx->ctx;
> +       int err;
> +
> +       memcpy(req->result, rctx->msg_buf.digest, ctx->digestsize);
> +
> +       if (spu->spu_type == SPU_TYPE_SPUM) {
> +               /* byte swap the output from the UPDT function to network byte
> +                * order
> +                */
> +               if (ctx->auth.alg == HASH_ALG_MD5) {
> +                       __swab32s((u32 *)req->result);
> +                       __swab32s(((u32 *)req->result) + 1);
> +                       __swab32s(((u32 *)req->result) + 2);
> +                       __swab32s(((u32 *)req->result) + 3);
> +                       __swab32s(((u32 *)req->result) + 4);
> +               }
> +       }
> +
> +       flow_dump("  digest ", req->result, ctx->digestsize);
> +
> +       /* if this an HMAC then do the outer hash */
> +       if (rctx->is_sw_hmac) {
> +               err = spu_hmac_outer_hash(req, ctx);
> +               if (err < 0)
> +                       return err;
> +               flow_dump("  hmac: ", req->result, ctx->digestsize);
> +       }
> +
> +       if (rctx->is_sw_hmac || ctx->auth.mode == HASH_MODE_HMAC) {
> +               atomic_inc(&iproc_priv.op_counts[SPU_OP_HMAC]);
> +               atomic_inc(&iproc_priv.hmac_cnt[ctx->auth.alg]);
> +       } else {
> +               atomic_inc(&iproc_priv.op_counts[SPU_OP_HASH]);
> +               atomic_inc(&iproc_priv.hash_cnt[ctx->auth.alg]);
> +       }
> +
> +       return 0;
> +}
> +
> +/**
> + * spu_aead_rx_sg_create() - Build up the scatterlist of buffers used to receive
> + * a SPU response message for an AEAD request. Includes buffers to catch SPU
> + * message headers and the response data.
> + * @mssg:      mailbox message containing the receive sg
> + * @rctx:      crypto request context
> + * @rx_frag_num: number of scatterlist elements required to hold the
> + *             SPU response message
> + * @assoc_len: Length of associated data included in the crypto request
> + * @ret_iv_len: Length of IV returned in response
> + * @resp_len:  Number of bytes of response data expected to be written to
> + *              dst buffer from crypto API
> + * @digestsize: Length of hash digest, in bytes
> + * @stat_pad_len: Number of bytes required to pad the STAT field to
> + *             a 4-byte boundary
> + * Returns:
> + *   0 if successful
> + *   < 0 if an error
> + */
> +static int spu_aead_rx_sg_create(struct brcm_message *mssg,
> +                                struct aead_request *req,
> +                                struct iproc_reqctx_s *rctx,
> +                                u8 rx_frag_num,
> +                                unsigned int assoc_len,
> +                                u32 ret_iv_len, unsigned int resp_len,
> +                                unsigned int digestsize, u32 stat_pad_len)
> +{
> +       struct device *dev = &iproc_priv.pdev->dev;
> +       struct spu_hw *spu = &iproc_priv.spu;
> +       struct scatterlist *sg; /* used to build sgs in mbox message */
> +       struct iproc_ctx_s *ctx = rctx->ctx;
> +       u32 datalen;            /* Number of bytes of response data expected */
> +       u32 assoc_buf_len;
> +       u8 data_padlen = 0;
> +
> +       if (ctx->is_rfc4543) {
> +               /* RFC4543: only pad after data, not after AAD */
> +               data_padlen = spu->spu_gcm_ccm_pad_len(ctx->cipher.mode,
> +                                                         assoc_len + resp_len);
> +               assoc_buf_len = assoc_len;
> +       } else {
> +               data_padlen = spu->spu_gcm_ccm_pad_len(ctx->cipher.mode,
> +                                                         resp_len);
> +               assoc_buf_len = spu->spu_assoc_resp_len(ctx->cipher.mode,
> +                                               assoc_len, ret_iv_len,
> +                                               rctx->is_encrypt);
> +       }
> +
> +       if (ctx->cipher.mode == CIPHER_MODE_CCM)
> +               /* ICV (after data) must be in the next 32-bit word for CCM */
> +               data_padlen += spu->spu_wordalign_padlen(assoc_buf_len +
> +                                                        resp_len +
> +                                                        data_padlen);
> +
> +       if (data_padlen)
> +               /* have to catch gcm pad in separate buffer */
> +               rx_frag_num++;
> +
> +       mssg->spu.dst = kcalloc(rx_frag_num, sizeof(struct scatterlist),
> +                               rctx->gfp);
> +       if (!mssg->spu.dst)
> +               return -ENOMEM;
> +
> +       sg = mssg->spu.dst;
> +       sg_init_table(sg, rx_frag_num);
> +
> +       /* Space for SPU message header */
> +       sg_set_buf(sg++, rctx->msg_buf.spu_resp_hdr, ctx->spu_resp_hdr_len);
> +
> +       if (assoc_buf_len) {
> +               /*
> +                * Don't write directly to req->dst, because SPU may pad the
> +                * assoc data in the response
> +                */
> +               memset(rctx->msg_buf.a.resp_aad, 0, assoc_buf_len);
> +               sg_set_buf(sg++, rctx->msg_buf.a.resp_aad, assoc_buf_len);
> +       }
> +
> +       /*
> +        * Copy in each dst sg entry from request, up to chunksize.
> +        * dst sg catches just the data. digest caught in separate buf.
> +        */
> +       datalen = spu_msg_sg_add(&sg, &rctx->dst_sg, &rctx->dst_skip,
> +                                rctx->dst_nents, resp_len);
> +       if (datalen < (resp_len)) {
> +               dev_err(dev,
> +                       "%s(): failed to copy dst sg to mbox msg. expected len %u, datalen %u",
> +                       __func__, resp_len, datalen);
> +               return -EFAULT;
> +       }
> +
> +       /* If GCM/CCM data is padded, catch padding in separate buffer */
> +       if (data_padlen) {
> +               memset(rctx->msg_buf.a.gcmpad, 0, data_padlen);
> +               sg_set_buf(sg++, rctx->msg_buf.a.gcmpad, data_padlen);
> +       }
> +
> +       /* Always catch ICV in separate buffer */
> +       sg_set_buf(sg++, rctx->msg_buf.digest, digestsize);
> +
> +       flow_log("stat_pad_len %u\n", stat_pad_len);
> +       if (stat_pad_len) {
> +               memset(rctx->msg_buf.rx_stat_pad, 0, stat_pad_len);
> +               sg_set_buf(sg++, rctx->msg_buf.rx_stat_pad, stat_pad_len);
> +       }
> +
> +       memset(rctx->msg_buf.rx_stat, 0, SPU_RX_STATUS_LEN);
> +       sg_set_buf(sg, rctx->msg_buf.rx_stat, spu->spu_rx_status_len());
> +
> +       return 0;
> +}
> +
> +/**
> + * spu_aead_tx_sg_create() - Build up the scatterlist of buffers used to send a
> + * SPU request message for an AEAD request. Includes SPU message headers and the
> + * request data.
> + * @mssg:      mailbox message containing the transmit sg
> + * @rctx:      crypto request context
> + * @tx_frag_num: number of scatterlist elements required to construct the
> + *             SPU request message
> + * @spu_hdr_len: length of SPU message header in bytes
> + * @assoc:     crypto API associated data scatterlist
> + * @assoc_len: length of associated data
> + * @assoc_nents: number of scatterlist entries containing assoc data
> + * @aead_iv_len: length of AEAD IV, if included
> + * @chunksize: Number of bytes of request data
> + * @aad_pad_len: Number of bytes of padding at end of AAD. For GCM/CCM.
> + * @pad_len:   Number of pad bytes
> + * @incl_icv:  If true, write separate ICV buffer after data and
> + *              any padding
> + * Return:
> + *   0 if successful
> + *   < 0 if an error
> + */
> +static int spu_aead_tx_sg_create(struct brcm_message *mssg,
> +                                struct iproc_reqctx_s *rctx,
> +                                u8 tx_frag_num,
> +                                u32 spu_hdr_len,
> +                                struct scatterlist *assoc,
> +                                unsigned int assoc_len,
> +                                int assoc_nents,
> +                                unsigned int aead_iv_len,
> +                                unsigned int chunksize,
> +                                u32 aad_pad_len, u32 pad_len, bool incl_icv)
> +{
> +       struct device *dev = &iproc_priv.pdev->dev;
> +       struct spu_hw *spu = &iproc_priv.spu;
> +       struct scatterlist *sg; /* used to build sgs in mbox message */
> +       struct scatterlist *assoc_sg = assoc;
> +       struct iproc_ctx_s *ctx = rctx->ctx;
> +       u32 datalen;            /* Number of bytes of data to write */
> +       u32 written;            /* Number of bytes of data written */
> +       u32 assoc_offset = 0;
> +       u32 stat_len;
> +
> +       mssg->spu.src = kcalloc(tx_frag_num, sizeof(struct scatterlist),
> +                               rctx->gfp);
> +       if (!mssg->spu.src)
> +               return -ENOMEM;
> +
> +       sg = mssg->spu.src;
> +       sg_init_table(sg, tx_frag_num);
> +
> +       sg_set_buf(sg++, rctx->msg_buf.bcm_spu_req_hdr,
> +                  BCM_HDR_LEN + spu_hdr_len);
> +
> +       if (assoc_len) {
> +               /* Copy in each associated data sg entry from request */
> +               written = spu_msg_sg_add(&sg, &assoc_sg, &assoc_offset,
> +                                        assoc_nents, assoc_len);
> +               if (written < assoc_len) {
> +                       dev_err(dev,
> +                               "%s(): failed to copy assoc sg to mbox msg",
> +                               __func__);
> +                       return -EFAULT;
> +               }
> +       }
> +
> +       if (aead_iv_len)
> +               sg_set_buf(sg++, rctx->msg_buf.iv_ctr, aead_iv_len);
> +
> +       if (aad_pad_len) {
> +               memset(rctx->msg_buf.a.req_aad_pad, 0, aad_pad_len);
> +               sg_set_buf(sg++, rctx->msg_buf.a.req_aad_pad, aad_pad_len);
> +       }
> +
> +       datalen = chunksize;
> +       if ((chunksize > ctx->digestsize) && incl_icv)
> +               datalen -= ctx->digestsize;
> +       if (datalen) {
> +               /* For aead, a single msg should consume the entire src sg */
> +               written = spu_msg_sg_add(&sg, &rctx->src_sg, &rctx->src_skip,
> +                                        rctx->src_nents, datalen);
> +               if (written < datalen) {
> +                       dev_err(dev, "%s(): failed to copy src sg to mbox msg",
> +                               __func__);
> +                       return -EFAULT;
> +               }
> +       }
> +
> +       if (pad_len) {
> +               memset(rctx->msg_buf.spu_req_pad, 0, pad_len);
> +               sg_set_buf(sg++, rctx->msg_buf.spu_req_pad, pad_len);
> +       }
> +
> +       if (incl_icv)
> +               sg_set_buf(sg++, rctx->msg_buf.digest, ctx->digestsize);
> +
> +       stat_len = spu->spu_tx_status_len();
> +       if (stat_len) {
> +               memset(rctx->msg_buf.tx_stat, 0, stat_len);
> +               sg_set_buf(sg, rctx->msg_buf.tx_stat, stat_len);
> +       }
> +       return 0;
> +}
> +
> +/**
> + * handle_aead_req() - Submit a SPU request message for the next chunk of the
> + * current AEAD request.
> + * @rctx:  Crypto request context
> + *
> + * Unlike other operation types, we assume the length of the request fits in
> + * a single SPU request message. aead_enqueue() makes sure this is true.
> + * Comments for other op types regarding threads applies here as well.
> + *
> + * Unlike incremental hash ops, where the spu returns the entire hash for
> + * truncated algs like sha-224, the SPU returns just the truncated hash in
> + * response to aead requests. So digestsize is always ctx->digestsize here.
> + *
> + * Return: -EINPROGRESS: crypto request has been accepted and result will be
> + *                      returned asynchronously
> + *         Any other value indicates an error
> + */
> +static int handle_aead_req(struct iproc_reqctx_s *rctx)
> +{
> +       struct spu_hw *spu = &iproc_priv.spu;
> +       struct crypto_async_request *areq = rctx->parent;
> +       struct aead_request *req = container_of(areq,
> +                                               struct aead_request, base);
> +       struct iproc_ctx_s *ctx = rctx->ctx;
> +       int err;
> +       unsigned int chunksize;
> +       unsigned int resp_len;
> +       u32 spu_hdr_len;
> +       u32 db_size;
> +       u32 stat_pad_len;
> +       u32 pad_len;
> +       struct brcm_message *mssg;      /* mailbox message */
> +       struct spu_request_opts req_opts;
> +       struct spu_cipher_parms cipher_parms;
> +       struct spu_hash_parms hash_parms;
> +       struct spu_aead_parms aead_parms;
> +       int assoc_nents = 0;
> +       bool incl_icv = false;
> +       unsigned int digestsize = ctx->digestsize;
> +       int retry_cnt = 0;
> +
> +       /* number of entries in src and dst sg. Always includes SPU msg header.
> +        */
> +       u8 rx_frag_num = 2;     /* and STATUS */
> +       u8 tx_frag_num = 1;
> +
> +       /* doing the whole thing at once */
> +       chunksize = rctx->total_todo;
> +
> +       flow_log("%s: chunksize %u\n", __func__, chunksize);
> +
> +       memset(&req_opts, 0, sizeof(req_opts));
> +       memset(&hash_parms, 0, sizeof(hash_parms));
> +       memset(&aead_parms, 0, sizeof(aead_parms));
> +
> +       req_opts.is_inbound = !(rctx->is_encrypt);
> +       req_opts.auth_first = ctx->auth_first;
> +       req_opts.is_aead = true;
> +       req_opts.is_esp = ctx->is_esp;
> +
> +       cipher_parms.alg = ctx->cipher.alg;
> +       cipher_parms.mode = ctx->cipher.mode;
> +       cipher_parms.type = ctx->cipher_type;
> +       cipher_parms.key_buf = ctx->enckey;
> +       cipher_parms.key_len = ctx->enckeylen;
> +       cipher_parms.iv_buf = rctx->msg_buf.iv_ctr;
> +       cipher_parms.iv_len = rctx->iv_ctr_len;
> +
> +       hash_parms.alg = ctx->auth.alg;
> +       hash_parms.mode = ctx->auth.mode;
> +       hash_parms.type = HASH_TYPE_NONE;
> +       hash_parms.key_buf = (u8 *)ctx->authkey;
> +       hash_parms.key_len = ctx->authkeylen;
> +       hash_parms.digestsize = digestsize;
> +
> +       if ((ctx->auth.alg == HASH_ALG_SHA224) &&
> +           (ctx->authkeylen < SHA224_DIGEST_SIZE))
> +               hash_parms.key_len = SHA224_DIGEST_SIZE;
> +
> +       aead_parms.assoc_size = req->assoclen;
> +       if (ctx->is_esp && !ctx->is_rfc4543) {
> +               /*
> +                * 8-byte IV is included assoc data in request. SPU2
> +                * expects AAD to include just SPI and seqno. So
> +                * subtract off the IV len.
> +                */
> +               aead_parms.assoc_size -= GCM_ESP_IV_SIZE;
> +
> +               if (rctx->is_encrypt) {
> +                       aead_parms.return_iv = true;
> +                       aead_parms.ret_iv_len = GCM_ESP_IV_SIZE;
> +                       aead_parms.ret_iv_off = GCM_ESP_SALT_SIZE;
> +               }
> +       } else {
> +               aead_parms.ret_iv_len = 0;
> +       }
> +
> +       /*
> +        * Count number of sg entries from the crypto API request that are to
> +        * be included in this mailbox message. For dst sg, don't count space
> +        * for digest. Digest gets caught in a separate buffer and copied back
> +        * to dst sg when processing response.
> +        */
> +       rctx->src_nents = spu_sg_count(rctx->src_sg, rctx->src_skip, chunksize);
> +       rctx->dst_nents = spu_sg_count(rctx->dst_sg, rctx->dst_skip, chunksize);
> +       if (aead_parms.assoc_size)
> +               assoc_nents = spu_sg_count(rctx->assoc, 0,
> +                                          aead_parms.assoc_size);
> +
> +       mssg = &rctx->mb_mssg;
> +
> +       rctx->total_sent = chunksize;
> +       rctx->src_sent = chunksize;
> +       if (spu->spu_assoc_resp_len(ctx->cipher.mode,
> +                                   aead_parms.assoc_size,
> +                                   aead_parms.ret_iv_len,
> +                                   rctx->is_encrypt))
> +               rx_frag_num++;
> +
> +       aead_parms.iv_len = spu->spu_aead_ivlen(ctx->cipher.mode,
> +                                               rctx->iv_ctr_len);
> +
> +       if (ctx->auth.alg == HASH_ALG_AES)
> +               hash_parms.type = ctx->cipher_type;
> +
> +       /* General case AAD padding (CCM and RFC4543 special cases below) */
> +       aead_parms.aad_pad_len = spu->spu_gcm_ccm_pad_len(ctx->cipher.mode,
> +                                                aead_parms.assoc_size);
> +
> +       /* General case data padding (CCM decrypt special case below) */
> +       aead_parms.data_pad_len = spu->spu_gcm_ccm_pad_len(ctx->cipher.mode,
> +                                                          chunksize);
> +
> +       if (ctx->cipher.mode == CIPHER_MODE_CCM) {
> +               /*
> +                * for CCM, AAD len + 2 (rather than AAD len) needs to be
> +                * 128-bit aligned
> +                */
> +               aead_parms.aad_pad_len = spu->spu_gcm_ccm_pad_len(
> +                                        ctx->cipher.mode,
> +                                        aead_parms.assoc_size + 2);
> +
> +               /*
> +                * And when decrypting CCM, need to pad without including
> +                * size of ICV which is tacked on to end of chunk
> +                */
> +               if (!rctx->is_encrypt)
> +                       aead_parms.data_pad_len =
> +                               spu->spu_gcm_ccm_pad_len(ctx->cipher.mode,
> +                                                       chunksize - digestsize);
> +
> +               /* CCM also requires software to rewrite portions of IV: */
> +               spu->spu_ccm_update_iv(digestsize, &cipher_parms, req->assoclen,
> +                                      chunksize, rctx->is_encrypt,
> +                                      ctx->is_esp);
> +       }
> +
> +       if (ctx->is_rfc4543) {
> +               /*
> +                * RFC4543: data is included in AAD, so don't pad after AAD
> +                * and pad data based on both AAD + data size
> +                */
> +               aead_parms.aad_pad_len = 0;
> +               if (!rctx->is_encrypt)
> +                       aead_parms.data_pad_len = spu->spu_gcm_ccm_pad_len(
> +                                       ctx->cipher.mode,
> +                                       aead_parms.assoc_size + chunksize -
> +                                       digestsize);
> +               else
> +                       aead_parms.data_pad_len = spu->spu_gcm_ccm_pad_len(
> +                                       ctx->cipher.mode,
> +                                       aead_parms.assoc_size + chunksize);
> +
> +               req_opts.is_rfc4543 = true;
> +       }
> +
> +       if (spu_req_incl_icv(ctx->cipher.mode, rctx->is_encrypt)) {
> +               incl_icv = true;
> +               tx_frag_num++;
> +               /* Copy ICV from end of src scatterlist to digest buf */
> +               sg_copy_part_to_buf(req->src, rctx->msg_buf.digest, digestsize,
> +                                   req->assoclen + rctx->total_sent -
> +                                   digestsize);
> +       }
> +
> +       atomic64_add(chunksize, &iproc_priv.bytes_out);
> +
> +       flow_log("%s()-sent chunksize:%u hmac_offset:%u\n",
> +                __func__, chunksize, hash_parms.hmac_offset);
> +
> +       /* Prepend SPU header with type 3 BCM header */
> +       memcpy(rctx->msg_buf.bcm_spu_req_hdr, BCMHEADER, BCM_HDR_LEN);
> +
> +       spu_hdr_len = spu->spu_create_request(rctx->msg_buf.bcm_spu_req_hdr +
> +                                             BCM_HDR_LEN, &req_opts,
> +                                             &cipher_parms, &hash_parms,
> +                                             &aead_parms, chunksize);
> +
> +       /* Determine total length of padding. Put all padding in one buffer. */
> +       db_size = spu_real_db_size(aead_parms.assoc_size, aead_parms.iv_len, 0,
> +                                  chunksize, aead_parms.aad_pad_len,
> +                                  aead_parms.data_pad_len, 0);
> +
> +       stat_pad_len = spu->spu_wordalign_padlen(db_size);
> +
> +       if (stat_pad_len)
> +               rx_frag_num++;
> +       pad_len = aead_parms.data_pad_len + stat_pad_len;
> +       if (pad_len) {
> +               tx_frag_num++;
> +               spu->spu_request_pad(rctx->msg_buf.spu_req_pad,
> +                                    aead_parms.data_pad_len, 0,
> +                                    ctx->auth.alg, ctx->auth.mode,
> +                                    rctx->total_sent, stat_pad_len);
> +       }
> +
> +       spu->spu_dump_msg_hdr(rctx->msg_buf.bcm_spu_req_hdr + BCM_HDR_LEN,
> +                             spu_hdr_len);
> +       dump_sg(rctx->assoc, 0, aead_parms.assoc_size);
> +       packet_dump("    aead iv: ", rctx->msg_buf.iv_ctr, aead_parms.iv_len);
> +       packet_log("BD:\n");
> +       dump_sg(rctx->src_sg, rctx->src_skip, chunksize);
> +       packet_dump("   pad: ", rctx->msg_buf.spu_req_pad, pad_len);
> +
> +       /*
> +        * Build mailbox message containing SPU request msg and rx buffers
> +        * to catch response message
> +        */
> +       memset(mssg, 0, sizeof(*mssg));
> +       mssg->type = BRCM_MESSAGE_SPU;
> +       mssg->ctx = rctx;       /* Will be returned in response */
> +
> +       /* Create rx scatterlist to catch result */
> +       rx_frag_num += rctx->dst_nents;
> +       resp_len = chunksize;
> +
> +       /*
> +        * Always catch ICV in separate buffer. Have to for GCM/CCM because of
> +        * padding. Have to for SHA-224 and other truncated SHAs because SPU
> +        * sends entire digest back.
> +        */
> +       rx_frag_num++;
> +
> +       if (((ctx->cipher.mode == CIPHER_MODE_GCM) ||
> +            (ctx->cipher.mode == CIPHER_MODE_CCM)) && !rctx->is_encrypt)
> +               /*
> +                * Input is ciphertxt plus ICV, but ICV not incl
> +                * in output.
> +                */
> +               resp_len -= ctx->digestsize;
> +
> +       err = spu_aead_rx_sg_create(mssg, req, rctx, rx_frag_num,
> +                                   aead_parms.assoc_size,
> +                                   aead_parms.ret_iv_len, resp_len, digestsize,
> +                                   stat_pad_len);
> +       if (err)
> +               return err;
> +
> +       /* Create tx scatterlist containing SPU request message */
> +       tx_frag_num += rctx->src_nents;
> +       tx_frag_num += assoc_nents;
> +       if (aead_parms.aad_pad_len)
> +               tx_frag_num++;
> +       if (aead_parms.iv_len)
> +               tx_frag_num++;
> +       if (spu->spu_tx_status_len())
> +               tx_frag_num++;
> +       err = spu_aead_tx_sg_create(mssg, rctx, tx_frag_num, spu_hdr_len,
> +                                   rctx->assoc, aead_parms.assoc_size,
> +                                   assoc_nents, aead_parms.iv_len, chunksize,
> +                                   aead_parms.aad_pad_len, pad_len, incl_icv);
> +       if (err)
> +               return err;
> +
> +       err = mbox_send_message(iproc_priv.mbox[rctx->chan_idx], mssg);
> +       if (req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP) {
> +               while ((err == -ENOBUFS) && (retry_cnt < SPU_MB_RETRY_MAX)) {
> +                       /*
> +                        * Mailbox queue is full. Since MAY_SLEEP is set, assume
> +                        * not in atomic context and we can wait and try again.
> +                        */
> +                       retry_cnt++;
> +                       usleep_range(MBOX_SLEEP_MIN, MBOX_SLEEP_MAX);
> +                       err = mbox_send_message(iproc_priv.mbox[rctx->chan_idx],
> +                                               mssg);
> +                       atomic_inc(&iproc_priv.mb_no_spc);
> +               }
> +       }
> +       if (err < 0) {
> +               atomic_inc(&iproc_priv.mb_send_fail);
> +               return err;
> +       }
> +
> +       return -EINPROGRESS;
> +}
> +
> +/**
> + * handle_aead_resp() - Process a SPU response message for an AEAD request.
> + * @rctx:  Crypto request context
> + */
> +static void handle_aead_resp(struct iproc_reqctx_s *rctx)
> +{
> +       struct spu_hw *spu = &iproc_priv.spu;
> +       struct crypto_async_request *areq = rctx->parent;
> +       struct aead_request *req = container_of(areq,
> +                                               struct aead_request, base);
> +       struct iproc_ctx_s *ctx = rctx->ctx;
> +       u32 payload_len;
> +       unsigned int icv_offset;
> +       u32 result_len;
> +
> +       /* See how much data was returned */
> +       payload_len = spu->spu_payload_length(rctx->msg_buf.spu_resp_hdr);
> +       flow_log("payload_len %u\n", payload_len);
> +
> +       /* only count payload */
> +       atomic64_add(payload_len, &iproc_priv.bytes_in);
> +
> +       if (req->assoclen)
> +               packet_dump("  assoc_data ", rctx->msg_buf.a.resp_aad,
> +                           req->assoclen);
> +
> +       /*
> +        * Copy the ICV back to the destination
> +        * buffer. In decrypt case, SPU gives us back the digest, but crypto
> +        * API doesn't expect ICV in dst buffer.
> +        */
> +       result_len = req->cryptlen;
> +       if (rctx->is_encrypt) {
> +               icv_offset = req->assoclen + rctx->total_sent;
> +               packet_dump("  ICV: ", rctx->msg_buf.digest, ctx->digestsize);
> +               flow_log("copying ICV to dst sg at offset %u\n", icv_offset);
> +               sg_copy_part_from_buf(req->dst, rctx->msg_buf.digest,
> +                                     ctx->digestsize, icv_offset);
> +               result_len += ctx->digestsize;
> +       }
> +
> +       packet_log("response data:  ");
> +       dump_sg(req->dst, req->assoclen, result_len);
> +
> +       atomic_inc(&iproc_priv.op_counts[SPU_OP_AEAD]);
> +       if (ctx->cipher.alg == CIPHER_ALG_AES) {
> +               if (ctx->cipher.mode == CIPHER_MODE_CCM)
> +                       atomic_inc(&iproc_priv.aead_cnt[AES_CCM]);
> +               else if (ctx->cipher.mode == CIPHER_MODE_GCM)
> +                       atomic_inc(&iproc_priv.aead_cnt[AES_GCM]);
> +               else
> +                       atomic_inc(&iproc_priv.aead_cnt[AUTHENC]);
> +       } else {
> +               atomic_inc(&iproc_priv.aead_cnt[AUTHENC]);
> +       }
> +}
> +
> +/**
> + * spu_chunk_cleanup() - Do cleanup after processing one chunk of a request
> + * @rctx:  request context
> + *
> + * Mailbox scatterlists are allocated for each chunk. So free them after
> + * processing each chunk.
> + */
> +static void spu_chunk_cleanup(struct iproc_reqctx_s *rctx)
> +{
> +       /* mailbox message used to tx request */
> +       struct brcm_message *mssg = &rctx->mb_mssg;
> +
> +       kfree(mssg->spu.src);
> +       kfree(mssg->spu.dst);
> +       memset(mssg, 0, sizeof(struct brcm_message));
> +}
> +
> +/**
> + * finish_req() - Used to invoke the complete callback from the requester when
> + * a request has been handled asynchronously.
> + * @rctx:  Request context
> + * @err:   Indicates whether the request was successful or not
> + *
> + * Ensures that cleanup has been done for request
> + */
> +static void finish_req(struct iproc_reqctx_s *rctx, int err)
> +{
> +       struct crypto_async_request *areq = rctx->parent;
> +
> +       flow_log("%s() err:%d\n\n", __func__, err);
> +
> +       /* No harm done if already called */
> +       spu_chunk_cleanup(rctx);
> +
> +       if (areq)
> +               areq->complete(areq, err);
> +}
> +
> +/**
> + * spu_rx_callback() - Callback from mailbox framework with a SPU response.
> + * @cl:                mailbox client structure for SPU driver
> + * @msg:       mailbox message containing SPU response
> + */
> +static void spu_rx_callback(struct mbox_client *cl, void *msg)
> +{
> +       struct spu_hw *spu = &iproc_priv.spu;
> +       struct device *dev = &iproc_priv.pdev->dev;
> +       struct brcm_message *mssg = msg;
> +       struct iproc_reqctx_s *rctx;
> +       struct iproc_ctx_s *ctx;
> +       struct crypto_async_request *areq;
> +       int err = 0;
> +
> +       rctx = mssg->ctx;
> +       if (unlikely(!rctx)) {
> +               /* This is fatal */
> +               dev_err(dev, "%s(): no request context", __func__);
> +               err = -EFAULT;
> +               goto cb_finish;
> +       }
> +       areq = rctx->parent;
> +       ctx = rctx->ctx;
> +
> +       /* process the SPU status */
> +       err = spu->spu_status_process(rctx->msg_buf.rx_stat);
> +       if (err != 0) {
> +               if (err == SPU_INVALID_ICV)
> +                       atomic_inc(&iproc_priv.bad_icv);
> +               err = -EBADMSG;
> +               goto cb_finish;
> +       }
> +
> +       /* Process the SPU response message */
> +       switch (rctx->ctx->alg->type) {
> +       case CRYPTO_ALG_TYPE_ABLKCIPHER:
> +               handle_ablkcipher_resp(rctx);
> +               break;
> +       case CRYPTO_ALG_TYPE_AHASH:
> +               handle_ahash_resp(rctx);
> +               break;
> +       case CRYPTO_ALG_TYPE_AEAD:
> +               handle_aead_resp(rctx);
> +               break;
> +       default:
> +               err = -EINVAL;
> +               goto cb_finish;
> +       }
> +
> +       /*
> +        * If this response does not complete the request, then send the next
> +        * request chunk.
> +        */
> +       if (rctx->total_sent < rctx->total_todo) {
> +               /* Deallocate anything specific to previous chunk */
> +               spu_chunk_cleanup(rctx);
> +
> +               switch (rctx->ctx->alg->type) {
> +               case CRYPTO_ALG_TYPE_ABLKCIPHER:
> +                       err = handle_ablkcipher_req(rctx);
> +                       break;
> +               case CRYPTO_ALG_TYPE_AHASH:
> +                       err = handle_ahash_req(rctx);
> +                       if (err == -EAGAIN)
> +                               /*
> +                                * we saved data in hash carry, but tell crypto
> +                                * API we successfully completed request.
> +                                */
> +                               err = 0;
> +                       break;
> +               case CRYPTO_ALG_TYPE_AEAD:
> +                       err = handle_aead_req(rctx);
> +                       break;
> +               default:
> +                       err = -EINVAL;
> +               }
> +
> +               if (err == -EINPROGRESS)
> +                       /* Successfully submitted request for next chunk */
> +                       return;
> +       }
> +
> +cb_finish:
> +       finish_req(rctx, err);
> +}
> +
> +/* ==================== Kernel Cryptographic API ==================== */
> +
> +/**
> + * ablkcipher_enqueue() - Handle ablkcipher encrypt or decrypt request.
> + * @req:       Crypto API request
> + * @encrypt:   true if encrypting; false if decrypting
> + *
> + * Return: -EINPROGRESS if request accepted and result will be returned
> + *                     asynchronously
> + *        < 0 if an error
> + */
> +static int ablkcipher_enqueue(struct ablkcipher_request *req, bool encrypt)
> +{
> +       struct iproc_reqctx_s *rctx = ablkcipher_request_ctx(req);
> +       struct iproc_ctx_s *ctx =
> +           crypto_ablkcipher_ctx(crypto_ablkcipher_reqtfm(req));
> +       int err;
> +
> +       flow_log("%s() enc:%u\n", __func__, encrypt);
> +
> +       rctx->gfp = (req->base.flags & (CRYPTO_TFM_REQ_MAY_BACKLOG |
> +                      CRYPTO_TFM_REQ_MAY_SLEEP)) ? GFP_KERNEL : GFP_ATOMIC;
> +       rctx->parent = &req->base;
> +       rctx->is_encrypt = encrypt;
> +       rctx->bd_suppress = false;
> +       rctx->total_todo = req->nbytes;
> +       rctx->src_sent = 0;
> +       rctx->total_sent = 0;
> +       rctx->total_received = 0;
> +       rctx->ctx = ctx;
> +
> +       /* Initialize current position in src and dst scatterlists */
> +       rctx->src_sg = req->src;
> +       rctx->src_nents = 0;
> +       rctx->src_skip = 0;
> +       rctx->dst_sg = req->dst;
> +       rctx->dst_nents = 0;
> +       rctx->dst_skip = 0;
> +
> +       if (ctx->cipher.mode == CIPHER_MODE_CBC ||
> +           ctx->cipher.mode == CIPHER_MODE_CTR ||
> +           ctx->cipher.mode == CIPHER_MODE_OFB ||
> +           ctx->cipher.mode == CIPHER_MODE_XTS ||
> +           ctx->cipher.mode == CIPHER_MODE_GCM ||
> +           ctx->cipher.mode == CIPHER_MODE_CCM) {
> +               rctx->iv_ctr_len =
> +                   crypto_ablkcipher_ivsize(crypto_ablkcipher_reqtfm(req));
> +               memcpy(rctx->msg_buf.iv_ctr, req->info, rctx->iv_ctr_len);
> +       } else {
> +               rctx->iv_ctr_len = 0;
> +       }
> +
> +       /* Choose a SPU to process this request */
> +       rctx->chan_idx = select_channel();
> +       err = handle_ablkcipher_req(rctx);
> +       if (err != -EINPROGRESS)
> +               /* synchronous result */
> +               spu_chunk_cleanup(rctx);
> +
> +       return err;
> +}
> +
> +static int des_setkey(struct crypto_ablkcipher *cipher, const u8 *key,
> +                     unsigned int keylen)
> +{
> +       struct iproc_ctx_s *ctx = crypto_ablkcipher_ctx(cipher);
> +       u32 tmp[DES_EXPKEY_WORDS];
> +
> +       if (keylen == DES_KEY_SIZE) {
> +               if (des_ekey(tmp, key) == 0) {
> +                       if (crypto_ablkcipher_get_flags(cipher) &
> +                           CRYPTO_TFM_REQ_WEAK_KEY) {
> +                               u32 flags = CRYPTO_TFM_RES_WEAK_KEY;
> +
> +                               crypto_ablkcipher_set_flags(cipher, flags);
> +                               return -EINVAL;
> +                       }
> +               }
> +
> +               ctx->cipher_type = CIPHER_TYPE_DES;
> +       } else {
> +               crypto_ablkcipher_set_flags(cipher, CRYPTO_TFM_RES_BAD_KEY_LEN);
> +               return -EINVAL;
> +       }
> +       return 0;
> +}
> +
> +static int threedes_setkey(struct crypto_ablkcipher *cipher, const u8 *key,
> +                          unsigned int keylen)
> +{
> +       struct iproc_ctx_s *ctx = crypto_ablkcipher_ctx(cipher);
> +
> +       if (keylen == (DES_KEY_SIZE * 3)) {
> +               const u32 *K = (const u32 *)key;
> +               u32 flags = CRYPTO_TFM_RES_BAD_KEY_SCHED;
> +
> +               if (!((K[0] ^ K[2]) | (K[1] ^ K[3])) ||
> +                   !((K[2] ^ K[4]) | (K[3] ^ K[5]))) {
> +                       crypto_ablkcipher_set_flags(cipher, flags);
> +                       return -EINVAL;
> +               }
> +
> +               ctx->cipher_type = CIPHER_TYPE_3DES;
> +       } else {
> +               crypto_ablkcipher_set_flags(cipher, CRYPTO_TFM_RES_BAD_KEY_LEN);
> +               return -EINVAL;
> +       }
> +       return 0;
> +}
> +
> +static int aes_setkey(struct crypto_ablkcipher *cipher, const u8 *key,
> +                     unsigned int keylen)
> +{
> +       struct iproc_ctx_s *ctx = crypto_ablkcipher_ctx(cipher);
> +
> +       if (ctx->cipher.mode == CIPHER_MODE_XTS)
> +               /* XTS includes two keys of equal length */
> +               keylen = keylen / 2;
> +
> +       switch (keylen) {
> +       case AES_KEYSIZE_128:
> +               ctx->cipher_type = CIPHER_TYPE_AES128;
> +               break;
> +       case AES_KEYSIZE_192:
> +               ctx->cipher_type = CIPHER_TYPE_AES192;
> +               break;
> +       case AES_KEYSIZE_256:
> +               ctx->cipher_type = CIPHER_TYPE_AES256;
> +               break;
> +       default:
> +               crypto_ablkcipher_set_flags(cipher, CRYPTO_TFM_RES_BAD_KEY_LEN);
> +               return -EINVAL;
> +       }
> +       WARN_ON((ctx->max_payload != SPU_MAX_PAYLOAD_INF) &&
> +               ((ctx->max_payload % AES_BLOCK_SIZE) != 0));
> +       return 0;
> +}
> +
> +static int rc4_setkey(struct crypto_ablkcipher *cipher, const u8 *key,
> +                     unsigned int keylen)
> +{
> +       struct iproc_ctx_s *ctx = crypto_ablkcipher_ctx(cipher);
> +       int i;
> +
> +       ctx->enckeylen = ARC4_MAX_KEY_SIZE + ARC4_STATE_SIZE;
> +
> +       ctx->enckey[0] = 0x00;  /* 0x00 */
> +       ctx->enckey[1] = 0x00;  /* i    */
> +       ctx->enckey[2] = 0x00;  /* 0x00 */
> +       ctx->enckey[3] = 0x00;  /* j    */
> +       for (i = 0; i < ARC4_MAX_KEY_SIZE; i++)
> +               ctx->enckey[i + ARC4_STATE_SIZE] = key[i % keylen];
> +
> +       ctx->cipher_type = CIPHER_TYPE_INIT;
> +
> +       return 0;
> +}
> +
> +static int ablkcipher_setkey(struct crypto_ablkcipher *cipher, const u8 *key,
> +                            unsigned int keylen)
> +{
> +       struct spu_hw *spu = &iproc_priv.spu;
> +       struct iproc_ctx_s *ctx = crypto_ablkcipher_ctx(cipher);
> +       struct spu_cipher_parms cipher_parms;
> +       u32 alloc_len = 0;
> +       int err;
> +
> +       flow_log("ablkcipher_setkey() keylen: %d\n", keylen);
> +       flow_dump("  key: ", key, keylen);
> +
> +       switch (ctx->cipher.alg) {
> +       case CIPHER_ALG_DES:
> +               err = des_setkey(cipher, key, keylen);
> +               break;
> +       case CIPHER_ALG_3DES:
> +               err = threedes_setkey(cipher, key, keylen);
> +               break;
> +       case CIPHER_ALG_AES:
> +               err = aes_setkey(cipher, key, keylen);
> +               break;
> +       case CIPHER_ALG_RC4:
> +               err = rc4_setkey(cipher, key, keylen);
> +               break;
> +       default:
> +               pr_err("%s() Error: unknown cipher alg\n", __func__);
> +               err = -EINVAL;
> +       }
> +       if (err)
> +               return err;
> +
> +       /* RC4 already populated ctx->enkey */
> +       if (ctx->cipher.alg != CIPHER_ALG_RC4) {
> +               memcpy(ctx->enckey, key, keylen);
> +               ctx->enckeylen = keylen;
> +       }
> +       /* SPU needs XTS keys in the reverse order the crypto API presents */
> +       if ((ctx->cipher.alg == CIPHER_ALG_AES) &&
> +           (ctx->cipher.mode == CIPHER_MODE_XTS)) {
> +               unsigned int xts_keylen = keylen / 2;
> +
> +               memcpy(ctx->enckey, key + xts_keylen, xts_keylen);
> +               memcpy(ctx->enckey + xts_keylen, key, xts_keylen);
> +       }
> +
> +       if (spu->spu_type == SPU_TYPE_SPUM)
> +               alloc_len = BCM_HDR_LEN + SPU_HEADER_ALLOC_LEN;
> +       else if (spu->spu_type == SPU_TYPE_SPU2)
> +               alloc_len = BCM_HDR_LEN + SPU2_HEADER_ALLOC_LEN;
> +       memset(ctx->bcm_spu_req_hdr, 0, alloc_len);
> +       cipher_parms.iv_buf = NULL;
> +       cipher_parms.iv_len = crypto_ablkcipher_ivsize(cipher);
> +       flow_log("%s: iv_len %u\n", __func__, cipher_parms.iv_len);
> +
> +       cipher_parms.alg = ctx->cipher.alg;
> +       cipher_parms.mode = ctx->cipher.mode;
> +       cipher_parms.type = ctx->cipher_type;
> +       cipher_parms.key_buf = ctx->enckey;
> +       cipher_parms.key_len = ctx->enckeylen;
> +
> +       /* Prepend SPU request message with BCM header */
> +       memcpy(ctx->bcm_spu_req_hdr, BCMHEADER, BCM_HDR_LEN);
> +       ctx->spu_req_hdr_len =
> +           spu->spu_cipher_req_init(ctx->bcm_spu_req_hdr + BCM_HDR_LEN,
> +                                    &cipher_parms);
> +
> +       ctx->spu_resp_hdr_len = spu->spu_response_hdr_len(ctx->authkeylen,
> +                                                         ctx->enckeylen,
> +                                                         false);
> +
> +       atomic_inc(&iproc_priv.setkey_cnt[SPU_OP_CIPHER]);
> +
> +       return 0;
> +}
> +
> +static int ablkcipher_encrypt(struct ablkcipher_request *req)
> +{
> +       flow_log("ablkcipher_encrypt() nbytes:%u\n", req->nbytes);
> +
> +       return ablkcipher_enqueue(req, true);
> +}
> +
> +static int ablkcipher_decrypt(struct ablkcipher_request *req)
> +{
> +       flow_log("ablkcipher_decrypt() nbytes:%u\n", req->nbytes);
> +       return ablkcipher_enqueue(req, false);
> +}
> +
> +static int ahash_enqueue(struct ahash_request *req)
> +{
> +       struct iproc_reqctx_s *rctx = ahash_request_ctx(req);
> +       struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
> +       struct iproc_ctx_s *ctx = crypto_ahash_ctx(tfm);
> +       int err = 0;
> +       const char *alg_name;
> +
> +       flow_log("ahash_enqueue() nbytes:%u\n", req->nbytes);
> +
> +       rctx->gfp = (req->base.flags & (CRYPTO_TFM_REQ_MAY_BACKLOG |
> +                      CRYPTO_TFM_REQ_MAY_SLEEP)) ? GFP_KERNEL : GFP_ATOMIC;
> +       rctx->parent = &req->base;
> +       rctx->ctx = ctx;
> +       rctx->bd_suppress = true;
> +       memset(&rctx->mb_mssg, 0, sizeof(struct brcm_message));
> +
> +       /* Initialize position in src scatterlist */
> +       rctx->src_sg = req->src;
> +       rctx->src_skip = 0;
> +       rctx->src_nents = 0;
> +       rctx->dst_sg = NULL;
> +       rctx->dst_skip = 0;
> +       rctx->dst_nents = 0;
> +
> +       /* SPU2 hardware does not compute hash of zero length data */
> +       if ((rctx->is_final == 1) && (rctx->total_todo == 0) &&
> +           (iproc_priv.spu.spu_type == SPU_TYPE_SPU2)) {
> +               alg_name = crypto_tfm_alg_name(crypto_ahash_tfm(tfm));
> +               flow_log("Doing %sfinal %s zero-len hash request in software\n",
> +                        rctx->is_final ? "" : "non-", alg_name);
> +               err = do_shash((unsigned char *)alg_name, req->result,
> +                              NULL, 0, NULL, 0, ctx->authkey,
> +                              ctx->authkeylen);
> +               if (err < 0)
> +                       flow_log("Hash request failed with error %d\n", err);
> +               return err;
> +       }
> +       /* Choose a SPU to process this request */
> +       rctx->chan_idx = select_channel();
> +
> +       err = handle_ahash_req(rctx);
> +       if (err != -EINPROGRESS)
> +               /* synchronous result */
> +               spu_chunk_cleanup(rctx);
> +
> +       if (err == -EAGAIN)
> +               /*
> +                * we saved data in hash carry, but tell crypto API
> +                * we successfully completed request.
> +                */
> +               err = 0;
> +
> +       return err;
> +}
> +
> +static int __ahash_init(struct ahash_request *req)
> +{
> +       struct spu_hw *spu = &iproc_priv.spu;
> +       struct iproc_reqctx_s *rctx = ahash_request_ctx(req);
> +       struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
> +       struct iproc_ctx_s *ctx = crypto_ahash_ctx(tfm);
> +
> +       flow_log("%s()\n", __func__);
> +
> +       /* Initialize the context */
> +       rctx->hash_carry_len = 0;
> +       rctx->is_final = 0;
> +
> +       rctx->total_todo = 0;
> +       rctx->src_sent = 0;
> +       rctx->total_sent = 0;
> +       rctx->total_received = 0;
> +
> +       ctx->digestsize = crypto_ahash_digestsize(tfm);
> +       /* If we add a hash whose digest is larger, catch it here. */
> +       WARN_ON(ctx->digestsize > MAX_DIGEST_SIZE);
> +
> +       rctx->is_sw_hmac = false;
> +
> +       ctx->spu_resp_hdr_len = spu->spu_response_hdr_len(ctx->authkeylen, 0,
> +                                                         true);
> +
> +       return 0;
> +}
> +
> +/**
> + * spu_no_incr_hash() - Determine whether incremental hashing is supported.
> + * @ctx:  Crypto session context
> + *
> + * SPU-2 does not support incremental hashing (we'll have to revisit and
> + * condition based on chip revision or device tree entry if future versions do
> + * support incremental hash)
> + *
> + * SPU-M also doesn't support incremental hashing of AES-XCBC
> + *
> + * Return: true if incremental hashing is not supported
> + *         false otherwise
> + */
> +bool spu_no_incr_hash(struct iproc_ctx_s *ctx)
> +{
> +       struct spu_hw *spu = &iproc_priv.spu;
> +
> +       if (spu->spu_type == SPU_TYPE_SPU2)
> +               return true;
> +
> +       if ((ctx->auth.alg == HASH_ALG_AES) &&
> +           (ctx->auth.mode == HASH_MODE_XCBC))
> +               return true;
> +
> +       /* Otherwise, incremental hashing is supported */
> +       return false;
> +}
> +
> +static int ahash_init(struct ahash_request *req)
> +{
> +       struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
> +       struct iproc_ctx_s *ctx = crypto_ahash_ctx(tfm);
> +       const char *alg_name;
> +       struct crypto_shash *hash;
> +       int ret;
> +       gfp_t gfp;
> +
> +       if (spu_no_incr_hash(ctx)) {
> +               /*
> +                * If we get an incremental hashing request and it's not
> +                * supported by the hardware, we need to handle it in software
> +                * by calling synchronous hash functions.
> +                */
> +               alg_name = crypto_tfm_alg_name(crypto_ahash_tfm(tfm));
> +               hash = crypto_alloc_shash(alg_name, 0, 0);
> +               if (IS_ERR(hash)) {
> +                       ret = PTR_ERR(hash);
> +                       return ret;
> +               }
> +
> +               gfp = (req->base.flags & (CRYPTO_TFM_REQ_MAY_BACKLOG |
> +                      CRYPTO_TFM_REQ_MAY_SLEEP)) ? GFP_KERNEL : GFP_ATOMIC;
> +               ctx->shash = kmalloc(sizeof(*ctx->shash) +
> +                                    crypto_shash_descsize(hash), gfp);
> +               if (!ctx->shash) {
> +                       crypto_free_shash(hash);
> +                       return -ENOMEM;
> +               }
> +               ctx->shash->tfm = hash;
> +               ctx->shash->flags = 0;
> +
> +               /* Set the key using data we already have from setkey */
> +               if (ctx->authkeylen > 0) {
> +                       ret = crypto_shash_setkey(hash, ctx->authkey,
> +                                                 ctx->authkeylen);
> +                       if (ret) {
> +                               crypto_free_shash(hash);
> +                               kfree(ctx->shash);
> +                               return ret;
> +                       }
> +               }
> +
> +               /* Initialize hash w/ this key and other params */
> +               ret = crypto_shash_init(ctx->shash);
> +               if (ret) {
> +                       crypto_free_shash(hash);
> +                       kfree(ctx->shash);
> +                       return ret;
> +               }
> +       } else {
> +               /* Otherwise call the internal function which uses SPU hw */
> +               ret = __ahash_init(req);
> +       }
> +
> +       return ret;
> +}
> +
> +static int __ahash_update(struct ahash_request *req)
> +{
> +       struct iproc_reqctx_s *rctx = ahash_request_ctx(req);
> +
> +       flow_log("ahash_update() nbytes:%u\n", req->nbytes);
> +
> +       if (!req->nbytes)
> +               return 0;
> +       rctx->total_todo += req->nbytes;
> +       rctx->src_sent = 0;
> +
> +       return ahash_enqueue(req);
> +}
> +
> +static int ahash_update(struct ahash_request *req)
> +{
> +       struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
> +       struct iproc_ctx_s *ctx = crypto_ahash_ctx(tfm);
> +       u8 *tmpbuf;
> +       int ret;
> +       int nents;
> +       gfp_t gfp;
> +
> +       if (spu_no_incr_hash(ctx)) {
> +               /*
> +                * If we get an incremental hashing request and it's not
> +                * supported by the hardware, we need to handle it in software
> +                * by calling synchronous hash functions.
> +                */
> +               if (req->src)
> +                       nents = sg_nents(req->src);
> +               else
> +                       return -EINVAL;
> +
> +               /* Copy data from req scatterlist to tmp buffer */
> +               gfp = (req->base.flags & (CRYPTO_TFM_REQ_MAY_BACKLOG |
> +                      CRYPTO_TFM_REQ_MAY_SLEEP)) ? GFP_KERNEL : GFP_ATOMIC;
> +               tmpbuf = kmalloc(req->nbytes, gfp);
> +               if (!tmpbuf)
> +                       return -ENOMEM;
> +
> +               if (sg_copy_to_buffer(req->src, nents, tmpbuf, req->nbytes) !=
> +                               req->nbytes) {
> +                       kfree(tmpbuf);
> +                       return -EINVAL;
> +               }
> +
> +               /* Call synchronous update */
> +               ret = crypto_shash_update(ctx->shash, tmpbuf, req->nbytes);
> +               kfree(tmpbuf);
> +       } else {
> +               /* Otherwise call the internal function which uses SPU hw */
> +               ret = __ahash_update(req);
> +       }
> +
> +       return ret;
> +}
> +
> +static int __ahash_final(struct ahash_request *req)
> +{
> +       struct iproc_reqctx_s *rctx = ahash_request_ctx(req);
> +
> +       flow_log("ahash_final() nbytes:%u\n", req->nbytes);
> +
> +       rctx->is_final = 1;
> +
> +       return ahash_enqueue(req);
> +}
> +
> +static int ahash_final(struct ahash_request *req)
> +{
> +       struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
> +       struct iproc_ctx_s *ctx = crypto_ahash_ctx(tfm);
> +       int ret;
> +
> +       if (spu_no_incr_hash(ctx)) {
> +               /*
> +                * If we get an incremental hashing request and it's not
> +                * supported by the hardware, we need to handle it in software
> +                * by calling synchronous hash functions.
> +                */
> +               ret = crypto_shash_final(ctx->shash, req->result);
> +
> +               /* Done with hash, can deallocate it now */
> +               crypto_free_shash(ctx->shash->tfm);
> +               kfree(ctx->shash);
> +
> +       } else {
> +               /* Otherwise call the internal function which uses SPU hw */
> +               ret = __ahash_final(req);
> +       }
> +
> +       return ret;
> +}
> +
> +static int __ahash_finup(struct ahash_request *req)
> +{
> +       struct iproc_reqctx_s *rctx = ahash_request_ctx(req);
> +
> +       flow_log("ahash_finup() nbytes:%u\n", req->nbytes);
> +
> +       rctx->total_todo += req->nbytes;
> +       rctx->src_sent = 0;
> +       rctx->is_final = 1;
> +
> +       return ahash_enqueue(req);
> +}
> +
> +static int ahash_finup(struct ahash_request *req)
> +{
> +       struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
> +       struct iproc_ctx_s *ctx = crypto_ahash_ctx(tfm);
> +       u8 *tmpbuf;
> +       int ret;
> +       int nents;
> +       gfp_t gfp;
> +
> +       if (spu_no_incr_hash(ctx)) {
> +               /*
> +                * If we get an incremental hashing request and it's not
> +                * supported by the hardware, we need to handle it in software
> +                * by calling synchronous hash functions.
> +                */
> +               if (req->src) {
> +                       nents = sg_nents(req->src);
> +               } else {
> +                       ret = -EINVAL;
> +                       goto ahash_finup_exit;
> +               }
> +
> +               /* Copy data from req scatterlist to tmp buffer */
> +               gfp = (req->base.flags & (CRYPTO_TFM_REQ_MAY_BACKLOG |
> +                      CRYPTO_TFM_REQ_MAY_SLEEP)) ? GFP_KERNEL : GFP_ATOMIC;
> +               tmpbuf = kmalloc(req->nbytes, gfp);
> +               if (!tmpbuf) {
> +                       ret = -ENOMEM;
> +                       goto ahash_finup_exit;
> +               }
> +
> +               if (sg_copy_to_buffer(req->src, nents, tmpbuf, req->nbytes) !=
> +                               req->nbytes) {
> +                       kfree(tmpbuf);
> +                       ret = -EINVAL;
> +                       goto ahash_finup_exit;
> +               }
> +
> +               /* Call synchronous update */
> +               ret = crypto_shash_finup(ctx->shash, tmpbuf, req->nbytes,
> +                                        req->result);
> +               kfree(tmpbuf);
> +       } else {
> +               /* Otherwise call the internal function which uses SPU hw */
> +               return __ahash_finup(req);
> +       }
> +
> +ahash_finup_exit:
> +       /* Done with hash, can deallocate it now */
> +       crypto_free_shash(ctx->shash->tfm);
> +       kfree(ctx->shash);
> +       return ret;
> +}
> +
> +static int ahash_digest(struct ahash_request *req)
> +{
> +       int err = 0;
> +
> +       flow_log("ahash_digest() nbytes:%u\n", req->nbytes);
> +
> +       /* whole thing at once */
> +       err = __ahash_init(req);
> +       if (!err)
> +               err = __ahash_finup(req);
> +
> +       return err;
> +}
> +
> +static int ahash_setkey(struct crypto_ahash *ahash, const u8 *key,
> +                       unsigned int keylen)
> +{
> +       struct iproc_ctx_s *ctx = crypto_ahash_ctx(ahash);
> +
> +       flow_log("%s() ahash:%p key:%p keylen:%u\n",
> +                __func__, ahash, key, keylen);
> +       flow_dump("  key: ", key, keylen);
> +
> +       if (ctx->auth.alg == HASH_ALG_AES) {
> +               switch (keylen) {
> +               case AES_KEYSIZE_128:
> +                       ctx->cipher_type = CIPHER_TYPE_AES128;
> +                       break;
> +               case AES_KEYSIZE_192:
> +                       ctx->cipher_type = CIPHER_TYPE_AES192;
> +                       break;
> +               case AES_KEYSIZE_256:
> +                       ctx->cipher_type = CIPHER_TYPE_AES256;
> +                       break;
> +               default:
> +                       pr_err("%s() Error: Invalid key length\n", __func__);
> +                       return -EINVAL;
> +               }
> +       } else {
> +               pr_err("%s() Error: unknown hash alg\n", __func__);
> +               return -EINVAL;
> +       }
> +       memcpy(ctx->authkey, key, keylen);
> +       ctx->authkeylen = keylen;
> +
> +       return 0;
> +}
> +
> +static int ahash_export(struct ahash_request *req, void *out)
> +{
> +       const struct iproc_reqctx_s *rctx = ahash_request_ctx(req);
> +
> +       memcpy(out, rctx, offsetof(struct iproc_reqctx_s, msg_buf));
> +       return 0;
> +}
> +
> +static int ahash_import(struct ahash_request *req, const void *in)
> +{
> +       struct iproc_reqctx_s *rctx = ahash_request_ctx(req);
> +
> +       memcpy(rctx, in, offsetof(struct iproc_reqctx_s, msg_buf));
> +       return 0;
> +}
> +
> +static int ahash_hmac_setkey(struct crypto_ahash *ahash, const u8 *key,
> +                            unsigned int keylen)
> +{
> +       struct iproc_ctx_s *ctx = crypto_ahash_ctx(ahash);
> +       unsigned int blocksize =
> +               crypto_tfm_alg_blocksize(crypto_ahash_tfm(ahash));
> +       unsigned int digestsize = crypto_ahash_digestsize(ahash);
> +       unsigned int index;
> +       int rc;
> +
> +       flow_log("%s() ahash:%p key:%p keylen:%u blksz:%u digestsz:%u\n",
> +                __func__, ahash, key, keylen, blocksize, digestsize);
> +       flow_dump("  key: ", key, keylen);
> +
> +       if (keylen > blocksize) {
> +               switch (ctx->auth.alg) {
> +               case HASH_ALG_MD5:
> +                       rc = do_shash("md5", ctx->authkey, key, keylen, NULL,
> +                                     0, NULL, 0);
> +                       break;
> +               case HASH_ALG_SHA1:
> +                       rc = do_shash("sha1", ctx->authkey, key, keylen, NULL,
> +                                     0, NULL, 0);
> +                       break;
> +               case HASH_ALG_SHA224:
> +                       rc = do_shash("sha224", ctx->authkey, key, keylen, NULL,
> +                                     0, NULL, 0);
> +                       break;
> +               case HASH_ALG_SHA256:
> +                       rc = do_shash("sha256", ctx->authkey, key, keylen, NULL,
> +                                     0, NULL, 0);
> +                       break;
> +               case HASH_ALG_SHA384:
> +                       rc = do_shash("sha384", ctx->authkey, key, keylen, NULL,
> +                                     0, NULL, 0);
> +                       break;
> +               case HASH_ALG_SHA512:
> +                       rc = do_shash("sha512", ctx->authkey, key, keylen, NULL,
> +                                     0, NULL, 0);
> +                       break;
> +               case HASH_ALG_SHA3_224:
> +                       rc = do_shash("sha3-224", ctx->authkey, key, keylen,
> +                                     NULL, 0, NULL, 0);
> +                       break;
> +               case HASH_ALG_SHA3_256:
> +                       rc = do_shash("sha3-256", ctx->authkey, key, keylen,
> +                                     NULL, 0, NULL, 0);
> +                       break;
> +               case HASH_ALG_SHA3_384:
> +                       rc = do_shash("sha3-384", ctx->authkey, key, keylen,
> +                                     NULL, 0, NULL, 0);
> +                       break;
> +               case HASH_ALG_SHA3_512:
> +                       rc = do_shash("sha3-512", ctx->authkey, key, keylen,
> +                                     NULL, 0, NULL, 0);
> +                       break;
> +               default:
> +                       pr_err("%s() Error: unknown hash alg\n", __func__);
> +                       return -EINVAL;
> +               }
> +               if (rc < 0) {
> +                       pr_err("%s() Error %d computing shash for %s\n",
> +                              __func__, rc, hash_alg_name[ctx->auth.alg]);
> +                       return rc;
> +               }
> +               ctx->authkeylen = digestsize;
> +
> +               flow_log("  keylen > digestsize... hashed\n");
> +               flow_dump("  newkey: ", ctx->authkey, ctx->authkeylen);
> +       } else {
> +               memcpy(ctx->authkey, key, keylen);
> +               ctx->authkeylen = keylen;
> +       }
> +
> +       /*
> +        * Full HMAC operation in SPUM is not verified,
> +        * So keeping the generation of IPAD, OPAD and
> +        * outer hashing in software.
> +        */
> +       if (iproc_priv.spu.spu_type == SPU_TYPE_SPUM) {
> +               memcpy(ctx->ipad, ctx->authkey, ctx->authkeylen);
> +               memset(ctx->ipad + ctx->authkeylen, 0,
> +                      blocksize - ctx->authkeylen);
> +               ctx->authkeylen = 0;
> +               memcpy(ctx->opad, ctx->ipad, blocksize);
> +
> +               for (index = 0; index < blocksize; index++) {
> +                       ctx->ipad[index] ^= 0x36;
> +                       ctx->opad[index] ^= 0x5c;
> +               }
> +
> +               flow_dump("  ipad: ", ctx->ipad, blocksize);
> +               flow_dump("  opad: ", ctx->opad, blocksize);
> +       }
> +       ctx->digestsize = digestsize;
> +       atomic_inc(&iproc_priv.setkey_cnt[SPU_OP_HMAC]);
> +
> +       return 0;
> +}
> +
> +static int ahash_hmac_init(struct ahash_request *req)
> +{
> +       struct iproc_reqctx_s *rctx = ahash_request_ctx(req);
> +       struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
> +       struct iproc_ctx_s *ctx = crypto_ahash_ctx(tfm);
> +       unsigned int blocksize =
> +                       crypto_tfm_alg_blocksize(crypto_ahash_tfm(tfm));
> +
> +       flow_log("ahash_hmac_init()\n");
> +
> +       /* init the context as a hash */
> +       ahash_init(req);
> +
> +       if (!spu_no_incr_hash(ctx)) {
> +               /* SPU-M can do incr hashing but needs sw for outer HMAC */
> +               rctx->is_sw_hmac = true;
> +               ctx->auth.mode = HASH_MODE_HASH;
> +               /* start with a prepended ipad */
> +               memcpy(rctx->hash_carry, ctx->ipad, blocksize);
> +               rctx->hash_carry_len = blocksize;
> +               rctx->total_todo += blocksize;
> +       }
> +
> +       return 0;
> +}
> +
> +static int ahash_hmac_update(struct ahash_request *req)
> +{
> +       flow_log("ahash_hmac_update() nbytes:%u\n", req->nbytes);
> +
> +       if (!req->nbytes)
> +               return 0;
> +
> +       return ahash_update(req);
> +}
> +
> +static int ahash_hmac_final(struct ahash_request *req)
> +{
> +       flow_log("ahash_hmac_final() nbytes:%u\n", req->nbytes);
> +
> +       return ahash_final(req);
> +}
> +
> +static int ahash_hmac_finup(struct ahash_request *req)
> +{
> +       flow_log("ahash_hmac_finupl() nbytes:%u\n", req->nbytes);
> +
> +       return ahash_finup(req);
> +}
> +
> +static int ahash_hmac_digest(struct ahash_request *req)
> +{
> +       struct iproc_reqctx_s *rctx = ahash_request_ctx(req);
> +       struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
> +       struct iproc_ctx_s *ctx = crypto_ahash_ctx(tfm);
> +       unsigned int blocksize =
> +                       crypto_tfm_alg_blocksize(crypto_ahash_tfm(tfm));
> +
> +       flow_log("ahash_hmac_digest() nbytes:%u\n", req->nbytes);
> +
> +       /* Perform initialization and then call finup */
> +       __ahash_init(req);
> +
> +       if (iproc_priv.spu.spu_type == SPU_TYPE_SPU2) {
> +               /*
> +                * SPU2 supports full HMAC implementation in the
> +                * hardware, need not to generate IPAD, OPAD and
> +                * outer hash in software.
> +                * Only for hash key len > hash block size, SPU2
> +                * expects to perform hashing on the key, shorten
> +                * it to digest size and feed it as hash key.
> +                */
> +               rctx->is_sw_hmac = false;
> +               ctx->auth.mode = HASH_MODE_HMAC;
> +       } else {
> +               rctx->is_sw_hmac = true;
> +               ctx->auth.mode = HASH_MODE_HASH;
> +               /* start with a prepended ipad */
> +               memcpy(rctx->hash_carry, ctx->ipad, blocksize);
> +               rctx->hash_carry_len = blocksize;
> +               rctx->total_todo += blocksize;
> +       }
> +
> +       return __ahash_finup(req);
> +}
> +
> +/* aead helpers */
> +
> +static int aead_need_fallback(struct aead_request *req)
> +{
> +       struct spu_hw *spu = &iproc_priv.spu;
> +       struct crypto_aead *aead = crypto_aead_reqtfm(req);
> +       struct iproc_ctx_s *ctx = crypto_aead_ctx(aead);
> +       u32 payload_len;
> +
> +       /*
> +        * SPU hardware cannot handle the AES-GCM/CCM case where plaintext
> +        * and AAD are both 0 bytes long. So use fallback in this case.
> +        */
> +       if (((ctx->cipher.mode == CIPHER_MODE_GCM) ||
> +            (ctx->cipher.mode == CIPHER_MODE_CCM)) &&
> +            (req->cryptlen + req->assoclen) == 0) {
> +               flow_log("%s() AES GCM/CCM needs fallback for 0 len request\n",
> +                        __func__);
> +               return 1;
> +       }
> +
> +       /* SPU-M hardware only supports CCM digest size of 8, 12, or 16 bytes */
> +       if ((ctx->cipher.mode == CIPHER_MODE_CCM) &&
> +           (spu->spu_type == SPU_TYPE_SPUM) &&
> +           (ctx->digestsize != 8) && (ctx->digestsize != 12) &&
> +           (ctx->digestsize != 16)) {
> +               flow_log("%s() AES CCM needs fallbck for digest size %d\n",
> +                        __func__, ctx->digestsize);
> +               return 1;
> +       }
> +
> +       /*
> +        * SPU-M on NSP has an issue where AES-CCM hash is not correct
> +        * when AAD size is 0
> +        */
> +       if ((ctx->cipher.mode == CIPHER_MODE_CCM) &&
> +           (spu->spu_subtype == SPU_SUBTYPE_SPUM_NSP) &&
> +           (req->assoclen == 0)) {
> +               flow_log("%s() AES_CCM needs fallback for 0 len AAD on NSP\n",
> +                        __func__);
> +               return 1;
> +       }
> +
> +       payload_len = req->cryptlen;
> +       if (spu->spu_type == SPU_TYPE_SPUM)
> +               payload_len += req->assoclen;
> +
> +       flow_log("%s() payload len: %u\n", __func__, payload_len);
> +
> +       if (ctx->max_payload == SPU_MAX_PAYLOAD_INF)
> +               return 0;
> +       else
> +               return payload_len > ctx->max_payload;
> +}
> +
> +static void aead_complete(struct crypto_async_request *areq, int err)
> +{
> +       struct aead_request *req =
> +           container_of(areq, struct aead_request, base);
> +       struct iproc_reqctx_s *rctx = aead_request_ctx(req);
> +       struct crypto_aead *aead = crypto_aead_reqtfm(req);
> +
> +       flow_log("%s() err:%d\n", __func__, err);
> +
> +       areq->tfm = crypto_aead_tfm(aead);
> +
> +       areq->complete = rctx->old_complete;
> +       areq->data = rctx->old_data;
> +
> +       areq->complete(areq, err);
> +}
> +
> +static int aead_do_fallback(struct aead_request *req, bool is_encrypt)
> +{
> +       struct crypto_aead *aead = crypto_aead_reqtfm(req);
> +       struct crypto_tfm *tfm = crypto_aead_tfm(aead);
> +       struct iproc_reqctx_s *rctx = aead_request_ctx(req);
> +       struct iproc_ctx_s *ctx = crypto_tfm_ctx(tfm);
> +       int err;
> +       u32 req_flags;
> +
> +       flow_log("%s() enc:%u\n", __func__, is_encrypt);
> +
> +       if (ctx->fallback_cipher) {
> +               /* Store the cipher tfm and then use the fallback tfm */
> +               rctx->old_tfm = tfm;
> +               aead_request_set_tfm(req, ctx->fallback_cipher);
> +               /*
> +                * Save the callback and chain ourselves in, so we can restore
> +                * the tfm
> +                */
> +               rctx->old_complete = req->base.complete;
> +               rctx->old_data = req->base.data;
> +               req_flags = aead_request_flags(req);
> +               aead_request_set_callback(req, req_flags, aead_complete, req);
> +               err = is_encrypt ? crypto_aead_encrypt(req) :
> +                   crypto_aead_decrypt(req);
> +
> +               if (err == 0) {
> +                       /*
> +                        * fallback was synchronous (did not return
> +                        * -EINPROGRESS). So restore request state here.
> +                        */
> +                       aead_request_set_callback(req, req_flags,
> +                                                 rctx->old_complete, req);
> +                       req->base.data = rctx->old_data;
> +                       aead_request_set_tfm(req, aead);
> +                       flow_log("%s() fallback completed successfully\n\n",
> +                                __func__);
> +               }
> +       } else {
> +               err = -EINVAL;
> +       }
> +
> +       return err;
> +}
> +
> +static int aead_enqueue(struct aead_request *req, bool is_encrypt)
> +{
> +       struct iproc_reqctx_s *rctx = aead_request_ctx(req);
> +       struct crypto_aead *aead = crypto_aead_reqtfm(req);
> +       struct iproc_ctx_s *ctx = crypto_aead_ctx(aead);
> +       int err;
> +
> +       flow_log("%s() enc:%u\n", __func__, is_encrypt);
> +
> +       if (req->assoclen > MAX_ASSOC_SIZE) {
> +               pr_err
> +                   ("%s() Error: associated data too long. (%u > %u bytes)\n",
> +                    __func__, req->assoclen, MAX_ASSOC_SIZE);
> +               return -EINVAL;
> +       }
> +
> +       rctx->gfp = (req->base.flags & (CRYPTO_TFM_REQ_MAY_BACKLOG |
> +                      CRYPTO_TFM_REQ_MAY_SLEEP)) ? GFP_KERNEL : GFP_ATOMIC;
> +       rctx->parent = &req->base;
> +       rctx->is_encrypt = is_encrypt;
> +       rctx->bd_suppress = false;
> +       rctx->total_todo = req->cryptlen;
> +       rctx->src_sent = 0;
> +       rctx->total_sent = 0;
> +       rctx->total_received = 0;
> +       rctx->is_sw_hmac = false;
> +       rctx->ctx = ctx;
> +       memset(&rctx->mb_mssg, 0, sizeof(struct brcm_message));
> +
> +       /* assoc data is at start of src sg */
> +       rctx->assoc = req->src;
> +
> +       /*
> +        * Init current position in src scatterlist to be after assoc data.
> +        * src_skip set to buffer offset where data begins. (Assoc data could
> +        * end in the middle of a buffer.)
> +        */
> +       if (spu_sg_at_offset(req->src, req->assoclen, &rctx->src_sg,
> +                            &rctx->src_skip) < 0) {
> +               pr_err("%s() Error: Unable to find start of src data\n",
> +                      __func__);
> +               return -EINVAL;
> +       }
> +
> +       rctx->src_nents = 0;
> +       rctx->dst_nents = 0;
> +       if (req->dst == req->src) {
> +               rctx->dst_sg = rctx->src_sg;
> +               rctx->dst_skip = rctx->src_skip;
> +       } else {
> +               /*
> +                * Expect req->dst to have room for assoc data followed by
> +                * output data and ICV, if encrypt. So initialize dst_sg
> +                * to point beyond assoc len offset.
> +                */
> +               if (spu_sg_at_offset(req->dst, req->assoclen, &rctx->dst_sg,
> +                                    &rctx->dst_skip) < 0) {
> +                       pr_err("%s() Error: Unable to find start of dst data\n",
> +                              __func__);
> +                       return -EINVAL;
> +               }
> +       }
> +
> +       if (ctx->cipher.mode == CIPHER_MODE_CBC ||
> +           ctx->cipher.mode == CIPHER_MODE_CTR ||
> +           ctx->cipher.mode == CIPHER_MODE_OFB ||
> +           ctx->cipher.mode == CIPHER_MODE_XTS ||
> +           ctx->cipher.mode == CIPHER_MODE_GCM) {
> +               rctx->iv_ctr_len =
> +                       ctx->salt_len +
> +                       crypto_aead_ivsize(crypto_aead_reqtfm(req));
> +       } else if (ctx->cipher.mode == CIPHER_MODE_CCM) {
> +               rctx->iv_ctr_len = CCM_AES_IV_SIZE;
> +       } else {
> +               rctx->iv_ctr_len = 0;
> +       }
> +
> +       rctx->hash_carry_len = 0;
> +
> +       flow_log("  src sg: %p\n", req->src);
> +       flow_log("  rctx->src_sg: %p, src_skip %u\n",
> +                rctx->src_sg, rctx->src_skip);
> +       flow_log("  assoc:  %p, assoclen %u\n", rctx->assoc, req->assoclen);
> +       flow_log("  dst sg: %p\n", req->dst);
> +       flow_log("  rctx->dst_sg: %p, dst_skip %u\n",
> +                rctx->dst_sg, rctx->dst_skip);
> +       flow_log("  iv_ctr_len:%u\n", rctx->iv_ctr_len);
> +       flow_dump("  iv: ", req->iv, rctx->iv_ctr_len);
> +       flow_log("  authkeylen:%u\n", ctx->authkeylen);
> +       flow_log("  is_esp: %s\n", ctx->is_esp ? "yes" : "no");
> +
> +       if (ctx->max_payload == SPU_MAX_PAYLOAD_INF)
> +               flow_log("  max_payload infinite");
> +       else
> +               flow_log("  max_payload: %u\n", ctx->max_payload);
> +
> +       if (unlikely(aead_need_fallback(req)))
> +               return aead_do_fallback(req, is_encrypt);
> +
> +       /*
> +        * Do memory allocations for request after fallback check, because if we
> +        * do fallback, we won't call finish_req() to dealloc.
> +        */
> +       if (rctx->iv_ctr_len) {
> +               if (ctx->salt_len)
> +                       memcpy(rctx->msg_buf.iv_ctr + ctx->salt_offset,
> +                              ctx->salt, ctx->salt_len);
> +               memcpy(rctx->msg_buf.iv_ctr + ctx->salt_offset + ctx->salt_len,
> +                      req->iv,
> +                      rctx->iv_ctr_len - ctx->salt_len - ctx->salt_offset);
> +       }
> +
> +       rctx->chan_idx = select_channel();
> +       err = handle_aead_req(rctx);
> +       if (err != -EINPROGRESS)
> +               /* synchronous result */
> +               spu_chunk_cleanup(rctx);
> +
> +       return err;
> +}
> +
> +static int aead_authenc_setkey(struct crypto_aead *cipher,
> +                              const u8 *key, unsigned int keylen)
> +{
> +       struct spu_hw *spu = &iproc_priv.spu;
> +       struct iproc_ctx_s *ctx = crypto_aead_ctx(cipher);
> +       struct crypto_tfm *tfm = crypto_aead_tfm(cipher);
> +       struct rtattr *rta = (void *)key;
> +       struct crypto_authenc_key_param *param;
> +       const u8 *origkey = key;
> +       const unsigned int origkeylen = keylen;
> +
> +       int ret = 0;
> +
> +       flow_log("%s() aead:%p key:%p keylen:%u\n", __func__, cipher, key,
> +                keylen);
> +       flow_dump("  key: ", key, keylen);
> +
> +       if (!RTA_OK(rta, keylen))
> +               goto badkey;
> +       if (rta->rta_type != CRYPTO_AUTHENC_KEYA_PARAM)
> +               goto badkey;
> +       if (RTA_PAYLOAD(rta) < sizeof(*param))
> +               goto badkey;
> +
> +       param = RTA_DATA(rta);
> +       ctx->enckeylen = be32_to_cpu(param->enckeylen);
> +
> +       key += RTA_ALIGN(rta->rta_len);
> +       keylen -= RTA_ALIGN(rta->rta_len);
> +
> +       if (keylen < ctx->enckeylen)
> +               goto badkey;
> +       if (ctx->enckeylen > MAX_KEY_SIZE)
> +               goto badkey;
> +
> +       ctx->authkeylen = keylen - ctx->enckeylen;
> +
> +       if (ctx->authkeylen > MAX_KEY_SIZE)
> +               goto badkey;
> +
> +       memcpy(ctx->enckey, key + ctx->authkeylen, ctx->enckeylen);
> +       /* May end up padding auth key. So make sure it's zeroed. */
> +       memset(ctx->authkey, 0, sizeof(ctx->authkey));
> +       memcpy(ctx->authkey, key, ctx->authkeylen);
> +
> +       switch (ctx->alg->cipher_info.alg) {
> +       case CIPHER_ALG_DES:
> +               if (ctx->enckeylen == DES_KEY_SIZE) {
> +                       u32 tmp[DES_EXPKEY_WORDS];
> +                       u32 flags = CRYPTO_TFM_RES_WEAK_KEY;
> +
> +                       if (des_ekey(tmp, key) == 0) {
> +                               if (crypto_aead_get_flags(cipher) &
> +                                   CRYPTO_TFM_REQ_WEAK_KEY) {
> +                                       crypto_aead_set_flags(cipher, flags);
> +                                       return -EINVAL;
> +                               }
> +                       }
> +
> +                       ctx->cipher_type = CIPHER_TYPE_DES;
> +               } else {
> +                       goto badkey;
> +               }
> +               break;
> +       case CIPHER_ALG_3DES:
> +               if (ctx->enckeylen == (DES_KEY_SIZE * 3)) {
> +                       const u32 *K = (const u32 *)key;
> +                       u32 flags = CRYPTO_TFM_RES_BAD_KEY_SCHED;
> +
> +                       if (!((K[0] ^ K[2]) | (K[1] ^ K[3])) ||
> +                           !((K[2] ^ K[4]) | (K[3] ^ K[5]))) {
> +                               crypto_aead_set_flags(cipher, flags);
> +                               return -EINVAL;
> +                       }
> +
> +                       ctx->cipher_type = CIPHER_TYPE_3DES;
> +               } else {
> +                       crypto_aead_set_flags(cipher,
> +                                             CRYPTO_TFM_RES_BAD_KEY_LEN);
> +                       return -EINVAL;
> +               }
> +               break;
> +       case CIPHER_ALG_AES:
> +               switch (ctx->enckeylen) {
> +               case AES_KEYSIZE_128:
> +                       ctx->cipher_type = CIPHER_TYPE_AES128;
> +                       break;
> +               case AES_KEYSIZE_192:
> +                       ctx->cipher_type = CIPHER_TYPE_AES192;
> +                       break;
> +               case AES_KEYSIZE_256:
> +                       ctx->cipher_type = CIPHER_TYPE_AES256;
> +                       break;
> +               default:
> +                       goto badkey;
> +               }
> +               break;
> +       case CIPHER_ALG_RC4:
> +               ctx->cipher_type = CIPHER_TYPE_INIT;
> +               break;
> +       default:
> +               pr_err("%s() Error: Unknown cipher alg\n", __func__);
> +               return -EINVAL;
> +       }
> +
> +       flow_log("  enckeylen:%u authkeylen:%u\n", ctx->enckeylen,
> +                ctx->authkeylen);
> +       flow_dump("  enc: ", ctx->enckey, ctx->enckeylen);
> +       flow_dump("  auth: ", ctx->authkey, ctx->authkeylen);
> +
> +       /* setkey the fallback just in case we needto use it */
> +       if (ctx->fallback_cipher) {
> +               flow_log("  running fallback setkey()\n");
> +
> +               ctx->fallback_cipher->base.crt_flags &= ~CRYPTO_TFM_REQ_MASK;
> +               ctx->fallback_cipher->base.crt_flags |=
> +                   tfm->crt_flags & CRYPTO_TFM_REQ_MASK;
> +               ret =
> +                   crypto_aead_setkey(ctx->fallback_cipher, origkey,
> +                                      origkeylen);
> +               if (ret) {
> +                       flow_log("  fallback setkey() returned:%d\n", ret);
> +                       tfm->crt_flags &= ~CRYPTO_TFM_RES_MASK;
> +                       tfm->crt_flags |=
> +                           (ctx->fallback_cipher->
> +                            base.crt_flags & CRYPTO_TFM_RES_MASK);
> +               }
> +       }
> +
> +       ctx->spu_resp_hdr_len = spu->spu_response_hdr_len(ctx->authkeylen,
> +                                                         ctx->enckeylen,
> +                                                         false);
> +
> +       atomic_inc(&iproc_priv.setkey_cnt[SPU_OP_AEAD]);
> +
> +       return ret;
> +
> +badkey:
> +       ctx->enckeylen = 0;
> +       ctx->authkeylen = 0;
> +       ctx->digestsize = 0;
> +
> +       crypto_aead_set_flags(cipher, CRYPTO_TFM_RES_BAD_KEY_LEN);
> +       return -EINVAL;
> +}
> +
> +static int aead_gcm_ccm_setkey(struct crypto_aead *cipher,
> +                              const u8 *key, unsigned int keylen)
> +{
> +       struct spu_hw *spu = &iproc_priv.spu;
> +       struct iproc_ctx_s *ctx = crypto_aead_ctx(cipher);
> +       struct crypto_tfm *tfm = crypto_aead_tfm(cipher);
> +
> +       int ret = 0;
> +
> +       flow_log("%s() keylen:%u\n", __func__, keylen);
> +       flow_dump("  key: ", key, keylen);
> +
> +       if (!ctx->is_esp)
> +               ctx->digestsize = keylen;
> +
> +       ctx->enckeylen = keylen;
> +       ctx->authkeylen = 0;
> +       memcpy(ctx->enckey, key, ctx->enckeylen);
> +
> +       switch (ctx->enckeylen) {
> +       case AES_KEYSIZE_128:
> +               ctx->cipher_type = CIPHER_TYPE_AES128;
> +               break;
> +       case AES_KEYSIZE_192:
> +               ctx->cipher_type = CIPHER_TYPE_AES192;
> +               break;
> +       case AES_KEYSIZE_256:
> +               ctx->cipher_type = CIPHER_TYPE_AES256;
> +               break;
> +       default:
> +               goto badkey;
> +       }
> +
> +       flow_log("  enckeylen:%u authkeylen:%u\n", ctx->enckeylen,
> +                ctx->authkeylen);
> +       flow_dump("  enc: ", ctx->enckey, ctx->enckeylen);
> +       flow_dump("  auth: ", ctx->authkey, ctx->authkeylen);
> +
> +       /* setkey the fallback just in case we need to use it */
> +       if (ctx->fallback_cipher) {
> +               flow_log("  running fallback setkey()\n");
> +
> +               ctx->fallback_cipher->base.crt_flags &= ~CRYPTO_TFM_REQ_MASK;
> +               ctx->fallback_cipher->base.crt_flags |=
> +                   tfm->crt_flags & CRYPTO_TFM_REQ_MASK;
> +               ret = crypto_aead_setkey(ctx->fallback_cipher, key,
> +                                        keylen + ctx->salt_len);
> +               if (ret) {
> +                       flow_log("  fallback setkey() returned:%d\n", ret);
> +                       tfm->crt_flags &= ~CRYPTO_TFM_RES_MASK;
> +                       tfm->crt_flags |=
> +                           (ctx->fallback_cipher->
> +                            base.crt_flags & CRYPTO_TFM_RES_MASK);
> +               }
> +       }
> +
> +       ctx->spu_resp_hdr_len = spu->spu_response_hdr_len(ctx->authkeylen,
> +                                                         ctx->enckeylen,
> +                                                         false);
> +
> +       atomic_inc(&iproc_priv.setkey_cnt[SPU_OP_AEAD]);
> +
> +       flow_log("  enckeylen:%u authkeylen:%u\n", ctx->enckeylen,
> +                ctx->authkeylen);
> +
> +       return ret;
> +
> +badkey:
> +       ctx->enckeylen = 0;
> +       ctx->authkeylen = 0;
> +       ctx->digestsize = 0;
> +
> +       crypto_aead_set_flags(cipher, CRYPTO_TFM_RES_BAD_KEY_LEN);
> +       return -EINVAL;
> +}
> +
> +/**
> + * aead_gcm_esp_setkey() - setkey() operation for ESP variant of GCM AES.
> + * @cipher: AEAD structure
> + * @key:    Key followed by 4 bytes of salt
> + * @keylen: Length of key plus salt, in bytes
> + *
> + * Extracts salt from key and stores it to be prepended to IV on each request.
> + * Digest is always 16 bytes
> + *
> + * Return: Value from generic gcm setkey.
> + */
> +static int aead_gcm_esp_setkey(struct crypto_aead *cipher,
> +                              const u8 *key, unsigned int keylen)
> +{
> +       struct iproc_ctx_s *ctx = crypto_aead_ctx(cipher);
> +
> +       flow_log("%s\n", __func__);
> +       ctx->salt_len = GCM_ESP_SALT_SIZE;
> +       ctx->salt_offset = GCM_ESP_SALT_OFFSET;
> +       memcpy(ctx->salt, key + keylen - GCM_ESP_SALT_SIZE, GCM_ESP_SALT_SIZE);
> +       keylen -= GCM_ESP_SALT_SIZE;
> +       ctx->digestsize = GCM_ESP_DIGESTSIZE;
> +       ctx->is_esp = true;
> +       flow_dump("salt: ", ctx->salt, GCM_ESP_SALT_SIZE);
> +
> +       return aead_gcm_ccm_setkey(cipher, key, keylen);
> +}
> +
> +/**
> + * rfc4543_gcm_esp_setkey() - setkey operation for RFC4543 variant of GCM/GMAC.
> + * cipher: AEAD structure
> + * key:    Key followed by 4 bytes of salt
> + * keylen: Length of key plus salt, in bytes
> + *
> + * Extracts salt from key and stores it to be prepended to IV on each request.
> + * Digest is always 16 bytes
> + *
> + * Return: Value from generic gcm setkey.
> + */
> +static int rfc4543_gcm_esp_setkey(struct crypto_aead *cipher,
> +                                 const u8 *key, unsigned int keylen)
> +{
> +       struct iproc_ctx_s *ctx = crypto_aead_ctx(cipher);
> +
> +       flow_log("%s\n", __func__);
> +       ctx->salt_len = GCM_ESP_SALT_SIZE;
> +       ctx->salt_offset = GCM_ESP_SALT_OFFSET;
> +       memcpy(ctx->salt, key + keylen - GCM_ESP_SALT_SIZE, GCM_ESP_SALT_SIZE);
> +       keylen -= GCM_ESP_SALT_SIZE;
> +       ctx->digestsize = GCM_ESP_DIGESTSIZE;
> +       ctx->is_esp = true;
> +       ctx->is_rfc4543 = true;
> +       flow_dump("salt: ", ctx->salt, GCM_ESP_SALT_SIZE);
> +
> +       return aead_gcm_ccm_setkey(cipher, key, keylen);
> +}
> +
> +/**
> + * aead_ccm_esp_setkey() - setkey() operation for ESP variant of CCM AES.
> + * @cipher: AEAD structure
> + * @key:    Key followed by 4 bytes of salt
> + * @keylen: Length of key plus salt, in bytes
> + *
> + * Extracts salt from key and stores it to be prepended to IV on each request.
> + * Digest is always 16 bytes
> + *
> + * Return: Value from generic ccm setkey.
> + */
> +static int aead_ccm_esp_setkey(struct crypto_aead *cipher,
> +                              const u8 *key, unsigned int keylen)
> +{
> +       struct iproc_ctx_s *ctx = crypto_aead_ctx(cipher);
> +
> +       flow_log("%s\n", __func__);
> +       ctx->salt_len = CCM_ESP_SALT_SIZE;
> +       ctx->salt_offset = CCM_ESP_SALT_OFFSET;
> +       memcpy(ctx->salt, key + keylen - CCM_ESP_SALT_SIZE, CCM_ESP_SALT_SIZE);
> +       keylen -= CCM_ESP_SALT_SIZE;
> +       ctx->is_esp = true;
> +       flow_dump("salt: ", ctx->salt, CCM_ESP_SALT_SIZE);
> +
> +       return aead_gcm_ccm_setkey(cipher, key, keylen);
> +}
> +
> +static int aead_setauthsize(struct crypto_aead *cipher, unsigned int authsize)
> +{
> +       struct iproc_ctx_s *ctx = crypto_aead_ctx(cipher);
> +       int ret = 0;
> +
> +       flow_log("%s() authkeylen:%u authsize:%u\n",
> +                __func__, ctx->authkeylen, authsize);
> +
> +       ctx->digestsize = authsize;
> +
> +       /* setkey the fallback just in case we needto use it */
> +       if (ctx->fallback_cipher) {
> +               flow_log("  running fallback setauth()\n");
> +
> +               ret = crypto_aead_setauthsize(ctx->fallback_cipher, authsize);
> +               if (ret)
> +                       flow_log("  fallback setauth() returned:%d\n", ret);
> +       }
> +
> +       return ret;
> +}
> +
> +static int aead_encrypt(struct aead_request *req)
> +{
> +       flow_log("%s() cryptlen:%u %08x\n", __func__, req->cryptlen,
> +                req->cryptlen);
> +       dump_sg(req->src, 0, req->cryptlen + req->assoclen);
> +       flow_log("  assoc_len:%u\n", req->assoclen);
> +
> +       return aead_enqueue(req, true);
> +}
> +
> +static int aead_decrypt(struct aead_request *req)
> +{
> +       flow_log("%s() cryptlen:%u\n", __func__, req->cryptlen);
> +       dump_sg(req->src, 0, req->cryptlen + req->assoclen);
> +       flow_log("  assoc_len:%u\n", req->assoclen);
> +
> +       return aead_enqueue(req, false);
> +}
> +
> +/* ==================== Supported Cipher Algorithms ==================== */
> +
> +static struct iproc_alg_s driver_algs[] = {
> +       {
> +        .type = CRYPTO_ALG_TYPE_AEAD,
> +        .alg.aead = {
> +                .base = {
> +                       .cra_name = "gcm(aes)",
> +                       .cra_driver_name = "gcm-aes-iproc",
> +                       .cra_blocksize = AES_BLOCK_SIZE,
> +                       .cra_flags = CRYPTO_ALG_NEED_FALLBACK
> +                },
> +                .setkey = aead_gcm_ccm_setkey,
> +                .ivsize = GCM_AES_IV_SIZE,
> +               .maxauthsize = AES_BLOCK_SIZE,
> +        },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_AES,
> +                        .mode = CIPHER_MODE_GCM,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_AES,
> +                      .mode = HASH_MODE_GCM,
> +                      },
> +        .auth_first = 0,
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_AEAD,
> +        .alg.aead = {
> +                .base = {
> +                       .cra_name = "ccm(aes)",
> +                       .cra_driver_name = "ccm-aes-iproc",
> +                       .cra_blocksize = AES_BLOCK_SIZE,
> +                       .cra_flags = CRYPTO_ALG_NEED_FALLBACK
> +                },
> +                .setkey = aead_gcm_ccm_setkey,
> +                .ivsize = CCM_AES_IV_SIZE,
> +               .maxauthsize = AES_BLOCK_SIZE,
> +        },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_AES,
> +                        .mode = CIPHER_MODE_CCM,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_AES,
> +                      .mode = HASH_MODE_CCM,
> +                      },
> +        .auth_first = 0,
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_AEAD,
> +        .alg.aead = {
> +                .base = {
> +                       .cra_name = "rfc4106(gcm(aes))",
> +                       .cra_driver_name = "gcm-aes-esp-iproc",
> +                       .cra_blocksize = AES_BLOCK_SIZE,
> +                       .cra_flags = CRYPTO_ALG_NEED_FALLBACK
> +                },
> +                .setkey = aead_gcm_esp_setkey,
> +                .ivsize = GCM_ESP_IV_SIZE,
> +                .maxauthsize = AES_BLOCK_SIZE,
> +        },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_AES,
> +                        .mode = CIPHER_MODE_GCM,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_AES,
> +                      .mode = HASH_MODE_GCM,
> +                      },
> +        .auth_first = 0,
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_AEAD,
> +        .alg.aead = {
> +                .base = {
> +                       .cra_name = "rfc4309(ccm(aes))",
> +                       .cra_driver_name = "ccm-aes-esp-iproc",
> +                       .cra_blocksize = AES_BLOCK_SIZE,
> +                       .cra_flags = CRYPTO_ALG_NEED_FALLBACK
> +                },
> +                .setkey = aead_ccm_esp_setkey,
> +                .ivsize = CCM_AES_IV_SIZE,
> +                .maxauthsize = AES_BLOCK_SIZE,
> +        },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_AES,
> +                        .mode = CIPHER_MODE_CCM,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_AES,
> +                      .mode = HASH_MODE_CCM,
> +                      },
> +        .auth_first = 0,
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_AEAD,
> +        .alg.aead = {
> +                .base = {
> +                       .cra_name = "rfc4543(gcm(aes))",
> +                       .cra_driver_name = "gmac-aes-esp-iproc",
> +                       .cra_blocksize = AES_BLOCK_SIZE,
> +                       .cra_flags = CRYPTO_ALG_NEED_FALLBACK
> +                },
> +                .setkey = rfc4543_gcm_esp_setkey,
> +                .ivsize = GCM_ESP_IV_SIZE,
> +                .maxauthsize = AES_BLOCK_SIZE,
> +        },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_AES,
> +                        .mode = CIPHER_MODE_GCM,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_AES,
> +                      .mode = HASH_MODE_GCM,
> +                      },
> +        .auth_first = 0,
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_AEAD,
> +        .alg.aead = {
> +                .base = {
> +                       .cra_name = "authenc(hmac(md5),cbc(aes))",
> +                       .cra_driver_name = "authenc-hmac-md5-cbc-aes-iproc",
> +                       .cra_blocksize = AES_BLOCK_SIZE,
> +                       .cra_flags = CRYPTO_ALG_NEED_FALLBACK | CRYPTO_ALG_ASYNC
> +                },
> +                .setkey = aead_authenc_setkey,
> +               .ivsize = AES_BLOCK_SIZE,
> +               .maxauthsize = MD5_DIGEST_SIZE,
> +        },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_AES,
> +                        .mode = CIPHER_MODE_CBC,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_MD5,
> +                      .mode = HASH_MODE_HMAC,
> +                      },
> +        .auth_first = 0,
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_AEAD,
> +        .alg.aead = {
> +                .base = {
> +                       .cra_name = "authenc(hmac(sha1),cbc(aes))",
> +                       .cra_driver_name = "authenc-hmac-sha1-cbc-aes-iproc",
> +                       .cra_blocksize = AES_BLOCK_SIZE,
> +                       .cra_flags = CRYPTO_ALG_NEED_FALLBACK | CRYPTO_ALG_ASYNC
> +                },
> +                .setkey = aead_authenc_setkey,
> +                .ivsize = AES_BLOCK_SIZE,
> +                .maxauthsize = SHA1_DIGEST_SIZE,
> +        },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_AES,
> +                        .mode = CIPHER_MODE_CBC,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_SHA1,
> +                      .mode = HASH_MODE_HMAC,
> +                      },
> +        .auth_first = 0,
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_AEAD,
> +        .alg.aead = {
> +                .base = {
> +                       .cra_name = "authenc(hmac(sha256),cbc(aes))",
> +                       .cra_driver_name = "authenc-hmac-sha256-cbc-aes-iproc",
> +                       .cra_blocksize = AES_BLOCK_SIZE,
> +                       .cra_flags = CRYPTO_ALG_NEED_FALLBACK | CRYPTO_ALG_ASYNC
> +                },
> +                .setkey = aead_authenc_setkey,
> +                .ivsize = AES_BLOCK_SIZE,
> +                .maxauthsize = SHA256_DIGEST_SIZE,
> +        },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_AES,
> +                        .mode = CIPHER_MODE_CBC,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_SHA256,
> +                      .mode = HASH_MODE_HMAC,
> +                      },
> +        .auth_first = 0,
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_AEAD,
> +        .alg.aead = {
> +                .base = {
> +                       .cra_name = "authenc(hmac(md5),cbc(des))",
> +                       .cra_driver_name = "authenc-hmac-md5-cbc-des-iproc",
> +                       .cra_blocksize = DES_BLOCK_SIZE,
> +                       .cra_flags = CRYPTO_ALG_NEED_FALLBACK | CRYPTO_ALG_ASYNC
> +                },
> +                .setkey = aead_authenc_setkey,
> +                .ivsize = DES_BLOCK_SIZE,
> +                .maxauthsize = MD5_DIGEST_SIZE,
> +        },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_DES,
> +                        .mode = CIPHER_MODE_CBC,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_MD5,
> +                      .mode = HASH_MODE_HMAC,
> +                      },
> +        .auth_first = 0,
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_AEAD,
> +        .alg.aead = {
> +                .base = {
> +                       .cra_name = "authenc(hmac(sha1),cbc(des))",
> +                       .cra_driver_name = "authenc-hmac-sha1-cbc-des-iproc",
> +                       .cra_blocksize = DES_BLOCK_SIZE,
> +                       .cra_flags = CRYPTO_ALG_NEED_FALLBACK | CRYPTO_ALG_ASYNC
> +                },
> +                .setkey = aead_authenc_setkey,
> +                .ivsize = DES_BLOCK_SIZE,
> +                .maxauthsize = SHA1_DIGEST_SIZE,
> +        },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_DES,
> +                        .mode = CIPHER_MODE_CBC,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_SHA1,
> +                      .mode = HASH_MODE_HMAC,
> +                      },
> +        .auth_first = 0,
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_AEAD,
> +        .alg.aead = {
> +                .base = {
> +                       .cra_name = "authenc(hmac(sha224),cbc(des))",
> +                       .cra_driver_name = "authenc-hmac-sha224-cbc-des-iproc",
> +                       .cra_blocksize = DES_BLOCK_SIZE,
> +                       .cra_flags = CRYPTO_ALG_NEED_FALLBACK | CRYPTO_ALG_ASYNC
> +                },
> +                .setkey = aead_authenc_setkey,
> +                .ivsize = DES_BLOCK_SIZE,
> +                .maxauthsize = SHA224_DIGEST_SIZE,
> +        },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_DES,
> +                        .mode = CIPHER_MODE_CBC,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_SHA224,
> +                      .mode = HASH_MODE_HMAC,
> +                      },
> +        .auth_first = 0,
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_AEAD,
> +        .alg.aead = {
> +                .base = {
> +                       .cra_name = "authenc(hmac(sha256),cbc(des))",
> +                       .cra_driver_name = "authenc-hmac-sha256-cbc-des-iproc",
> +                       .cra_blocksize = DES_BLOCK_SIZE,
> +                       .cra_flags = CRYPTO_ALG_NEED_FALLBACK | CRYPTO_ALG_ASYNC
> +                },
> +                .setkey = aead_authenc_setkey,
> +                .ivsize = DES_BLOCK_SIZE,
> +                .maxauthsize = SHA256_DIGEST_SIZE,
> +        },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_DES,
> +                        .mode = CIPHER_MODE_CBC,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_SHA256,
> +                      .mode = HASH_MODE_HMAC,
> +                      },
> +        .auth_first = 0,
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_AEAD,
> +        .alg.aead = {
> +                .base = {
> +                       .cra_name = "authenc(hmac(sha384),cbc(des))",
> +                       .cra_driver_name = "authenc-hmac-sha384-cbc-des-iproc",
> +                       .cra_blocksize = DES_BLOCK_SIZE,
> +                       .cra_flags = CRYPTO_ALG_NEED_FALLBACK | CRYPTO_ALG_ASYNC
> +                },
> +                .setkey = aead_authenc_setkey,
> +                .ivsize = DES_BLOCK_SIZE,
> +                .maxauthsize = SHA384_DIGEST_SIZE,
> +        },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_DES,
> +                        .mode = CIPHER_MODE_CBC,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_SHA384,
> +                      .mode = HASH_MODE_HMAC,
> +                      },
> +        .auth_first = 0,
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_AEAD,
> +        .alg.aead = {
> +                .base = {
> +                       .cra_name = "authenc(hmac(sha512),cbc(des))",
> +                       .cra_driver_name = "authenc-hmac-sha512-cbc-des-iproc",
> +                       .cra_blocksize = DES_BLOCK_SIZE,
> +                       .cra_flags = CRYPTO_ALG_NEED_FALLBACK | CRYPTO_ALG_ASYNC
> +                },
> +                .setkey = aead_authenc_setkey,
> +                .ivsize = DES_BLOCK_SIZE,
> +                .maxauthsize = SHA512_DIGEST_SIZE,
> +        },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_DES,
> +                        .mode = CIPHER_MODE_CBC,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_SHA512,
> +                      .mode = HASH_MODE_HMAC,
> +                      },
> +        .auth_first = 0,
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_AEAD,
> +        .alg.aead = {
> +                .base = {
> +                       .cra_name = "authenc(hmac(md5),cbc(des3_ede))",
> +                       .cra_driver_name = "authenc-hmac-md5-cbc-des3-iproc",
> +                       .cra_blocksize = DES3_EDE_BLOCK_SIZE,
> +                       .cra_flags = CRYPTO_ALG_NEED_FALLBACK | CRYPTO_ALG_ASYNC
> +                },
> +                .setkey = aead_authenc_setkey,
> +                .ivsize = DES3_EDE_BLOCK_SIZE,
> +                .maxauthsize = MD5_DIGEST_SIZE,
> +        },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_3DES,
> +                        .mode = CIPHER_MODE_CBC,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_MD5,
> +                      .mode = HASH_MODE_HMAC,
> +                      },
> +        .auth_first = 0,
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_AEAD,
> +        .alg.aead = {
> +                .base = {
> +                       .cra_name = "authenc(hmac(sha1),cbc(des3_ede))",
> +                       .cra_driver_name = "authenc-hmac-sha1-cbc-des3-iproc",
> +                       .cra_blocksize = DES3_EDE_BLOCK_SIZE,
> +                       .cra_flags = CRYPTO_ALG_NEED_FALLBACK | CRYPTO_ALG_ASYNC
> +                },
> +                .setkey = aead_authenc_setkey,
> +                .ivsize = DES3_EDE_BLOCK_SIZE,
> +                .maxauthsize = SHA1_DIGEST_SIZE,
> +        },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_3DES,
> +                        .mode = CIPHER_MODE_CBC,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_SHA1,
> +                      .mode = HASH_MODE_HMAC,
> +                      },
> +        .auth_first = 0,
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_AEAD,
> +        .alg.aead = {
> +                .base = {
> +                       .cra_name = "authenc(hmac(sha224),cbc(des3_ede))",
> +                       .cra_driver_name = "authenc-hmac-sha224-cbc-des3-iproc",
> +                       .cra_blocksize = DES3_EDE_BLOCK_SIZE,
> +                       .cra_flags = CRYPTO_ALG_NEED_FALLBACK | CRYPTO_ALG_ASYNC
> +                },
> +                .setkey = aead_authenc_setkey,
> +                .ivsize = DES3_EDE_BLOCK_SIZE,
> +                .maxauthsize = SHA224_DIGEST_SIZE,
> +        },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_3DES,
> +                        .mode = CIPHER_MODE_CBC,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_SHA224,
> +                      .mode = HASH_MODE_HMAC,
> +                      },
> +        .auth_first = 0,
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_AEAD,
> +        .alg.aead = {
> +                .base = {
> +                       .cra_name = "authenc(hmac(sha256),cbc(des3_ede))",
> +                       .cra_driver_name = "authenc-hmac-sha256-cbc-des3-iproc",
> +                       .cra_blocksize = DES3_EDE_BLOCK_SIZE,
> +                       .cra_flags = CRYPTO_ALG_NEED_FALLBACK | CRYPTO_ALG_ASYNC
> +                },
> +                .setkey = aead_authenc_setkey,
> +                .ivsize = DES3_EDE_BLOCK_SIZE,
> +                .maxauthsize = SHA256_DIGEST_SIZE,
> +        },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_3DES,
> +                        .mode = CIPHER_MODE_CBC,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_SHA256,
> +                      .mode = HASH_MODE_HMAC,
> +                      },
> +        .auth_first = 0,
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_AEAD,
> +        .alg.aead = {
> +                .base = {
> +                       .cra_name = "authenc(hmac(sha384),cbc(des3_ede))",
> +                       .cra_driver_name = "authenc-hmac-sha384-cbc-des3-iproc",
> +                       .cra_blocksize = DES3_EDE_BLOCK_SIZE,
> +                       .cra_flags = CRYPTO_ALG_NEED_FALLBACK | CRYPTO_ALG_ASYNC
> +                },
> +                .setkey = aead_authenc_setkey,
> +                .ivsize = DES3_EDE_BLOCK_SIZE,
> +                .maxauthsize = SHA384_DIGEST_SIZE,
> +        },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_3DES,
> +                        .mode = CIPHER_MODE_CBC,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_SHA384,
> +                      .mode = HASH_MODE_HMAC,
> +                      },
> +        .auth_first = 0,
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_AEAD,
> +        .alg.aead = {
> +                .base = {
> +                       .cra_name = "authenc(hmac(sha512),cbc(des3_ede))",
> +                       .cra_driver_name = "authenc-hmac-sha512-cbc-des3-iproc",
> +                       .cra_blocksize = DES3_EDE_BLOCK_SIZE,
> +                       .cra_flags = CRYPTO_ALG_NEED_FALLBACK | CRYPTO_ALG_ASYNC
> +                },
> +                .setkey = aead_authenc_setkey,
> +                .ivsize = DES3_EDE_BLOCK_SIZE,
> +                .maxauthsize = SHA512_DIGEST_SIZE,
> +        },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_3DES,
> +                        .mode = CIPHER_MODE_CBC,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_SHA512,
> +                      .mode = HASH_MODE_HMAC,
> +                      },
> +        .auth_first = 0,
> +        },
> +
> +/* ABLKCIPHER algorithms. */
> +       {
> +        .type = CRYPTO_ALG_TYPE_ABLKCIPHER,
> +        .alg.crypto = {
> +                       .cra_name = "ecb(arc4)",
> +                       .cra_driver_name = "ecb-arc4-iproc",
> +                       .cra_blocksize = ARC4_BLOCK_SIZE,
> +                       .cra_ablkcipher = {
> +                                          .min_keysize = ARC4_MIN_KEY_SIZE,
> +                                          .max_keysize = ARC4_MAX_KEY_SIZE,
> +                                          .ivsize = 0,
> +                                       }
> +                       },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_RC4,
> +                        .mode = CIPHER_MODE_NONE,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_NONE,
> +                      .mode = HASH_MODE_NONE,
> +                      },
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_ABLKCIPHER,
> +        .alg.crypto = {
> +                       .cra_name = "ofb(des)",
> +                       .cra_driver_name = "ofb-des-iproc",
> +                       .cra_blocksize = DES_BLOCK_SIZE,
> +                       .cra_ablkcipher = {
> +                                          .min_keysize = DES_KEY_SIZE,
> +                                          .max_keysize = DES_KEY_SIZE,
> +                                          .ivsize = DES_BLOCK_SIZE,
> +                                       }
> +                       },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_DES,
> +                        .mode = CIPHER_MODE_OFB,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_NONE,
> +                      .mode = HASH_MODE_NONE,
> +                      },
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_ABLKCIPHER,
> +        .alg.crypto = {
> +                       .cra_name = "cbc(des)",
> +                       .cra_driver_name = "cbc-des-iproc",
> +                       .cra_blocksize = DES_BLOCK_SIZE,
> +                       .cra_ablkcipher = {
> +                                          .min_keysize = DES_KEY_SIZE,
> +                                          .max_keysize = DES_KEY_SIZE,
> +                                          .ivsize = DES_BLOCK_SIZE,
> +                                       }
> +                       },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_DES,
> +                        .mode = CIPHER_MODE_CBC,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_NONE,
> +                      .mode = HASH_MODE_NONE,
> +                      },
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_ABLKCIPHER,
> +        .alg.crypto = {
> +                       .cra_name = "ecb(des)",
> +                       .cra_driver_name = "ecb-des-iproc",
> +                       .cra_blocksize = DES_BLOCK_SIZE,
> +                       .cra_ablkcipher = {
> +                                          .min_keysize = DES_KEY_SIZE,
> +                                          .max_keysize = DES_KEY_SIZE,
> +                                          .ivsize = 0,
> +                                       }
> +                       },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_DES,
> +                        .mode = CIPHER_MODE_ECB,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_NONE,
> +                      .mode = HASH_MODE_NONE,
> +                      },
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_ABLKCIPHER,
> +        .alg.crypto = {
> +                       .cra_name = "ofb(des3_ede)",
> +                       .cra_driver_name = "ofb-des3-iproc",
> +                       .cra_blocksize = DES3_EDE_BLOCK_SIZE,
> +                       .cra_ablkcipher = {
> +                                          .min_keysize = DES3_EDE_KEY_SIZE,
> +                                          .max_keysize = DES3_EDE_KEY_SIZE,
> +                                          .ivsize = DES3_EDE_BLOCK_SIZE,
> +                                       }
> +                       },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_3DES,
> +                        .mode = CIPHER_MODE_OFB,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_NONE,
> +                      .mode = HASH_MODE_NONE,
> +                      },
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_ABLKCIPHER,
> +        .alg.crypto = {
> +                       .cra_name = "cbc(des3_ede)",
> +                       .cra_driver_name = "cbc-des3-iproc",
> +                       .cra_blocksize = DES3_EDE_BLOCK_SIZE,
> +                       .cra_ablkcipher = {
> +                                          .min_keysize = DES3_EDE_KEY_SIZE,
> +                                          .max_keysize = DES3_EDE_KEY_SIZE,
> +                                          .ivsize = DES3_EDE_BLOCK_SIZE,
> +                                       }
> +                       },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_3DES,
> +                        .mode = CIPHER_MODE_CBC,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_NONE,
> +                      .mode = HASH_MODE_NONE,
> +                      },
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_ABLKCIPHER,
> +        .alg.crypto = {
> +                       .cra_name = "ecb(des3_ede)",
> +                       .cra_driver_name = "ecb-des3-iproc",
> +                       .cra_blocksize = DES3_EDE_BLOCK_SIZE,
> +                       .cra_ablkcipher = {
> +                                          .min_keysize = DES3_EDE_KEY_SIZE,
> +                                          .max_keysize = DES3_EDE_KEY_SIZE,
> +                                          .ivsize = 0,
> +                                       }
> +                       },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_3DES,
> +                        .mode = CIPHER_MODE_ECB,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_NONE,
> +                      .mode = HASH_MODE_NONE,
> +                      },
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_ABLKCIPHER,
> +        .alg.crypto = {
> +                       .cra_name = "ofb(aes)",
> +                       .cra_driver_name = "ofb-aes-iproc",
> +                       .cra_blocksize = AES_BLOCK_SIZE,
> +                       .cra_ablkcipher = {
> +                                          .min_keysize = AES_MIN_KEY_SIZE,
> +                                          .max_keysize = AES_MAX_KEY_SIZE,
> +                                          .ivsize = AES_BLOCK_SIZE,
> +                                       }
> +                       },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_AES,
> +                        .mode = CIPHER_MODE_OFB,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_NONE,
> +                      .mode = HASH_MODE_NONE,
> +                      },
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_ABLKCIPHER,
> +        .alg.crypto = {
> +                       .cra_name = "cbc(aes)",
> +                       .cra_driver_name = "cbc-aes-iproc",
> +                       .cra_blocksize = AES_BLOCK_SIZE,
> +                       .cra_ablkcipher = {
> +                                          .min_keysize = AES_MIN_KEY_SIZE,
> +                                          .max_keysize = AES_MAX_KEY_SIZE,
> +                                          .ivsize = AES_BLOCK_SIZE,
> +                                       }
> +                       },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_AES,
> +                        .mode = CIPHER_MODE_CBC,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_NONE,
> +                      .mode = HASH_MODE_NONE,
> +                      },
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_ABLKCIPHER,
> +        .alg.crypto = {
> +                       .cra_name = "ecb(aes)",
> +                       .cra_driver_name = "ecb-aes-iproc",
> +                       .cra_blocksize = AES_BLOCK_SIZE,
> +                       .cra_ablkcipher = {
> +                                          .min_keysize = AES_MIN_KEY_SIZE,
> +                                          .max_keysize = AES_MAX_KEY_SIZE,
> +                                          .ivsize = 0,
> +                                       }
> +                       },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_AES,
> +                        .mode = CIPHER_MODE_ECB,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_NONE,
> +                      .mode = HASH_MODE_NONE,
> +                      },
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_ABLKCIPHER,
> +        .alg.crypto = {
> +                       .cra_name = "ctr(aes)",
> +                       .cra_driver_name = "ctr-aes-iproc",
> +                       .cra_blocksize = AES_BLOCK_SIZE,
> +                       .cra_ablkcipher = {
> +                                          /* .geniv = "chainiv", */
> +                                          .min_keysize = AES_MIN_KEY_SIZE,
> +                                          .max_keysize = AES_MAX_KEY_SIZE,
> +                                          .ivsize = AES_BLOCK_SIZE,
> +                                       }
> +                       },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_AES,
> +                        .mode = CIPHER_MODE_CTR,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_NONE,
> +                      .mode = HASH_MODE_NONE,
> +                      },
> +        },
> +{
> +        .type = CRYPTO_ALG_TYPE_ABLKCIPHER,
> +        .alg.crypto = {
> +                       .cra_name = "xts(aes)",
> +                       .cra_driver_name = "xts-aes-iproc",
> +                       .cra_blocksize = AES_BLOCK_SIZE,
> +                       .cra_ablkcipher = {
> +                               .min_keysize = 2 * AES_MIN_KEY_SIZE,
> +                               .max_keysize = 2 * AES_MAX_KEY_SIZE,
> +                               .ivsize = AES_BLOCK_SIZE,
> +                               }
> +                       },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_AES,
> +                        .mode = CIPHER_MODE_XTS,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_NONE,
> +                      .mode = HASH_MODE_NONE,
> +                      },
> +        },
> +
> +/* AHASH algorithms. */
> +       {
> +        .type = CRYPTO_ALG_TYPE_AHASH,
> +        .alg.hash = {
> +                     .halg.digestsize = MD5_DIGEST_SIZE,
> +                     .halg.base = {
> +                                   .cra_name = "md5",
> +                                   .cra_driver_name = "md5-iproc",
> +                                   .cra_blocksize = MD5_BLOCK_WORDS * 4,
> +                                   .cra_flags = CRYPTO_ALG_TYPE_AHASH |
> +                                            CRYPTO_ALG_ASYNC,
> +                               }
> +                     },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_NONE,
> +                        .mode = CIPHER_MODE_NONE,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_MD5,
> +                      .mode = HASH_MODE_HASH,
> +                      },
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_AHASH,
> +        .alg.hash = {
> +                     .halg.digestsize = MD5_DIGEST_SIZE,
> +                     .halg.base = {
> +                                   .cra_name = "hmac(md5)",
> +                                   .cra_driver_name = "hmac-md5-iproc",
> +                                   .cra_blocksize = MD5_BLOCK_WORDS * 4,
> +                               }
> +                     },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_NONE,
> +                        .mode = CIPHER_MODE_NONE,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_MD5,
> +                      .mode = HASH_MODE_HMAC,
> +                      },
> +        },
> +       {.type = CRYPTO_ALG_TYPE_AHASH,
> +        .alg.hash = {
> +                     .halg.digestsize = SHA1_DIGEST_SIZE,
> +                     .halg.base = {
> +                                   .cra_name = "sha1",
> +                                   .cra_driver_name = "sha1-iproc",
> +                                   .cra_blocksize = SHA1_BLOCK_SIZE,
> +                               }
> +                     },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_NONE,
> +                        .mode = CIPHER_MODE_NONE,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_SHA1,
> +                      .mode = HASH_MODE_HASH,
> +                      },
> +        },
> +       {.type = CRYPTO_ALG_TYPE_AHASH,
> +        .alg.hash = {
> +                     .halg.digestsize = SHA1_DIGEST_SIZE,
> +                     .halg.base = {
> +                                   .cra_name = "hmac(sha1)",
> +                                   .cra_driver_name = "hmac-sha1-iproc",
> +                                   .cra_blocksize = SHA1_BLOCK_SIZE,
> +                               }
> +                     },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_NONE,
> +                        .mode = CIPHER_MODE_NONE,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_SHA1,
> +                      .mode = HASH_MODE_HMAC,
> +                      },
> +        },
> +       {.type = CRYPTO_ALG_TYPE_AHASH,
> +        .alg.hash = {
> +                       .halg.digestsize = SHA224_DIGEST_SIZE,
> +                       .halg.base = {
> +                                   .cra_name = "sha224",
> +                                   .cra_driver_name = "sha224-iproc",
> +                                   .cra_blocksize = SHA224_BLOCK_SIZE,
> +                       }
> +                     },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_NONE,
> +                        .mode = CIPHER_MODE_NONE,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_SHA224,
> +                      .mode = HASH_MODE_HASH,
> +                      },
> +        },
> +       {.type = CRYPTO_ALG_TYPE_AHASH,
> +        .alg.hash = {
> +                     .halg.digestsize = SHA224_DIGEST_SIZE,
> +                     .halg.base = {
> +                                   .cra_name = "hmac(sha224)",
> +                                   .cra_driver_name = "hmac-sha224-iproc",
> +                                   .cra_blocksize = SHA224_BLOCK_SIZE,
> +                               }
> +                     },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_NONE,
> +                        .mode = CIPHER_MODE_NONE,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_SHA224,
> +                      .mode = HASH_MODE_HMAC,
> +                      },
> +        },
> +       {.type = CRYPTO_ALG_TYPE_AHASH,
> +        .alg.hash = {
> +                     .halg.digestsize = SHA256_DIGEST_SIZE,
> +                     .halg.base = {
> +                                   .cra_name = "sha256",
> +                                   .cra_driver_name = "sha256-iproc",
> +                                   .cra_blocksize = SHA256_BLOCK_SIZE,
> +                               }
> +                     },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_NONE,
> +                        .mode = CIPHER_MODE_NONE,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_SHA256,
> +                      .mode = HASH_MODE_HASH,
> +                      },
> +        },
> +       {.type = CRYPTO_ALG_TYPE_AHASH,
> +        .alg.hash = {
> +                     .halg.digestsize = SHA256_DIGEST_SIZE,
> +                     .halg.base = {
> +                                   .cra_name = "hmac(sha256)",
> +                                   .cra_driver_name = "hmac-sha256-iproc",
> +                                   .cra_blocksize = SHA256_BLOCK_SIZE,
> +                               }
> +                     },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_NONE,
> +                        .mode = CIPHER_MODE_NONE,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_SHA256,
> +                      .mode = HASH_MODE_HMAC,
> +                      },
> +        },
> +       {
> +       .type = CRYPTO_ALG_TYPE_AHASH,
> +        .alg.hash = {
> +                     .halg.digestsize = SHA384_DIGEST_SIZE,
> +                     .halg.base = {
> +                                   .cra_name = "sha384",
> +                                   .cra_driver_name = "sha384-iproc",
> +                                   .cra_blocksize = SHA384_BLOCK_SIZE,
> +                               }
> +                     },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_NONE,
> +                        .mode = CIPHER_MODE_NONE,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_SHA384,
> +                      .mode = HASH_MODE_HASH,
> +                      },
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_AHASH,
> +        .alg.hash = {
> +                     .halg.digestsize = SHA384_DIGEST_SIZE,
> +                     .halg.base = {
> +                                   .cra_name = "hmac(sha384)",
> +                                   .cra_driver_name = "hmac-sha384-iproc",
> +                                   .cra_blocksize = SHA384_BLOCK_SIZE,
> +                               }
> +                     },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_NONE,
> +                        .mode = CIPHER_MODE_NONE,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_SHA384,
> +                      .mode = HASH_MODE_HMAC,
> +                      },
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_AHASH,
> +        .alg.hash = {
> +                     .halg.digestsize = SHA512_DIGEST_SIZE,
> +                     .halg.base = {
> +                                   .cra_name = "sha512",
> +                                   .cra_driver_name = "sha512-iproc",
> +                                   .cra_blocksize = SHA512_BLOCK_SIZE,
> +                               }
> +                     },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_NONE,
> +                        .mode = CIPHER_MODE_NONE,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_SHA512,
> +                      .mode = HASH_MODE_HASH,
> +                      },
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_AHASH,
> +        .alg.hash = {
> +                     .halg.digestsize = SHA512_DIGEST_SIZE,
> +                     .halg.base = {
> +                                   .cra_name = "hmac(sha512)",
> +                                   .cra_driver_name = "hmac-sha512-iproc",
> +                                   .cra_blocksize = SHA512_BLOCK_SIZE,
> +                               }
> +                     },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_NONE,
> +                        .mode = CIPHER_MODE_NONE,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_SHA512,
> +                      .mode = HASH_MODE_HMAC,
> +                      },
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_AHASH,
> +        .alg.hash = {
> +                     .halg.digestsize = SHA3_224_DIGEST_SIZE,
> +                     .halg.base = {
> +                                   .cra_name = "sha3-224",
> +                                   .cra_driver_name = "sha3-224-iproc",
> +                                   .cra_blocksize = SHA3_224_BLOCK_SIZE,
> +                               }
> +                     },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_NONE,
> +                        .mode = CIPHER_MODE_NONE,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_SHA3_224,
> +                      .mode = HASH_MODE_HASH,
> +                      },
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_AHASH,
> +        .alg.hash = {
> +                     .halg.digestsize = SHA3_224_DIGEST_SIZE,
> +                     .halg.base = {
> +                                   .cra_name = "hmac(sha3-224)",
> +                                   .cra_driver_name = "hmac-sha3-224-iproc",
> +                                   .cra_blocksize = SHA3_224_BLOCK_SIZE,
> +                               }
> +                     },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_NONE,
> +                        .mode = CIPHER_MODE_NONE,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_SHA3_224,
> +                      .mode = HASH_MODE_HMAC
> +                      },
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_AHASH,
> +        .alg.hash = {
> +                     .halg.digestsize = SHA3_256_DIGEST_SIZE,
> +                     .halg.base = {
> +                                   .cra_name = "sha3-256",
> +                                   .cra_driver_name = "sha3-256-iproc",
> +                                   .cra_blocksize = SHA3_256_BLOCK_SIZE,
> +                               }
> +                     },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_NONE,
> +                        .mode = CIPHER_MODE_NONE,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_SHA3_256,
> +                      .mode = HASH_MODE_HASH,
> +                      },
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_AHASH,
> +        .alg.hash = {
> +                     .halg.digestsize = SHA3_256_DIGEST_SIZE,
> +                     .halg.base = {
> +                                   .cra_name = "hmac(sha3-256)",
> +                                   .cra_driver_name = "hmac-sha3-256-iproc",
> +                                   .cra_blocksize = SHA3_256_BLOCK_SIZE,
> +                               }
> +                     },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_NONE,
> +                        .mode = CIPHER_MODE_NONE,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_SHA3_256,
> +                      .mode = HASH_MODE_HMAC,
> +                      },
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_AHASH,
> +        .alg.hash = {
> +                     .halg.digestsize = SHA3_384_DIGEST_SIZE,
> +                     .halg.base = {
> +                                   .cra_name = "sha3-384",
> +                                   .cra_driver_name = "sha3-384-iproc",
> +                                   .cra_blocksize = SHA3_224_BLOCK_SIZE,
> +                               }
> +                     },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_NONE,
> +                        .mode = CIPHER_MODE_NONE,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_SHA3_384,
> +                      .mode = HASH_MODE_HASH,
> +                      },
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_AHASH,
> +        .alg.hash = {
> +                     .halg.digestsize = SHA3_384_DIGEST_SIZE,
> +                     .halg.base = {
> +                                   .cra_name = "hmac(sha3-384)",
> +                                   .cra_driver_name = "hmac-sha3-384-iproc",
> +                                   .cra_blocksize = SHA3_384_BLOCK_SIZE,
> +                               }
> +                     },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_NONE,
> +                        .mode = CIPHER_MODE_NONE,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_SHA3_384,
> +                      .mode = HASH_MODE_HMAC,
> +                      },
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_AHASH,
> +        .alg.hash = {
> +                     .halg.digestsize = SHA3_512_DIGEST_SIZE,
> +                     .halg.base = {
> +                                   .cra_name = "sha3-512",
> +                                   .cra_driver_name = "sha3-512-iproc",
> +                                   .cra_blocksize = SHA3_512_BLOCK_SIZE,
> +                               }
> +                     },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_NONE,
> +                        .mode = CIPHER_MODE_NONE,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_SHA3_512,
> +                      .mode = HASH_MODE_HASH,
> +                      },
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_AHASH,
> +        .alg.hash = {
> +                     .halg.digestsize = SHA3_512_DIGEST_SIZE,
> +                     .halg.base = {
> +                                   .cra_name = "hmac(sha3-512)",
> +                                   .cra_driver_name = "hmac-sha3-512-iproc",
> +                                   .cra_blocksize = SHA3_512_BLOCK_SIZE,
> +                               }
> +                     },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_NONE,
> +                        .mode = CIPHER_MODE_NONE,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_SHA3_512,
> +                      .mode = HASH_MODE_HMAC,
> +                      },
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_AHASH,
> +        .alg.hash = {
> +                     .halg.digestsize = AES_BLOCK_SIZE,
> +                     .halg.base = {
> +                                   .cra_name = "xcbc(aes)",
> +                                   .cra_driver_name = "xcbc-aes-iproc",
> +                                   .cra_blocksize = AES_BLOCK_SIZE,
> +                               }
> +                     },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_NONE,
> +                        .mode = CIPHER_MODE_NONE,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_AES,
> +                      .mode = HASH_MODE_XCBC,
> +                      },
> +        },
> +       {
> +        .type = CRYPTO_ALG_TYPE_AHASH,
> +        .alg.hash = {
> +                     .halg.digestsize = AES_BLOCK_SIZE,
> +                     .halg.base = {
> +                                   .cra_name = "cmac(aes)",
> +                                   .cra_driver_name = "cmac-aes-iproc",
> +                                   .cra_blocksize = AES_BLOCK_SIZE,
> +                               }
> +                     },
> +        .cipher_info = {
> +                        .alg = CIPHER_ALG_NONE,
> +                        .mode = CIPHER_MODE_NONE,
> +                        },
> +        .auth_info = {
> +                      .alg = HASH_ALG_AES,
> +                      .mode = HASH_MODE_CMAC,
> +                      },
> +        },
> +};
> +
> +static int generic_cra_init(struct crypto_tfm *tfm,
> +                           struct iproc_alg_s *cipher_alg)
> +{
> +       struct spu_hw *spu = &iproc_priv.spu;
> +       struct iproc_ctx_s *ctx = crypto_tfm_ctx(tfm);
> +       unsigned int blocksize = crypto_tfm_alg_blocksize(tfm);
> +
> +       flow_log("%s()\n", __func__);
> +
> +       ctx->alg = cipher_alg;
> +       ctx->cipher = cipher_alg->cipher_info;
> +       ctx->auth = cipher_alg->auth_info;
> +       ctx->auth_first = cipher_alg->auth_first;
> +       ctx->max_payload = spu->spu_ctx_max_payload(ctx->cipher.alg,
> +                                                   ctx->cipher.mode,
> +                                                   blocksize);
> +       ctx->fallback_cipher = NULL;
> +
> +       ctx->enckeylen = 0;
> +       ctx->authkeylen = 0;
> +
> +       atomic_inc(&iproc_priv.stream_count);
> +       atomic_inc(&iproc_priv.session_count);
> +
> +       return 0;
> +}
> +
> +static int ablkcipher_cra_init(struct crypto_tfm *tfm)
> +{
> +       struct crypto_alg *alg = tfm->__crt_alg;
> +       struct iproc_alg_s *cipher_alg;
> +
> +       flow_log("%s()\n", __func__);
> +
> +       tfm->crt_ablkcipher.reqsize = sizeof(struct iproc_reqctx_s);
> +
> +       cipher_alg = container_of(alg, struct iproc_alg_s, alg.crypto);
> +       return generic_cra_init(tfm, cipher_alg);
> +}
> +
> +static int ahash_cra_init(struct crypto_tfm *tfm)
> +{
> +       int err;
> +       struct crypto_alg *alg = tfm->__crt_alg;
> +       struct iproc_alg_s *cipher_alg;
> +
> +       cipher_alg = container_of(__crypto_ahash_alg(alg), struct iproc_alg_s,
> +                                 alg.hash);
> +
> +       err = generic_cra_init(tfm, cipher_alg);
> +       flow_log("%s()\n", __func__);
> +
> +       /*
> +        * export state size has to be < 512 bytes. So don't include msg bufs
> +        * in state size.
> +        */
> +       crypto_ahash_set_reqsize(__crypto_ahash_cast(tfm),
> +                                sizeof(struct iproc_reqctx_s));
> +
> +       return err;
> +}
> +
> +static int aead_cra_init(struct crypto_aead *aead)
> +{
> +       struct crypto_tfm *tfm = crypto_aead_tfm(aead);
> +       struct iproc_ctx_s *ctx = crypto_tfm_ctx(tfm);
> +       struct crypto_alg *alg = tfm->__crt_alg;
> +       struct aead_alg *aalg = container_of(alg, struct aead_alg, base);
> +       struct iproc_alg_s *cipher_alg = container_of(aalg, struct iproc_alg_s,
> +                                                     alg.aead);
> +
> +       int err = generic_cra_init(tfm, cipher_alg);
> +
> +       flow_log("%s()\n", __func__);
> +
> +       crypto_aead_set_reqsize(aead, sizeof(struct iproc_reqctx_s));
> +       ctx->is_esp = false;
> +       ctx->salt_len = 0;
> +       ctx->salt_offset = 0;
> +
> +       /* random first IV */
> +       get_random_bytes(ctx->iv, MAX_IV_SIZE);
> +       flow_dump("  iv: ", ctx->iv, MAX_IV_SIZE);
> +
> +       if (!err) {
> +               if (alg->cra_flags & CRYPTO_ALG_NEED_FALLBACK) {
> +                       flow_log("%s() creating fallback cipher\n", __func__);
> +
> +                       ctx->fallback_cipher =
> +                           crypto_alloc_aead(alg->cra_name, 0,
> +                                             CRYPTO_ALG_ASYNC |
> +                                             CRYPTO_ALG_NEED_FALLBACK);
> +                       if (IS_ERR(ctx->fallback_cipher)) {
> +                               pr_err("%s() Error: failed to allocate fallback for %s\n",
> +                                      __func__, alg->cra_name);
> +                               return PTR_ERR(ctx->fallback_cipher);
> +                       }
> +               }
> +       }
> +
> +       return err;
> +}
> +
> +static void generic_cra_exit(struct crypto_tfm *tfm)
> +{
> +       atomic_dec(&iproc_priv.session_count);
> +}
> +
> +static void aead_cra_exit(struct crypto_aead *aead)
> +{
> +       struct crypto_tfm *tfm = crypto_aead_tfm(aead);
> +       struct iproc_ctx_s *ctx = crypto_tfm_ctx(tfm);
> +
> +       generic_cra_exit(tfm);
> +
> +       if (ctx->fallback_cipher) {
> +               crypto_free_aead(ctx->fallback_cipher);
> +               ctx->fallback_cipher = NULL;
> +       }
> +}
> +
> +/**
> + * spu_functions_register() - Specify hardware-specific SPU functions based on
> + * SPU type read from device tree.
> + * @dev:       device structure
> + * @spu_type:  SPU hardware generation
> + * @spu_subtype: SPU hardware version
> + */
> +static void spu_functions_register(struct device *dev,
> +                                  enum spu_spu_type spu_type,
> +                                  enum spu_spu_subtype spu_subtype)
> +{
> +       struct spu_hw *spu = &iproc_priv.spu;
> +
> +       if (spu_type == SPU_TYPE_SPUM) {
> +               dev_dbg(dev, "Registering SPUM functions");
> +               spu->spu_dump_msg_hdr = spum_dump_msg_hdr;
> +               spu->spu_payload_length = spum_payload_length;
> +               spu->spu_response_hdr_len = spum_response_hdr_len;
> +               spu->spu_hash_pad_len = spum_hash_pad_len;
> +               spu->spu_gcm_ccm_pad_len = spum_gcm_ccm_pad_len;
> +               spu->spu_assoc_resp_len = spum_assoc_resp_len;
> +               spu->spu_aead_ivlen = spum_aead_ivlen;
> +               spu->spu_hash_type = spum_hash_type;
> +               spu->spu_digest_size = spum_digest_size;
> +               spu->spu_create_request = spum_create_request;
> +               spu->spu_cipher_req_init = spum_cipher_req_init;
> +               spu->spu_cipher_req_finish = spum_cipher_req_finish;
> +               spu->spu_request_pad = spum_request_pad;
> +               spu->spu_tx_status_len = spum_tx_status_len;
> +               spu->spu_rx_status_len = spum_rx_status_len;
> +               spu->spu_status_process = spum_status_process;
> +               spu->spu_xts_tweak_in_payload = spum_xts_tweak_in_payload;
> +               spu->spu_ccm_update_iv = spum_ccm_update_iv;
> +               spu->spu_wordalign_padlen = spum_wordalign_padlen;
> +               if (spu_subtype == SPU_SUBTYPE_SPUM_NS2)
> +                       spu->spu_ctx_max_payload = spum_ns2_ctx_max_payload;
> +               else
> +                       spu->spu_ctx_max_payload = spum_nsp_ctx_max_payload;
> +       } else {
> +               dev_dbg(dev, "Registering SPU2 functions");
> +               spu->spu_dump_msg_hdr = spu2_dump_msg_hdr;
> +               spu->spu_ctx_max_payload = spu2_ctx_max_payload;
> +               spu->spu_payload_length = spu2_payload_length;
> +               spu->spu_response_hdr_len = spu2_response_hdr_len;
> +               spu->spu_hash_pad_len = spu2_hash_pad_len;
> +               spu->spu_gcm_ccm_pad_len = spu2_gcm_ccm_pad_len;
> +               spu->spu_assoc_resp_len = spu2_assoc_resp_len;
> +               spu->spu_aead_ivlen = spu2_aead_ivlen;
> +               spu->spu_hash_type = spu2_hash_type;
> +               spu->spu_digest_size = spu2_digest_size;
> +               spu->spu_create_request = spu2_create_request;
> +               spu->spu_cipher_req_init = spu2_cipher_req_init;
> +               spu->spu_cipher_req_finish = spu2_cipher_req_finish;
> +               spu->spu_request_pad = spu2_request_pad;
> +               spu->spu_tx_status_len = spu2_tx_status_len;
> +               spu->spu_rx_status_len = spu2_rx_status_len;
> +               spu->spu_status_process = spu2_status_process;
> +               spu->spu_xts_tweak_in_payload = spu2_xts_tweak_in_payload;
> +               spu->spu_ccm_update_iv = spu2_ccm_update_iv;
> +               spu->spu_wordalign_padlen = spu2_wordalign_padlen;
> +       }
> +}
> +
> +/**
> + * spu_mb_init() - Initialize mailbox client. Request ownership of each mailbox
> + * channel in the device tree.
> + * @dev:  SPU driver device structure
> + *
> + * Return: 0 if successful
> + *        < 0 otherwise
> + */
> +static int spu_mb_init(struct device *dev)
> +{
> +       int i;
> +       struct mbox_client *mcl = &iproc_priv.mcl;
> +       int err;
> +
> +       iproc_priv.mbox = kcalloc(iproc_priv.spu.num_chan,
> +                                 sizeof(struct mbox_chan *), GFP_KERNEL);
> +       if (!iproc_priv.mbox)
> +               return -ENOMEM;
> +
> +       mcl->dev = dev;
> +       mcl->tx_block = false;
> +       mcl->tx_tout = 0;
> +       mcl->knows_txdone = false;
> +       mcl->rx_callback = spu_rx_callback;
> +       mcl->tx_done = NULL;
> +
> +       for (i = 0; i < iproc_priv.spu.num_chan; i++) {
> +               iproc_priv.mbox[i] = mbox_request_channel(mcl, i);
> +               if (IS_ERR(iproc_priv.mbox[i])) {
> +                       err = (int)PTR_ERR(iproc_priv.mbox[i]);
> +                       dev_err(dev,
> +                               "Mbox channel %d request failed with err %d",
> +                               i, err);
> +                       iproc_priv.mbox[i] = NULL;
> +                       return err;
> +               }
> +       }
> +
> +       return 0;
> +}
> +
> +static void spu_mb_release(struct platform_device *pdev)
> +{
> +       int i;
> +
> +       if (!iproc_priv.mbox)
> +               return;
> +
> +       for (i = 0; i < iproc_priv.spu.num_chan; i++)
> +               mbox_free_channel(iproc_priv.mbox[i]);
> +
> +       kfree(iproc_priv.mbox);
> +       iproc_priv.mbox = NULL;
> +}
> +
> +static void spu_counters_init(void)
> +{
> +       int i;
> +       int j;
> +
> +       atomic_set(&iproc_priv.session_count, 0);
> +       atomic_set(&iproc_priv.stream_count, 0);
> +       atomic_set(&iproc_priv.next_chan, (int)iproc_priv.spu.num_chan);
> +       atomic64_set(&iproc_priv.bytes_in, 0);
> +       atomic64_set(&iproc_priv.bytes_out, 0);
> +       for (i = 0; i < SPU_OP_NUM; i++) {
> +               atomic_set(&iproc_priv.op_counts[i], 0);
> +               atomic_set(&iproc_priv.setkey_cnt[i], 0);
> +       }
> +       for (i = 0; i < CIPHER_ALG_LAST; i++)
> +               for (j = 0; j < CIPHER_MODE_LAST; j++)
> +                       atomic_set(&iproc_priv.cipher_cnt[i][j], 0);
> +
> +       for (i = 0; i < HASH_ALG_LAST; i++) {
> +               atomic_set(&iproc_priv.hash_cnt[i], 0);
> +               atomic_set(&iproc_priv.hmac_cnt[i], 0);
> +       }
> +       for (i = 0; i < AEAD_TYPE_LAST; i++)
> +               atomic_set(&iproc_priv.aead_cnt[i], 0);
> +
> +       atomic_set(&iproc_priv.mb_no_spc, 0);
> +       atomic_set(&iproc_priv.mb_send_fail, 0);
> +       atomic_set(&iproc_priv.bad_icv, 0);
> +}
> +
> +static int spu_register_ablkcipher(struct iproc_alg_s *driver_alg)
> +{
> +       struct device *dev = &iproc_priv.pdev->dev;
> +       struct spu_hw *spu = &iproc_priv.spu;
> +       struct crypto_alg *crypto = &driver_alg->alg.crypto;
> +       int err;
> +
> +       /* SPU2 does not support RC4 */
> +       if ((driver_alg->cipher_info.alg == CIPHER_ALG_RC4) &&
> +           (spu->spu_type == SPU_TYPE_SPU2))
> +               return 0;
> +
> +       crypto->cra_module = THIS_MODULE;
> +       crypto->cra_priority = cipher_pri;
> +       crypto->cra_alignmask = 0;
> +       crypto->cra_ctxsize = sizeof(struct iproc_ctx_s);
> +       INIT_LIST_HEAD(&crypto->cra_list);
> +
> +       crypto->cra_init = ablkcipher_cra_init;
> +       crypto->cra_exit = generic_cra_exit;
> +       crypto->cra_type = &crypto_ablkcipher_type;
> +       crypto->cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER | CRYPTO_ALG_ASYNC |
> +                               CRYPTO_ALG_KERN_DRIVER_ONLY;
> +
> +       crypto->cra_ablkcipher.setkey = ablkcipher_setkey;
> +       crypto->cra_ablkcipher.encrypt = ablkcipher_encrypt;
> +       crypto->cra_ablkcipher.decrypt = ablkcipher_decrypt;
> +
> +       err = crypto_register_alg(crypto);
> +       /* Mark alg as having been registered, if successful */
> +       if (err == 0)
> +               driver_alg->registered = true;
> +       dev_dbg(dev, "  registered ablkcipher %s\n", crypto->cra_driver_name);
> +       return err;
> +}
> +
> +static int spu_register_ahash(struct iproc_alg_s *driver_alg)
> +{
> +       struct device *dev = &iproc_priv.pdev->dev;
> +       struct spu_hw *spu = &iproc_priv.spu;
> +       struct ahash_alg *hash = &driver_alg->alg.hash;
> +       int err;
> +
> +       /* AES-XCBC is the only AES hash type currently supported on SPU-M */
> +       if ((driver_alg->auth_info.alg == HASH_ALG_AES) &&
> +           (driver_alg->auth_info.mode != HASH_MODE_XCBC) &&
> +           (spu->spu_type == SPU_TYPE_SPUM))
> +               return 0;
> +
> +       /* SHA3 algorithm variants are not registered for SPU-M or SPU2. */
> +       if ((driver_alg->auth_info.alg >= HASH_ALG_SHA3_224) &&
> +           (spu->spu_subtype != SPU_SUBTYPE_SPU2_V2))
> +               return 0;
> +
> +       hash->halg.base.cra_module = THIS_MODULE;
> +       hash->halg.base.cra_priority = hash_pri;
> +       hash->halg.base.cra_alignmask = 0;
> +       hash->halg.base.cra_ctxsize = sizeof(struct iproc_ctx_s);
> +       hash->halg.base.cra_init = ahash_cra_init;
> +       hash->halg.base.cra_exit = generic_cra_exit;
> +       hash->halg.base.cra_type = &crypto_ahash_type;
> +       hash->halg.base.cra_flags = CRYPTO_ALG_TYPE_AHASH | CRYPTO_ALG_ASYNC;
> +       /*
> +        * export state size has to be < 512 bytes. So don't include msg bufs
> +        * in state size.
> +        */
> +       hash->halg.statesize = offsetof(struct iproc_reqctx_s, msg_buf);
> +
> +       if (driver_alg->auth_info.mode != HASH_MODE_HMAC) {
> +               hash->setkey = ahash_setkey;
> +               hash->init = ahash_init;
> +               hash->update = ahash_update;
> +               hash->final = ahash_final;
> +               hash->finup = ahash_finup;
> +               hash->digest = ahash_digest;
> +       } else {
> +               hash->setkey = ahash_hmac_setkey;
> +               hash->init = ahash_hmac_init;
> +               hash->update = ahash_hmac_update;
> +               hash->final = ahash_hmac_final;
> +               hash->finup = ahash_hmac_finup;
> +               hash->digest = ahash_hmac_digest;
> +       }
> +       hash->export = ahash_export;
> +       hash->import = ahash_import;
> +
> +       err = crypto_register_ahash(hash);
> +       /* Mark alg as having been registered, if successful */
> +       if (err == 0)
> +               driver_alg->registered = true;
> +       dev_dbg(dev, "  registered ahash %s\n",
> +               hash->halg.base.cra_driver_name);
> +       return err;
> +}
> +
> +static int spu_register_aead(struct iproc_alg_s *driver_alg)
> +{
> +       struct device *dev = &iproc_priv.pdev->dev;
> +       struct aead_alg *aead = &driver_alg->alg.aead;
> +       int err;
> +
> +       aead->base.cra_module = THIS_MODULE;
> +       aead->base.cra_priority = aead_pri;
> +       aead->base.cra_alignmask = 0;
> +       aead->base.cra_ctxsize = sizeof(struct iproc_ctx_s);
> +       INIT_LIST_HEAD(&aead->base.cra_list);
> +
> +       aead->base.cra_flags |= CRYPTO_ALG_TYPE_AEAD | CRYPTO_ALG_ASYNC;
> +       /* setkey set in alg initialization */
> +       aead->setauthsize = aead_setauthsize;
> +       aead->encrypt = aead_encrypt;
> +       aead->decrypt = aead_decrypt;
> +       aead->init = aead_cra_init;
> +       aead->exit = aead_cra_exit;
> +
> +       err = crypto_register_aead(aead);
> +       /* Mark alg as having been registered, if successful */
> +       if (err == 0)
> +               driver_alg->registered = true;
> +       dev_dbg(dev, "  registered aead %s\n", aead->base.cra_driver_name);
> +       return err;
> +}
> +
> +/* register crypto algorithms the device supports */
> +static int spu_algs_register(struct device *dev)
> +{
> +       int i, j;
> +       int err;
> +
> +       for (i = 0; i < ARRAY_SIZE(driver_algs); i++) {
> +               switch (driver_algs[i].type) {
> +               case CRYPTO_ALG_TYPE_ABLKCIPHER:
> +                       err = spu_register_ablkcipher(&driver_algs[i]);
> +                       break;
> +               case CRYPTO_ALG_TYPE_AHASH:
> +                       err = spu_register_ahash(&driver_algs[i]);
> +                       break;
> +               case CRYPTO_ALG_TYPE_AEAD:
> +                       err = spu_register_aead(&driver_algs[i]);
> +                       break;
> +               default:
> +                       dev_err(dev,
> +                               "iproc-crypto: unknown alg type: %d",
> +                               driver_algs[i].type);
> +                       err = -EINVAL;
> +               }
> +
> +               if (err) {
> +                       dev_err(dev, "alg registration failed with error %d\n",
> +                               err);
> +                       goto err_algs;
> +               }
> +       }
> +
> +       return 0;
> +
> +err_algs:
> +       for (j = 0; j < i; j++) {
> +               /* Skip any algorithm not registered */
> +               if (!driver_algs[j].registered)
> +                       continue;
> +               switch (driver_algs[j].type) {
> +               case CRYPTO_ALG_TYPE_ABLKCIPHER:
> +                       crypto_unregister_alg(&driver_algs[j].alg.crypto);
> +                       driver_algs[j].registered = false;
> +                       break;
> +               case CRYPTO_ALG_TYPE_AHASH:
> +                       crypto_unregister_ahash(&driver_algs[j].alg.hash);
> +                       driver_algs[j].registered = false;
> +                       break;
> +               case CRYPTO_ALG_TYPE_AEAD:
> +                       crypto_unregister_aead(&driver_algs[j].alg.aead);
> +                       driver_algs[j].registered = false;
> +                       break;
> +               }
> +       }
> +       return err;
> +}
> +
> +/* ==================== Kernel Platform API ==================== */
> +
> +static struct spu_type_subtype spum_ns2_types = {
> +       SPU_TYPE_SPUM, SPU_SUBTYPE_SPUM_NS2
> +};
> +
> +static struct spu_type_subtype spum_nsp_types = {
> +       SPU_TYPE_SPUM, SPU_SUBTYPE_SPUM_NSP
> +};
> +
> +static struct spu_type_subtype spu2_types = {
> +       SPU_TYPE_SPU2, SPU_SUBTYPE_SPU2_V1
> +};
> +
> +static struct spu_type_subtype spu2_v2_types = {
> +       SPU_TYPE_SPU2, SPU_SUBTYPE_SPU2_V2
> +};
> +
> +static const struct of_device_id bcm_spu_dt_ids[] = {
> +       {
> +               .compatible = "brcm,spum-crypto",
> +               .data = &spum_ns2_types,
> +       },
> +       {
> +               .compatible = "brcm,spum-nsp-crypto",
> +               .data = &spum_nsp_types,
> +       },
> +       {
> +               .compatible = "brcm,spu2-crypto",
> +               .data = &spu2_types,
> +       },
> +       {
> +               .compatible = "brcm,spu2-v2-crypto",
> +               .data = &spu2_v2_types,
> +       },
> +       { /* sentinel */ }
> +};
> +
> +MODULE_DEVICE_TABLE(of, bcm_spu_dt_ids);
> +
> +static int spu_dt_read(struct platform_device *pdev)
> +{
> +       struct device *dev = &pdev->dev;
> +       struct spu_hw *spu = &iproc_priv.spu;
> +       struct device_node *dn = pdev->dev.of_node;
> +       struct resource *spu_ctrl_regs;
> +       const struct of_device_id *match;
> +       struct spu_type_subtype *matched_spu_type;
> +       void __iomem *spu_reg_vbase[MAX_SPUS];
> +       int i;
> +       int err;
> +
> +       if (!of_device_is_available(dn)) {
> +               dev_crit(dev, "SPU device not available");
> +               return -ENODEV;
> +       }
> +
> +       /* Count number of mailbox channels */
> +       spu->num_chan = of_count_phandle_with_args(dn, "mboxes", "#mbox-cells");
> +       dev_dbg(dev, "Device has %d SPU channels", spu->num_chan);
> +
> +       match = of_match_device(of_match_ptr(bcm_spu_dt_ids), dev);
> +       matched_spu_type = (struct spu_type_subtype *)match->data;
> +       spu->spu_type = matched_spu_type->type;
> +       spu->spu_subtype = matched_spu_type->subtype;
> +
> +       /* Read registers and count number of SPUs */
> +       i = 0;
> +       while ((i < MAX_SPUS) && ((spu_ctrl_regs =
> +               platform_get_resource(pdev, IORESOURCE_MEM, i)) != NULL)) {
> +               dev_dbg(dev,
> +                       "SPU %d control register region res.start = %#x, res.end = %#x",
> +                       i,
> +                       (unsigned int)spu_ctrl_regs->start,
> +                       (unsigned int)spu_ctrl_regs->end);
> +
> +               spu_reg_vbase[i] = devm_ioremap_resource(dev, spu_ctrl_regs);
> +               if (IS_ERR(spu_reg_vbase[i])) {
> +                       err = PTR_ERR(spu_reg_vbase[i]);
> +                       dev_err(&pdev->dev, "Failed to map registers: %d\n",
> +                               err);
> +                       spu_reg_vbase[i] = NULL;
> +                       return err;
> +               }
> +               i++;
> +       }
> +       spu->num_spu = i;
> +       dev_dbg(dev, "Device has %d SPUs", spu->num_spu);
> +
> +       spu->reg_vbase = devm_kcalloc(dev, spu->num_spu,
> +                                     sizeof(*spu->reg_vbase), GFP_KERNEL);
> +       if (!spu->reg_vbase)
> +               return -ENOMEM;
> +       memcpy(spu->reg_vbase, spu_reg_vbase,
> +              spu->num_spu * sizeof(*spu->reg_vbase));
> +
> +       return 0;
> +}
> +
> +int bcm_spu_probe(struct platform_device *pdev)
> +{
> +       struct device *dev = &pdev->dev;
> +       struct spu_hw *spu = &iproc_priv.spu;
> +       int err = 0;
> +
> +       iproc_priv.pdev = pdev;
> +       platform_set_drvdata(iproc_priv.pdev, &iproc_priv);
> +
> +       err = spu_dt_read(pdev);
> +       if (err < 0)
> +               goto failure;
> +
> +       if (spu->spu_type == SPU_TYPE_SPUM)
> +               iproc_priv.bcm_hdr_len = 8;
> +       else if (spu->spu_type == SPU_TYPE_SPU2)
> +               iproc_priv.bcm_hdr_len = 0;
> +
> +       spu_functions_register(&pdev->dev, spu->spu_type, spu->spu_subtype);
> +
> +       err = spu_mb_init(&pdev->dev);
> +       if (err < 0)
> +               goto failure;
> +
> +       spu_counters_init();
> +
> +       spu_setup_debugfs();
> +
> +       err = spu_algs_register(dev);
> +       if (err < 0)
> +               goto fail_reg;
> +
> +       return 0;
> +
> +fail_reg:
> +       spu_free_debugfs();
> +failure:
> +       spu_mb_release(pdev);
> +       dev_err(dev, "%s failed with error %d.\n", __func__, err);
> +
> +       return err;
> +}
> +
> +int bcm_spu_remove(struct platform_device *pdev)
> +{
> +       int i;
> +       struct device *dev = &pdev->dev;
> +
> +       for (i = 0; i < ARRAY_SIZE(driver_algs); i++) {
> +               /*
> +                * Not all algorithms were registered, depending on whether
> +                * hardware is SPU or SPU2.  So here we make sure to skip
> +                * those algorithms that were not previously registered.
> +                */
> +               if (!driver_algs[i].registered)
> +                       continue;
> +
> +               switch (driver_algs[i].type) {
> +               case CRYPTO_ALG_TYPE_ABLKCIPHER:
> +                       crypto_unregister_alg(&driver_algs[i].alg.crypto);
> +                       dev_dbg(dev, "  unregistered cipher %s\n",
> +                               driver_algs[i].alg.crypto.cra_driver_name);
> +                       driver_algs[i].registered = false;
> +                       break;
> +               case CRYPTO_ALG_TYPE_AHASH:
> +                       crypto_unregister_ahash(&driver_algs[i].alg.hash);
> +                       dev_dbg(dev, "  unregistered hash %s\n",
> +                               driver_algs[i].alg.hash.halg.
> +                               base.cra_driver_name);
> +                       driver_algs[i].registered = false;
> +                       break;
> +               case CRYPTO_ALG_TYPE_AEAD:
> +                       crypto_unregister_aead(&driver_algs[i].alg.aead);
> +                       dev_dbg(dev, "  unregistered aead %s\n",
> +                               driver_algs[i].alg.aead.base.cra_driver_name);
> +                       driver_algs[i].registered = false;
> +                       break;
> +               }
> +       }
> +       spu_free_debugfs();
> +       spu_mb_release(pdev);
> +       return 0;
> +}
> +
> +/* ===== Kernel Module API ===== */
> +
> +static struct platform_driver bcm_spu_pdriver = {
> +       .driver = {
> +                  .name = "brcm-spu-crypto",
> +                  .of_match_table = of_match_ptr(bcm_spu_dt_ids),
> +                  },
> +       .probe = bcm_spu_probe,
> +       .remove = bcm_spu_remove,
> +};
> +module_platform_driver(bcm_spu_pdriver);
> +
> +MODULE_AUTHOR("Rob Rice <rob.rice@broadcom.com>");
> +MODULE_DESCRIPTION("Broadcom symmetric crypto offload driver");
> +MODULE_LICENSE("GPL v2");
> diff --git a/drivers/crypto/bcm/cipher.h b/drivers/crypto/bcm/cipher.h
> new file mode 100644
> index 0000000..2d856bd
> --- /dev/null
> +++ b/drivers/crypto/bcm/cipher.h
> @@ -0,0 +1,472 @@
> +/*
> + * Copyright 2016 Broadcom
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License, version 2, as
> + * published by the Free Software Foundation (the "GPL").
> + *
> + * This program is distributed in the hope that it will be useful, but
> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License version 2 (GPLv2) for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * version 2 (GPLv2) along with this source code.
> + */
> +
> +#ifndef _CIPHER_H
> +#define _CIPHER_H
> +
> +#include <linux/atomic.h>
> +#include <linux/mailbox/brcm-message.h>
> +#include <linux/mailbox_client.h>
> +#include <crypto/aes.h>
> +#include <crypto/internal/hash.h>
> +#include <crypto/aead.h>
> +#include <crypto/sha.h>
> +#include <crypto/sha3.h>
> +
> +#include "spu.h"
> +#include "spum.h"
> +#include "spu2.h"
> +
> +#define ARC4_MIN_KEY_SIZE   1
> +#define ARC4_MAX_KEY_SIZE   256
> +#define ARC4_BLOCK_SIZE     1
> +#define ARC4_STATE_SIZE     4
> +
> +#define CCM_AES_IV_SIZE    16
> +#define GCM_AES_IV_SIZE    12
> +#define GCM_ESP_IV_SIZE     8
> +#define CCM_ESP_IV_SIZE     8
> +#define RFC4543_ICV_SIZE   16
> +
> +#define MAX_KEY_SIZE   ARC4_MAX_KEY_SIZE
> +#define MAX_IV_SIZE    AES_BLOCK_SIZE
> +#define MAX_DIGEST_SIZE        SHA3_512_DIGEST_SIZE
> +#define MAX_ASSOC_SIZE 512
> +
> +/* size of salt value for AES-GCM-ESP and AES-CCM-ESP */
> +#define GCM_ESP_SALT_SIZE   4
> +#define CCM_ESP_SALT_SIZE   3
> +#define MAX_SALT_SIZE       GCM_ESP_SALT_SIZE
> +#define GCM_ESP_SALT_OFFSET 0
> +#define CCM_ESP_SALT_OFFSET 1
> +
> +#define GCM_ESP_DIGESTSIZE 16
> +
> +#define MAX_HASH_BLOCK_SIZE SHA512_BLOCK_SIZE
> +
> +/*
> + * Maximum number of bytes from a non-final hash request that can be deferred
> + * until more data is available. With new crypto API framework, this
> + * can be no more than one block of data.
> + */
> +#define HASH_CARRY_MAX  MAX_HASH_BLOCK_SIZE
> +
> +/* Force at least 4-byte alignment of all SPU message fields */
> +#define SPU_MSG_ALIGN  4
> +
> +/* Number of times to resend mailbox message if mb queue is full */
> +#define SPU_MB_RETRY_MAX  1000
> +
> +/* op_counts[] indexes */
> +enum op_type {
> +       SPU_OP_CIPHER,
> +       SPU_OP_HASH,
> +       SPU_OP_HMAC,
> +       SPU_OP_AEAD,
> +       SPU_OP_NUM
> +};
> +
> +enum spu_spu_type {
> +       SPU_TYPE_SPUM,
> +       SPU_TYPE_SPU2,
> +};
> +
> +/*
> + * SPUM_NS2 and SPUM_NSP are the SPU-M block on Northstar 2 and Northstar Plus,
> + * respectively.
> + */
> +enum spu_spu_subtype {
> +       SPU_SUBTYPE_SPUM_NS2,
> +       SPU_SUBTYPE_SPUM_NSP,
> +       SPU_SUBTYPE_SPU2_V1,
> +       SPU_SUBTYPE_SPU2_V2
> +};
> +
> +struct spu_type_subtype {
> +       enum spu_spu_type type;
> +       enum spu_spu_subtype subtype;
> +};
> +
> +struct cipher_op {
> +       enum spu_cipher_alg alg;
> +       enum spu_cipher_mode mode;
> +};
> +
> +struct auth_op {
> +       enum hash_alg alg;
> +       enum hash_mode mode;
> +};
> +
> +struct iproc_alg_s {
> +       u32 type;
> +       union {
> +               struct crypto_alg crypto;
> +               struct ahash_alg hash;
> +               struct aead_alg aead;
> +       } alg;
> +       struct cipher_op cipher_info;
> +       struct auth_op auth_info;
> +       bool auth_first;
> +       bool registered;
> +};
> +
> +/*
> + * Buffers for a SPU request/reply message pair. All part of one structure to
> + * allow a single alloc per request.
> + */
> +struct spu_msg_buf {
> +       /* Request message fragments */
> +
> +       /*
> +        * SPU request message header. For SPU-M, holds MH, EMH, SCTX, BDESC,
> +        * and BD header. For SPU2, holds FMD, OMD.
> +        */
> +       u8 bcm_spu_req_hdr[ALIGN(SPU2_HEADER_ALLOC_LEN, SPU_MSG_ALIGN)];
> +
> +       /* IV or counter. Size to include salt. Also used for XTS tweek. */
> +       u8 iv_ctr[ALIGN(2 * AES_BLOCK_SIZE, SPU_MSG_ALIGN)];
> +
> +       /* Hash digest. request and response. */
> +       u8 digest[ALIGN(MAX_DIGEST_SIZE, SPU_MSG_ALIGN)];
> +
> +       /* SPU request message padding */
> +       u8 spu_req_pad[ALIGN(SPU_PAD_LEN_MAX, SPU_MSG_ALIGN)];
> +
> +       /* SPU-M request message STATUS field */
> +       u8 tx_stat[ALIGN(SPU_TX_STATUS_LEN, SPU_MSG_ALIGN)];
> +
> +       /* Response message fragments */
> +
> +       /* SPU response message header */
> +       u8 spu_resp_hdr[ALIGN(SPU2_HEADER_ALLOC_LEN, SPU_MSG_ALIGN)];
> +
> +       /* SPU response message STATUS field padding */
> +       u8 rx_stat_pad[ALIGN(SPU_STAT_PAD_MAX, SPU_MSG_ALIGN)];
> +
> +       /* SPU response message STATUS field */
> +       u8 rx_stat[ALIGN(SPU_RX_STATUS_LEN, SPU_MSG_ALIGN)];
> +
> +       union {
> +               /* Buffers only used for ablkcipher */
> +               struct {
> +                       /*
> +                        * Field used for either SUPDT when RC4 is used
> +                        * -OR- tweak value when XTS/AES is used
> +                        */
> +                       u8 supdt_tweak[ALIGN(SPU_SUPDT_LEN, SPU_MSG_ALIGN)];
> +               } c;
> +
> +               /* Buffers only used for aead */
> +               struct {
> +                       /* SPU response pad for GCM data */
> +                       u8 gcmpad[ALIGN(AES_BLOCK_SIZE, SPU_MSG_ALIGN)];
> +
> +                       /* SPU request msg padding for GCM AAD */
> +                       u8 req_aad_pad[ALIGN(SPU_PAD_LEN_MAX, SPU_MSG_ALIGN)];
> +
> +                       /* SPU response data to be discarded */
> +                       u8 resp_aad[ALIGN(MAX_ASSOC_SIZE + MAX_IV_SIZE,
> +                                         SPU_MSG_ALIGN)];
> +               } a;
> +       };
> +};
> +
> +struct iproc_ctx_s {
> +       u8 enckey[MAX_KEY_SIZE + ARC4_STATE_SIZE];
> +       unsigned int enckeylen;
> +
> +       u8 authkey[MAX_KEY_SIZE + ARC4_STATE_SIZE];
> +       unsigned int authkeylen;
> +
> +       u8 salt[MAX_SALT_SIZE];
> +       unsigned int salt_len;
> +       unsigned int salt_offset;
> +       u8 iv[MAX_IV_SIZE];
> +
> +       unsigned int digestsize;
> +
> +       struct iproc_alg_s *alg;
> +       bool is_esp;
> +
> +       struct cipher_op cipher;
> +       enum spu_cipher_type cipher_type;
> +
> +       struct auth_op auth;
> +       bool auth_first;
> +
> +       /*
> +        * The maximum length in bytes of the payload in a SPU message for this
> +        * context. For SPU-M, the payload is the combination of AAD and data.
> +        * For SPU2, the payload is just data. A value of SPU_MAX_PAYLOAD_INF
> +        * indicates that there is no limit to the length of the SPU message
> +        * payload.
> +        */
> +       unsigned int max_payload;
> +
> +       struct crypto_aead *fallback_cipher;
> +
> +       /* auth_type is determined during processing of request */
> +
> +       u8 ipad[MAX_HASH_BLOCK_SIZE];
> +       u8 opad[MAX_HASH_BLOCK_SIZE];
> +
> +       /*
> +        * Buffer to hold SPU message header template. Template is created at
> +        * setkey time for ablkcipher requests, since most of the fields in the
> +        * header are known at that time. At request time, just fill in a few
> +        * missing pieces related to length of data in the request and IVs, etc.
> +        */
> +       u8 bcm_spu_req_hdr[ALIGN(SPU2_HEADER_ALLOC_LEN, SPU_MSG_ALIGN)];
> +
> +       /* Length of SPU request header */
> +       u16 spu_req_hdr_len;
> +
> +       /* Expected length of SPU response header */
> +       u16 spu_resp_hdr_len;
> +
> +       /*
> +        * shash descriptor - needed to perform incremental hashing in
> +        * in software, when hw doesn't support it.
> +        */
> +       struct shash_desc *shash;
> +
> +       bool is_rfc4543;        /* RFC 4543 style of GMAC */
> +};
> +
> +struct iproc_reqctx_s {
> +       /* general context */
> +       struct crypto_async_request *parent;
> +
> +       /* only valid after enqueue() */
> +       struct iproc_ctx_s *ctx;
> +
> +       u8 chan_idx;   /* Mailbox channel to be used to submit this request */
> +
> +       /* total todo, rx'd, and sent for this request */
> +       unsigned int total_todo;
> +       unsigned int total_received;    /* only valid for ablkcipher */
> +       unsigned int total_sent;
> +
> +       /*
> +        * num bytes sent to hw from the src sg in this request. This can differ
> +        * from total_sent for incremental hashing. total_sent includes previous
> +        * init() and update() data. src_sent does not.
> +        */
> +       unsigned int src_sent;
> +       unsigned int hmac_offset;
> +
> +       /*
> +        * For AEAD requests, start of associated data. This will typically
> +        * point to the beginning of the src scatterlist from the request,
> +        * since assoc data is at the beginning of the src scatterlist rather
> +        * than in its own sg.
> +        */
> +       struct scatterlist *assoc;
> +
> +       /*
> +        * scatterlist entry and offset to start of data for next chunk. Crypto
> +        * API src scatterlist for AEAD starts with AAD, if present. For first
> +        * chunk, src_sg is sg entry at beginning of input data (after AAD).
> +        * src_skip begins at the offset in that sg entry where data begins.
> +        */
> +       struct scatterlist *src_sg;
> +       int src_nents;          /* Number of src entries with data */
> +       u32 src_skip;           /* bytes of current sg entry already used */
> +
> +       /*
> +        * Same for destination. For AEAD, if there is AAD, output data must
> +        * be written at offset following AAD.
> +        */
> +       struct scatterlist *dst_sg;
> +       int dst_nents;          /* Number of dst entries with data */
> +       u32 dst_skip;           /* bytes of current sg entry already written */
> +
> +       /* Mailbox message used to send this request to PDC driver */
> +       struct brcm_message mb_mssg;
> +
> +       bool bd_suppress;       /* suppress BD field in SPU response? */
> +
> +       /* cipher context */
> +       bool is_encrypt;
> +
> +       /*
> +        * CBC mode: IV.  CTR mode: counter.  Else empty. Used as a DMA
> +        * buffer for AEAD requests. So allocate as DMAable memory. If IV
> +        * concatenated with salt, includes the salt.
> +        */
> +       u8 *iv_ctr;
> +       /* Length of IV or counter, in bytes */
> +       unsigned int iv_ctr_len;
> +
> +       /*
> +        * Hash requests can be of any size, whether initial, update, or final.
> +        * A non-final request must be submitted to the SPU as an integral
> +        * number of blocks. This may leave data at the end of the request
> +        * that is not a full block. Since the request is non-final, it cannot
> +        * be padded. So, we write the remainder to this hash_carry buffer and
> +        * hold it until the next request arrives. The carry data is then
> +        * submitted at the beginning of the data in the next SPU msg.
> +        * hash_carry_len is the number of bytes currently in hash_carry. These
> +        * fields are only used for ahash requests.
> +        */
> +       u8 hash_carry[HASH_CARRY_MAX];
> +       unsigned int hash_carry_len;
> +       unsigned int is_final;  /* is this the final for the hash op? */
> +
> +       /*
> +        * Digest from incremental hash is saved here to include in next hash
> +        * operation. Cannot be stored in req->result for truncated hashes,
> +        * since result may be sized for final digest. Cannot be saved in
> +        * msg_buf because that gets deleted between incremental hash ops
> +        * and is not saved as part of export().
> +        */
> +       u8 incr_hash[MAX_DIGEST_SIZE];
> +
> +       /* hmac context */
> +       bool is_sw_hmac;
> +
> +       /* aead context */
> +       struct crypto_tfm *old_tfm;
> +       crypto_completion_t old_complete;
> +       void *old_data;
> +
> +       gfp_t gfp;
> +
> +       /* Buffers used to build SPU request and response messages */
> +       /* MUST BE LAST */
> +       struct spu_msg_buf msg_buf;
> +};
> +
> +/*
> + * Structure encapsulates a set of function pointers specific to the type of
> + * SPU hardware running. These functions handling creation and parsing of
> + * SPU request messages and SPU response messages. Includes hardware-specific
> + * values read from device tree.
> + */
> +struct spu_hw {
> +       void (*spu_dump_msg_hdr)(u8 *buf, unsigned int buf_len);
> +       u32 (*spu_ctx_max_payload)(enum spu_cipher_alg cipher_alg,
> +                                  enum spu_cipher_mode cipher_mode,
> +                                  unsigned int blocksize);
> +       u32 (*spu_payload_length)(u8 *spu_hdr);
> +       u16 (*spu_response_hdr_len)(u16 auth_key_len, u16 enc_key_len,
> +                                   bool is_hash);
> +       u16 (*spu_hash_pad_len)(enum hash_alg hash_alg,
> +                               enum hash_mode hash_mode, u32 chunksize,
> +                               u16 hash_block_size);
> +       u32 (*spu_gcm_ccm_pad_len)(enum spu_cipher_mode cipher_mode,
> +                                  unsigned int data_size);
> +       u32 (*spu_assoc_resp_len)(enum spu_cipher_mode cipher_mode,
> +                                 unsigned int assoc_len,
> +                                 unsigned int iv_len, bool is_encrypt);
> +       u8 (*spu_aead_ivlen)(enum spu_cipher_mode cipher_mode,
> +                            u16 iv_len);
> +       enum hash_type (*spu_hash_type)(u32 src_sent);
> +       u32 (*spu_digest_size)(u32 digest_size, enum hash_alg alg,
> +                              enum hash_type);
> +       u32 (*spu_create_request)(u8 *spu_hdr,
> +                                 struct spu_request_opts *req_opts,
> +                                 struct spu_cipher_parms *cipher_parms,
> +                                 struct spu_hash_parms *hash_parms,
> +                                 struct spu_aead_parms *aead_parms,
> +                                 unsigned int data_size);
> +       u16 (*spu_cipher_req_init)(u8 *spu_hdr,
> +                                  struct spu_cipher_parms *cipher_parms);
> +       void (*spu_cipher_req_finish)(u8 *spu_hdr,
> +                                     u16 spu_req_hdr_len,
> +                                     unsigned int is_inbound,
> +                                     struct spu_cipher_parms *cipher_parms,
> +                                     bool update_key,
> +                                     unsigned int data_size);
> +       void (*spu_request_pad)(u8 *pad_start, u32 gcm_padding,
> +                               u32 hash_pad_len, enum hash_alg auth_alg,
> +                               enum hash_mode auth_mode,
> +                               unsigned int total_sent, u32 status_padding);
> +       u8 (*spu_xts_tweak_in_payload)(void);
> +       u8 (*spu_tx_status_len)(void);
> +       u8 (*spu_rx_status_len)(void);
> +       int (*spu_status_process)(u8 *statp);
> +       void (*spu_ccm_update_iv)(unsigned int digestsize,
> +                                 struct spu_cipher_parms *cipher_parms,
> +                                 unsigned int assoclen, unsigned int chunksize,
> +                                 bool is_encrypt, bool is_esp);
> +       u32 (*spu_wordalign_padlen)(u32 data_size);
> +
> +       /* The base virtual address of the SPU hw registers */
> +       void __iomem **reg_vbase;
> +
> +       /* Version of the SPU hardware */
> +       enum spu_spu_type spu_type;
> +
> +       /* Sub-version of the SPU hardware */
> +       enum spu_spu_subtype spu_subtype;
> +
> +       /* The number of SPUs on this platform */
> +       u32 num_spu;
> +
> +       /* The number of SPU channels on this platform */
> +       u32 num_chan;
> +};
> +
> +struct device_private {
> +       struct platform_device *pdev;
> +
> +       struct spu_hw spu;
> +
> +       atomic_t session_count; /* number of streams active */
> +       atomic_t stream_count;  /* monotonic counter for streamID's */
> +
> +       /* Length of BCM header. Set to 0 when hw does not expect BCM HEADER. */
> +       u8 bcm_hdr_len;
> +
> +       /* The index of the channel to use for the next crypto request */
> +       atomic_t next_chan;
> +
> +       struct dentry *debugfs_dir;
> +       struct dentry *debugfs_stats;
> +
> +       /* Number of request bytes processed and result bytes returned */
> +       atomic64_t bytes_in;
> +       atomic64_t bytes_out;
> +
> +       /* Number of operations of each type */
> +       atomic_t op_counts[SPU_OP_NUM];
> +
> +       atomic_t cipher_cnt[CIPHER_ALG_LAST][CIPHER_MODE_LAST];
> +       atomic_t hash_cnt[HASH_ALG_LAST];
> +       atomic_t hmac_cnt[HASH_ALG_LAST];
> +       atomic_t aead_cnt[AEAD_TYPE_LAST];
> +
> +       /* Number of calls to setkey() for each operation type */
> +       atomic_t setkey_cnt[SPU_OP_NUM];
> +
> +       /* Number of times request was resubmitted because mb was full */
> +       atomic_t mb_no_spc;
> +
> +       /* Number of mailbox send failures */
> +       atomic_t mb_send_fail;
> +
> +       /* Number of ICV check failures for AEAD messages */
> +       atomic_t bad_icv;
> +
> +       struct mbox_client mcl;
> +       /* Array of mailbox channel pointers, one for each channel */
> +       struct mbox_chan **mbox;
> +};
> +
> +extern struct device_private iproc_priv;
> +
> +#endif
> diff --git a/drivers/crypto/bcm/spu.c b/drivers/crypto/bcm/spu.c
> new file mode 100644
> index 0000000..0331267
> --- /dev/null
> +++ b/drivers/crypto/bcm/spu.c
> @@ -0,0 +1,1252 @@
> +/*
> + * Copyright 2016 Broadcom
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License, version 2, as
> + * published by the Free Software Foundation (the "GPL").
> + *
> + * This program is distributed in the hope that it will be useful, but
> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License version 2 (GPLv2) for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * version 2 (GPLv2) along with this source code.
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/string.h>
> +
> +#include "util.h"
> +#include "spu.h"
> +#include "spum.h"
> +#include "cipher.h"
> +
> +/* This array is based on the hash algo type supported in spu.h */
> +char *tag_to_hash_idx[] = { "none", "md5", "sha1", "sha224", "sha256" };
> +
> +char *hash_alg_name[] = { "None", "md5", "sha1", "sha224", "sha256", "aes",
> +       "sha384", "sha512", "sha3_224", "sha3_256", "sha3_384", "sha3_512" };
> +
> +char *aead_alg_name[] = { "ccm(aes)", "gcm(aes)", "authenc" };
> +
> +/* Assumes SPU-M messages are in big endian */
> +void spum_dump_msg_hdr(u8 *buf, unsigned int buf_len)
> +{
> +       u8 *ptr = buf;
> +       struct SPUHEADER *spuh = (struct SPUHEADER *)buf;
> +       unsigned int hash_key_len = 0;
> +       unsigned int hash_state_len = 0;
> +       unsigned int cipher_key_len = 0;
> +       unsigned int iv_len;
> +       u32 pflags;
> +       u32 cflags;
> +       u32 ecf;
> +       u32 cipher_alg;
> +       u32 cipher_mode;
> +       u32 cipher_type;
> +       u32 hash_alg;
> +       u32 hash_mode;
> +       u32 hash_type;
> +       u32 sctx_size;   /* SCTX length in words */
> +       u32 sctx_pl_len; /* SCTX payload length in bytes */
> +
> +       packet_log("\n");
> +       packet_log("SPU Message header %p len: %u\n", buf, buf_len);
> +
> +       /* ========== Decode MH ========== */
> +       packet_log("  MH 0x%08x\n", be32_to_cpu(*((u32 *)ptr)));
> +       if (spuh->mh.flags & MH_SCTX_PRES)
> +               packet_log("    SCTX  present\n");
> +       if (spuh->mh.flags & MH_BDESC_PRES)
> +               packet_log("    BDESC present\n");
> +       if (spuh->mh.flags & MH_MFM_PRES)
> +               packet_log("    MFM   present\n");
> +       if (spuh->mh.flags & MH_BD_PRES)
> +               packet_log("    BD    present\n");
> +       if (spuh->mh.flags & MH_HASH_PRES)
> +               packet_log("    HASH  present\n");
> +       if (spuh->mh.flags & MH_SUPDT_PRES)
> +               packet_log("    SUPDT present\n");
> +       packet_log("    Opcode 0x%02x\n", spuh->mh.op_code);
> +
> +       ptr += sizeof(spuh->mh) + sizeof(spuh->emh);  /* skip emh. unused */
> +
> +       /* ========== Decode SCTX ========== */
> +       if (spuh->mh.flags & MH_SCTX_PRES) {
> +               pflags = be32_to_cpu(spuh->sa.proto_flags);
> +               packet_log("  SCTX[0] 0x%08x\n", pflags);
> +               sctx_size = pflags & SCTX_SIZE;
> +               packet_log("    Size %u words\n", sctx_size);
> +
> +               cflags = be32_to_cpu(spuh->sa.cipher_flags);
> +               packet_log("  SCTX[1] 0x%08x\n", cflags);
> +               packet_log("    Inbound:%lu (1:decrypt/vrfy 0:encrypt/auth)\n",
> +                          (cflags & CIPHER_INBOUND) >> CIPHER_INBOUND_SHIFT);
> +               packet_log("    Order:%lu (1:AuthFirst 0:EncFirst)\n",
> +                          (cflags & CIPHER_ORDER) >> CIPHER_ORDER_SHIFT);
> +               packet_log("    ICV_IS_512:%lx\n",
> +                          (cflags & ICV_IS_512) >> ICV_IS_512_SHIFT);
> +               cipher_alg = (cflags & CIPHER_ALG) >> CIPHER_ALG_SHIFT;
> +               cipher_mode = (cflags & CIPHER_MODE) >> CIPHER_MODE_SHIFT;
> +               cipher_type = (cflags & CIPHER_TYPE) >> CIPHER_TYPE_SHIFT;
> +               packet_log("    Crypto Alg:%u Mode:%u Type:%u\n",
> +                          cipher_alg, cipher_mode, cipher_type);
> +               hash_alg = (cflags & HASH_ALG) >> HASH_ALG_SHIFT;
> +               hash_mode = (cflags & HASH_MODE) >> HASH_MODE_SHIFT;
> +               hash_type = (cflags & HASH_TYPE) >> HASH_TYPE_SHIFT;
> +               packet_log("    Hash   Alg:%x Mode:%x Type:%x\n",
> +                          hash_alg, hash_mode, hash_type);
> +               packet_log("    UPDT_Offset:%u\n", cflags & UPDT_OFST);
> +
> +               ecf = be32_to_cpu(spuh->sa.ecf);
> +               packet_log("  SCTX[2] 0x%08x\n", ecf);
> +               packet_log("    WriteICV:%lu CheckICV:%lu ICV_SIZE:%u ",
> +                          (ecf & INSERT_ICV) >> INSERT_ICV_SHIFT,
> +                          (ecf & CHECK_ICV) >> CHECK_ICV_SHIFT,
> +                          (ecf & ICV_SIZE) >> ICV_SIZE_SHIFT);
> +               packet_log("BD_SUPPRESS:%lu\n",
> +                          (ecf & BD_SUPPRESS) >> BD_SUPPRESS_SHIFT);
> +               packet_log("    SCTX_IV:%lu ExplicitIV:%lu GenIV:%lu ",
> +                          (ecf & SCTX_IV) >> SCTX_IV_SHIFT,
> +                          (ecf & EXPLICIT_IV) >> EXPLICIT_IV_SHIFT,
> +                          (ecf & GEN_IV) >> GEN_IV_SHIFT);
> +               packet_log("IV_OV_OFST:%lu EXP_IV_SIZE:%u\n",
> +                          (ecf & IV_OFFSET) >> IV_OFFSET_SHIFT,
> +                          ecf & EXP_IV_SIZE);
> +
> +               ptr += sizeof(struct SCTX);
> +
> +               if (hash_alg && hash_mode) {
> +                       char *name = "NONE";
> +
> +                       switch (hash_alg) {
> +                       case HASH_ALG_MD5:
> +                               hash_key_len = 16;
> +                               name = "MD5";
> +                               break;
> +                       case HASH_ALG_SHA1:
> +                               hash_key_len = 20;
> +                               name = "SHA1";
> +                               break;
> +                       case HASH_ALG_SHA224:
> +                               hash_key_len = 28;
> +                               name = "SHA224";
> +                               break;
> +                       case HASH_ALG_SHA256:
> +                               hash_key_len = 32;
> +                               name = "SHA256";
> +                               break;
> +                       case HASH_ALG_SHA384:
> +                               hash_key_len = 48;
> +                               name = "SHA384";
> +                               break;
> +                       case HASH_ALG_SHA512:
> +                               hash_key_len = 64;
> +                               name = "SHA512";
> +                               break;
> +                       case HASH_ALG_AES:
> +                               hash_key_len = 0;
> +                               name = "AES";
> +                               break;
> +                       case HASH_ALG_NONE:
> +                               break;
> +                       }
> +
> +                       packet_log("    Auth Key Type:%s Length:%u Bytes\n",
> +                                  name, hash_key_len);
> +                       packet_dump("    KEY: ", ptr, hash_key_len);
> +                       ptr += hash_key_len;
> +               } else if ((hash_alg == HASH_ALG_AES) &&
> +                          (hash_mode == HASH_MODE_XCBC)) {
> +                       char *name = "NONE";
> +
> +                       switch (cipher_type) {
> +                       case CIPHER_TYPE_AES128:
> +                               hash_key_len = 16;
> +                               name = "AES128-XCBC";
> +                               break;
> +                       case CIPHER_TYPE_AES192:
> +                               hash_key_len = 24;
> +                               name = "AES192-XCBC";
> +                               break;
> +                       case CIPHER_TYPE_AES256:
> +                               hash_key_len = 32;
> +                               name = "AES256-XCBC";
> +                               break;
> +                       }
> +                       packet_log("    Auth Key Type:%s Length:%u Bytes\n",
> +                                  name, hash_key_len);
> +                       packet_dump("    KEY: ", ptr, hash_key_len);
> +                       ptr += hash_key_len;
> +               }
> +
> +               if (hash_alg && (hash_mode == HASH_MODE_NONE) &&
> +                   (hash_type == HASH_TYPE_UPDT)) {
> +                       char *name = "NONE";
> +
> +                       switch (hash_alg) {
> +                       case HASH_ALG_MD5:
> +                               hash_state_len = 16;
> +                               name = "MD5";
> +                               break;
> +                       case HASH_ALG_SHA1:
> +                               hash_state_len = 20;
> +                               name = "SHA1";
> +                               break;
> +                       case HASH_ALG_SHA224:
> +                               hash_state_len = 32;
> +                               name = "SHA224";
> +                               break;
> +                       case HASH_ALG_SHA256:
> +                               hash_state_len = 32;
> +                               name = "SHA256";
> +                               break;
> +                       case HASH_ALG_SHA384:
> +                               hash_state_len = 48;
> +                               name = "SHA384";
> +                               break;
> +                       case HASH_ALG_SHA512:
> +                               hash_state_len = 64;
> +                               name = "SHA512";
> +                               break;
> +                       case HASH_ALG_AES:
> +                               hash_state_len = 0;
> +                               name = "AES";
> +                               break;
> +                       case HASH_ALG_NONE:
> +                               break;
> +                       }
> +
> +                       packet_log("    Auth State Type:%s Length:%u Bytes\n",
> +                                  name, hash_state_len);
> +                       packet_dump("    State: ", ptr, hash_state_len);
> +                       ptr += hash_state_len;
> +               }
> +
> +               if (cipher_alg) {
> +                       char *name = "NONE";
> +
> +                       switch (cipher_alg) {
> +                       case CIPHER_ALG_DES:
> +                               cipher_key_len = 8;
> +                               name = "DES";
> +                               break;
> +                       case CIPHER_ALG_3DES:
> +                               cipher_key_len = 24;
> +                               name = "3DES";
> +                               break;
> +                       case CIPHER_ALG_RC4:
> +                               cipher_key_len = 260;
> +                               name = "ARC4";
> +                               break;
> +                       case CIPHER_ALG_AES:
> +                               switch (cipher_type) {
> +                               case CIPHER_TYPE_AES128:
> +                                       cipher_key_len = 16;
> +                                       name = "AES128";
> +                                       break;
> +                               case CIPHER_TYPE_AES192:
> +                                       cipher_key_len = 24;
> +                                       name = "AES192";
> +                                       break;
> +                               case CIPHER_TYPE_AES256:
> +                                       cipher_key_len = 32;
> +                                       name = "AES256";
> +                                       break;
> +                               }
> +                               break;
> +                       case CIPHER_ALG_NONE:
> +                               break;
> +                       }
> +
> +                       packet_log("    Cipher Key Type:%s Length:%u Bytes\n",
> +                                  name, cipher_key_len);
> +
> +                       /* XTS has two keys */
> +                       if (cipher_mode == CIPHER_MODE_XTS) {
> +                               packet_dump("    KEY2: ", ptr, cipher_key_len);
> +                               ptr += cipher_key_len;
> +                               packet_dump("    KEY1: ", ptr, cipher_key_len);
> +                               ptr += cipher_key_len;
> +
> +                               cipher_key_len *= 2;
> +                       } else {
> +                               packet_dump("    KEY: ", ptr, cipher_key_len);
> +                               ptr += cipher_key_len;
> +                       }
> +
> +                       if (ecf & SCTX_IV) {
> +                               sctx_pl_len = sctx_size * sizeof(u32) -
> +                                       sizeof(struct SCTX);
> +                               iv_len = sctx_pl_len -
> +                                       (hash_key_len + hash_state_len +
> +                                        cipher_key_len);
> +                               packet_log("    IV Length:%u Bytes\n", iv_len);
> +                               packet_dump("    IV: ", ptr, iv_len);
> +                               ptr += iv_len;
> +                       }
> +               }
> +       }
> +
> +       /* ========== Decode BDESC ========== */
> +       if (spuh->mh.flags & MH_BDESC_PRES) {
> +#ifdef DEBUG
> +               struct BDESC_HEADER *bdesc = (struct BDESC_HEADER *)ptr;
> +#endif
> +               packet_log("  BDESC[0] 0x%08x\n", be32_to_cpu(*((u32 *)ptr)));
> +               packet_log("    OffsetMAC:%u LengthMAC:%u\n",
> +                          be16_to_cpu(bdesc->offset_mac),
> +                          be16_to_cpu(bdesc->length_mac));
> +               ptr += sizeof(u32);
> +
> +               packet_log("  BDESC[1] 0x%08x\n", be32_to_cpu(*((u32 *)ptr)));
> +               packet_log("    OffsetCrypto:%u LengthCrypto:%u\n",
> +                          be16_to_cpu(bdesc->offset_crypto),
> +                          be16_to_cpu(bdesc->length_crypto));
> +               ptr += sizeof(u32);
> +
> +               packet_log("  BDESC[2] 0x%08x\n", be32_to_cpu(*((u32 *)ptr)));
> +               packet_log("    OffsetICV:%u OffsetIV:%u\n",
> +                          be16_to_cpu(bdesc->offset_icv),
> +                          be16_to_cpu(bdesc->offset_iv));
> +               ptr += sizeof(u32);
> +       }
> +
> +       /* ========== Decode BD ========== */
> +       if (spuh->mh.flags & MH_BD_PRES) {
> +#ifdef DEBUG
> +               struct BD_HEADER *bd = (struct BD_HEADER *)ptr;
> +#endif
> +               packet_log("  BD[0] 0x%08x\n", be32_to_cpu(*((u32 *)ptr)));
> +               packet_log("    Size:%ubytes PrevLength:%u\n",
> +                          be16_to_cpu(bd->size), be16_to_cpu(bd->prev_length));
> +               ptr += 4;
> +       }
> +
> +       /* Double check sanity */
> +       if (buf + buf_len != ptr) {
> +               packet_log(" Packet parsed incorrectly. ");
> +               packet_log("buf:%p buf_len:%u buf+buf_len:%p ptr:%p\n",
> +                          buf, buf_len, buf + buf_len, ptr);
> +       }
> +
> +       packet_log("\n");
> +}
> +
> +/**
> + * spum_ns2_ctx_max_payload() - Determine the max length of the payload for a
> + * SPU message for a given cipher and hash alg context.
> + * @cipher_alg:                The cipher algorithm
> + * @cipher_mode:       The cipher mode
> + * @blocksize:         The size of a block of data for this algo
> + *
> + * The max payload must be a multiple of the blocksize so that if a request is
> + * too large to fit in a single SPU message, the request can be broken into
> + * max_payload sized chunks. Each chunk must be a multiple of blocksize.
> + *
> + * Return: Max payload length in bytes
> + */
> +u32 spum_ns2_ctx_max_payload(enum spu_cipher_alg cipher_alg,
> +                            enum spu_cipher_mode cipher_mode,
> +                            unsigned int blocksize)
> +{
> +       u32 max_payload = SPUM_NS2_MAX_PAYLOAD;
> +       u32 excess;
> +
> +       /* In XTS on SPU-M, we'll need to insert tweak before input data */
> +       if (cipher_mode == CIPHER_MODE_XTS)
> +               max_payload -= SPU_XTS_TWEAK_SIZE;
> +
> +       excess = max_payload % blocksize;
> +
> +       return max_payload - excess;
> +}
> +
> +/**
> + * spum_nsp_ctx_max_payload() - Determine the max length of the payload for a
> + * SPU message for a given cipher and hash alg context.
> + * @cipher_alg:                The cipher algorithm
> + * @cipher_mode:       The cipher mode
> + * @blocksize:         The size of a block of data for this algo
> + *
> + * The max payload must be a multiple of the blocksize so that if a request is
> + * too large to fit in a single SPU message, the request can be broken into
> + * max_payload sized chunks. Each chunk must be a multiple of blocksize.
> + *
> + * Return: Max payload length in bytes
> + */
> +u32 spum_nsp_ctx_max_payload(enum spu_cipher_alg cipher_alg,
> +                            enum spu_cipher_mode cipher_mode,
> +                            unsigned int blocksize)
> +{
> +       u32 max_payload = SPUM_NSP_MAX_PAYLOAD;
> +       u32 excess;
> +
> +       /* In XTS on SPU-M, we'll need to insert tweak before input data */
> +       if (cipher_mode == CIPHER_MODE_XTS)
> +               max_payload -= SPU_XTS_TWEAK_SIZE;
> +
> +       excess = max_payload % blocksize;
> +
> +       return max_payload - excess;
> +}
> +
> +/** spum_payload_length() - Given a SPU-M message header, extract the payload
> + * length.
> + * @spu_hdr:   Start of SPU header
> + *
> + * Assumes just MH, EMH, BD (no SCTX, BDESC. Works for response frames.
> + *
> + * Return: payload length in bytes
> + */
> +u32 spum_payload_length(u8 *spu_hdr)
> +{
> +       struct BD_HEADER *bd;
> +       u32 pl_len;
> +
> +       /* Find BD header.  skip MH, EMH */
> +       bd = (struct BD_HEADER *)(spu_hdr + 8);
> +       pl_len = be16_to_cpu(bd->size);
> +
> +       return pl_len;
> +}
> +
> +/**
> + * spum_response_hdr_len() - Given the length of the hash key and encryption
> + * key, determine the expected length of a SPU response header.
> + * @auth_key_len:      authentication key length (bytes)
> + * @enc_key_len:       encryption key length (bytes)
> + * @is_hash:           true if response message is for a hash operation
> + *
> + * Return: length of SPU response header (bytes)
> + */
> +u16 spum_response_hdr_len(u16 auth_key_len, u16 enc_key_len, bool is_hash)
> +{
> +       if (is_hash)
> +               return SPU_HASH_RESP_HDR_LEN;
> +       else
> +               return SPU_RESP_HDR_LEN;
> +}
> +
> +/**
> + * spum_hash_pad_len() - Calculate the length of hash padding required to extend
> + * data to a full block size.
> + * @hash_alg:   hash algorithm
> + * @hash_mode:       hash mode
> + * @chunksize:  length of data, in bytes
> + * @hash_block_size:  size of a block of data for hash algorithm
> + *
> + * Reserve space for 1 byte (0x80) start of pad and the total length as u64
> + *
> + * Return:  length of hash pad in bytes
> + */
> +u16 spum_hash_pad_len(enum hash_alg hash_alg, enum hash_mode hash_mode,
> +                     u32 chunksize, u16 hash_block_size)
> +{
> +       unsigned int length_len;
> +       unsigned int used_space_last_block;
> +       int hash_pad_len;
> +
> +       /* AES-XCBC hash requires just padding to next block boundary */
> +       if ((hash_alg == HASH_ALG_AES) && (hash_mode == HASH_MODE_XCBC)) {
> +               used_space_last_block = chunksize % hash_block_size;
> +               hash_pad_len = hash_block_size - used_space_last_block;
> +               if (hash_pad_len >= hash_block_size)
> +                       hash_pad_len -= hash_block_size;
> +               return hash_pad_len;
> +       }
> +
> +       used_space_last_block = chunksize % hash_block_size + 1;
> +       if ((hash_alg == HASH_ALG_SHA384) || (hash_alg == HASH_ALG_SHA512))
> +               length_len = 2 * sizeof(u64);
> +       else
> +               length_len = sizeof(u64);
> +
> +       used_space_last_block += length_len;
> +       hash_pad_len = hash_block_size - used_space_last_block;
> +       if (hash_pad_len < 0)
> +               hash_pad_len += hash_block_size;
> +
> +       hash_pad_len += 1 + length_len;
> +       return hash_pad_len;
> +}
> +
> +/**
> + * spum_gcm_ccm_pad_len() - Determine the required length of GCM or CCM padding.
> + * @cipher_mode:       Algo type
> + * @data_size:         Length of plaintext (bytes)
> + *
> + * @Return: Length of padding, in bytes
> + */
> +u32 spum_gcm_ccm_pad_len(enum spu_cipher_mode cipher_mode,
> +                        unsigned int data_size)
> +{
> +       u32 pad_len = 0;
> +       u32 m1 = SPU_GCM_CCM_ALIGN - 1;
> +
> +       if ((cipher_mode == CIPHER_MODE_GCM) ||
> +           (cipher_mode == CIPHER_MODE_CCM))
> +               pad_len = ((data_size + m1) & ~m1) - data_size;
> +
> +       return pad_len;
> +}
> +
> +/**
> + * spum_assoc_resp_len() - Determine the size of the receive buffer required to
> + * catch associated data.
> + * @cipher_mode:       cipher mode
> + * @assoc_len:         length of associated data (bytes)
> + * @iv_len:            length of IV (bytes)
> + * @is_encrypt:                true if encrypting. false if decrypting.
> + *
> + * Return: length of associated data in response message (bytes)
> + */
> +u32 spum_assoc_resp_len(enum spu_cipher_mode cipher_mode,
> +                       unsigned int assoc_len, unsigned int iv_len,
> +                       bool is_encrypt)
> +{
> +       u32 buflen = 0;
> +       u32 pad;
> +
> +       if (assoc_len)
> +               buflen = assoc_len;
> +
> +       if (cipher_mode == CIPHER_MODE_GCM) {
> +               /* AAD needs to be padded in responses too */
> +               pad = spum_gcm_ccm_pad_len(cipher_mode, buflen);
> +               buflen += pad;
> +       }
> +       if (cipher_mode == CIPHER_MODE_CCM) {
> +               /*
> +                * AAD needs to be padded in responses too
> +                * for CCM, len + 2 needs to be 128-bit aligned.
> +                */
> +               pad = spum_gcm_ccm_pad_len(cipher_mode, buflen + 2);
> +               buflen += pad;
> +       }
> +
> +       return buflen;
> +}
> +
> +/**
> + * spu_aead_ivlen() - Calculate the length of the AEAD IV to be included
> + * in a SPU request after the AAD and before the payload.
> + * @cipher_mode:  cipher mode
> + * @iv_ctr_len:   initialization vector length in bytes
> + *
> + * In Linux ~4.2 and later, the assoc_data sg includes the IV. So no need
> + * to include the IV as a separate field in the SPU request msg.
> + *
> + * Return: Length of AEAD IV in bytes
> + */
> +u8 spum_aead_ivlen(enum spu_cipher_mode cipher_mode, u16 iv_len)
> +{
> +       return 0;
> +}
> +
> +/**
> + * spum_hash_type() - Determine the type of hash operation.
> + * @src_sent:  The number of bytes in the current request that have already
> + *             been sent to the SPU to be hashed.
> + *
> + * We do not use HASH_TYPE_FULL for requests that fit in a single SPU message.
> + * Using FULL causes failures (such as when the string to be hashed is empty).
> + * For similar reasons, we never use HASH_TYPE_FIN. Instead, submit messages
> + * as INIT or UPDT and do the hash padding in sw.
> + */
> +enum hash_type spum_hash_type(u32 src_sent)
> +{
> +       return src_sent ? HASH_TYPE_UPDT : HASH_TYPE_INIT;
> +}
> +
> +/**
> + * spum_digest_size() - Determine the size of a hash digest to expect the SPU to
> + * return.
> + * alg_digest_size: Number of bytes in the final digest for the given algo
> + * alg:             The hash algorithm
> + * htype:           Type of hash operation (init, update, full, etc)
> + *
> + * When doing incremental hashing for an algorithm with a truncated hash
> + * (e.g., SHA224), the SPU returns the full digest so that it can be fed back as
> + * a partial result for the next chunk.
> + */
> +u32 spum_digest_size(u32 alg_digest_size, enum hash_alg alg,
> +                    enum hash_type htype)
> +{
> +       u32 digestsize = alg_digest_size;
> +
> +       /* SPU returns complete digest when doing incremental hash and truncated
> +        * hash algo.
> +        */
> +       if ((htype == HASH_TYPE_INIT) || (htype == HASH_TYPE_UPDT)) {
> +               if (alg == HASH_ALG_SHA224)
> +                       digestsize = SHA256_DIGEST_SIZE;
> +               else if (alg == HASH_ALG_SHA384)
> +                       digestsize = SHA512_DIGEST_SIZE;
> +       }
> +       return digestsize;
> +}
> +
> +/**
> + * spum_create_request() - Build a SPU request message header, up to and
> + * including the BD header. Construct the message starting at spu_hdr. Caller
> + * should allocate this buffer in DMA-able memory at least SPU_HEADER_ALLOC_LEN
> + * bytes long.
> + * @spu_hdr: Start of buffer where SPU request header is to be written
> + * @req_opts: SPU request message options
> + * @cipher_parms: Parameters related to cipher algorithm
> + * @hash_parms:   Parameters related to hash algorithm
> + * @aead_parms:   Parameters related to AEAD operation
> + * @data_size:    Length of data to be encrypted or authenticated. If AEAD, does
> + *               not include length of AAD.
> +
> + * Return: the length of the SPU header in bytes. 0 if an error occurs.
> + */
> +u32 spum_create_request(u8 *spu_hdr,
> +                       struct spu_request_opts *req_opts,
> +                       struct spu_cipher_parms *cipher_parms,
> +                       struct spu_hash_parms *hash_parms,
> +                       struct spu_aead_parms *aead_parms,
> +                       unsigned int data_size)
> +{
> +       struct SPUHEADER *spuh;
> +       struct BDESC_HEADER *bdesc;
> +       struct BD_HEADER *bd;
> +
> +       u8 *ptr;
> +       u32 protocol_bits = 0;
> +       u32 cipher_bits = 0;
> +       u32 ecf_bits = 0;
> +       u8 sctx_words = 0;
> +       unsigned int buf_len = 0;
> +
> +       /* size of the cipher payload */
> +       unsigned int cipher_len = hash_parms->prebuf_len + data_size +
> +                               hash_parms->pad_len;
> +
> +       /* offset of prebuf or data from end of BD header */
> +       unsigned int cipher_offset = aead_parms->assoc_size +
> +               aead_parms->iv_len + aead_parms->aad_pad_len;
> +
> +       /* total size of the DB data (without STAT word padding) */
> +       unsigned int real_db_size = spu_real_db_size(aead_parms->assoc_size,
> +                                                aead_parms->iv_len,
> +                                                hash_parms->prebuf_len,
> +                                                data_size,
> +                                                aead_parms->aad_pad_len,
> +                                                aead_parms->data_pad_len,
> +                                                hash_parms->pad_len);
> +
> +       unsigned int auth_offset = 0;
> +       unsigned int offset_iv = 0;
> +
> +       /* size/offset of the auth payload */
> +       unsigned int auth_len;
> +
> +       auth_len = real_db_size;
> +
> +       if (req_opts->is_aead && req_opts->is_inbound)
> +               cipher_len -= hash_parms->digestsize;
> +
> +       if (req_opts->is_aead && req_opts->is_inbound)
> +               auth_len -= hash_parms->digestsize;
> +
> +       if ((hash_parms->alg == HASH_ALG_AES) &&
> +           (hash_parms->mode == HASH_MODE_XCBC)) {
> +               auth_len -= hash_parms->pad_len;
> +               cipher_len -= hash_parms->pad_len;
> +       }
> +
> +       flow_log("%s()\n", __func__);
> +       flow_log("  in:%u authFirst:%u\n",
> +                req_opts->is_inbound, req_opts->auth_first);
> +       flow_log("  %s. cipher alg:%u mode:%u type %u\n",
> +                spu_alg_name(cipher_parms->alg, cipher_parms->mode),
> +                cipher_parms->alg, cipher_parms->mode, cipher_parms->type);
> +       flow_log("    key: %d\n", cipher_parms->key_len);
> +       flow_dump("    key: ", cipher_parms->key_buf, cipher_parms->key_len);
> +       flow_log("    iv: %d\n", cipher_parms->iv_len);
> +       flow_dump("    iv: ", cipher_parms->iv_buf, cipher_parms->iv_len);
> +       flow_log("  auth alg:%u mode:%u type %u\n",
> +                hash_parms->alg, hash_parms->mode, hash_parms->type);
> +       flow_log("  digestsize: %u\n", hash_parms->digestsize);
> +       flow_log("  authkey: %d\n", hash_parms->key_len);
> +       flow_dump("  authkey: ", hash_parms->key_buf, hash_parms->key_len);
> +       flow_log("  assoc_size:%u\n", aead_parms->assoc_size);
> +       flow_log("  prebuf_len:%u\n", hash_parms->prebuf_len);
> +       flow_log("  data_size:%u\n", data_size);
> +       flow_log("  hash_pad_len:%u\n", hash_parms->pad_len);
> +       flow_log("  real_db_size:%u\n", real_db_size);
> +       flow_log(" auth_offset:%u auth_len:%u cipher_offset:%u cipher_len:%u\n",
> +                auth_offset, auth_len, cipher_offset, cipher_len);
> +       flow_log("  hmac_offset:%u\n", hash_parms->hmac_offset);
> +       flow_log("  aead_iv: %u\n", aead_parms->iv_len);
> +
> +       /* starting out: zero the header (plus some) */
> +       ptr = spu_hdr;
> +       memset(ptr, 0, sizeof(struct SPUHEADER));
> +
> +       /* format master header word */
> +       /* Do not set the next bit even though the datasheet says to */
> +       spuh = (struct SPUHEADER *)ptr;
> +       ptr += sizeof(struct SPUHEADER);
> +       buf_len += sizeof(struct SPUHEADER);
> +
> +       spuh->mh.op_code = SPU_CRYPTO_OPERATION_GENERIC;
> +       spuh->mh.flags |= (MH_SCTX_PRES | MH_BDESC_PRES | MH_BD_PRES);
> +
> +       /* Format sctx word 0 (protocol_bits) */
> +       sctx_words = 3;         /* size in words */
> +
> +       /* Format sctx word 1 (cipher_bits) */
> +       if (req_opts->is_inbound)
> +               cipher_bits |= CIPHER_INBOUND;
> +       if (req_opts->auth_first)
> +               cipher_bits |= CIPHER_ORDER;
> +
> +       /* Set the crypto parameters in the cipher.flags */
> +       cipher_bits |= cipher_parms->alg << CIPHER_ALG_SHIFT;
> +       cipher_bits |= cipher_parms->mode << CIPHER_MODE_SHIFT;
> +       cipher_bits |= cipher_parms->type << CIPHER_TYPE_SHIFT;
> +
> +       /* Set the auth parameters in the cipher.flags */
> +       cipher_bits |= hash_parms->alg << HASH_ALG_SHIFT;
> +       cipher_bits |= hash_parms->mode << HASH_MODE_SHIFT;
> +       cipher_bits |= hash_parms->type << HASH_TYPE_SHIFT;
> +
> +       /*
> +        * Format sctx extensions if required, and update main fields if
> +        * required)
> +        */
> +       if (hash_parms->alg) {
> +               /* Write the authentication key material if present */
> +               if (hash_parms->key_len) {
> +                       memcpy(ptr, hash_parms->key_buf, hash_parms->key_len);
> +                       ptr += hash_parms->key_len;
> +                       buf_len += hash_parms->key_len;
> +                       sctx_words += hash_parms->key_len / 4;
> +               }
> +
> +               if ((cipher_parms->mode == CIPHER_MODE_GCM) ||
> +                   (cipher_parms->mode == CIPHER_MODE_CCM))
> +                       /* unpadded length */
> +                       offset_iv = aead_parms->assoc_size;
> +
> +               /* if GCM/CCM we need to write ICV into the payload */
> +               if (!req_opts->is_inbound) {
> +                       if ((cipher_parms->mode == CIPHER_MODE_GCM) ||
> +                           (cipher_parms->mode == CIPHER_MODE_CCM))
> +                               ecf_bits |= 1 << INSERT_ICV_SHIFT;
> +               } else {
> +                       ecf_bits |= CHECK_ICV;
> +               }
> +
> +               /* Inform the SPU of the ICV size (in words) */
> +               if (hash_parms->digestsize == 64)
> +                       cipher_bits |= ICV_IS_512;
> +               else
> +                       ecf_bits |=
> +                       (hash_parms->digestsize / 4) << ICV_SIZE_SHIFT;
> +       }
> +
> +       if (req_opts->bd_suppress)
> +               ecf_bits |= BD_SUPPRESS;
> +
> +       /* copy the encryption keys in the SAD entry */
> +       if (cipher_parms->alg) {
> +               if (cipher_parms->key_len) {
> +                       memcpy(ptr, cipher_parms->key_buf,
> +                              cipher_parms->key_len);
> +                       ptr += cipher_parms->key_len;
> +                       buf_len += cipher_parms->key_len;
> +                       sctx_words += cipher_parms->key_len / 4;
> +               }
> +
> +               /*
> +                * if encrypting then set IV size, use SCTX IV unless no IV
> +                * given here
> +                */
> +               if (cipher_parms->iv_buf && cipher_parms->iv_len) {
> +                       /* Use SCTX IV */
> +                       ecf_bits |= SCTX_IV;
> +
> +                       /* cipher iv provided so put it in here */
> +                       memcpy(ptr, cipher_parms->iv_buf, cipher_parms->iv_len);
> +
> +                       ptr += cipher_parms->iv_len;
> +                       buf_len += cipher_parms->iv_len;
> +                       sctx_words += cipher_parms->iv_len / 4;
> +               }
> +       }
> +
> +       /*
> +        * RFC4543 (GMAC/ESP) requires data to be sent as part of AAD
> +        * so we need to override the BDESC parameters.
> +        */
> +       if (req_opts->is_rfc4543) {
> +               if (req_opts->is_inbound)
> +                       data_size -= hash_parms->digestsize;
> +               offset_iv = aead_parms->assoc_size + data_size;
> +               cipher_len = 0;
> +               cipher_offset = offset_iv;
> +               auth_len = cipher_offset + aead_parms->data_pad_len;
> +       }
> +
> +       /* write in the total sctx length now that we know it */
> +       protocol_bits |= sctx_words;
> +
> +       /* Endian adjust the SCTX */
> +       spuh->sa.proto_flags = cpu_to_be32(protocol_bits);
> +       spuh->sa.cipher_flags = cpu_to_be32(cipher_bits);
> +       spuh->sa.ecf = cpu_to_be32(ecf_bits);
> +
> +       /* === create the BDESC section === */
> +       bdesc = (struct BDESC_HEADER *)ptr;
> +
> +       bdesc->offset_mac = cpu_to_be16(auth_offset);
> +       bdesc->length_mac = cpu_to_be16(auth_len);
> +       bdesc->offset_crypto = cpu_to_be16(cipher_offset);
> +       bdesc->length_crypto = cpu_to_be16(cipher_len);
> +
> +       /*
> +        * CCM in SPU-M requires that ICV not be in same 32-bit word as data or
> +        * padding.  So account for padding as necessary.
> +        */
> +       if (cipher_parms->mode == CIPHER_MODE_CCM)
> +               auth_len += spum_wordalign_padlen(auth_len);
> +
> +       bdesc->offset_icv = cpu_to_be16(auth_len);
> +       bdesc->offset_iv = cpu_to_be16(offset_iv);
> +
> +       ptr += sizeof(struct BDESC_HEADER);
> +       buf_len += sizeof(struct BDESC_HEADER);
> +
> +       /* === no MFM section === */
> +
> +       /* === create the BD section === */
> +
> +       /* add the BD header */
> +       bd = (struct BD_HEADER *)ptr;
> +       bd->size = cpu_to_be16(real_db_size);
> +       bd->prev_length = 0;
> +
> +       ptr += sizeof(struct BD_HEADER);
> +       buf_len += sizeof(struct BD_HEADER);
> +
> +       packet_dump("  SPU request header: ", spu_hdr, buf_len);
> +
> +       return buf_len;
> +}
> +
> +/**
> + * spum_cipher_req_init() - Build a SPU request message header, up to and
> + * including the BD header.
> + * @spu_hdr:      Start of SPU request header (MH)
> + * @cipher_parms: Parameters that describe the cipher request
> + *
> + * Construct the message starting at spu_hdr. Caller should allocate this buffer
> + * in DMA-able memory at least SPU_HEADER_ALLOC_LEN bytes long.
> + *
> + * Return: the length of the SPU header in bytes. 0 if an error occurs.
> + */
> +u16 spum_cipher_req_init(u8 *spu_hdr, struct spu_cipher_parms *cipher_parms)
> +{
> +       struct SPUHEADER *spuh;
> +       u32 protocol_bits = 0;
> +       u32 cipher_bits = 0;
> +       u32 ecf_bits = 0;
> +       u8 sctx_words = 0;
> +       u8 *ptr = spu_hdr;
> +
> +       flow_log("%s()\n", __func__);
> +       flow_log("  cipher alg:%u mode:%u type %u\n", cipher_parms->alg,
> +                cipher_parms->mode, cipher_parms->type);
> +       flow_log("  cipher_iv_len: %u\n", cipher_parms->iv_len);
> +       flow_log("    key: %d\n", cipher_parms->key_len);
> +       flow_dump("    key: ", cipher_parms->key_buf, cipher_parms->key_len);
> +
> +       /* starting out: zero the header (plus some) */
> +       memset(spu_hdr, 0, sizeof(struct SPUHEADER));
> +       ptr += sizeof(struct SPUHEADER);
> +
> +       /* format master header word */
> +       /* Do not set the next bit even though the datasheet says to */
> +       spuh = (struct SPUHEADER *)spu_hdr;
> +
> +       spuh->mh.op_code = SPU_CRYPTO_OPERATION_GENERIC;
> +       spuh->mh.flags |= (MH_SCTX_PRES | MH_BDESC_PRES | MH_BD_PRES);
> +
> +       /* Format sctx word 0 (protocol_bits) */
> +       sctx_words = 3;         /* size in words */
> +
> +       /* copy the encryption keys in the SAD entry */
> +       if (cipher_parms->alg) {
> +               if (cipher_parms->key_len) {
> +                       ptr += cipher_parms->key_len;
> +                       sctx_words += cipher_parms->key_len / 4;
> +               }
> +
> +               /*
> +                * if encrypting then set IV size, use SCTX IV unless no IV
> +                * given here
> +                */
> +               if (cipher_parms->iv_len) {
> +                       /* Use SCTX IV */
> +                       ecf_bits |= SCTX_IV;
> +                       ptr += cipher_parms->iv_len;
> +                       sctx_words += cipher_parms->iv_len / 4;
> +               }
> +       }
> +
> +       /* Set the crypto parameters in the cipher.flags */
> +       cipher_bits |= cipher_parms->alg << CIPHER_ALG_SHIFT;
> +       cipher_bits |= cipher_parms->mode << CIPHER_MODE_SHIFT;
> +       cipher_bits |= cipher_parms->type << CIPHER_TYPE_SHIFT;
> +
> +       /* copy the encryption keys in the SAD entry */
> +       if (cipher_parms->alg && cipher_parms->key_len)
> +               memcpy(spuh + 1, cipher_parms->key_buf, cipher_parms->key_len);
> +
> +       /* write in the total sctx length now that we know it */
> +       protocol_bits |= sctx_words;
> +
> +       /* Endian adjust the SCTX */
> +       spuh->sa.proto_flags = cpu_to_be32(protocol_bits);
> +
> +       /* Endian adjust the SCTX */
> +       spuh->sa.cipher_flags = cpu_to_be32(cipher_bits);
> +       spuh->sa.ecf = cpu_to_be32(ecf_bits);
> +
> +       packet_dump("  SPU request header: ", spu_hdr,
> +                   sizeof(struct SPUHEADER));
> +
> +       return sizeof(struct SPUHEADER) + cipher_parms->key_len +
> +               cipher_parms->iv_len + sizeof(struct BDESC_HEADER) +
> +               sizeof(struct BD_HEADER);
> +}
> +
> +/**
> + * spum_cipher_req_finish() - Finish building a SPU request message header for a
> + * block cipher request. Assumes much of the header was already filled in at
> + * setkey() time in spu_cipher_req_init().
> + * @spu_hdr:         Start of the request message header (MH field)
> + * @spu_req_hdr_len: Length in bytes of the SPU request header
> + * @isInbound:       0 encrypt, 1 decrypt
> + * @cipher_parms:    Parameters describing cipher operation to be performed
> + * @update_key:      If true, rewrite the cipher key in SCTX
> + * @data_size:       Length of the data in the BD field
> + *
> + * Assumes much of the header was already filled in at setkey() time in
> + * spum_cipher_req_init().
> + * spum_cipher_req_init() fills in the encryption key. For RC4, when submitting
> + * a request for a non-first chunk, we use the 260-byte SUPDT field from the
> + * previous response as the key. update_key is true for this case. Unused in all
> + * other cases.
> + */
> +void spum_cipher_req_finish(u8 *spu_hdr,
> +                           u16 spu_req_hdr_len,
> +                           unsigned int is_inbound,
> +                           struct spu_cipher_parms *cipher_parms,
> +                           bool update_key,
> +                           unsigned int data_size)
> +{
> +       struct SPUHEADER *spuh;
> +       struct BDESC_HEADER *bdesc;
> +       struct BD_HEADER *bd;
> +       u8 *bdesc_ptr = spu_hdr + spu_req_hdr_len -
> +           (sizeof(struct BD_HEADER) + sizeof(struct BDESC_HEADER));
> +
> +       u32 cipher_bits;
> +
> +       flow_log("%s()\n", __func__);
> +       flow_log(" in: %u\n", is_inbound);
> +       flow_log(" cipher alg: %u, cipher_type: %u\n", cipher_parms->alg,
> +                cipher_parms->type);
> +       if (update_key) {
> +               flow_log(" cipher key len: %u\n", cipher_parms->key_len);
> +               flow_dump("  key: ", cipher_parms->key_buf,
> +                         cipher_parms->key_len);
> +       }
> +
> +       /*
> +        * In XTS mode, API puts "i" parameter (block tweak) in IV.  For
> +        * SPU-M, should be in start of the BD; tx_sg_create() copies it there.
> +        * IV in SPU msg for SPU-M should be 0, since that's the "j" parameter
> +        * (block ctr within larger data unit) - given we can send entire disk
> +        * block (<= 4KB) in 1 SPU msg, don't need to use this parameter.
> +        */
> +       if (cipher_parms->mode == CIPHER_MODE_XTS)
> +               memset(cipher_parms->iv_buf, 0, cipher_parms->iv_len);
> +
> +       flow_log(" iv len: %d\n", cipher_parms->iv_len);
> +       flow_dump("    iv: ", cipher_parms->iv_buf, cipher_parms->iv_len);
> +       flow_log(" data_size: %u\n", data_size);
> +
> +       /* format master header word */
> +       /* Do not set the next bit even though the datasheet says to */
> +       spuh = (struct SPUHEADER *)spu_hdr;
> +
> +       /* cipher_bits was initialized at setkey time */
> +       cipher_bits = be32_to_cpu(spuh->sa.cipher_flags);
> +
> +       /* Format sctx word 1 (cipher_bits) */
> +       if (is_inbound)
> +               cipher_bits |= CIPHER_INBOUND;
> +       else
> +               cipher_bits &= ~CIPHER_INBOUND;
> +
> +       /* update encryption key for RC4 on non-first chunk */
> +       if (update_key) {
> +               spuh->sa.cipher_flags |=
> +                       cipher_parms->type << CIPHER_TYPE_SHIFT;
> +               memcpy(spuh + 1, cipher_parms->key_buf, cipher_parms->key_len);
> +       }
> +
> +       if (cipher_parms->alg && cipher_parms->iv_buf && cipher_parms->iv_len)
> +               /* cipher iv provided so put it in here */
> +               memcpy(bdesc_ptr - cipher_parms->iv_len, cipher_parms->iv_buf,
> +                      cipher_parms->iv_len);
> +
> +       spuh->sa.cipher_flags = cpu_to_be32(cipher_bits);
> +
> +       /* === create the BDESC section === */
> +       bdesc = (struct BDESC_HEADER *)bdesc_ptr;
> +       bdesc->offset_mac = 0;
> +       bdesc->length_mac = 0;
> +       bdesc->offset_crypto = 0;
> +
> +       /* XTS mode, data_size needs to include tweak parameter */
> +       if (cipher_parms->mode == CIPHER_MODE_XTS)
> +               bdesc->length_crypto = cpu_to_be16(data_size +
> +                                                 SPU_XTS_TWEAK_SIZE);
> +       else
> +               bdesc->length_crypto = cpu_to_be16(data_size);
> +
> +       bdesc->offset_icv = 0;
> +       bdesc->offset_iv = 0;
> +
> +       /* === no MFM section === */
> +
> +       /* === create the BD section === */
> +       /* add the BD header */
> +       bd = (struct BD_HEADER *)(bdesc_ptr + sizeof(struct BDESC_HEADER));
> +       bd->size = cpu_to_be16(data_size);
> +
> +       /* XTS mode, data_size needs to include tweak parameter */
> +       if (cipher_parms->mode == CIPHER_MODE_XTS)
> +               bd->size = cpu_to_be16(data_size + SPU_XTS_TWEAK_SIZE);
> +       else
> +               bd->size = cpu_to_be16(data_size);
> +
> +       bd->prev_length = 0;
> +
> +       packet_dump("  SPU request header: ", spu_hdr, spu_req_hdr_len);
> +}
> +
> +/**
> + * spum_request_pad() - Create pad bytes at the end of the data.
> + * @pad_start:         Start of buffer where pad bytes are to be written
> + * @gcm_ccm_padding:   length of GCM/CCM padding, in bytes
> + * @hash_pad_len:      Number of bytes of padding extend data to full block
> + * @auth_alg:          authentication algorithm
> + * @auth_mode:         authentication mode
> + * @total_sent:                length inserted at end of hash pad
> + * @status_padding:    Number of bytes of padding to align STATUS word
> + *
> + * There may be three forms of pad:
> + *  1. GCM/CCM pad - for GCM/CCM mode ciphers, pad to 16-byte alignment
> + *  2. hash pad - pad to a block length, with 0x80 data terminator and
> + *                size at the end
> + *  3. STAT pad - to ensure the STAT field is 4-byte aligned
> + */
> +void spum_request_pad(u8 *pad_start,
> +                     u32 gcm_ccm_padding,
> +                     u32 hash_pad_len,
> +                     enum hash_alg auth_alg,
> +                     enum hash_mode auth_mode,
> +                     unsigned int total_sent, u32 status_padding)
> +{
> +       u8 *ptr = pad_start;
> +
> +       /* fix data alignent for GCM/CCM */
> +       if (gcm_ccm_padding > 0) {
> +               flow_log("  GCM: padding to 16 byte alignment: %u bytes\n",
> +                        gcm_ccm_padding);
> +               memset(ptr, 0, gcm_ccm_padding);
> +               ptr += gcm_ccm_padding;
> +       }
> +
> +       if (hash_pad_len > 0) {
> +               /* clear the padding section */
> +               memset(ptr, 0, hash_pad_len);
> +
> +               if ((auth_alg == HASH_ALG_AES) &&
> +                   (auth_mode == HASH_MODE_XCBC)) {
> +                       /* AES/XCBC just requires padding to be 0s */
> +                       ptr += hash_pad_len;
> +               } else {
> +                       /* terminate the data */
> +                       *ptr = 0x80;
> +                       ptr += (hash_pad_len - sizeof(u64));
> +
> +                       /* add the size at the end as required per alg */
> +                       if (auth_alg == HASH_ALG_MD5)
> +                               *(u64 *)ptr = cpu_to_le64((u64)total_sent * 8);
> +                       else            /* SHA1, SHA2-224, SHA2-256 */
> +                               *(u64 *)ptr = cpu_to_be64((u64)total_sent * 8);
> +                       ptr += sizeof(u64);
> +               }
> +       }
> +
> +       /* pad to a 4byte alignment for STAT */
> +       if (status_padding > 0) {
> +               flow_log("  STAT: padding to 4 byte alignment: %u bytes\n",
> +                        status_padding);
> +
> +               memset(ptr, 0, status_padding);
> +               ptr += status_padding;
> +       }
> +}
> +
> +/**
> + * spum_xts_tweak_in_payload() - Indicate that SPUM DOES place the XTS tweak
> + * field in the packet payload (rather than using IV)
> + *
> + * Return: 1
> + */
> +u8 spum_xts_tweak_in_payload(void)
> +{
> +       return 1;
> +}
> +
> +/**
> + * spum_tx_status_len() - Return the length of the STATUS field in a SPU
> + * response message.
> + *
> + * Return: Length of STATUS field in bytes.
> + */
> +u8 spum_tx_status_len(void)
> +{
> +       return SPU_TX_STATUS_LEN;
> +}
> +
> +/**
> + * spum_rx_status_len() - Return the length of the STATUS field in a SPU
> + * response message.
> + *
> + * Return: Length of STATUS field in bytes.
> + */
> +u8 spum_rx_status_len(void)
> +{
> +       return SPU_RX_STATUS_LEN;
> +}
> +
> +/**
> + * spum_status_process() - Process the status from a SPU response message.
> + * @statp:  start of STATUS word
> + * Return:
> + *   0 - if status is good and response should be processed
> + *   !0 - status indicates an error and response is invalid
> + */
> +int spum_status_process(u8 *statp)
> +{
> +       u32 status;
> +
> +       status = __be32_to_cpu(*(__be32 *)statp);
> +       flow_log("SPU response STATUS %#08x\n", status);
> +       if (status & SPU_STATUS_ERROR_FLAG) {
> +               pr_err("%s() Warning: Error result from SPU: %#08x\n",
> +                      __func__, status);
> +               if (status & SPU_STATUS_INVALID_ICV)
> +                       return SPU_INVALID_ICV;
> +               return -EBADMSG;
> +       }
> +       return 0;
> +}
> +
> +/**
> + * spum_ccm_update_iv() - Update the IV as per the requirements for CCM mode.
> + *
> + * @digestsize:                Digest size of this request
> + * @cipher_parms:      (pointer to) cipher parmaeters, includes IV buf & IV len
> + * @assoclen:          Length of AAD data
> + * @chunksize:         length of input data to be sent in this req
> + * @is_encrypt:                true if this is an output/encrypt operation
> + * @is_esp:            true if this is an ESP / RFC4309 operation
> + *
> + */
> +void spum_ccm_update_iv(unsigned int digestsize,
> +                       struct spu_cipher_parms *cipher_parms,
> +                       unsigned int assoclen,
> +                       unsigned int chunksize,
> +                       bool is_encrypt,
> +                       bool is_esp)
> +{
> +       u8 L;           /* L from CCM algorithm, length of plaintext data */
> +       u8 mprime;      /* M' from CCM algo, (M - 2) / 2, where M=authsize */
> +       u8 adata;
> +
> +       if (cipher_parms->iv_len != CCM_AES_IV_SIZE) {
> +               pr_err("%s(): Invalid IV len %d for CCM mode, should be %d\n",
> +                      __func__, cipher_parms->iv_len, CCM_AES_IV_SIZE);
> +               return;
> +       }
> +
> +       /*
> +        * IV needs to be formatted as follows:
> +        *
> +        * |          Byte 0               | Bytes 1 - N | Bytes (N+1) - 15 |
> +        * | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | Bits 7 - 0  |    Bits 7 - 0    |
> +        * | 0 |Ad?|(M - 2) / 2|   L - 1   |    Nonce    | Plaintext Length |
> +        *
> +        * Ad? = 1 if AAD present, 0 if not present
> +        * M = size of auth field, 8, 12, or 16 bytes (SPU-M) -or-
> +        *                         4, 6, 8, 10, 12, 14, 16 bytes (SPU2)
> +        * L = Size of Plaintext Length field; Nonce size = 15 - L
> +        *
> +        * It appears that the crypto API already expects the L-1 portion
> +        * to be set in the first byte of the IV, which implicitly determines
> +        * the nonce size, and also fills in the nonce.  But the other bits
> +        * in byte 0 as well as the plaintext length need to be filled in.
> +        *
> +        * In rfc4309/esp mode, L is not already in the supplied IV and
> +        * we need to fill it in, as well as move the IV data to be after
> +        * the salt
> +        */
> +       if (is_esp) {
> +               L = CCM_ESP_L_VALUE;    /* RFC4309 has fixed L */
> +       } else {
> +               /* L' = plaintext length - 1 so Plaintext length is L' + 1 */
> +               L = ((cipher_parms->iv_buf[0] & CCM_B0_L_PRIME) >>
> +                     CCM_B0_L_PRIME_SHIFT) + 1;
> +       }
> +
> +       mprime = (digestsize - 2) >> 1;  /* M' = (M - 2) / 2 */
> +       adata = (assoclen > 0);  /* adata = 1 if any associated data */
> +
> +       cipher_parms->iv_buf[0] = (adata << CCM_B0_ADATA_SHIFT) |
> +                                 (mprime << CCM_B0_M_PRIME_SHIFT) |
> +                                 ((L - 1) << CCM_B0_L_PRIME_SHIFT);
> +
> +       /* Nonce is already filled in by crypto API, and is 15 - L bytes */
> +
> +       /* Don't include digest in plaintext size when decrypting */
> +       if (!is_encrypt)
> +               chunksize -= digestsize;
> +
> +       /* Fill in length of plaintext, formatted to be L bytes long */
> +       format_value_ccm(chunksize, &cipher_parms->iv_buf[15 - L + 1], L);
> +}
> +
> +/**
> + * spum_wordalign_padlen() - Given the length of a data field, determine the
> + * padding required to align the data following this field on a 4-byte boundary.
> + * @data_size: length of data field in bytes
> + *
> + * Return: length of status field padding, in bytes
> + */
> +u32 spum_wordalign_padlen(u32 data_size)
> +{
> +       return ((data_size + 3) & ~3) - data_size;
> +}
> diff --git a/drivers/crypto/bcm/spu.h b/drivers/crypto/bcm/spu.h
> new file mode 100644
> index 0000000..e2eb925
> --- /dev/null
> +++ b/drivers/crypto/bcm/spu.h
> @@ -0,0 +1,288 @@
> +/*
> + * Copyright 2016 Broadcom
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License, version 2, as
> + * published by the Free Software Foundation (the "GPL").
> + *
> + * This program is distributed in the hope that it will be useful, but
> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License version 2 (GPLv2) for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * version 2 (GPLv2) along with this source code.
> + */
> +
> +/*
> + * This file contains the definition of SPU messages. There are currently two
> + * SPU message formats: SPU-M and SPU2. The hardware uses different values to
> + * identify the same things in SPU-M vs SPU2. So this file defines values that
> + * are hardware independent. Software can use these values for any version of
> + * SPU hardware. These values are used in APIs in spu.c. Functions internal to
> + * spu.c and spu2.c convert these to hardware-specific values.
> + */
> +
> +#ifndef _SPU_H
> +#define _SPU_H
> +
> +#include <linux/types.h>
> +#include <linux/scatterlist.h>
> +#include <crypto/sha.h>
> +
> +enum spu_cipher_alg {
> +       CIPHER_ALG_NONE = 0x0,
> +       CIPHER_ALG_RC4 = 0x1,
> +       CIPHER_ALG_DES = 0x2,
> +       CIPHER_ALG_3DES = 0x3,
> +       CIPHER_ALG_AES = 0x4,
> +       CIPHER_ALG_LAST = 0x5
> +};
> +
> +enum spu_cipher_mode {
> +       CIPHER_MODE_NONE = 0x0,
> +       CIPHER_MODE_ECB = 0x0,
> +       CIPHER_MODE_CBC = 0x1,
> +       CIPHER_MODE_OFB = 0x2,
> +       CIPHER_MODE_CFB = 0x3,
> +       CIPHER_MODE_CTR = 0x4,
> +       CIPHER_MODE_CCM = 0x5,
> +       CIPHER_MODE_GCM = 0x6,
> +       CIPHER_MODE_XTS = 0x7,
> +       CIPHER_MODE_LAST = 0x8
> +};
> +
> +enum spu_cipher_type {
> +       CIPHER_TYPE_NONE = 0x0,
> +       CIPHER_TYPE_DES = 0x0,
> +       CIPHER_TYPE_3DES = 0x0,
> +       CIPHER_TYPE_INIT = 0x0, /* used for ARC4 */
> +       CIPHER_TYPE_AES128 = 0x0,
> +       CIPHER_TYPE_AES192 = 0x1,
> +       CIPHER_TYPE_UPDT = 0x1, /* used for ARC4 */
> +       CIPHER_TYPE_AES256 = 0x2,
> +};
> +
> +enum hash_alg {
> +       HASH_ALG_NONE = 0x0,
> +       HASH_ALG_MD5 = 0x1,
> +       HASH_ALG_SHA1 = 0x2,
> +       HASH_ALG_SHA224 = 0x3,
> +       HASH_ALG_SHA256 = 0x4,
> +       HASH_ALG_AES = 0x5,
> +       HASH_ALG_SHA384 = 0x6,
> +       HASH_ALG_SHA512 = 0x7,
> +       /* Keep SHA3 algorithms at the end always */
> +       HASH_ALG_SHA3_224 = 0x8,
> +       HASH_ALG_SHA3_256 = 0x9,
> +       HASH_ALG_SHA3_384 = 0xa,
> +       HASH_ALG_SHA3_512 = 0xb,
> +       HASH_ALG_LAST
> +};
> +
> +enum hash_mode {
> +       HASH_MODE_NONE = 0x0,
> +       HASH_MODE_HASH = 0x0,
> +       HASH_MODE_XCBC = 0x0,
> +       HASH_MODE_CMAC = 0x1,
> +       HASH_MODE_CTXT = 0x1,
> +       HASH_MODE_HMAC = 0x2,
> +       HASH_MODE_RABIN = 0x4,
> +       HASH_MODE_FHMAC = 0x6,
> +       HASH_MODE_CCM = 0x5,
> +       HASH_MODE_GCM = 0x6,
> +};
> +
> +enum hash_type {
> +       HASH_TYPE_NONE = 0x0,
> +       HASH_TYPE_FULL = 0x0,
> +       HASH_TYPE_INIT = 0x1,
> +       HASH_TYPE_UPDT = 0x2,
> +       HASH_TYPE_FIN = 0x3,
> +       HASH_TYPE_AES128 = 0x0,
> +       HASH_TYPE_AES192 = 0x1,
> +       HASH_TYPE_AES256 = 0x2
> +};
> +
> +enum aead_type {
> +       AES_CCM,
> +       AES_GCM,
> +       AUTHENC,
> +       AEAD_TYPE_LAST
> +};
> +
> +extern char *hash_alg_name[HASH_ALG_LAST];
> +extern char *aead_alg_name[AEAD_TYPE_LAST];
> +
> +struct spu_request_opts {
> +       bool is_inbound;
> +       bool auth_first;
> +       bool is_aead;
> +       bool is_esp;
> +       bool bd_suppress;
> +       bool is_rfc4543;
> +};
> +
> +struct spu_cipher_parms {
> +       enum spu_cipher_alg  alg;
> +       enum spu_cipher_mode mode;
> +       enum spu_cipher_type type;
> +       u8                  *key_buf;
> +       u16                  key_len;
> +       /* iv_buf and iv_len include salt, if applicable */
> +       u8                  *iv_buf;
> +       u16                  iv_len;
> +};
> +
> +struct spu_hash_parms {
> +       enum hash_alg  alg;
> +       enum hash_mode mode;
> +       enum hash_type type;
> +       u8             digestsize;
> +       u8            *key_buf;
> +       u16            key_len;
> +       u16            prebuf_len;
> +       u16            hmac_offset;
> +       /* length of hash pad. signed, needs to handle roll-overs */
> +       int            pad_len;
> +};
> +
> +struct spu_aead_parms {
> +       u32 assoc_size;
> +       u16 iv_len;      /* length of IV field between assoc data and data */
> +       u8  aad_pad_len; /* For AES GCM/CCM, length of padding after AAD */
> +       u8  data_pad_len;/* For AES GCM/CCM, length of padding after data */
> +       bool return_iv;  /* True if SPU should return an IV */
> +       u32 ret_iv_len;  /* Length in bytes of returned IV */
> +       u32 ret_iv_off;  /* Offset into full IV if partial IV returned */
> +};
> +
> +/************** SPU sizes ***************/
> +
> +#define SPU_RX_STATUS_LEN  4
> +
> +/* Max length of padding for 4-byte alignment of STATUS field */
> +#define SPU_STAT_PAD_MAX  4
> +
> +/* Max length of pad fragment. 4 is for 4-byte alignment of STATUS field */
> +#define SPU_PAD_LEN_MAX (SPU_GCM_CCM_ALIGN + MAX_HASH_BLOCK_SIZE + \
> +                        SPU_STAT_PAD_MAX)
> +
> +/* GCM and CCM require 16-byte alignment */
> +#define SPU_GCM_CCM_ALIGN 16
> +
> +/* Length up SUPDT field in SPU response message for RC4 */
> +#define SPU_SUPDT_LEN 260
> +
> +/* SPU status error codes. These used as common error codes across all
> + * SPU variants.
> + */
> +#define SPU_INVALID_ICV  1
> +
> +/* Indicates no limit to the length of the payload in a SPU message */
> +#define SPU_MAX_PAYLOAD_INF  0xFFFFFFFF
> +
> +/* Size of XTS tweak ("i" parameter), in bytes */
> +#define SPU_XTS_TWEAK_SIZE 16
> +
> +/* CCM B_0 field definitions, common for SPU-M and SPU2 */
> +#define CCM_B0_ADATA           0x40
> +#define CCM_B0_ADATA_SHIFT        6
> +#define CCM_B0_M_PRIME         0x38
> +#define CCM_B0_M_PRIME_SHIFT      3
> +#define CCM_B0_L_PRIME         0x07
> +#define CCM_B0_L_PRIME_SHIFT      0
> +#define CCM_ESP_L_VALUE                   4
> +
> +/**
> + * spu_req_incl_icv() - Return true if SPU request message should include the
> + * ICV as a separate buffer.
> + * @cipher_mode:  the cipher mode being requested
> + * @is_encrypt:   true if encrypting. false if decrypting.
> + *
> + * Return:  true if ICV to be included as separate buffer
> + */
> +static __always_inline  bool spu_req_incl_icv(enum spu_cipher_mode cipher_mode,
> +                                             bool is_encrypt)
> +{
> +       if ((cipher_mode == CIPHER_MODE_GCM) && !is_encrypt)
> +               return true;
> +       if ((cipher_mode == CIPHER_MODE_CCM) && !is_encrypt)
> +               return true;
> +
> +       return false;
> +}
> +
> +static __always_inline u32 spu_real_db_size(u32 assoc_size,
> +                                           u32 aead_iv_buf_len,
> +                                           u32 prebuf_len,
> +                                           u32 data_size,
> +                                           u32 aad_pad_len,
> +                                           u32 gcm_pad_len,
> +                                           u32 hash_pad_len)
> +{
> +       return assoc_size + aead_iv_buf_len + prebuf_len + data_size +
> +           aad_pad_len + gcm_pad_len + hash_pad_len;
> +}
> +
> +/************** SPU Functions Prototypes **************/
> +
> +void spum_dump_msg_hdr(u8 *buf, unsigned int buf_len);
> +
> +u32 spum_ns2_ctx_max_payload(enum spu_cipher_alg cipher_alg,
> +                            enum spu_cipher_mode cipher_mode,
> +                            unsigned int blocksize);
> +u32 spum_nsp_ctx_max_payload(enum spu_cipher_alg cipher_alg,
> +                            enum spu_cipher_mode cipher_mode,
> +                            unsigned int blocksize);
> +u32 spum_payload_length(u8 *spu_hdr);
> +u16 spum_response_hdr_len(u16 auth_key_len, u16 enc_key_len, bool is_hash);
> +u16 spum_hash_pad_len(enum hash_alg hash_alg, enum hash_mode hash_mode,
> +                     u32 chunksize, u16 hash_block_size);
> +u32 spum_gcm_ccm_pad_len(enum spu_cipher_mode cipher_mode,
> +                        unsigned int data_size);
> +u32 spum_assoc_resp_len(enum spu_cipher_mode cipher_mode,
> +                       unsigned int assoc_len, unsigned int iv_len,
> +                       bool is_encrypt);
> +u8 spum_aead_ivlen(enum spu_cipher_mode cipher_mode, u16 iv_len);
> +bool spu_req_incl_icv(enum spu_cipher_mode cipher_mode, bool is_encrypt);
> +enum hash_type spum_hash_type(u32 src_sent);
> +u32 spum_digest_size(u32 alg_digest_size, enum hash_alg alg,
> +                    enum hash_type htype);
> +
> +u32 spum_create_request(u8 *spu_hdr,
> +                       struct spu_request_opts *req_opts,
> +                       struct spu_cipher_parms *cipher_parms,
> +                       struct spu_hash_parms *hash_parms,
> +                       struct spu_aead_parms *aead_parms,
> +                       unsigned int data_size);
> +
> +u16 spum_cipher_req_init(u8 *spu_hdr, struct spu_cipher_parms *cipher_parms);
> +
> +void spum_cipher_req_finish(u8 *spu_hdr,
> +                           u16 spu_req_hdr_len,
> +                           unsigned int is_inbound,
> +                           struct spu_cipher_parms *cipher_parms,
> +                           bool update_key,
> +                           unsigned int data_size);
> +
> +void spum_request_pad(u8 *pad_start,
> +                     u32 gcm_padding,
> +                     u32 hash_pad_len,
> +                     enum hash_alg auth_alg,
> +                     enum hash_mode auth_mode,
> +                     unsigned int total_sent, u32 status_padding);
> +
> +u8 spum_xts_tweak_in_payload(void);
> +u8 spum_tx_status_len(void);
> +u8 spum_rx_status_len(void);
> +int spum_status_process(u8 *statp);
> +
> +void spum_ccm_update_iv(unsigned int digestsize,
> +                       struct spu_cipher_parms *cipher_parms,
> +                       unsigned int assoclen,
> +                       unsigned int chunksize,
> +                       bool is_encrypt,
> +                       bool is_esp);
> +u32 spum_wordalign_padlen(u32 data_size);
> +#endif
> diff --git a/drivers/crypto/bcm/spu2.c b/drivers/crypto/bcm/spu2.c
> new file mode 100644
> index 0000000..d7b44b6
> --- /dev/null
> +++ b/drivers/crypto/bcm/spu2.c
> @@ -0,0 +1,1402 @@
> +/*
> + * Copyright 2016 Broadcom
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License, version 2, as
> + * published by the Free Software Foundation (the "GPL").
> + *
> + * This program is distributed in the hope that it will be useful, but
> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License version 2 (GPLv2) for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * version 2 (GPLv2) along with this source code.
> + */
> +
> +/*
> + * This file works with the SPU2 version of the SPU. SPU2 has different message
> + * formats than the previous version of the SPU. All SPU message format
> + * differences should be hidden in the spux.c,h files.
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/string.h>
> +
> +#include "util.h"
> +#include "spu.h"
> +#include "spu2.h"
> +
> +#define SPU2_TX_STATUS_LEN  0  /* SPU2 has no STATUS in input packet */
> +
> +/*
> + * Controlled by pkt_stat_cnt field in CRYPTO_SS_SPU0_CORE_SPU2_CONTROL0
> + * register. Defaults to 2.
> + */
> +#define SPU2_RX_STATUS_LEN  2
> +
> +enum spu2_proto_sel {
> +       SPU2_PROTO_RESV = 0,
> +       SPU2_MACSEC_SECTAG8_ECB = 1,
> +       SPU2_MACSEC_SECTAG8_SCB = 2,
> +       SPU2_MACSEC_SECTAG16 = 3,
> +       SPU2_MACSEC_SECTAG16_8_XPN = 4,
> +       SPU2_IPSEC = 5,
> +       SPU2_IPSEC_ESN = 6,
> +       SPU2_TLS_CIPHER = 7,
> +       SPU2_TLS_AEAD = 8,
> +       SPU2_DTLS_CIPHER = 9,
> +       SPU2_DTLS_AEAD = 10
> +};
> +
> +char *spu2_cipher_type_names[] = { "None", "AES128", "AES192", "AES256",
> +       "DES", "3DES"
> +};
> +
> +char *spu2_cipher_mode_names[] = { "ECB", "CBC", "CTR", "CFB", "OFB", "XTS",
> +       "CCM", "GCM"
> +};
> +
> +char *spu2_hash_type_names[] = { "None", "AES128", "AES192", "AES256",
> +       "Reserved", "Reserved", "MD5", "SHA1", "SHA224", "SHA256", "SHA384",
> +       "SHA512", "SHA512/224", "SHA512/256", "SHA3-224", "SHA3-256",
> +       "SHA3-384", "SHA3-512"
> +};
> +
> +char *spu2_hash_mode_names[] = { "CMAC", "CBC-MAC", "XCBC-MAC", "HMAC",
> +       "Rabin", "CCM", "GCM", "Reserved"
> +};
> +
> +static char *spu2_ciph_type_name(enum spu2_cipher_type cipher_type)
> +{
> +       if (cipher_type >= SPU2_CIPHER_TYPE_LAST)
> +               return "Reserved";
> +       return spu2_cipher_type_names[cipher_type];
> +}
> +
> +static char *spu2_ciph_mode_name(enum spu2_cipher_mode cipher_mode)
> +{
> +       if (cipher_mode >= SPU2_CIPHER_MODE_LAST)
> +               return "Reserved";
> +       return spu2_cipher_mode_names[cipher_mode];
> +}
> +
> +static char *spu2_hash_type_name(enum spu2_hash_type hash_type)
> +{
> +       if (hash_type >= SPU2_HASH_TYPE_LAST)
> +               return "Reserved";
> +       return spu2_hash_type_names[hash_type];
> +}
> +
> +static char *spu2_hash_mode_name(enum spu2_hash_mode hash_mode)
> +{
> +       if (hash_mode >= SPU2_HASH_MODE_LAST)
> +               return "Reserved";
> +       return spu2_hash_mode_names[hash_mode];
> +}
> +
> +/*
> + * Convert from a software cipher mode value to the corresponding value
> + * for SPU2.
> + */
> +static int spu2_cipher_mode_xlate(enum spu_cipher_mode cipher_mode,
> +                                 enum spu2_cipher_mode *spu2_mode)
> +{
> +       switch (cipher_mode) {
> +       case CIPHER_MODE_ECB:
> +               *spu2_mode = SPU2_CIPHER_MODE_ECB;
> +               break;
> +       case CIPHER_MODE_CBC:
> +               *spu2_mode = SPU2_CIPHER_MODE_CBC;
> +               break;
> +       case CIPHER_MODE_OFB:
> +               *spu2_mode = SPU2_CIPHER_MODE_OFB;
> +               break;
> +       case CIPHER_MODE_CFB:
> +               *spu2_mode = SPU2_CIPHER_MODE_CFB;
> +               break;
> +       case CIPHER_MODE_CTR:
> +               *spu2_mode = SPU2_CIPHER_MODE_CTR;
> +               break;
> +       case CIPHER_MODE_CCM:
> +               *spu2_mode = SPU2_CIPHER_MODE_CCM;
> +               break;
> +       case CIPHER_MODE_GCM:
> +               *spu2_mode = SPU2_CIPHER_MODE_GCM;
> +               break;
> +       case CIPHER_MODE_XTS:
> +               *spu2_mode = SPU2_CIPHER_MODE_XTS;
> +               break;
> +       default:
> +               return -EINVAL;
> +       }
> +       return 0;
> +}
> +
> +/**
> + * spu2_cipher_xlate() - Convert a cipher {alg/mode/type} triple to a SPU2
> + * cipher type and mode.
> + * @cipher_alg:  [in]  cipher algorithm value from software enumeration
> + * @cipher_mode: [in]  cipher mode value from software enumeration
> + * @cipher_type: [in]  cipher type value from software enumeration
> + * @spu2_type:   [out] cipher type value used by spu2 hardware
> + * @spu2_mode:   [out] cipher mode value used by spu2 hardware
> + *
> + * Return:  0 if successful
> + */
> +static int spu2_cipher_xlate(enum spu_cipher_alg cipher_alg,
> +                            enum spu_cipher_mode cipher_mode,
> +                            enum spu_cipher_type cipher_type,
> +                            enum spu2_cipher_type *spu2_type,
> +                            enum spu2_cipher_mode *spu2_mode)
> +{
> +       int err;
> +
> +       err = spu2_cipher_mode_xlate(cipher_mode, spu2_mode);
> +       if (err) {
> +               flow_log("Invalid cipher mode %d\n", cipher_mode);
> +               return err;
> +       }
> +
> +       switch (cipher_alg) {
> +       case CIPHER_ALG_NONE:
> +               *spu2_type = SPU2_CIPHER_TYPE_NONE;
> +               break;
> +       case CIPHER_ALG_RC4:
> +               /* SPU2 does not support RC4 */
> +               err = -EINVAL;
> +               *spu2_type = SPU2_CIPHER_TYPE_NONE;
> +               break;
> +       case CIPHER_ALG_DES:
> +               *spu2_type = SPU2_CIPHER_TYPE_DES;
> +               break;
> +       case CIPHER_ALG_3DES:
> +               *spu2_type = SPU2_CIPHER_TYPE_3DES;
> +               break;
> +       case CIPHER_ALG_AES:
> +               switch (cipher_type) {
> +               case CIPHER_TYPE_AES128:
> +                       *spu2_type = SPU2_CIPHER_TYPE_AES128;
> +                       break;
> +               case CIPHER_TYPE_AES192:
> +                       *spu2_type = SPU2_CIPHER_TYPE_AES192;
> +                       break;
> +               case CIPHER_TYPE_AES256:
> +                       *spu2_type = SPU2_CIPHER_TYPE_AES256;
> +                       break;
> +               default:
> +                       err = -EINVAL;
> +               }
> +               break;
> +       case CIPHER_ALG_LAST:
> +       default:
> +               err = -EINVAL;
> +               break;
> +       }
> +
> +       if (err)
> +               flow_log("Invalid cipher alg %d or type %d\n",
> +                        cipher_alg, cipher_type);
> +       return err;
> +}
> +
> +/*
> + * Convert from a software hash mode value to the corresponding value
> + * for SPU2. Note that HASH_MODE_NONE and HASH_MODE_XCBC have the same value.
> + */
> +static int spu2_hash_mode_xlate(enum hash_mode hash_mode,
> +                               enum spu2_hash_mode *spu2_mode)
> +{
> +       switch (hash_mode) {
> +       case HASH_MODE_XCBC:
> +               *spu2_mode = SPU2_HASH_MODE_XCBC_MAC;
> +               break;
> +       case HASH_MODE_CMAC:
> +               *spu2_mode = SPU2_HASH_MODE_CMAC;
> +               break;
> +       case HASH_MODE_HMAC:
> +               *spu2_mode = SPU2_HASH_MODE_HMAC;
> +               break;
> +       case HASH_MODE_CCM:
> +               *spu2_mode = SPU2_HASH_MODE_CCM;
> +               break;
> +       case HASH_MODE_GCM:
> +               *spu2_mode = SPU2_HASH_MODE_GCM;
> +               break;
> +       default:
> +               return -EINVAL;
> +       }
> +       return 0;
> +}
> +
> +/**
> + * spu2_hash_xlate() - Convert a hash {alg/mode/type} triple to a SPU2 hash type
> + * and mode.
> + * @hash_alg:  [in] hash algorithm value from software enumeration
> + * @hash_mode: [in] hash mode value from software enumeration
> + * @hash_type: [in] hash type value from software enumeration
> + * @ciph_type: [in] cipher type value from software enumeration
> + * @spu2_type: [out] hash type value used by SPU2 hardware
> + * @spu2_mode: [out] hash mode value used by SPU2 hardware
> + *
> + * Return:  0 if successful
> + */
> +static int
> +spu2_hash_xlate(enum hash_alg hash_alg, enum hash_mode hash_mode,
> +               enum hash_type hash_type, enum spu_cipher_type ciph_type,
> +               enum spu2_hash_type *spu2_type, enum spu2_hash_mode *spu2_mode)
> +{
> +       int err;
> +
> +       err = spu2_hash_mode_xlate(hash_mode, spu2_mode);
> +       if (err) {
> +               flow_log("Invalid hash mode %d\n", hash_mode);
> +               return err;
> +       }
> +
> +       switch (hash_alg) {
> +       case HASH_ALG_NONE:
> +               *spu2_type = SPU2_HASH_TYPE_NONE;
> +               break;
> +       case HASH_ALG_MD5:
> +               *spu2_type = SPU2_HASH_TYPE_MD5;
> +               break;
> +       case HASH_ALG_SHA1:
> +               *spu2_type = SPU2_HASH_TYPE_SHA1;
> +               break;
> +       case HASH_ALG_SHA224:
> +               *spu2_type = SPU2_HASH_TYPE_SHA224;
> +               break;
> +       case HASH_ALG_SHA256:
> +               *spu2_type = SPU2_HASH_TYPE_SHA256;
> +               break;
> +       case HASH_ALG_SHA384:
> +               *spu2_type = SPU2_HASH_TYPE_SHA384;
> +               break;
> +       case HASH_ALG_SHA512:
> +               *spu2_type = SPU2_HASH_TYPE_SHA512;
> +               break;
> +       case HASH_ALG_AES:
> +               switch (ciph_type) {
> +               case CIPHER_TYPE_AES128:
> +                       *spu2_type = SPU2_HASH_TYPE_AES128;
> +                       break;
> +               case CIPHER_TYPE_AES192:
> +                       *spu2_type = SPU2_HASH_TYPE_AES192;
> +                       break;
> +               case CIPHER_TYPE_AES256:
> +                       *spu2_type = SPU2_HASH_TYPE_AES256;
> +                       break;
> +               default:
> +                       err = -EINVAL;
> +               }
> +               break;
> +       case HASH_ALG_SHA3_224:
> +               *spu2_type = SPU2_HASH_TYPE_SHA3_224;
> +               break;
> +       case HASH_ALG_SHA3_256:
> +               *spu2_type = SPU2_HASH_TYPE_SHA3_256;
> +               break;
> +       case HASH_ALG_SHA3_384:
> +               *spu2_type = SPU2_HASH_TYPE_SHA3_384;
> +               break;
> +       case HASH_ALG_SHA3_512:
> +               *spu2_type = SPU2_HASH_TYPE_SHA3_512;
> +       case HASH_ALG_LAST:
> +       default:
> +               err = -EINVAL;
> +               break;
> +       }
> +
> +       if (err)
> +               flow_log("Invalid hash alg %d or type %d\n",
> +                        hash_alg, hash_type);
> +       return err;
> +}
> +
> +/* Dump FMD ctrl0. The ctrl0 input is in host byte order */
> +static void spu2_dump_fmd_ctrl0(u64 ctrl0)
> +{
> +       enum spu2_cipher_type ciph_type;
> +       enum spu2_cipher_mode ciph_mode;
> +       enum spu2_hash_type hash_type;
> +       enum spu2_hash_mode hash_mode;
> +       char *ciph_name;
> +       char *ciph_mode_name;
> +       char *hash_name;
> +       char *hash_mode_name;
> +       u8 cfb;
> +       u8 proto;
> +
> +       packet_log(" FMD CTRL0 %#16llx\n", ctrl0);
> +       if (ctrl0 & SPU2_CIPH_ENCRYPT_EN)
> +               packet_log("  encrypt\n");
> +       else
> +               packet_log("  decrypt\n");
> +
> +       ciph_type = (ctrl0 & SPU2_CIPH_TYPE) >> SPU2_CIPH_TYPE_SHIFT;
> +       ciph_name = spu2_ciph_type_name(ciph_type);
> +       packet_log("  Cipher type: %s\n", ciph_name);
> +
> +       if (ciph_type != SPU2_CIPHER_TYPE_NONE) {
> +               ciph_mode = (ctrl0 & SPU2_CIPH_MODE) >> SPU2_CIPH_MODE_SHIFT;
> +               ciph_mode_name = spu2_ciph_mode_name(ciph_mode);
> +               packet_log("  Cipher mode: %s\n", ciph_mode_name);
> +       }
> +
> +       cfb = (ctrl0 & SPU2_CFB_MASK) >> SPU2_CFB_MASK_SHIFT;
> +       packet_log("  CFB %#x\n", cfb);
> +
> +       proto = (ctrl0 & SPU2_PROTO_SEL) >> SPU2_PROTO_SEL_SHIFT;
> +       packet_log("  protocol %#x\n", proto);
> +
> +       if (ctrl0 & SPU2_HASH_FIRST)
> +               packet_log("  hash first\n");
> +       else
> +               packet_log("  cipher first\n");
> +
> +       if (ctrl0 & SPU2_CHK_TAG)
> +               packet_log("  check tag\n");
> +
> +       hash_type = (ctrl0 & SPU2_HASH_TYPE) >> SPU2_HASH_TYPE_SHIFT;
> +       hash_name = spu2_hash_type_name(hash_type);
> +       packet_log("  Hash type: %s\n", hash_name);
> +
> +       if (hash_type != SPU2_HASH_TYPE_NONE) {
> +               hash_mode = (ctrl0 & SPU2_HASH_MODE) >> SPU2_HASH_MODE_SHIFT;
> +               hash_mode_name = spu2_hash_mode_name(hash_mode);
> +               packet_log("  Hash mode: %s\n", hash_mode_name);
> +       }
> +
> +       if (ctrl0 & SPU2_CIPH_PAD_EN) {
> +               packet_log("  Cipher pad: %#2llx\n",
> +                          (ctrl0 & SPU2_CIPH_PAD) >> SPU2_CIPH_PAD_SHIFT);
> +       }
> +}
> +
> +/* Dump FMD ctrl1. The ctrl1 input is in host byte order */
> +static void spu2_dump_fmd_ctrl1(u64 ctrl1)
> +{
> +       u8 hash_key_len;
> +       u8 ciph_key_len;
> +       u8 ret_iv_len;
> +       u8 iv_offset;
> +       u8 iv_len;
> +       u8 hash_tag_len;
> +       u8 ret_md;
> +
> +       packet_log(" FMD CTRL1 %#16llx\n", ctrl1);
> +       if (ctrl1 & SPU2_TAG_LOC)
> +               packet_log("  Tag after payload\n");
> +
> +       packet_log("  Msg includes ");
> +       if (ctrl1 & SPU2_HAS_FR_DATA)
> +               packet_log("FD ");
> +       if (ctrl1 & SPU2_HAS_AAD1)
> +               packet_log("AAD1 ");
> +       if (ctrl1 & SPU2_HAS_NAAD)
> +               packet_log("NAAD ");
> +       if (ctrl1 & SPU2_HAS_AAD2)
> +               packet_log("AAD2 ");
> +       if (ctrl1 & SPU2_HAS_ESN)
> +               packet_log("ESN ");
> +       packet_log("\n");
> +
> +       hash_key_len = (ctrl1 & SPU2_HASH_KEY_LEN) >> SPU2_HASH_KEY_LEN_SHIFT;
> +       packet_log("  Hash key len %u\n", hash_key_len);
> +
> +       ciph_key_len = (ctrl1 & SPU2_CIPH_KEY_LEN) >> SPU2_CIPH_KEY_LEN_SHIFT;
> +       packet_log("  Cipher key len %u\n", ciph_key_len);
> +
> +       if (ctrl1 & SPU2_GENIV)
> +               packet_log("  Generate IV\n");
> +
> +       if (ctrl1 & SPU2_HASH_IV)
> +               packet_log("  IV included in hash\n");
> +
> +       if (ctrl1 & SPU2_RET_IV)
> +               packet_log("  Return IV in output before payload\n");
> +
> +       ret_iv_len = (ctrl1 & SPU2_RET_IV_LEN) >> SPU2_RET_IV_LEN_SHIFT;
> +       packet_log("  Length of returned IV %u bytes\n",
> +                  ret_iv_len ? ret_iv_len : 16);
> +
> +       iv_offset = (ctrl1 & SPU2_IV_OFFSET) >> SPU2_IV_OFFSET_SHIFT;
> +       packet_log("  IV offset %u\n", iv_offset);
> +
> +       iv_len = (ctrl1 & SPU2_IV_LEN) >> SPU2_IV_LEN_SHIFT;
> +       packet_log("  Input IV len %u bytes\n", iv_len);
> +
> +       hash_tag_len = (ctrl1 & SPU2_HASH_TAG_LEN) >> SPU2_HASH_TAG_LEN_SHIFT;
> +       packet_log("  Hash tag length %u bytes\n", hash_tag_len);
> +
> +       packet_log("  Return ");
> +       ret_md = (ctrl1 & SPU2_RETURN_MD) >> SPU2_RETURN_MD_SHIFT;
> +       if (ret_md)
> +               packet_log("FMD ");
> +       if (ret_md == SPU2_RET_FMD_OMD)
> +               packet_log("OMD ");
> +       else if (ret_md == SPU2_RET_FMD_OMD_IV)
> +               packet_log("OMD IV ");
> +       if (ctrl1 & SPU2_RETURN_FD)
> +               packet_log("FD ");
> +       if (ctrl1 & SPU2_RETURN_AAD1)
> +               packet_log("AAD1 ");
> +       if (ctrl1 & SPU2_RETURN_NAAD)
> +               packet_log("NAAD ");
> +       if (ctrl1 & SPU2_RETURN_AAD2)
> +               packet_log("AAD2 ");
> +       if (ctrl1 & SPU2_RETURN_PAY)
> +               packet_log("Payload");
> +       packet_log("\n");
> +}
> +
> +/* Dump FMD ctrl2. The ctrl2 input is in host byte order */
> +static void spu2_dump_fmd_ctrl2(u64 ctrl2)
> +{
> +       packet_log(" FMD CTRL2 %#16llx\n", ctrl2);
> +
> +       packet_log("  AAD1 offset %llu length %llu bytes\n",
> +                  ctrl2 & SPU2_AAD1_OFFSET,
> +                  (ctrl2 & SPU2_AAD1_LEN) >> SPU2_AAD1_LEN_SHIFT);
> +       packet_log("  AAD2 offset %llu\n",
> +                  (ctrl2 & SPU2_AAD2_OFFSET) >> SPU2_AAD2_OFFSET_SHIFT);
> +       packet_log("  Payload offset %llu\n",
> +                  (ctrl2 & SPU2_PL_OFFSET) >> SPU2_PL_OFFSET_SHIFT);
> +}
> +
> +/* Dump FMD ctrl3. The ctrl3 input is in host byte order */
> +static void spu2_dump_fmd_ctrl3(u64 ctrl3)
> +{
> +       packet_log(" FMD CTRL3 %#16llx\n", ctrl3);
> +
> +       packet_log("  Payload length %llu bytes\n", ctrl3 & SPU2_PL_LEN);
> +       packet_log("  TLS length %llu bytes\n",
> +                  (ctrl3 & SPU2_TLS_LEN) >> SPU2_TLS_LEN_SHIFT);
> +}
> +
> +static void spu2_dump_fmd(struct SPU2_FMD *fmd)
> +{
> +       spu2_dump_fmd_ctrl0(le64_to_cpu(fmd->ctrl0));
> +       spu2_dump_fmd_ctrl1(le64_to_cpu(fmd->ctrl1));
> +       spu2_dump_fmd_ctrl2(le64_to_cpu(fmd->ctrl2));
> +       spu2_dump_fmd_ctrl3(le64_to_cpu(fmd->ctrl3));
> +}
> +
> +static void spu2_dump_omd(u8 *omd, u16 hash_key_len, u16 ciph_key_len,
> +                         u16 hash_iv_len, u16 ciph_iv_len)
> +{
> +       u8 *ptr = omd;
> +
> +       packet_log(" OMD:\n");
> +
> +       if (hash_key_len) {
> +               packet_log("  Hash Key Length %u bytes\n", hash_key_len);
> +               packet_dump("  KEY: ", ptr, hash_key_len);
> +               ptr += hash_key_len;
> +       }
> +
> +       if (ciph_key_len) {
> +               packet_log("  Cipher Key Length %u bytes\n", ciph_key_len);
> +               packet_dump("  KEY: ", ptr, ciph_key_len);
> +               ptr += ciph_key_len;
> +       }
> +
> +       if (hash_iv_len) {
> +               packet_log("  Hash IV Length %u bytes\n", hash_iv_len);
> +               packet_dump("  hash IV: ", ptr, hash_iv_len);
> +               ptr += ciph_key_len;
> +       }
> +
> +       if (ciph_iv_len) {
> +               packet_log("  Cipher IV Length %u bytes\n", ciph_iv_len);
> +               packet_dump("  cipher IV: ", ptr, ciph_iv_len);
> +       }
> +}
> +
> +/* Dump a SPU2 header for debug */
> +void spu2_dump_msg_hdr(u8 *buf, unsigned int buf_len)
> +{
> +       struct SPU2_FMD *fmd = (struct SPU2_FMD *)buf;
> +       u8 *omd;
> +       u64 ctrl1;
> +       u16 hash_key_len;
> +       u16 ciph_key_len;
> +       u16 hash_iv_len;
> +       u16 ciph_iv_len;
> +       u16 omd_len;
> +
> +       packet_log("\n");
> +       packet_log("SPU2 message header %p len: %u\n", buf, buf_len);
> +
> +       spu2_dump_fmd(fmd);
> +       omd = (u8 *)(fmd + 1);
> +
> +       ctrl1 = le64_to_cpu(fmd->ctrl1);
> +       hash_key_len = (ctrl1 & SPU2_HASH_KEY_LEN) >> SPU2_HASH_KEY_LEN_SHIFT;
> +       ciph_key_len = (ctrl1 & SPU2_CIPH_KEY_LEN) >> SPU2_CIPH_KEY_LEN_SHIFT;
> +       hash_iv_len = 0;
> +       ciph_iv_len = (ctrl1 & SPU2_IV_LEN) >> SPU2_IV_LEN_SHIFT;
> +       spu2_dump_omd(omd, hash_key_len, ciph_key_len, hash_iv_len,
> +                     ciph_iv_len);
> +
> +       /* Double check sanity */
> +       omd_len = hash_key_len + ciph_key_len + hash_iv_len + ciph_iv_len;
> +       if (FMD_SIZE + omd_len != buf_len) {
> +               packet_log
> +                   (" Packet parsed incorrectly. buf_len %u, sum of MD %zu\n",
> +                    buf_len, FMD_SIZE + omd_len);
> +       }
> +       packet_log("\n");
> +}
> +
> +/**
> + * spu2_fmd_init() - At setkey time, initialize the fixed meta data for
> + * subsequent ablkcipher requests for this context.
> + * @spu2_cipher_type:  Cipher algorithm
> + * @spu2_mode:         Cipher mode
> + * @cipher_key_len:    Length of cipher key, in bytes
> + * @cipher_iv_len:     Length of cipher initialization vector, in bytes
> + *
> + * Return:  0 (success)
> + */
> +static int spu2_fmd_init(struct SPU2_FMD *fmd,
> +                        enum spu2_cipher_type spu2_type,
> +                        enum spu2_cipher_mode spu2_mode,
> +                        u32 cipher_key_len, u32 cipher_iv_len)
> +{
> +       u64 ctrl0;
> +       u64 ctrl1;
> +       u64 ctrl2;
> +       u64 ctrl3;
> +       u32 aad1_offset;
> +       u32 aad2_offset;
> +       u16 aad1_len = 0;
> +       u64 payload_offset;
> +
> +       ctrl0 = (spu2_type << SPU2_CIPH_TYPE_SHIFT) |
> +           (spu2_mode << SPU2_CIPH_MODE_SHIFT);
> +
> +       ctrl1 = (cipher_key_len << SPU2_CIPH_KEY_LEN_SHIFT) |
> +           ((u64)cipher_iv_len << SPU2_IV_LEN_SHIFT) |
> +           ((u64)SPU2_RET_FMD_ONLY << SPU2_RETURN_MD_SHIFT) | SPU2_RETURN_PAY;
> +
> +       /*
> +        * AAD1 offset is from start of FD. FD length is always 0 for this
> +        * driver. So AAD1_offset is always 0.
> +        */
> +       aad1_offset = 0;
> +       aad2_offset = aad1_offset;
> +       payload_offset = 0;
> +       ctrl2 = aad1_offset |
> +           (aad1_len << SPU2_AAD1_LEN_SHIFT) |
> +           (aad2_offset << SPU2_AAD2_OFFSET_SHIFT) |
> +           (payload_offset << SPU2_PL_OFFSET_SHIFT);
> +
> +       ctrl3 = 0;
> +
> +       fmd->ctrl0 = cpu_to_le64(ctrl0);
> +       fmd->ctrl1 = cpu_to_le64(ctrl1);
> +       fmd->ctrl2 = cpu_to_le64(ctrl2);
> +       fmd->ctrl3 = cpu_to_le64(ctrl3);
> +
> +       return 0;
> +}
> +
> +/**
> + * spu2_fmd_ctrl0_write() - Write ctrl0 field in fixed metadata (FMD) field of
> + * SPU request packet.
> + * @fmd:            Start of FMD field to be written
> + * @is_inbound:     true if decrypting. false if encrypting.
> + * @authFirst:      true if alg authenticates before encrypting
> + * @protocol:       protocol selector
> + * @cipher_type:    cipher algorithm
> + * @cipher_mode:    cipher mode
> + * @auth_type:      authentication type
> + * @auth_mode:      authentication mode
> + */
> +static void spu2_fmd_ctrl0_write(struct SPU2_FMD *fmd,
> +                                bool is_inbound, bool auth_first,
> +                                enum spu2_proto_sel protocol,
> +                                enum spu2_cipher_type cipher_type,
> +                                enum spu2_cipher_mode cipher_mode,
> +                                enum spu2_hash_type auth_type,
> +                                enum spu2_hash_mode auth_mode)
> +{
> +       u64 ctrl0 = 0;
> +
> +       if ((cipher_type != SPU2_CIPHER_TYPE_NONE) && !is_inbound)
> +               ctrl0 |= SPU2_CIPH_ENCRYPT_EN;
> +
> +       ctrl0 |= ((u64)cipher_type << SPU2_CIPH_TYPE_SHIFT) |
> +           ((u64)cipher_mode << SPU2_CIPH_MODE_SHIFT);
> +
> +       if (protocol)
> +               ctrl0 |= (u64)protocol << SPU2_PROTO_SEL_SHIFT;
> +
> +       if (auth_first)
> +               ctrl0 |= SPU2_HASH_FIRST;
> +
> +       if (is_inbound && (auth_type != SPU2_HASH_TYPE_NONE))
> +               ctrl0 |= SPU2_CHK_TAG;
> +
> +       ctrl0 |= (((u64)auth_type << SPU2_HASH_TYPE_SHIFT) |
> +                 ((u64)auth_mode << SPU2_HASH_MODE_SHIFT));
> +
> +       fmd->ctrl0 = cpu_to_le64(ctrl0);
> +}
> +
> +/**
> + * spu2_fmd_ctrl1_write() - Write ctrl1 field in fixed metadata (FMD) field of
> + * SPU request packet.
> + * @fmd:            Start of FMD field to be written
> + * @assoc_size:     Length of additional associated data, in bytes
> + * @auth_key_len:   Length of authentication key, in bytes
> + * @cipher_key_len: Length of cipher key, in bytes
> + * @gen_iv:         If true, hw generates IV and returns in response
> + * @hash_iv:        IV participates in hash. Used for IPSEC and TLS.
> + * @return_iv:      Return IV in output packet before payload
> + * @ret_iv_len:     Length of IV returned from SPU, in bytes
> + * @ret_iv_offset:  Offset into full IV of start of returned IV
> + * @cipher_iv_len:  Length of input cipher IV, in bytes
> + * @digest_size:    Length of digest (aka, hash tag or ICV), in bytes
> + * @return_payload: Return payload in SPU response
> + * @return_md : return metadata in SPU response
> + *
> + * Packet can have AAD2 w/o AAD1. For algorithms currently supported,
> + * associated data goes in AAD2.
> + */
> +static void spu2_fmd_ctrl1_write(struct SPU2_FMD *fmd, bool is_inbound,
> +                                u64 assoc_size,
> +                                u64 auth_key_len, u64 cipher_key_len,
> +                                bool gen_iv, bool hash_iv, bool return_iv,
> +                                u64 ret_iv_len, u64 ret_iv_offset,
> +                                u64 cipher_iv_len, u64 digest_size,
> +                                bool return_payload, bool return_md)
> +{
> +       u64 ctrl1 = 0;
> +
> +       if (is_inbound && digest_size)
> +               ctrl1 |= SPU2_TAG_LOC;
> +
> +       if (assoc_size) {
> +               ctrl1 |= SPU2_HAS_AAD2;
> +               ctrl1 |= SPU2_RETURN_AAD2;  /* need aad2 for gcm aes esp */
> +       }
> +
> +       if (auth_key_len)
> +               ctrl1 |= ((auth_key_len << SPU2_HASH_KEY_LEN_SHIFT) &
> +                         SPU2_HASH_KEY_LEN);
> +
> +       if (cipher_key_len)
> +               ctrl1 |= ((cipher_key_len << SPU2_CIPH_KEY_LEN_SHIFT) &
> +                         SPU2_CIPH_KEY_LEN);
> +
> +       if (gen_iv)
> +               ctrl1 |= SPU2_GENIV;
> +
> +       if (hash_iv)
> +               ctrl1 |= SPU2_HASH_IV;
> +
> +       if (return_iv) {
> +               ctrl1 |= SPU2_RET_IV;
> +               ctrl1 |= ret_iv_len << SPU2_RET_IV_LEN_SHIFT;
> +               ctrl1 |= ret_iv_offset << SPU2_IV_OFFSET_SHIFT;
> +       }
> +
> +       ctrl1 |= ((cipher_iv_len << SPU2_IV_LEN_SHIFT) & SPU2_IV_LEN);
> +
> +       if (digest_size)
> +               ctrl1 |= ((digest_size << SPU2_HASH_TAG_LEN_SHIFT) &
> +                         SPU2_HASH_TAG_LEN);
> +
> +       /* Let's ask for the output pkt to include FMD, but don't need to
> +        * get keys and IVs back in OMD.
> +        */
> +       if (return_md)
> +               ctrl1 |= ((u64)SPU2_RET_FMD_ONLY << SPU2_RETURN_MD_SHIFT);
> +       else
> +               ctrl1 |= ((u64)SPU2_RET_NO_MD << SPU2_RETURN_MD_SHIFT);
> +
> +       /* Crypto API does not get assoc data back. So no need for AAD2. */
> +
> +       if (return_payload)
> +               ctrl1 |= SPU2_RETURN_PAY;
> +
> +       fmd->ctrl1 = cpu_to_le64(ctrl1);
> +}
> +
> +/**
> + * spu2_fmd_ctrl2_write() - Set the ctrl2 field in the fixed metadata field of
> + * SPU2 header.
> + * @fmd:            Start of FMD field to be written
> + * @cipher_offset:  Number of bytes from Start of Packet (end of FD field) where
> + *                  data to be encrypted or decrypted begins
> + * @auth_key_len:   Length of authentication key, in bytes
> + * @auth_iv_len:    Length of authentication initialization vector, in bytes
> + * @cipher_key_len: Length of cipher key, in bytes
> + * @cipher_iv_len:  Length of cipher IV, in bytes
> + */
> +static void spu2_fmd_ctrl2_write(struct SPU2_FMD *fmd, u64 cipher_offset,
> +                                u64 auth_key_len, u64 auth_iv_len,
> +                                u64 cipher_key_len, u64 cipher_iv_len)
> +{
> +       u64 ctrl2;
> +       u64 aad1_offset;
> +       u64 aad2_offset;
> +       u16 aad1_len = 0;
> +       u64 payload_offset;
> +
> +       /* AAD1 offset is from start of FD. FD length always 0. */
> +       aad1_offset = 0;
> +
> +       aad2_offset = aad1_offset;
> +       payload_offset = cipher_offset;
> +       ctrl2 = aad1_offset |
> +           (aad1_len << SPU2_AAD1_LEN_SHIFT) |
> +           (aad2_offset << SPU2_AAD2_OFFSET_SHIFT) |
> +           (payload_offset << SPU2_PL_OFFSET_SHIFT);
> +
> +       fmd->ctrl2 = cpu_to_le64(ctrl2);
> +}
> +
> +/**
> + * spu2_fmd_ctrl3_write() - Set the ctrl3 field in FMD
> + * @fmd:          Fixed meta data. First field in SPU2 msg header.
> + * @payload_len:  Length of payload, in bytes
> + */
> +static void spu2_fmd_ctrl3_write(struct SPU2_FMD *fmd, u64 payload_len)
> +{
> +       u64 ctrl3;
> +
> +       ctrl3 = payload_len & SPU2_PL_LEN;
> +
> +       fmd->ctrl3 = cpu_to_le64(ctrl3);
> +}
> +
> +/**
> + * spu2_ctx_max_payload() - Determine the maximum length of the payload for a
> + * SPU message for a given cipher and hash alg context.
> + * @cipher_alg:                The cipher algorithm
> + * @cipher_mode:       The cipher mode
> + * @blocksize:         The size of a block of data for this algo
> + *
> + * For SPU2, the hardware generally ignores the PayloadLen field in ctrl3 of
> + * FMD and just keeps computing until it receives a DMA descriptor with the EOF
> + * flag set. So we consider the max payload to be infinite. AES CCM is an
> + * exception.
> + *
> + * Return: Max payload length in bytes
> + */
> +u32 spu2_ctx_max_payload(enum spu_cipher_alg cipher_alg,
> +                        enum spu_cipher_mode cipher_mode,
> +                        unsigned int blocksize)
> +{
> +       if ((cipher_alg == CIPHER_ALG_AES) &&
> +           (cipher_mode == CIPHER_MODE_CCM)) {
> +               u32 excess = SPU2_MAX_PAYLOAD % blocksize;
> +
> +               return SPU2_MAX_PAYLOAD - excess;
> +       } else {
> +               return SPU_MAX_PAYLOAD_INF;
> +       }
> +}
> +
> +/**
> + * spu_payload_length() -  Given a SPU2 message header, extract the payload
> + * length.
> + * @spu_hdr:  Start of SPU message header (FMD)
> + *
> + * Return: payload length, in bytes
> + */
> +u32 spu2_payload_length(u8 *spu_hdr)
> +{
> +       struct SPU2_FMD *fmd = (struct SPU2_FMD *)spu_hdr;
> +       u32 pl_len;
> +       u64 ctrl3;
> +
> +       ctrl3 = le64_to_cpu(fmd->ctrl3);
> +       pl_len = ctrl3 & SPU2_PL_LEN;
> +
> +       return pl_len;
> +}
> +
> +/**
> + * spu_response_hdr_len() - Determine the expected length of a SPU response
> + * header.
> + * @auth_key_len:  Length of authentication key, in bytes
> + * @enc_key_len:   Length of encryption key, in bytes
> + *
> + * For SPU2, includes just FMD. OMD is never requested.
> + *
> + * Return: Length of FMD, in bytes
> + */
> +u16 spu2_response_hdr_len(u16 auth_key_len, u16 enc_key_len, bool is_hash)
> +{
> +       return FMD_SIZE;
> +}
> +
> +/**
> + * spu_hash_pad_len() - Calculate the length of hash padding required to extend
> + * data to a full block size.
> + * @hash_alg:        hash algorithm
> + * @hash_mode:       hash mode
> + * @chunksize:       length of data, in bytes
> + * @hash_block_size: size of a hash block, in bytes
> + *
> + * SPU2 hardware does all hash padding
> + *
> + * Return:  length of hash pad in bytes
> + */
> +u16 spu2_hash_pad_len(enum hash_alg hash_alg, enum hash_mode hash_mode,
> +                     u32 chunksize, u16 hash_block_size)
> +{
> +       return 0;
> +}
> +
> +/**
> + * spu2_gcm_ccm_padlen() -  Determine the length of GCM/CCM padding for either
> + * the AAD field or the data.
> + *
> + * Return:  0. Unlike SPU-M, SPU2 hardware does any GCM/CCM padding required.
> + */
> +u32 spu2_gcm_ccm_pad_len(enum spu_cipher_mode cipher_mode,
> +                        unsigned int data_size)
> +{
> +       return 0;
> +}
> +
> +/**
> + * spu_assoc_resp_len() - Determine the size of the AAD2 buffer needed to catch
> + * associated data in a SPU2 output packet.
> + * @cipher_mode:   cipher mode
> + * @assoc_len:     length of additional associated data, in bytes
> + * @iv_len:        length of initialization vector, in bytes
> + * @is_encrypt:    true if encrypting. false if decrypt.
> + *
> + * Return: Length of buffer to catch associated data in response
> + */
> +u32 spu2_assoc_resp_len(enum spu_cipher_mode cipher_mode,
> +                       unsigned int assoc_len, unsigned int iv_len,
> +                       bool is_encrypt)
> +{
> +       u32 resp_len = assoc_len;
> +
> +       if (is_encrypt)
> +               /* gcm aes esp has to write 8-byte IV in response */
> +               resp_len += iv_len;
> +       return resp_len;
> +}
> +
> +/*
> + * spu_aead_ivlen() - Calculate the length of the AEAD IV to be included
> + * in a SPU request after the AAD and before the payload.
> + * @cipher_mode:  cipher mode
> + * @iv_ctr_len:   initialization vector length in bytes
> + *
> + * For SPU2, AEAD IV is included in OMD and does not need to be repeated
> + * prior to the payload.
> + *
> + * Return: Length of AEAD IV in bytes
> + */
> +u8 spu2_aead_ivlen(enum spu_cipher_mode cipher_mode, u16 iv_len)
> +{
> +       return 0;
> +}
> +
> +/**
> + * spu2_hash_type() - Determine the type of hash operation.
> + * @src_sent:  The number of bytes in the current request that have already
> + *             been sent to the SPU to be hashed.
> + *
> + * SPU2 always does a FULL hash operation
> + */
> +enum hash_type spu2_hash_type(u32 src_sent)
> +{
> +       return HASH_TYPE_FULL;
> +}
> +
> +/**
> + * spu2_digest_size() - Determine the size of a hash digest to expect the SPU to
> + * return.
> + * alg_digest_size: Number of bytes in the final digest for the given algo
> + * alg:             The hash algorithm
> + * htype:           Type of hash operation (init, update, full, etc)
> + *
> + */
> +u32 spu2_digest_size(u32 alg_digest_size, enum hash_alg alg,
> +                    enum hash_type htype)
> +{
> +       return alg_digest_size;
> +}
> +
> +/**
> + * spu_create_request() - Build a SPU2 request message header, includint FMD and
> + * OMD.
> + * @spu_hdr: Start of buffer where SPU request header is to be written
> + * @req_opts: SPU request message options
> + * @cipher_parms: Parameters related to cipher algorithm
> + * @hash_parms:   Parameters related to hash algorithm
> + * @aead_parms:   Parameters related to AEAD operation
> + * @data_size:    Length of data to be encrypted or authenticated. If AEAD, does
> + *               not include length of AAD.
> + *
> + * Construct the message starting at spu_hdr. Caller should allocate this buffer
> + * in DMA-able memory at least SPU_HEADER_ALLOC_LEN bytes long.
> + *
> + * Return: the length of the SPU header in bytes. 0 if an error occurs.
> + */
> +u32 spu2_create_request(u8 *spu_hdr,
> +                       struct spu_request_opts *req_opts,
> +                       struct spu_cipher_parms *cipher_parms,
> +                       struct spu_hash_parms *hash_parms,
> +                       struct spu_aead_parms *aead_parms,
> +                       unsigned int data_size)
> +{
> +       struct SPU2_FMD *fmd;
> +       u8 *ptr;
> +       unsigned int buf_len;
> +       int err;
> +       enum spu2_cipher_type spu2_ciph_type = SPU2_CIPHER_TYPE_NONE;
> +       enum spu2_cipher_mode spu2_ciph_mode;
> +       enum spu2_hash_type spu2_auth_type = SPU2_HASH_TYPE_NONE;
> +       enum spu2_hash_mode spu2_auth_mode;
> +       bool return_md = true;
> +       enum spu2_proto_sel proto = SPU2_PROTO_RESV;
> +
> +       /* size of the payload */
> +       unsigned int payload_len =
> +           hash_parms->prebuf_len + data_size + hash_parms->pad_len -
> +           ((req_opts->is_aead && req_opts->is_inbound) ?
> +            hash_parms->digestsize : 0);
> +
> +       /* offset of prebuf or data from start of AAD2 */
> +       unsigned int cipher_offset = aead_parms->assoc_size +
> +                       aead_parms->aad_pad_len + aead_parms->iv_len;
> +
> +#ifdef DEBUG
> +       /* total size of the data following OMD (without STAT word padding) */
> +       unsigned int real_db_size = spu_real_db_size(aead_parms->assoc_size,
> +                                                aead_parms->iv_len,
> +                                                hash_parms->prebuf_len,
> +                                                data_size,
> +                                                aead_parms->aad_pad_len,
> +                                                aead_parms->data_pad_len,
> +                                                hash_parms->pad_len);
> +#endif
> +       unsigned int assoc_size = aead_parms->assoc_size;
> +
> +       if (req_opts->is_aead &&
> +           (cipher_parms->alg == CIPHER_ALG_AES) &&
> +           (cipher_parms->mode == CIPHER_MODE_GCM))
> +               /*
> +                * On SPU 2, aes gcm cipher first on encrypt, auth first on
> +                * decrypt
> +                */
> +               req_opts->auth_first = req_opts->is_inbound;
> +
> +       /* and do opposite for ccm (auth 1st on encrypt) */
> +       if (req_opts->is_aead &&
> +           (cipher_parms->alg == CIPHER_ALG_AES) &&
> +           (cipher_parms->mode == CIPHER_MODE_CCM))
> +               req_opts->auth_first = !req_opts->is_inbound;
> +
> +       flow_log("%s()\n", __func__);
> +       flow_log("  in:%u authFirst:%u\n",
> +                req_opts->is_inbound, req_opts->auth_first);
> +       flow_log("  cipher alg:%u mode:%u type %u\n", cipher_parms->alg,
> +                cipher_parms->mode, cipher_parms->type);
> +       flow_log("  is_esp: %s\n", req_opts->is_esp ? "yes" : "no");
> +       flow_log("    key: %d\n", cipher_parms->key_len);
> +       flow_dump("    key: ", cipher_parms->key_buf, cipher_parms->key_len);
> +       flow_log("    iv: %d\n", cipher_parms->iv_len);
> +       flow_dump("    iv: ", cipher_parms->iv_buf, cipher_parms->iv_len);
> +       flow_log("  auth alg:%u mode:%u type %u\n",
> +                hash_parms->alg, hash_parms->mode, hash_parms->type);
> +       flow_log("  digestsize: %u\n", hash_parms->digestsize);
> +       flow_log("  authkey: %d\n", hash_parms->key_len);
> +       flow_dump("  authkey: ", hash_parms->key_buf, hash_parms->key_len);
> +       flow_log("  assoc_size:%u\n", assoc_size);
> +       flow_log("  prebuf_len:%u\n", hash_parms->prebuf_len);
> +       flow_log("  data_size:%u\n", data_size);
> +       flow_log("  hash_pad_len:%u\n", hash_parms->pad_len);
> +       flow_log("  real_db_size:%u\n", real_db_size);
> +       flow_log("  cipher_offset:%u payload_len:%u\n",
> +                cipher_offset, payload_len);
> +       flow_log("  hmac_offset:%u\n", hash_parms->hmac_offset);
> +       flow_log("  aead_iv: %u\n", aead_parms->iv_len);
> +
> +       /* Convert to spu2 values for cipher alg, hash alg */
> +       err = spu2_cipher_xlate(cipher_parms->alg, cipher_parms->mode,
> +                               cipher_parms->type,
> +                               &spu2_ciph_type, &spu2_ciph_mode);
> +
> +       /* If we are doing GCM hashing only - either via rfc4543 transform
> +        * or because we happen to do GCM with AAD only and no payload - we
> +        * need to configure hardware to use hash key rather than cipher key
> +        * and put data into payload.  This is because unlike SPU-M, running
> +        * GCM cipher with 0 size payload is not permitted.
> +        */
> +       if ((req_opts->is_rfc4543) ||
> +           ((spu2_ciph_mode == SPU2_CIPHER_MODE_GCM) &&
> +           (payload_len == 0))) {
> +               /* Use hashing (only) and set up hash key */
> +               spu2_ciph_type = SPU2_CIPHER_TYPE_NONE;
> +               hash_parms->key_len = cipher_parms->key_len;
> +               memcpy(hash_parms->key_buf, cipher_parms->key_buf,
> +                      cipher_parms->key_len);
> +               cipher_parms->key_len = 0;
> +
> +               if (req_opts->is_rfc4543)
> +                       payload_len += assoc_size;
> +               else
> +                       payload_len = assoc_size;
> +               cipher_offset = 0;
> +               assoc_size = 0;
> +       }
> +
> +       if (err)
> +               return 0;
> +
> +       flow_log("spu2 cipher type %s, cipher mode %s\n",
> +                spu2_ciph_type_name(spu2_ciph_type),
> +                spu2_ciph_mode_name(spu2_ciph_mode));
> +
> +       err = spu2_hash_xlate(hash_parms->alg, hash_parms->mode,
> +                             hash_parms->type,
> +                             cipher_parms->type,
> +                             &spu2_auth_type, &spu2_auth_mode);
> +       if (err)
> +               return 0;
> +
> +       flow_log("spu2 hash type %s, hash mode %s\n",
> +                spu2_hash_type_name(spu2_auth_type),
> +                spu2_hash_mode_name(spu2_auth_mode));
> +
> +       fmd = (struct SPU2_FMD *)spu_hdr;
> +
> +       spu2_fmd_ctrl0_write(fmd, req_opts->is_inbound, req_opts->auth_first,
> +                            proto, spu2_ciph_type, spu2_ciph_mode,
> +                            spu2_auth_type, spu2_auth_mode);
> +
> +       spu2_fmd_ctrl1_write(fmd, req_opts->is_inbound, assoc_size,
> +                            hash_parms->key_len, cipher_parms->key_len,
> +                            false, false,
> +                            aead_parms->return_iv, aead_parms->ret_iv_len,
> +                            aead_parms->ret_iv_off,
> +                            cipher_parms->iv_len, hash_parms->digestsize,
> +                            !req_opts->bd_suppress, return_md);
> +
> +       spu2_fmd_ctrl2_write(fmd, cipher_offset, hash_parms->key_len, 0,
> +                            cipher_parms->key_len, cipher_parms->iv_len);
> +
> +       spu2_fmd_ctrl3_write(fmd, payload_len);
> +
> +       ptr = (u8 *)(fmd + 1);
> +       buf_len = sizeof(struct SPU2_FMD);
> +
> +       /* Write OMD */
> +       if (hash_parms->key_len) {
> +               memcpy(ptr, hash_parms->key_buf, hash_parms->key_len);
> +               ptr += hash_parms->key_len;
> +               buf_len += hash_parms->key_len;
> +       }
> +       if (cipher_parms->key_len) {
> +               memcpy(ptr, cipher_parms->key_buf, cipher_parms->key_len);
> +               ptr += cipher_parms->key_len;
> +               buf_len += cipher_parms->key_len;
> +       }
> +       if (cipher_parms->iv_len) {
> +               memcpy(ptr, cipher_parms->iv_buf, cipher_parms->iv_len);
> +               ptr += cipher_parms->iv_len;
> +               buf_len += cipher_parms->iv_len;
> +       }
> +
> +       packet_dump("  SPU request header: ", spu_hdr, buf_len);
> +
> +       return buf_len;
> +}
> +
> +/**
> + * spu_cipher_req_init() - Build an ablkcipher SPU2 request message header,
> + * including FMD and OMD.
> + * @spu_hdr:       Location of start of SPU request (FMD field)
> + * @cipher_parms:  Parameters describing cipher request
> + *
> + * Called at setkey time to initialize a msg header that can be reused for all
> + * subsequent ablkcipher requests. Construct the message starting at spu_hdr.
> + * Caller should allocate this buffer in DMA-able memory at least
> + * SPU_HEADER_ALLOC_LEN bytes long.
> + *
> + * Return: the total length of the SPU header (FMD and OMD) in bytes. 0 if an
> + * error occurs.
> + */
> +u16 spu2_cipher_req_init(u8 *spu_hdr, struct spu_cipher_parms *cipher_parms)
> +{
> +       struct SPU2_FMD *fmd;
> +       u8 *omd;
> +       enum spu2_cipher_type spu2_type = SPU2_CIPHER_TYPE_NONE;
> +       enum spu2_cipher_mode spu2_mode;
> +       int err;
> +
> +       flow_log("%s()\n", __func__);
> +       flow_log("  cipher alg:%u mode:%u type %u\n", cipher_parms->alg,
> +                cipher_parms->mode, cipher_parms->type);
> +       flow_log("  cipher_iv_len: %u\n", cipher_parms->iv_len);
> +       flow_log("    key: %d\n", cipher_parms->key_len);
> +       flow_dump("    key: ", cipher_parms->key_buf, cipher_parms->key_len);
> +
> +       /* Convert to spu2 values */
> +       err = spu2_cipher_xlate(cipher_parms->alg, cipher_parms->mode,
> +                               cipher_parms->type, &spu2_type, &spu2_mode);
> +       if (err)
> +               return 0;
> +
> +       flow_log("spu2 cipher type %s, cipher mode %s\n",
> +                spu2_ciph_type_name(spu2_type),
> +                spu2_ciph_mode_name(spu2_mode));
> +
> +       /* Construct the FMD header */
> +       fmd = (struct SPU2_FMD *)spu_hdr;
> +       err = spu2_fmd_init(fmd, spu2_type, spu2_mode, cipher_parms->key_len,
> +                           cipher_parms->iv_len);
> +       if (err)
> +               return 0;
> +
> +       /* Write cipher key to OMD */
> +       omd = (u8 *)(fmd + 1);
> +       if (cipher_parms->key_buf && cipher_parms->key_len)
> +               memcpy(omd, cipher_parms->key_buf, cipher_parms->key_len);
> +
> +       packet_dump("  SPU request header: ", spu_hdr,
> +                   FMD_SIZE + cipher_parms->key_len + cipher_parms->iv_len);
> +
> +       return FMD_SIZE + cipher_parms->key_len + cipher_parms->iv_len;
> +}
> +
> +/**
> + * spu_cipher_req_finish() - Finish building a SPU request message header for a
> + * block cipher request.
> + * @spu_hdr:         Start of the request message header (MH field)
> + * @spu_req_hdr_len: Length in bytes of the SPU request header
> + * @isInbound:       0 encrypt, 1 decrypt
> + * @cipher_parms:    Parameters describing cipher operation to be performed
> + * @update_key:      If true, rewrite the cipher key in SCTX
> + * @data_size:       Length of the data in the BD field
> + *
> + * Assumes much of the header was already filled in at setkey() time in
> + * spu_cipher_req_init().
> + * spu_cipher_req_init() fills in the encryption key. For RC4, when submitting a
> + * request for a non-first chunk, we use the 260-byte SUPDT field from the
> + * previous response as the key. update_key is true for this case. Unused in all
> + * other cases.
> + */
> +void spu2_cipher_req_finish(u8 *spu_hdr,
> +                           u16 spu_req_hdr_len,
> +                           unsigned int is_inbound,
> +                           struct spu_cipher_parms *cipher_parms,
> +                           bool update_key,
> +                           unsigned int data_size)
> +{
> +       struct SPU2_FMD *fmd;
> +       u8 *omd;                /* start of optional metadata */
> +       u64 ctrl0;
> +       u64 ctrl3;
> +
> +       flow_log("%s()\n", __func__);
> +       flow_log(" in: %u\n", is_inbound);
> +       flow_log(" cipher alg: %u, cipher_type: %u\n", cipher_parms->alg,
> +                cipher_parms->type);
> +       if (update_key) {
> +               flow_log(" cipher key len: %u\n", cipher_parms->key_len);
> +               flow_dump("  key: ", cipher_parms->key_buf,
> +                         cipher_parms->key_len);
> +       }
> +       flow_log(" iv len: %d\n", cipher_parms->iv_len);
> +       flow_dump("    iv: ", cipher_parms->iv_buf, cipher_parms->iv_len);
> +       flow_log(" data_size: %u\n", data_size);
> +
> +       fmd = (struct SPU2_FMD *)spu_hdr;
> +       omd = (u8 *)(fmd + 1);
> +
> +       /*
> +        * FMD ctrl0 was initialized at setkey time. update it to indicate
> +        * whether we are encrypting or decrypting.
> +        */
> +       ctrl0 = le64_to_cpu(fmd->ctrl0);
> +       if (is_inbound)
> +               ctrl0 &= ~SPU2_CIPH_ENCRYPT_EN; /* decrypt */
> +       else
> +               ctrl0 |= SPU2_CIPH_ENCRYPT_EN;  /* encrypt */
> +       fmd->ctrl0 = cpu_to_le64(ctrl0);
> +
> +       if (cipher_parms->alg && cipher_parms->iv_buf && cipher_parms->iv_len) {
> +               /* cipher iv provided so put it in here */
> +               memcpy(omd + cipher_parms->key_len, cipher_parms->iv_buf,
> +                      cipher_parms->iv_len);
> +       }
> +
> +       ctrl3 = le64_to_cpu(fmd->ctrl3);
> +       data_size &= SPU2_PL_LEN;
> +       ctrl3 |= data_size;
> +       fmd->ctrl3 = cpu_to_le64(ctrl3);
> +
> +       packet_dump("  SPU request header: ", spu_hdr, spu_req_hdr_len);
> +}
> +
> +/**
> + * spu_request_pad() - Create pad bytes at the end of the data.
> + * @pad_start:      Start of buffer where pad bytes are to be written
> + * @gcm_padding:    Length of GCM padding, in bytes
> + * @hash_pad_len:   Number of bytes of padding extend data to full block
> + * @auth_alg:       Authentication algorithm
> + * @auth_mode:      Authentication mode
> + * @total_sent:     Length inserted at end of hash pad
> + * @status_padding: Number of bytes of padding to align STATUS word
> + *
> + * There may be three forms of pad:
> + *  1. GCM pad - for GCM mode ciphers, pad to 16-byte alignment
> + *  2. hash pad - pad to a block length, with 0x80 data terminator and
> + *                size at the end
> + *  3. STAT pad - to ensure the STAT field is 4-byte aligned
> + */
> +void spu2_request_pad(u8 *pad_start, u32 gcm_padding, u32 hash_pad_len,
> +                     enum hash_alg auth_alg, enum hash_mode auth_mode,
> +                     unsigned int total_sent, u32 status_padding)
> +{
> +       u8 *ptr = pad_start;
> +
> +       /* fix data alignent for GCM */
> +       if (gcm_padding > 0) {
> +               flow_log("  GCM: padding to 16 byte alignment: %u bytes\n",
> +                        gcm_padding);
> +               memset(ptr, 0, gcm_padding);
> +               ptr += gcm_padding;
> +       }
> +
> +       if (hash_pad_len > 0) {
> +               /* clear the padding section */
> +               memset(ptr, 0, hash_pad_len);
> +
> +               /* terminate the data */
> +               *ptr = 0x80;
> +               ptr += (hash_pad_len - sizeof(u64));
> +
> +               /* add the size at the end as required per alg */
> +               if (auth_alg == HASH_ALG_MD5)
> +                       *(u64 *)ptr = cpu_to_le64((u64)total_sent * 8);
> +               else            /* SHA1, SHA2-224, SHA2-256 */
> +                       *(u64 *)ptr = cpu_to_be64((u64)total_sent * 8);
> +               ptr += sizeof(u64);
> +       }
> +
> +       /* pad to a 4byte alignment for STAT */
> +       if (status_padding > 0) {
> +               flow_log("  STAT: padding to 4 byte alignment: %u bytes\n",
> +                        status_padding);
> +
> +               memset(ptr, 0, status_padding);
> +               ptr += status_padding;
> +       }
> +}
> +
> +/**
> + * spu2_xts_tweak_in_payload() - Indicate that SPU2 does NOT place the XTS
> + * tweak field in the packet payload (it uses IV instead)
> + *
> + * Return: 0
> + */
> +u8 spu2_xts_tweak_in_payload(void)
> +{
> +       return 0;
> +}
> +
> +/**
> + * spu2_tx_status_len() - Return the length of the STATUS field in a SPU
> + * response message.
> + *
> + * Return: Length of STATUS field in bytes.
> + */
> +u8 spu2_tx_status_len(void)
> +{
> +       return SPU2_TX_STATUS_LEN;
> +}
> +
> +/**
> + * spu2_rx_status_len() - Return the length of the STATUS field in a SPU
> + * response message.
> + *
> + * Return: Length of STATUS field in bytes.
> + */
> +u8 spu2_rx_status_len(void)
> +{
> +       return SPU2_RX_STATUS_LEN;
> +}
> +
> +/**
> + * spu_status_process() - Process the status from a SPU response message.
> + * @statp:  start of STATUS word
> + *
> + * Return:  0 - if status is good and response should be processed
> + *         !0 - status indicates an error and response is invalid
> + */
> +int spu2_status_process(u8 *statp)
> +{
> +       /* SPU2 status is 2 bytes by default - SPU_RX_STATUS_LEN */
> +       u16 status = le16_to_cpu(*(__le16 *)statp);
> +
> +       if (status == 0)
> +               return 0;
> +
> +       flow_log("rx status is %#x\n", status);
> +       if (status == SPU2_INVALID_ICV)
> +               return SPU_INVALID_ICV;
> +
> +       return -EBADMSG;
> +}
> +
> +/**
> + * spu2_ccm_update_iv() - Update the IV as per the requirements for CCM mode.
> + *
> + * @digestsize:                Digest size of this request
> + * @cipher_parms:      (pointer to) cipher parmaeters, includes IV buf & IV len
> + * @assoclen:          Length of AAD data
> + * @chunksize:         length of input data to be sent in this req
> + * @is_encrypt:                true if this is an output/encrypt operation
> + * @is_esp:            true if this is an ESP / RFC4309 operation
> + *
> + */
> +void spu2_ccm_update_iv(unsigned int digestsize,
> +                       struct spu_cipher_parms *cipher_parms,
> +                       unsigned int assoclen, unsigned int chunksize,
> +                       bool is_encrypt, bool is_esp)
> +{
> +       int L;  /* size of length field, in bytes */
> +
> +       /*
> +        * In RFC4309 mode, L is fixed at 4 bytes; otherwise, IV from
> +        * testmgr contains (L-1) in bottom 3 bits of first byte,
> +        * per RFC 3610.
> +        */
> +       if (is_esp)
> +               L = CCM_ESP_L_VALUE;
> +       else
> +               L = ((cipher_parms->iv_buf[0] & CCM_B0_L_PRIME) >>
> +                     CCM_B0_L_PRIME_SHIFT) + 1;
> +
> +       /* SPU2 doesn't want these length bytes nor the first byte... */
> +       cipher_parms->iv_len -= (1 + L);
> +       memmove(cipher_parms->iv_buf, &cipher_parms->iv_buf[1],
> +               cipher_parms->iv_len);
> +}
> +
> +/**
> + * spu2_wordalign_padlen() - SPU2 does not require padding.
> + * @data_size: length of data field in bytes
> + *
> + * Return: length of status field padding, in bytes (always 0 on SPU2)
> + */
> +u32 spu2_wordalign_padlen(u32 data_size)
> +{
> +       return 0;
> +}
> diff --git a/drivers/crypto/bcm/spu2.h b/drivers/crypto/bcm/spu2.h
> new file mode 100644
> index 0000000..ab1f599
> --- /dev/null
> +++ b/drivers/crypto/bcm/spu2.h
> @@ -0,0 +1,228 @@
> +/*
> + * Copyright 2016 Broadcom
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License, version 2, as
> + * published by the Free Software Foundation (the "GPL").
> + *
> + * This program is distributed in the hope that it will be useful, but
> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License version 2 (GPLv2) for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * version 2 (GPLv2) along with this source code.
> + */
> +
> +/*
> + * This file contains SPU message definitions specific to SPU2.
> + */
> +
> +#ifndef _SPU2_H
> +#define _SPU2_H
> +
> +enum spu2_cipher_type {
> +       SPU2_CIPHER_TYPE_NONE = 0x0,
> +       SPU2_CIPHER_TYPE_AES128 = 0x1,
> +       SPU2_CIPHER_TYPE_AES192 = 0x2,
> +       SPU2_CIPHER_TYPE_AES256 = 0x3,
> +       SPU2_CIPHER_TYPE_DES = 0x4,
> +       SPU2_CIPHER_TYPE_3DES = 0x5,
> +       SPU2_CIPHER_TYPE_LAST
> +};
> +
> +enum spu2_cipher_mode {
> +       SPU2_CIPHER_MODE_ECB = 0x0,
> +       SPU2_CIPHER_MODE_CBC = 0x1,
> +       SPU2_CIPHER_MODE_CTR = 0x2,
> +       SPU2_CIPHER_MODE_CFB = 0x3,
> +       SPU2_CIPHER_MODE_OFB = 0x4,
> +       SPU2_CIPHER_MODE_XTS = 0x5,
> +       SPU2_CIPHER_MODE_CCM = 0x6,
> +       SPU2_CIPHER_MODE_GCM = 0x7,
> +       SPU2_CIPHER_MODE_LAST
> +};
> +
> +enum spu2_hash_type {
> +       SPU2_HASH_TYPE_NONE = 0x0,
> +       SPU2_HASH_TYPE_AES128 = 0x1,
> +       SPU2_HASH_TYPE_AES192 = 0x2,
> +       SPU2_HASH_TYPE_AES256 = 0x3,
> +       SPU2_HASH_TYPE_MD5 = 0x6,
> +       SPU2_HASH_TYPE_SHA1 = 0x7,
> +       SPU2_HASH_TYPE_SHA224 = 0x8,
> +       SPU2_HASH_TYPE_SHA256 = 0x9,
> +       SPU2_HASH_TYPE_SHA384 = 0xa,
> +       SPU2_HASH_TYPE_SHA512 = 0xb,
> +       SPU2_HASH_TYPE_SHA512_224 = 0xc,
> +       SPU2_HASH_TYPE_SHA512_256 = 0xd,
> +       SPU2_HASH_TYPE_SHA3_224 = 0xe,
> +       SPU2_HASH_TYPE_SHA3_256 = 0xf,
> +       SPU2_HASH_TYPE_SHA3_384 = 0x10,
> +       SPU2_HASH_TYPE_SHA3_512 = 0x11,
> +       SPU2_HASH_TYPE_LAST
> +};
> +
> +enum spu2_hash_mode {
> +       SPU2_HASH_MODE_CMAC = 0x0,
> +       SPU2_HASH_MODE_CBC_MAC = 0x1,
> +       SPU2_HASH_MODE_XCBC_MAC = 0x2,
> +       SPU2_HASH_MODE_HMAC = 0x3,
> +       SPU2_HASH_MODE_RABIN = 0x4,
> +       SPU2_HASH_MODE_CCM = 0x5,
> +       SPU2_HASH_MODE_GCM = 0x6,
> +       SPU2_HASH_MODE_RESERVED = 0x7,
> +       SPU2_HASH_MODE_LAST
> +};
> +
> +enum spu2_ret_md_opts {
> +       SPU2_RET_NO_MD = 0,     /* return no metadata */
> +       SPU2_RET_FMD_OMD = 1,   /* return both FMD and OMD */
> +       SPU2_RET_FMD_ONLY = 2,  /* return only FMD */
> +       SPU2_RET_FMD_OMD_IV = 3,        /* return FMD and OMD with just IVs */
> +};
> +
> +/* Fixed Metadata format */
> +struct SPU2_FMD {
> +       u64 ctrl0;
> +       u64 ctrl1;
> +       u64 ctrl2;
> +       u64 ctrl3;
> +};
> +
> +#define FMD_SIZE  sizeof(struct SPU2_FMD)
> +
> +/* Fixed part of request message header length in bytes. Just FMD. */
> +#define SPU2_REQ_FIXED_LEN FMD_SIZE
> +#define SPU2_HEADER_ALLOC_LEN (SPU_REQ_FIXED_LEN + \
> +                               2 * MAX_KEY_SIZE + 2 * MAX_IV_SIZE)
> +
> +/* FMD ctrl0 field masks */
> +#define SPU2_CIPH_ENCRYPT_EN            0x1 /* 0: decrypt, 1: encrypt */
> +#define SPU2_CIPH_TYPE                 0xF0 /* one of spu2_cipher_type */
> +#define SPU2_CIPH_TYPE_SHIFT              4
> +#define SPU2_CIPH_MODE                0xF00 /* one of spu2_cipher_mode */
> +#define SPU2_CIPH_MODE_SHIFT              8
> +#define SPU2_CFB_MASK                0x7000 /* cipher feedback mask */
> +#define SPU2_CFB_MASK_SHIFT              12
> +#define SPU2_PROTO_SEL             0xF00000 /* MACsec, IPsec, TLS... */
> +#define SPU2_PROTO_SEL_SHIFT             20
> +#define SPU2_HASH_FIRST           0x1000000 /* 1: hash input is input pkt
> +                                            * data
> +                                            */
> +#define SPU2_CHK_TAG              0x2000000 /* 1: check digest provided */
> +#define SPU2_HASH_TYPE          0x1F0000000 /* one of spu2_hash_type */
> +#define SPU2_HASH_TYPE_SHIFT             28
> +#define SPU2_HASH_MODE         0xF000000000 /* one of spu2_hash_mode */
> +#define SPU2_HASH_MODE_SHIFT             36
> +#define SPU2_CIPH_PAD_EN     0x100000000000 /* 1: Add pad to end of payload for
> +                                            *    enc
> +                                            */
> +#define SPU2_CIPH_PAD      0xFF000000000000 /* cipher pad value */
> +#define SPU2_CIPH_PAD_SHIFT              48
> +
> +/* FMD ctrl1 field masks */
> +#define SPU2_TAG_LOC                    0x1 /* 1: end of payload, 0: undef */
> +#define SPU2_HAS_FR_DATA                0x2 /* 1: msg has frame data */
> +#define SPU2_HAS_AAD1                   0x4 /* 1: msg has AAD1 field */
> +#define SPU2_HAS_NAAD                   0x8 /* 1: msg has NAAD field */
> +#define SPU2_HAS_AAD2                  0x10 /* 1: msg has AAD2 field */
> +#define SPU2_HAS_ESN                   0x20 /* 1: msg has ESN field */
> +#define SPU2_HASH_KEY_LEN            0xFF00 /* len of hash key in bytes.
> +                                            * HMAC only.
> +                                            */
> +#define SPU2_HASH_KEY_LEN_SHIFT           8
> +#define SPU2_CIPH_KEY_LEN         0xFF00000 /* len of cipher key in bytes */
> +#define SPU2_CIPH_KEY_LEN_SHIFT          20
> +#define SPU2_GENIV               0x10000000 /* 1: hw generates IV */
> +#define SPU2_HASH_IV             0x20000000 /* 1: IV incl in hash */
> +#define SPU2_RET_IV              0x40000000 /* 1: return IV in output msg
> +                                            *    b4 payload
> +                                            */
> +#define SPU2_RET_IV_LEN         0xF00000000 /* length in bytes of IV returned.
> +                                            * 0 = 16 bytes
> +                                            */
> +#define SPU2_RET_IV_LEN_SHIFT            32
> +#define SPU2_IV_OFFSET         0xF000000000 /* gen IV offset */
> +#define SPU2_IV_OFFSET_SHIFT             36
> +#define SPU2_IV_LEN          0x1F0000000000 /* length of input IV in bytes */
> +#define SPU2_IV_LEN_SHIFT                40
> +#define SPU2_HASH_TAG_LEN  0x7F000000000000 /* hash tag length in bytes */
> +#define SPU2_HASH_TAG_LEN_SHIFT          48
> +#define SPU2_RETURN_MD    0x300000000000000 /* return metadata */
> +#define SPU2_RETURN_MD_SHIFT             56
> +#define SPU2_RETURN_FD    0x400000000000000
> +#define SPU2_RETURN_AAD1  0x800000000000000
> +#define SPU2_RETURN_NAAD 0x1000000000000000
> +#define SPU2_RETURN_AAD2 0x2000000000000000
> +#define SPU2_RETURN_PAY  0x4000000000000000 /* return payload */
> +
> +/* FMD ctrl2 field masks */
> +#define SPU2_AAD1_OFFSET              0xFFF /* byte offset of AAD1 field */
> +#define SPU2_AAD1_LEN               0xFF000 /* length of AAD1 in bytes */
> +#define SPU2_AAD1_LEN_SHIFT              12
> +#define SPU2_AAD2_OFFSET         0xFFF00000 /* byte offset of AAD2 field */
> +#define SPU2_AAD2_OFFSET_SHIFT           20
> +#define SPU2_PL_OFFSET   0xFFFFFFFF00000000 /* payload offset from AAD2 */
> +#define SPU2_PL_OFFSET_SHIFT             32
> +
> +/* FMD ctrl3 field masks */
> +#define SPU2_PL_LEN              0xFFFFFFFF /* payload length in bytes */
> +#define SPU2_TLS_LEN         0xFFFF00000000 /* TLS encrypt: cipher len
> +                                            * TLS decrypt: compressed len
> +                                            */
> +#define SPU2_TLS_LEN_SHIFT               32
> +
> +/*
> + * Max value that can be represented in the Payload Length field of the
> + * ctrl3 word of FMD.
> + */
> +#define SPU2_MAX_PAYLOAD  SPU2_PL_LEN
> +
> +/* Error values returned in STATUS field of response messages */
> +#define SPU2_INVALID_ICV  1
> +
> +void spu2_dump_msg_hdr(u8 *buf, unsigned int buf_len);
> +u32 spu2_ctx_max_payload(enum spu_cipher_alg cipher_alg,
> +                        enum spu_cipher_mode cipher_mode,
> +                        unsigned int blocksize);
> +u32 spu2_payload_length(u8 *spu_hdr);
> +u16 spu2_response_hdr_len(u16 auth_key_len, u16 enc_key_len, bool is_hash);
> +u16 spu2_hash_pad_len(enum hash_alg hash_alg, enum hash_mode hash_mode,
> +                     u32 chunksize, u16 hash_block_size);
> +u32 spu2_gcm_ccm_pad_len(enum spu_cipher_mode cipher_mode,
> +                        unsigned int data_size);
> +u32 spu2_assoc_resp_len(enum spu_cipher_mode cipher_mode,
> +                       unsigned int assoc_len, unsigned int iv_len,
> +                       bool is_encrypt);
> +u8 spu2_aead_ivlen(enum spu_cipher_mode cipher_mode,
> +                  u16 iv_len);
> +enum hash_type spu2_hash_type(u32 src_sent);
> +u32 spu2_digest_size(u32 alg_digest_size, enum hash_alg alg,
> +                    enum hash_type htype);
> +u32 spu2_create_request(u8 *spu_hdr,
> +                       struct spu_request_opts *req_opts,
> +                       struct spu_cipher_parms *cipher_parms,
> +                       struct spu_hash_parms *hash_parms,
> +                       struct spu_aead_parms *aead_parms,
> +                       unsigned int data_size);
> +u16 spu2_cipher_req_init(u8 *spu_hdr, struct spu_cipher_parms *cipher_parms);
> +void spu2_cipher_req_finish(u8 *spu_hdr,
> +                           u16 spu_req_hdr_len,
> +                           unsigned int is_inbound,
> +                           struct spu_cipher_parms *cipher_parms,
> +                           bool update_key,
> +                           unsigned int data_size);
> +void spu2_request_pad(u8 *pad_start, u32 gcm_padding, u32 hash_pad_len,
> +                     enum hash_alg auth_alg, enum hash_mode auth_mode,
> +                     unsigned int total_sent, u32 status_padding);
> +u8 spu2_xts_tweak_in_payload(void);
> +u8 spu2_tx_status_len(void);
> +u8 spu2_rx_status_len(void);
> +int spu2_status_process(u8 *statp);
> +void spu2_ccm_update_iv(unsigned int digestsize,
> +                       struct spu_cipher_parms *cipher_parms,
> +                       unsigned int assoclen, unsigned int chunksize,
> +                       bool is_encrypt, bool is_esp);
> +u32 spu2_wordalign_padlen(u32 data_size);
> +#endif
> diff --git a/drivers/crypto/bcm/spum.h b/drivers/crypto/bcm/spum.h
> new file mode 100644
> index 0000000..d0a5b58
> --- /dev/null
> +++ b/drivers/crypto/bcm/spum.h
> @@ -0,0 +1,174 @@
> +/*
> + * Copyright 2016 Broadcom
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License, version 2, as
> + * published by the Free Software Foundation (the "GPL").
> + *
> + * This program is distributed in the hope that it will be useful, but
> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License version 2 (GPLv2) for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * version 2 (GPLv2) along with this source code.
> + */
> +
> +/*
> + * This file contains SPU message definitions specific to SPU-M.
> + */
> +
> +#ifndef _SPUM_H_
> +#define _SPUM_H_
> +
> +#define SPU_CRYPTO_OPERATION_GENERIC   0x1
> +
> +/* Length of STATUS field in tx and rx packets */
> +#define SPU_TX_STATUS_LEN  4
> +
> +/* SPU-M error codes */
> +#define SPU_STATUS_MASK                 0x0000FF00
> +#define SPU_STATUS_SUCCESS              0x00000000
> +#define SPU_STATUS_INVALID_ICV          0x00000100
> +
> +#define SPU_STATUS_ERROR_FLAG           0x00020000
> +
> +/* Request message. MH + EMH + BDESC + BD header */
> +#define SPU_REQ_FIXED_LEN 24
> +
> +/*
> + * Max length of a SPU message header. Used to allocate a buffer where
> + * the SPU message header is constructed. Can be used for either a SPU-M
> + * header or a SPU2 header.
> + * For SPU-M, sum of the following:
> + *    MH - 4 bytes
> + *    EMH - 4
> + *    SCTX - 3 +
> + *      max auth key len - 64
> + *      max cipher key len - 264 (RC4)
> + *      max IV len - 16
> + *    BDESC - 12
> + *    BD header - 4
> + * Total:  371
> + *
> + * For SPU2, FMD_SIZE (32) plus lengths of hash and cipher keys,
> + * hash and cipher IVs. If SPU2 does not support RC4, then
> + */
> +#define SPU_HEADER_ALLOC_LEN  (SPU_REQ_FIXED_LEN + MAX_KEY_SIZE + \
> +                               MAX_KEY_SIZE + MAX_IV_SIZE)
> +
> +/*
> + * Response message header length. Normally MH, EMH, BD header, but when
> + * BD_SUPPRESS is used for hash requests, there is no BD header.
> + */
> +#define SPU_RESP_HDR_LEN 12
> +#define SPU_HASH_RESP_HDR_LEN 8
> +
> +/*
> + * Max value that can be represented in the Payload Length field of the BD
> + * header. This is a 16-bit field.
> + */
> +#define SPUM_NS2_MAX_PAYLOAD  (BIT(16) - 1)
> +
> +/*
> + * NSP SPU is limited to ~9KB because of FA2 FIFO size limitations;
> + * Set MAX_PAYLOAD to 8k to allow for addition of header, digest, etc.
> + * and stay within limitation.
> + */
> +
> +#define SPUM_NSP_MAX_PAYLOAD   8192
> +
> +/* Buffer Descriptor Header [BDESC]. SPU in big-endian mode. */
> +struct BDESC_HEADER {
> +       u16 offset_mac;         /* word 0 [31-16] */
> +       u16 length_mac;         /* word 0 [15-0]  */
> +       u16 offset_crypto;      /* word 1 [31-16] */
> +       u16 length_crypto;      /* word 1 [15-0]  */
> +       u16 offset_icv;         /* word 2 [31-16] */
> +       u16 offset_iv;          /* word 2 [15-0]  */
> +};
> +
> +/* Buffer Data Header [BD]. SPU in big-endian mode. */
> +struct BD_HEADER {
> +       u16 size;
> +       u16 prev_length;
> +};
> +
> +/* Command Context Header. SPU-M in big endian mode. */
> +struct MHEADER {
> +       u8 flags;       /* [31:24] */
> +       u8 op_code;     /* [23:16] */
> +       u16 reserved;   /* [15:0] */
> +};
> +
> +/* MH header flags bits */
> +#define MH_SUPDT_PRES   BIT(0)
> +#define MH_HASH_PRES    BIT(2)
> +#define MH_BD_PRES      BIT(3)
> +#define MH_MFM_PRES     BIT(4)
> +#define MH_BDESC_PRES   BIT(5)
> +#define MH_SCTX_PRES   BIT(7)
> +
> +/* SCTX word 0 bit offsets and fields masks */
> +#define SCTX_SIZE               0x000000FF
> +
> +/* SCTX word 1 bit shifts and field masks */
> +#define  UPDT_OFST              0x000000FF   /* offset of SCTX updateable fld */
> +#define  HASH_TYPE              0x00000300   /* hash alg operation type */
> +#define  HASH_TYPE_SHIFT                 8
> +#define  HASH_MODE              0x00001C00   /* one of spu2_hash_mode */
> +#define  HASH_MODE_SHIFT                10
> +#define  HASH_ALG               0x0000E000   /* hash algorithm */
> +#define  HASH_ALG_SHIFT                 13
> +#define  CIPHER_TYPE            0x00030000   /* encryption operation type */
> +#define  CIPHER_TYPE_SHIFT              16
> +#define  CIPHER_MODE            0x001C0000   /* encryption mode */
> +#define  CIPHER_MODE_SHIFT              18
> +#define  CIPHER_ALG             0x00E00000   /* encryption algo */
> +#define  CIPHER_ALG_SHIFT               21
> +#define  ICV_IS_512                BIT(27)
> +#define  ICV_IS_512_SHIFT              27
> +#define  CIPHER_ORDER               BIT(30)
> +#define  CIPHER_ORDER_SHIFT             30
> +#define  CIPHER_INBOUND             BIT(31)
> +#define  CIPHER_INBOUND_SHIFT           31
> +
> +/* SCTX word 2 bit shifts and field masks */
> +#define  EXP_IV_SIZE                   0x7
> +#define  IV_OFFSET                   BIT(3)
> +#define  IV_OFFSET_SHIFT                 3
> +#define  GEN_IV                      BIT(5)
> +#define  GEN_IV_SHIFT                    5
> +#define  EXPLICIT_IV                 BIT(6)
> +#define  EXPLICIT_IV_SHIFT               6
> +#define  SCTX_IV                     BIT(7)
> +#define  SCTX_IV_SHIFT                   7
> +#define  ICV_SIZE                   0x0F00
> +#define  ICV_SIZE_SHIFT                  8
> +#define  CHECK_ICV                  BIT(12)
> +#define  CHECK_ICV_SHIFT                12
> +#define  INSERT_ICV                 BIT(13)
> +#define  INSERT_ICV_SHIFT               13
> +#define  BD_SUPPRESS                BIT(19)
> +#define  BD_SUPPRESS_SHIFT              19
> +
> +/* Generic Mode Security Context Structure [SCTX] */
> +struct SCTX {
> +/* word 0: protocol flags */
> +       u32 proto_flags;
> +
> +/* word 1: cipher flags */
> +       u32 cipher_flags;
> +
> +/* word 2: Extended cipher flags */
> +       u32 ecf;
> +
> +};
> +
> +struct SPUHEADER {
> +       struct MHEADER mh;
> +       u32 emh;
> +       struct SCTX sa;
> +};
> +
> +#endif /* _SPUM_H_ */
> diff --git a/drivers/crypto/bcm/util.c b/drivers/crypto/bcm/util.c
> new file mode 100644
> index 0000000..dca540f
> --- /dev/null
> +++ b/drivers/crypto/bcm/util.c
> @@ -0,0 +1,584 @@
> +/*
> + * Copyright 2016 Broadcom
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License, version 2, as
> + * published by the Free Software Foundation (the "GPL").
> + *
> + * This program is distributed in the hope that it will be useful, but
> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License version 2 (GPLv2) for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * version 2 (GPLv2) along with this source code.
> + */
> +
> +#include <linux/debugfs.h>
> +
> +#include "cipher.h"
> +#include "util.h"
> +
> +/* offset of SPU_OFIFO_CTRL register */
> +#define SPU_OFIFO_CTRL      0x40
> +#define SPU_FIFO_WATERMARK  0x1FF
> +
> +/**
> + * spu_sg_at_offset() - Find the scatterlist entry at a given distance from the
> + * start of a scatterlist.
> + * @sg:         [in]  Start of a scatterlist
> + * @skip:       [in]  Distance from the start of the scatterlist, in bytes
> + * @sge:        [out] Scatterlist entry at skip bytes from start
> + * @sge_offset: [out] Number of bytes from start of sge buffer to get to
> + *                    requested distance.
> + *
> + * Return: 0 if entry found at requested distance
> + *         < 0 otherwise
> + */
> +int spu_sg_at_offset(struct scatterlist *sg, unsigned int skip,
> +                    struct scatterlist **sge, unsigned int *sge_offset)
> +{
> +       /* byte index from start of sg to the end of the previous entry */
> +       unsigned int index = 0;
> +       /* byte index from start of sg to the end of the current entry */
> +       unsigned int next_index;
> +
> +       next_index = sg->length;
> +       while (next_index <= skip) {
> +               sg = sg_next(sg);
> +               index = next_index;
> +               if (!sg)
> +                       return -EINVAL;
> +               next_index += sg->length;
> +       }
> +
> +       *sge_offset = skip - index;
> +       *sge = sg;
> +       return 0;
> +}
> +
> +/* Copy len bytes of sg data, starting at offset skip, to a dest buffer */
> +void sg_copy_part_to_buf(struct scatterlist *src, u8 *dest,
> +                        unsigned int len, unsigned int skip)
> +{
> +       size_t copied;
> +       unsigned int nents = sg_nents(src);
> +
> +       copied = sg_pcopy_to_buffer(src, nents, dest, len, skip);
> +       if (copied != len) {
> +               flow_log("%s copied %u bytes of %u requested. ",
> +                        __func__, (u32)copied, len);
> +               flow_log("sg with %u entries and skip %u\n", nents, skip);
> +       }
> +}
> +
> +/*
> + * Copy data into a scatterlist starting at a specified offset in the
> + * scatterlist. Specifically, copy len bytes of data in the buffer src
> + * into the scatterlist dest, starting skip bytes into the scatterlist.
> + */
> +void sg_copy_part_from_buf(struct scatterlist *dest, u8 *src,
> +                          unsigned int len, unsigned int skip)
> +{
> +       size_t copied;
> +       unsigned int nents = sg_nents(dest);
> +
> +       copied = sg_pcopy_from_buffer(dest, nents, src, len, skip);
> +       if (copied != len) {
> +               flow_log("%s copied %u bytes of %u requested. ",
> +                        __func__, (u32)copied, len);
> +               flow_log("sg with %u entries and skip %u\n", nents, skip);
> +       }
> +}
> +
> +/**
> + * spu_sg_count() - Determine number of elements in scatterlist to provide a
> + * specified number of bytes.
> + * @sg_list:  scatterlist to examine
> + * @skip:     index of starting point
> + * @nbytes:   consider elements of scatterlist until reaching this number of
> + *           bytes
> + *
> + * Return: the number of sg entries contributing to nbytes of data
> + */
> +int spu_sg_count(struct scatterlist *sg_list, unsigned int skip, int nbytes)
> +{
> +       struct scatterlist *sg;
> +       int sg_nents = 0;
> +       unsigned int offset;
> +
> +       if (!sg_list)
> +               return 0;
> +
> +       if (spu_sg_at_offset(sg_list, skip, &sg, &offset) < 0)
> +               return 0;
> +
> +       while (sg && (nbytes > 0)) {
> +               sg_nents++;
> +               nbytes -= (sg->length - offset);
> +               offset = 0;
> +               sg = sg_next(sg);
> +       }
> +       return sg_nents;
> +}
> +
> +/**
> + * spu_msg_sg_add() - Copy scatterlist entries from one sg to another, up to a
> + * given length.
> + * @to_sg:       scatterlist to copy to
> + * @from_sg:     scatterlist to copy from
> + * @from_skip:   number of bytes to skip in from_sg. Non-zero when previous
> + *              request included part of the buffer in entry in from_sg.
> + *              Assumes from_skip < from_sg->length.
> + * @from_nents   number of entries in from_sg
> + * @length       number of bytes to copy. may reach this limit before exhausting
> + *              from_sg.
> + *
> + * Copies the entries themselves, not the data in the entries. Assumes to_sg has
> + * enough entries. Does not limit the size of an individual buffer in to_sg.
> + *
> + * to_sg, from_sg, skip are all updated to end of copy
> + *
> + * Return: Number of bytes copied
> + */
> +u32 spu_msg_sg_add(struct scatterlist **to_sg,
> +                  struct scatterlist **from_sg, u32 *from_skip,
> +                  u8 from_nents, u32 length)
> +{
> +       struct scatterlist *sg; /* an entry in from_sg */
> +       struct scatterlist *to = *to_sg;
> +       struct scatterlist *from = *from_sg;
> +       u32 skip = *from_skip;
> +       u32 offset;
> +       int i;
> +       u32 entry_len = 0;
> +       u32 frag_len = 0;       /* length of entry added to to_sg */
> +       u32 copied = 0;         /* number of bytes copied so far */
> +
> +       if (length == 0)
> +               return 0;
> +
> +       for_each_sg(from, sg, from_nents, i) {
> +               /* number of bytes in this from entry not yet used */
> +               entry_len = sg->length - skip;
> +               frag_len = min(entry_len, length - copied);
> +               offset = sg->offset + skip;
> +               if (frag_len)
> +                       sg_set_page(to++, sg_page(sg), frag_len, offset);
> +               copied += frag_len;
> +               if (copied == entry_len) {
> +                       /* used up all of from entry */
> +                       skip = 0;       /* start at beginning of next entry */
> +               }
> +               if (copied == length)
> +                       break;
> +       }
> +       *to_sg = to;
> +       *from_sg = sg;
> +       if (frag_len < entry_len)
> +               *from_skip = skip + frag_len;
> +       else
> +               *from_skip = 0;
> +
> +       return copied;
> +}
> +
> +void add_to_ctr(u8 *ctr_pos, unsigned int increment)
> +{
> +       __be64 *high_be = (__be64 *)ctr_pos;
> +       __be64 *low_be = high_be + 1;
> +       u64 orig_low = __be64_to_cpu(*low_be);
> +       u64 new_low = orig_low + (u64)increment;
> +
> +       *low_be = __cpu_to_be64(new_low);
> +       if (new_low < orig_low)
> +               /* there was a carry from the low 8 bytes */
> +               *high_be = __cpu_to_be64(__be64_to_cpu(*high_be) + 1);
> +}
> +
> +struct sdesc {
> +       struct shash_desc shash;
> +       char ctx[];
> +};
> +
> +/* do a synchronous decrypt operation */
> +int do_decrypt(char *alg_name,
> +              void *key_ptr, unsigned int key_len,
> +              void *iv_ptr, void *src_ptr, void *dst_ptr,
> +              unsigned int block_len)
> +{
> +       struct scatterlist sg_in[1], sg_out[1];
> +       struct crypto_blkcipher *tfm =
> +           crypto_alloc_blkcipher(alg_name, 0, CRYPTO_ALG_ASYNC);
> +       struct blkcipher_desc desc = {.tfm = tfm, .flags = 0 };
> +       int ret = 0;
> +       void *iv;
> +       int ivsize;
> +
> +       flow_log("%s() name:%s block_len:%u\n", __func__, alg_name, block_len);
> +
> +       if (IS_ERR(tfm))
> +               return PTR_ERR(tfm);
> +
> +       crypto_blkcipher_setkey((void *)tfm, key_ptr, key_len);
> +
> +       sg_init_table(sg_in, 1);
> +       sg_set_buf(sg_in, src_ptr, block_len);
> +
> +       sg_init_table(sg_out, 1);
> +       sg_set_buf(sg_out, dst_ptr, block_len);
> +
> +       iv = crypto_blkcipher_crt(tfm)->iv;
> +       ivsize = crypto_blkcipher_ivsize(tfm);
> +       memcpy(iv, iv_ptr, ivsize);
> +
> +       ret = crypto_blkcipher_decrypt(&desc, sg_out, sg_in, block_len);
> +       crypto_free_blkcipher(tfm);
> +
> +       if (ret < 0)
> +               pr_err("aes_decrypt failed %d\n", ret);
> +
> +       return ret;
> +}
> +
> +/**
> + * do_shash() - Do a synchronous hash operation in software
> + * @name:       The name of the hash algorithm
> + * @result:     Buffer where digest is to be written
> + * @data1:      First part of data to hash. May be NULL.
> + * @data1_len:  Length of data1, in bytes
> + * @data2:      Second part of data to hash. May be NULL.
> + * @data2_len:  Length of data2, in bytes
> + * @key:       Key (if keyed hash)
> + * @key_len:   Length of key, in bytes (or 0 if non-keyed hash)
> + *
> + * Note that the crypto API will not select this driver's own transform because
> + * this driver only registers asynchronous algos.
> + *
> + * Return: 0 if hash successfully stored in result
> + *         < 0 otherwise
> + */
> +int do_shash(unsigned char *name, unsigned char *result,
> +            const u8 *data1, unsigned int data1_len,
> +            const u8 *data2, unsigned int data2_len,
> +            const u8 *key, unsigned int key_len)
> +{
> +       int rc;
> +       unsigned int size;
> +       struct crypto_shash *hash;
> +       struct sdesc *sdesc;
> +
> +       hash = crypto_alloc_shash(name, 0, 0);
> +       if (IS_ERR(hash)) {
> +               rc = PTR_ERR(hash);
> +               pr_err("%s: Crypto %s allocation error %d", __func__, name, rc);
> +               return rc;
> +       }
> +
> +       size = sizeof(struct shash_desc) + crypto_shash_descsize(hash);
> +       sdesc = kmalloc(size, GFP_KERNEL);
> +       if (!sdesc) {
> +               rc = -ENOMEM;
> +               pr_err("%s: Memory allocation failure", __func__);
> +               goto do_shash_err;
> +       }
> +       sdesc->shash.tfm = hash;
> +       sdesc->shash.flags = 0x0;
> +
> +       if (key_len > 0) {
> +               rc = crypto_shash_setkey(hash, key, key_len);
> +               if (rc) {
> +                       pr_err("%s: Could not setkey %s shash", __func__, name);
> +                       goto do_shash_err;
> +               }
> +       }
> +
> +       rc = crypto_shash_init(&sdesc->shash);
> +       if (rc) {
> +               pr_err("%s: Could not init %s shash", __func__, name);
> +               goto do_shash_err;
> +       }
> +       rc = crypto_shash_update(&sdesc->shash, data1, data1_len);
> +       if (rc) {
> +               pr_err("%s: Could not update1", __func__);
> +               goto do_shash_err;
> +       }
> +       if (data2 && data2_len) {
> +               rc = crypto_shash_update(&sdesc->shash, data2, data2_len);
> +               if (rc) {
> +                       pr_err("%s: Could not update2", __func__);
> +                       goto do_shash_err;
> +               }
> +       }
> +       rc = crypto_shash_final(&sdesc->shash, result);
> +       if (rc)
> +               pr_err("%s: Could not genereate %s hash", __func__, name);
> +
> +do_shash_err:
> +       crypto_free_shash(hash);
> +       kfree(sdesc);
> +
> +       return rc;
> +}
> +
> +/* Dump len bytes of a scatterlist starting at skip bytes into the sg */
> +void __dump_sg(struct scatterlist *sg, unsigned int skip, unsigned int len)
> +{
> +       u8 dbuf[16];
> +       unsigned int idx = skip;
> +       unsigned int num_out = 0;       /* number of bytes dumped so far */
> +       unsigned int count;
> +
> +       if (packet_debug_logging) {
> +               while (num_out < len) {
> +                       count = (len - num_out > 16) ? 16 : len - num_out;
> +                       sg_copy_part_to_buf(sg, dbuf, count, idx);
> +                       num_out += count;
> +                       print_hex_dump(KERN_ALERT, "  sg: ", DUMP_PREFIX_NONE,
> +                                      4, 1, dbuf, count, false);
> +                       idx += 16;
> +               }
> +       }
> +       if (debug_logging_sleep)
> +               msleep(debug_logging_sleep);
> +}
> +
> +/* Returns the name for a given cipher alg/mode */
> +char *spu_alg_name(enum spu_cipher_alg alg, enum spu_cipher_mode mode)
> +{
> +       switch (alg) {
> +       case CIPHER_ALG_RC4:
> +               return "rc4";
> +       case CIPHER_ALG_AES:
> +               switch (mode) {
> +               case CIPHER_MODE_CBC:
> +                       return "cbc(aes)";
> +               case CIPHER_MODE_ECB:
> +                       return "ecb(aes)";
> +               case CIPHER_MODE_OFB:
> +                       return "ofb(aes)";
> +               case CIPHER_MODE_CFB:
> +                       return "cfb(aes)";
> +               case CIPHER_MODE_CTR:
> +                       return "ctr(aes)";
> +               case CIPHER_MODE_XTS:
> +                       return "xts(aes)";
> +               case CIPHER_MODE_GCM:
> +                       return "gcm(aes)";
> +               default:
> +                       return "aes";
> +               }
> +               break;
> +       case CIPHER_ALG_DES:
> +               switch (mode) {
> +               case CIPHER_MODE_CBC:
> +                       return "cbc(des)";
> +               case CIPHER_MODE_ECB:
> +                       return "ecb(des)";
> +               case CIPHER_MODE_CTR:
> +                       return "ctr(des)";
> +               default:
> +                       return "des";
> +               }
> +               break;
> +       case CIPHER_ALG_3DES:
> +               switch (mode) {
> +               case CIPHER_MODE_CBC:
> +                       return "cbc(des3_ede)";
> +               case CIPHER_MODE_ECB:
> +                       return "ecb(des3_ede)";
> +               case CIPHER_MODE_CTR:
> +                       return "ctr(des3_ede)";
> +               default:
> +                       return "3des";
> +               }
> +               break;
> +       default:
> +               return "other";
> +       }
> +}
> +
> +static ssize_t spu_debugfs_read(struct file *filp, char __user *ubuf,
> +                               size_t count, loff_t *offp)
> +{
> +       struct device_private *ipriv;
> +       char *buf;
> +       ssize_t ret, out_offset, out_count;
> +       int i;
> +       u32 fifo_len;
> +       u32 spu_ofifo_ctrl;
> +       u32 alg;
> +       u32 mode;
> +       u32 op_cnt;
> +
> +       out_count = 2048;
> +
> +       buf = kmalloc(out_count, GFP_KERNEL);
> +       if (!buf)
> +               return -ENOMEM;
> +
> +       ipriv = filp->private_data;
> +       out_offset = 0;
> +       out_offset += snprintf(buf + out_offset, out_count - out_offset,
> +                              "Number of SPUs.........%u\n",
> +                              ipriv->spu.num_spu);
> +       out_offset += snprintf(buf + out_offset, out_count - out_offset,
> +                              "Number of channels.....%u\n",
> +                              ipriv->spu.num_chan);
> +       out_offset += snprintf(buf + out_offset, out_count - out_offset,
> +                              "Current sessions.......%u\n",
> +                              atomic_read(&ipriv->session_count));
> +       out_offset += snprintf(buf + out_offset, out_count - out_offset,
> +                              "Session count..........%u\n",
> +                              atomic_read(&ipriv->stream_count));
> +       out_offset += snprintf(buf + out_offset, out_count - out_offset,
> +                              "Cipher setkey..........%u\n",
> +                              atomic_read(&ipriv->setkey_cnt[SPU_OP_CIPHER]));
> +       out_offset += snprintf(buf + out_offset, out_count - out_offset,
> +                              "Cipher Ops.............%u\n",
> +                              atomic_read(&ipriv->op_counts[SPU_OP_CIPHER]));
> +       for (alg = 0; alg < CIPHER_ALG_LAST; alg++) {
> +               for (mode = 0; mode < CIPHER_MODE_LAST; mode++) {
> +                       op_cnt = atomic_read(&ipriv->cipher_cnt[alg][mode]);
> +                       if (op_cnt) {
> +                               out_offset += snprintf(buf + out_offset,
> +                                                      out_count - out_offset,
> +                              "  %-13s%11u\n",
> +                              spu_alg_name(alg, mode), op_cnt);
> +                       }
> +               }
> +       }
> +       out_offset += snprintf(buf + out_offset, out_count - out_offset,
> +                              "Hash Ops...............%u\n",
> +                              atomic_read(&ipriv->op_counts[SPU_OP_HASH]));
> +       for (alg = 0; alg < HASH_ALG_LAST; alg++) {
> +               op_cnt = atomic_read(&ipriv->hash_cnt[alg]);
> +               if (op_cnt) {
> +                       out_offset += snprintf(buf + out_offset,
> +                                              out_count - out_offset,
> +                      "  %-13s%11u\n",
> +                      hash_alg_name[alg], op_cnt);
> +               }
> +       }
> +       out_offset += snprintf(buf + out_offset, out_count - out_offset,
> +                              "HMAC setkey............%u\n",
> +                              atomic_read(&ipriv->setkey_cnt[SPU_OP_HMAC]));
> +       out_offset += snprintf(buf + out_offset, out_count - out_offset,
> +                              "HMAC Ops...............%u\n",
> +                              atomic_read(&ipriv->op_counts[SPU_OP_HMAC]));
> +       for (alg = 0; alg < HASH_ALG_LAST; alg++) {
> +               op_cnt = atomic_read(&ipriv->hmac_cnt[alg]);
> +               if (op_cnt) {
> +                       out_offset += snprintf(buf + out_offset,
> +                                              out_count - out_offset,
> +                      "  %-13s%11u\n",
> +                      hash_alg_name[alg], op_cnt);
> +               }
> +       }
> +       out_offset += snprintf(buf + out_offset, out_count - out_offset,
> +                              "AEAD setkey............%u\n",
> +                              atomic_read(&ipriv->setkey_cnt[SPU_OP_AEAD]));
> +
> +       out_offset += snprintf(buf + out_offset, out_count - out_offset,
> +                              "AEAD Ops...............%u\n",
> +                              atomic_read(&ipriv->op_counts[SPU_OP_AEAD]));
> +       for (alg = 0; alg < AEAD_TYPE_LAST; alg++) {
> +               op_cnt = atomic_read(&ipriv->aead_cnt[alg]);
> +               if (op_cnt) {
> +                       out_offset += snprintf(buf + out_offset,
> +                                              out_count - out_offset,
> +                      "  %-13s%11u\n",
> +                      aead_alg_name[alg], op_cnt);
> +               }
> +       }
> +       out_offset += snprintf(buf + out_offset, out_count - out_offset,
> +                              "Bytes of req data......%llu\n",
> +                              (u64)atomic64_read(&ipriv->bytes_out));
> +       out_offset += snprintf(buf + out_offset, out_count - out_offset,
> +                              "Bytes of resp data.....%llu\n",
> +                              (u64)atomic64_read(&ipriv->bytes_in));
> +       out_offset += snprintf(buf + out_offset, out_count - out_offset,
> +                              "Mailbox full...........%u\n",
> +                              atomic_read(&ipriv->mb_no_spc));
> +       out_offset += snprintf(buf + out_offset, out_count - out_offset,
> +                              "Mailbox send failures..%u\n",
> +                              atomic_read(&ipriv->mb_send_fail));
> +       out_offset += snprintf(buf + out_offset, out_count - out_offset,
> +                              "Check ICV errors.......%u\n",
> +                              atomic_read(&ipriv->bad_icv));
> +       if (ipriv->spu.spu_type == SPU_TYPE_SPUM)
> +               for (i = 0; i < ipriv->spu.num_spu; i++) {
> +                       spu_ofifo_ctrl = ioread32(ipriv->spu.reg_vbase[i] +
> +                                                 SPU_OFIFO_CTRL);
> +                       fifo_len = spu_ofifo_ctrl & SPU_FIFO_WATERMARK;
> +                       out_offset += snprintf(buf + out_offset,
> +                                              out_count - out_offset,
> +                                      "SPU %d output FIFO high water.....%u\n",
> +                                      i, fifo_len);
> +               }
> +
> +       if (out_offset > out_count)
> +               out_offset = out_count;
> +
> +       ret = simple_read_from_buffer(ubuf, count, offp, buf, out_offset);
> +       kfree(buf);
> +       return ret;
> +}
> +
> +static const struct file_operations spu_debugfs_stats = {
> +       .owner = THIS_MODULE,
> +       .open = simple_open,
> +       .read = spu_debugfs_read,
> +};
> +
> +/*
> + * Create the debug FS directories. If the top-level directory has not yet
> + * been created, create it now. Create a stats file in this directory for
> + * a SPU.
> + */
> +void spu_setup_debugfs(void)
> +{
> +       if (!debugfs_initialized())
> +               return;
> +
> +       if (!iproc_priv.debugfs_dir)
> +               iproc_priv.debugfs_dir = debugfs_create_dir(KBUILD_MODNAME,
> +                                                           NULL);
> +
> +       if (!iproc_priv.debugfs_stats)
> +               /* Create file with permissions S_IRUSR */
> +               debugfs_create_file("stats", 0400, iproc_priv.debugfs_dir,
> +                                   &iproc_priv, &spu_debugfs_stats);
> +}
> +
> +void spu_free_debugfs(void)
> +{
> +       debugfs_remove_recursive(iproc_priv.debugfs_dir);
> +       iproc_priv.debugfs_dir = NULL;
> +}
> +
> +/**
> + * format_value_ccm() - Format a value into a buffer, using a specified number
> + *                     of bytes (i.e. maybe writing value X into a 4 byte
> + *                     buffer, or maybe into a 12 byte buffer), as per the
> + *                     SPU CCM spec.
> + *
> + * @val:               value to write (up to max of unsigned int)
> + * @buf:               (pointer to) buffer to write the value
> + * @len:               number of bytes to use (0 to 255)
> + *
> + */
> +void format_value_ccm(unsigned int val, u8 *buf, u8 len)
> +{
> +       int i;
> +
> +       /* First clear full output buffer */
> +       memset(buf, 0, len);
> +
> +       /* Then, starting from right side, fill in with data */
> +       for (i = 0; i < len; i++) {
> +               buf[len - i - 1] = (val >> (8 * i)) & 0xff;
> +               if (i >= 3)
> +                       break;  /* Only handle up to 32 bits of 'val' */
> +       }
> +}
> diff --git a/drivers/crypto/bcm/util.h b/drivers/crypto/bcm/util.h
> new file mode 100644
> index 0000000..b858c45
> --- /dev/null
> +++ b/drivers/crypto/bcm/util.h
> @@ -0,0 +1,117 @@
> +/*
> + * Copyright 2016 Broadcom
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License, version 2, as
> + * published by the Free Software Foundation (the "GPL").
> + *
> + * This program is distributed in the hope that it will be useful, but
> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License version 2 (GPLv2) for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * version 2 (GPLv2) along with this source code.
> + */
> +
> +#ifndef _UTIL_H
> +#define _UTIL_H
> +
> +#include <linux/kernel.h>
> +#include <linux/delay.h>
> +
> +#include "spu.h"
> +
> +extern int flow_debug_logging;
> +extern int packet_debug_logging;
> +extern int debug_logging_sleep;
> +
> +#ifdef DEBUG
> +#define flow_log(...)                  \
> +       do {                                  \
> +               if (flow_debug_logging) {               \
> +                       printk(__VA_ARGS__);              \
> +                       if (debug_logging_sleep)              \
> +                               msleep(debug_logging_sleep);    \
> +               }                                       \
> +       } while (0)
> +#define flow_dump(msg, var, var_len)      \
> +       do {                                     \
> +               if (flow_debug_logging) {                  \
> +                       print_hex_dump(KERN_ALERT, msg, DUMP_PREFIX_NONE,  \
> +                                       16, 1, var, var_len, false); \
> +                               if (debug_logging_sleep)               \
> +                                       msleep(debug_logging_sleep);   \
> +               }                                    \
> +       } while (0)
> +
> +#define packet_log(...)               \
> +       do {                                \
> +               if (packet_debug_logging) {       \
> +                       printk(__VA_ARGS__);            \
> +                       if (debug_logging_sleep)        \
> +                               msleep(debug_logging_sleep);  \
> +               }                                 \
> +       } while (0)
> +#define packet_dump(msg, var, var_len)   \
> +       do {                                   \
> +               if (packet_debug_logging) {          \
> +                       print_hex_dump(KERN_ALERT, msg, DUMP_PREFIX_NONE,  \
> +                                       16, 1, var, var_len, false); \
> +                       if (debug_logging_sleep)           \
> +                               msleep(debug_logging_sleep);     \
> +               }                                    \
> +       } while (0)
> +
> +void __dump_sg(struct scatterlist *sg, unsigned int skip, unsigned int len);
> +
> +#define dump_sg(sg, skip, len)     __dump_sg(sg, skip, len)
> +
> +#else /* !DEBUG_ON */
> +
> +#define flow_log(...) do {} while (0)
> +#define flow_dump(msg, var, var_len) do {} while (0)
> +#define packet_log(...) do {} while (0)
> +#define packet_dump(msg, var, var_len) do {} while (0)
> +
> +#define dump_sg(sg, skip, len) do {} while (0)
> +
> +#endif /* DEBUG_ON */
> +
> +int spu_sg_at_offset(struct scatterlist *sg, unsigned int skip,
> +                    struct scatterlist **sge, unsigned int *sge_offset);
> +
> +/* Copy sg data, from skip, length len, to dest */
> +void sg_copy_part_to_buf(struct scatterlist *src, u8 *dest,
> +                        unsigned int len, unsigned int skip);
> +/* Copy src into scatterlist from offset, length len */
> +void sg_copy_part_from_buf(struct scatterlist *dest, u8 *src,
> +                          unsigned int len, unsigned int skip);
> +
> +int spu_sg_count(struct scatterlist *sg_list, unsigned int skip, int nbytes);
> +u32 spu_msg_sg_add(struct scatterlist **to_sg,
> +                  struct scatterlist **from_sg, u32 *skip,
> +                  u8 from_nents, u32 tot_len);
> +
> +void add_to_ctr(u8 *ctr_pos, unsigned int increment);
> +
> +/* do a synchronous decrypt operation */
> +int do_decrypt(char *alg_name,
> +              void *key_ptr, unsigned int key_len,
> +              void *iv_ptr, void *src_ptr, void *dst_ptr,
> +              unsigned int block_len);
> +
> +/* produce a message digest from data of length n bytes */
> +int do_shash(unsigned char *name, unsigned char *result,
> +            const u8 *data1, unsigned int data1_len,
> +            const u8 *data2, unsigned int data2_len,
> +            const u8 *key, unsigned int key_len);
> +
> +char *spu_alg_name(enum spu_cipher_alg alg, enum spu_cipher_mode mode);
> +
> +void spu_setup_debugfs(void);
> +void spu_free_debugfs(void);
> +void spu_free_debugfs_stats(void);
> +void format_value_ccm(unsigned int val, u8 *buf, u8 len);
> +
> +#endif
> --
> 2.1.0
>

^ permalink raw reply

* [PATCH v2 6/6] crypto: arm/crc32 - accelerated support based on x86 SSE implementation
From: Ard Biesheuvel @ 2016-12-04 11:54 UTC (permalink / raw)
  To: linux-crypto, herbert; +Cc: linux-arm-kernel, Ard Biesheuvel
In-Reply-To: <1480852447-25082-1-git-send-email-ard.biesheuvel@linaro.org>

This is a combination of the the Intel algorithm implemented using SSE
and PCLMULQDQ instructions from arch/x86/crypto/crc32-pclmul_asm.S, and
the new CRC32 extensions introduced for both 32-bit and 64-bit ARM in
version 8 of the architecture. Two versions of the above combo are
provided, one for CRC32 and one for CRC32C.

The PMULL/NEON algorithm is faster, but operates on blocks of at least
64 bytes, and on multiples of 16 bytes only. For the remaining input,
or for all input on systems that lack the PMULL 64x64->128 instructions,
the CRC32 instructions will be used.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm/crypto/Kconfig         |   5 +
 arch/arm/crypto/Makefile        |   2 +
 arch/arm/crypto/crc32-ce-core.S | 306 ++++++++++++++++++++
 arch/arm/crypto/crc32-ce-glue.c | 195 +++++++++++++
 4 files changed, 508 insertions(+)

diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig
index fce801fa52a1..de7bb20815bf 100644
--- a/arch/arm/crypto/Kconfig
+++ b/arch/arm/crypto/Kconfig
@@ -125,4 +125,9 @@ config CRYPTO_CRCT10DIF_ARM_CE
 	depends on KERNEL_MODE_NEON && CRC_T10DIF
 	select CRYPTO_HASH
 
+config CRYPTO_CRC32_ARM_CE
+	tristate "CRC32(C) digest algorithm using CRC and/or PMULL instructions"
+	depends on KERNEL_MODE_NEON && CRC32
+	select CRYPTO_HASH
+
 endif
diff --git a/arch/arm/crypto/Makefile b/arch/arm/crypto/Makefile
index fc77265014b7..b578a1820ab1 100644
--- a/arch/arm/crypto/Makefile
+++ b/arch/arm/crypto/Makefile
@@ -14,6 +14,7 @@ ce-obj-$(CONFIG_CRYPTO_SHA1_ARM_CE) += sha1-arm-ce.o
 ce-obj-$(CONFIG_CRYPTO_SHA2_ARM_CE) += sha2-arm-ce.o
 ce-obj-$(CONFIG_CRYPTO_GHASH_ARM_CE) += ghash-arm-ce.o
 ce-obj-$(CONFIG_CRYPTO_CRCT10DIF_ARM_CE) += crct10dif-arm-ce.o
+ce-obj-$(CONFIG_CRYPTO_CRC32_ARM_CE) += crc32-arm-ce.o
 
 ifneq ($(ce-obj-y)$(ce-obj-m),)
 ifeq ($(call as-instr,.fpu crypto-neon-fp-armv8,y,n),y)
@@ -38,6 +39,7 @@ sha2-arm-ce-y	:= sha2-ce-core.o sha2-ce-glue.o
 aes-arm-ce-y	:= aes-ce-core.o aes-ce-glue.o
 ghash-arm-ce-y	:= ghash-ce-core.o ghash-ce-glue.o
 crct10dif-arm-ce-y	:= crct10dif-ce-core.o crct10dif-ce-glue.o
+crc32-arm-ce-y:= crc32-ce-core.o crc32-ce-glue.o
 
 quiet_cmd_perl = PERL    $@
       cmd_perl = $(PERL) $(<) > $(@)
diff --git a/arch/arm/crypto/crc32-ce-core.S b/arch/arm/crypto/crc32-ce-core.S
new file mode 100644
index 000000000000..70e0c8042880
--- /dev/null
+++ b/arch/arm/crypto/crc32-ce-core.S
@@ -0,0 +1,306 @@
+/*
+ * Accelerated CRC32(C) using ARM CRC, NEON and Crypto Extensions instructions
+ *
+ * Copyright (C) 2016 Linaro Ltd <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+/* GPL HEADER START
+ *
+ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 only,
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License version 2 for more details (a copy is included
+ * in the LICENSE file that accompanied this code).
+ *
+ * You should have received a copy of the GNU General Public License
+ * version 2 along with this program; If not, see http://www.gnu.org/licenses
+ *
+ * Please  visit http://www.xyratex.com/contact if you need additional
+ * information or have any questions.
+ *
+ * GPL HEADER END
+ */
+
+/*
+ * Copyright 2012 Xyratex Technology Limited
+ *
+ * Using hardware provided PCLMULQDQ instruction to accelerate the CRC32
+ * calculation.
+ * CRC32 polynomial:0x04c11db7(BE)/0xEDB88320(LE)
+ * PCLMULQDQ is a new instruction in Intel SSE4.2, the reference can be found
+ * at:
+ * http://www.intel.com/products/processor/manuals/
+ * Intel(R) 64 and IA-32 Architectures Software Developer's Manual
+ * Volume 2B: Instruction Set Reference, N-Z
+ *
+ * Authors:   Gregory Prestas <Gregory_Prestas@us.xyratex.com>
+ *	      Alexander Boyko <Alexander_Boyko@xyratex.com>
+ */
+
+#include <linux/linkage.h>
+#include <asm/assembler.h>
+
+	.text
+	.align		6
+	.arch		armv8-a
+	.arch_extension	crc
+	.fpu		crypto-neon-fp-armv8
+
+.Lcrc32_constants:
+	/*
+	 * [x4*128+32 mod P(x) << 32)]'  << 1   = 0x154442bd4
+	 * #define CONSTANT_R1  0x154442bd4LL
+	 *
+	 * [(x4*128-32 mod P(x) << 32)]' << 1   = 0x1c6e41596
+	 * #define CONSTANT_R2  0x1c6e41596LL
+	 */
+	.quad		0x0000000154442bd4
+	.quad		0x00000001c6e41596
+
+	/*
+	 * [(x128+32 mod P(x) << 32)]'   << 1   = 0x1751997d0
+	 * #define CONSTANT_R3  0x1751997d0LL
+	 *
+	 * [(x128-32 mod P(x) << 32)]'   << 1   = 0x0ccaa009e
+	 * #define CONSTANT_R4  0x0ccaa009eLL
+	 */
+	.quad		0x00000001751997d0
+	.quad		0x00000000ccaa009e
+
+	/*
+	 * [(x64 mod P(x) << 32)]'       << 1   = 0x163cd6124
+	 * #define CONSTANT_R5  0x163cd6124LL
+	 */
+	.quad		0x0000000163cd6124
+	.quad		0x00000000FFFFFFFF
+
+	/*
+	 * #define CRCPOLY_TRUE_LE_FULL 0x1DB710641LL
+	 *
+	 * Barrett Reduction constant (u64`) = u` = (x**64 / P(x))`
+	 *                                                      = 0x1F7011641LL
+	 * #define CONSTANT_RU  0x1F7011641LL
+	 */
+	.quad		0x00000001DB710641
+	.quad		0x00000001F7011641
+
+.Lcrc32c_constants:
+	.quad		0x00000000740eef02
+	.quad		0x000000009e4addf8
+	.quad		0x00000000f20c0dfe
+	.quad		0x000000014cd00bd6
+	.quad		0x00000000dd45aab8
+	.quad		0x00000000FFFFFFFF
+	.quad		0x0000000105ec76f0
+	.quad		0x00000000dea713f1
+
+	dCONSTANTl	.req	d0
+	dCONSTANTh	.req	d1
+	qCONSTANT	.req	q0
+
+	BUF		.req	r0
+	LEN		.req	r1
+	CRC		.req	r2
+
+	qzr		.req	q9
+
+	/**
+	 * Calculate crc32
+	 * BUF - buffer
+	 * LEN - sizeof buffer (multiple of 16 bytes), LEN should be > 63
+	 * CRC - initial crc32
+	 * return %eax crc32
+	 * uint crc32_pmull_le(unsigned char const *buffer,
+	 *                     size_t len, uint crc32)
+	 */
+ENTRY(crc32_pmull_le)
+	adr		r3, .Lcrc32_constants
+	b		0f
+
+ENTRY(crc32c_pmull_le)
+	adr		r3, .Lcrc32c_constants
+
+0:	bic		LEN, LEN, #15
+	vld1.8		{q1-q2}, [BUF]!
+	vld1.8		{q3-q4}, [BUF]!
+	vmov.i8		qzr, #0
+	vmov.i8		qCONSTANT, #0
+	vmov		dCONSTANTl[0], CRC
+	veor.8		d2, d2, dCONSTANTl
+	sub		LEN, LEN, #0x40
+	cmp		LEN, #0x40
+	blt		less_64
+
+	vld1.64		{qCONSTANT}, [r3]
+
+loop_64:		/* 64 bytes Full cache line folding */
+	sub		LEN, LEN, #0x40
+
+	vmull.p64	q5, d3, dCONSTANTh
+	vmull.p64	q6, d5, dCONSTANTh
+	vmull.p64	q7, d7, dCONSTANTh
+	vmull.p64	q8, d9, dCONSTANTh
+
+	vmull.p64	q1, d2, dCONSTANTl
+	vmull.p64	q2, d4, dCONSTANTl
+	vmull.p64	q3, d6, dCONSTANTl
+	vmull.p64	q4, d8, dCONSTANTl
+
+	veor.8		q1, q1, q5
+	vld1.8		{q5}, [BUF]!
+	veor.8		q2, q2, q6
+	vld1.8		{q6}, [BUF]!
+	veor.8		q3, q3, q7
+	vld1.8		{q7}, [BUF]!
+	veor.8		q4, q4, q8
+	vld1.8		{q8}, [BUF]!
+
+	veor.8		q1, q1, q5
+	veor.8		q2, q2, q6
+	veor.8		q3, q3, q7
+	veor.8		q4, q4, q8
+
+	cmp		LEN, #0x40
+	bge		loop_64
+
+less_64:		/* Folding cache line into 128bit */
+	vldr		dCONSTANTl, [r3, #16]
+	vldr		dCONSTANTh, [r3, #24]
+
+	vmull.p64	q5, d3, dCONSTANTh
+	vmull.p64	q1, d2, dCONSTANTl
+	veor.8		q1, q1, q5
+	veor.8		q1, q1, q2
+
+	vmull.p64	q5, d3, dCONSTANTh
+	vmull.p64	q1, d2, dCONSTANTl
+	veor.8		q1, q1, q5
+	veor.8		q1, q1, q3
+
+	vmull.p64	q5, d3, dCONSTANTh
+	vmull.p64	q1, d2, dCONSTANTl
+	veor.8		q1, q1, q5
+	veor.8		q1, q1, q4
+
+	teq		LEN, #0
+	beq		fold_64
+
+loop_16:		/* Folding rest buffer into 128bit */
+	subs		LEN, LEN, #0x10
+
+	vld1.8		{q2}, [BUF]!
+	vmull.p64	q5, d3, dCONSTANTh
+	vmull.p64	q1, d2, dCONSTANTl
+	veor.8		q1, q1, q5
+	veor.8		q1, q1, q2
+
+	bne		loop_16
+
+fold_64:
+	/* perform the last 64 bit fold, also adds 32 zeroes
+	 * to the input stream */
+	vmull.p64	q2, d2, dCONSTANTh
+	vext.8		q1, q1, qzr, #8
+	veor.8		q1, q1, q2
+
+	/* final 32-bit fold */
+	vldr		dCONSTANTl, [r3, #32]
+	vldr		d6, [r3, #40]
+	vmov.i8		d7, #0
+
+	vext.8		q2, q1, qzr, #4
+	vand.8		d2, d2, d6
+	vmull.p64	q1, d2, dCONSTANTl
+	veor.8		q1, q1, q2
+
+	/* Finish up with the bit-reversed barrett reduction 64 ==> 32 bits */
+	vldr		dCONSTANTl, [r3, #48]
+	vldr		dCONSTANTh, [r3, #56]
+
+	vand.8		q2, q1, q3
+	vext.8		q2, qzr, q2, #8
+	vmull.p64	q2, d5, dCONSTANTh
+	vand.8		q2, q2, q3
+	vmull.p64	q2, d4, dCONSTANTl
+	veor.8		q1, q1, q2
+	vmov		r0, s5
+
+	bx		lr
+ENDPROC(crc32_pmull_le)
+ENDPROC(crc32c_pmull_le)
+
+	.macro		__crc32, c
+	subs		ip, r2, #8
+	bmi		.Ltail
+
+	tst		r1, #3
+	bne		.Lunaligned
+
+	teq		ip, #0
+.Laligned8\c:
+	ldrd		r2, r3, [r1], #8
+ARM_BE8(rev		r2, r2		)
+ARM_BE8(rev		r3, r3		)
+	crc32\c\()w	r0, r0, r2
+	crc32\c\()w	r0, r0, r3
+	bxeq		lr
+	subs		ip, ip, #8
+	bpl		.Laligned8\c
+
+.Ltail\c:
+	tst		ip, #4
+	beq		2f
+	ldr		r3, [r1], #4
+ARM_BE8(rev		r3, r3		)
+	crc32\c\()w	r0, r0, r3
+
+2:	tst		ip, #2
+	beq		1f
+	ldrh		r3, [r1], #2
+ARM_BE8(rev16		r3, r3		)
+	crc32\c\()h	r0, r0, r3
+
+1:	tst		ip, #1
+	bxeq		lr
+	ldrb		r3, [r1]
+	crc32\c\()b	r0, r0, r3
+	bx		lr
+
+.Lunaligned\c:
+	tst		r1, #1
+	beq		2f
+	ldrb		r3, [r1], #1
+	subs		r2, r2, #1
+	crc32\c\()b	r0, r0, r3
+
+	tst		r1, #2
+	beq		0f
+2:	ldrh		r3, [r1], #2
+	subs		r2, r2, #2
+ARM_BE8(rev16		r3, r3		)
+	crc32\c\()h	r0, r0, r3
+
+0:	subs		ip, r2, #8
+	bpl		.Laligned8\c
+	b		.Ltail\c
+	.endm
+
+	.align		5
+ENTRY(crc32_armv8_le)
+	__crc32
+ENDPROC(crc32_armv8_le)
+
+	.align		5
+ENTRY(crc32c_armv8_le)
+	__crc32		c
+ENDPROC(crc32c_armv8_le)
diff --git a/arch/arm/crypto/crc32-ce-glue.c b/arch/arm/crypto/crc32-ce-glue.c
new file mode 100644
index 000000000000..f78cf1689669
--- /dev/null
+++ b/arch/arm/crypto/crc32-ce-glue.c
@@ -0,0 +1,195 @@
+/*
+ * Accelerated CRC32(C) using ARM CRC, NEON and Crypto Extensions instructions
+ *
+ * Copyright (C) 2016 Linaro Ltd <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/crc32.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/string.h>
+
+#include <crypto/internal/hash.h>
+
+#include <asm/hwcap.h>
+#include <asm/neon.h>
+#include <asm/simd.h>
+#include <asm/unaligned.h>
+
+#define PMULL_MIN_LEN		64L	/* minimum size of buffer
+					 * for crc32_pmull_le_16 */
+#define SCALE_F			16L	/* size of NEON register */
+
+asmlinkage u32 crc32_pmull_le(const u8 buf[], u32 len, u32 init_crc);
+asmlinkage u32 crc32_armv8_le(u32 init_crc, const u8 buf[], u32 len);
+
+asmlinkage u32 crc32c_pmull_le(const u8 buf[], u32 len, u32 init_crc);
+asmlinkage u32 crc32c_armv8_le(u32 init_crc, const u8 buf[], u32 len);
+
+static int crc32_pmull_cra_init(struct crypto_tfm *tfm)
+{
+	u32 *key = crypto_tfm_ctx(tfm);
+
+	*key = 0;
+	return 0;
+}
+
+static int crc32c_pmull_cra_init(struct crypto_tfm *tfm)
+{
+	u32 *key = crypto_tfm_ctx(tfm);
+
+	*key = ~0;
+	return 0;
+}
+
+static int crc32_pmull_setkey(struct crypto_shash *hash, const u8 *key,
+			      unsigned int keylen)
+{
+	u32 *mctx = crypto_shash_ctx(hash);
+
+	if (keylen != sizeof(u32)) {
+		crypto_shash_set_flags(hash, CRYPTO_TFM_RES_BAD_KEY_LEN);
+		return -EINVAL;
+	}
+	*mctx = le32_to_cpup((__le32 *)key);
+	return 0;
+}
+
+static int crc32_pmull_init(struct shash_desc *desc)
+{
+	u32 *mctx = crypto_shash_ctx(desc->tfm);
+	u32 *crc = shash_desc_ctx(desc);
+
+	*crc = *mctx;
+	return 0;
+}
+
+static int crc32_pmull_update(struct shash_desc *desc, const u8 *data,
+			      unsigned int length)
+{
+	u32 *crc = shash_desc_ctx(desc);
+
+	if (length >= PMULL_MIN_LEN && may_use_simd() &&
+	    (elf_hwcap2 & HWCAP2_PMULL)) {
+		unsigned int l = round_down(length, SCALE_F);
+
+		kernel_neon_begin();
+		*crc = crc32_pmull_le(data, l, *crc);
+		kernel_neon_end();
+
+		data += l;
+		length -= l;
+	}
+
+	if (length > 0) {
+		if (elf_hwcap2 & HWCAP2_CRC32)
+			*crc = crc32_armv8_le(*crc, data, length);
+		else
+			*crc = crc32_le(*crc, data, length);
+	}
+
+	return 0;
+}
+
+static int crc32c_pmull_update(struct shash_desc *desc, const u8 *data,
+			       unsigned int length)
+{
+	u32 *crc = shash_desc_ctx(desc);
+
+	if (length >= PMULL_MIN_LEN && may_use_simd() &&
+	    (elf_hwcap2 & HWCAP2_PMULL)) {
+		unsigned int l = round_down(length, SCALE_F);
+
+		kernel_neon_begin();
+		*crc = crc32c_pmull_le(data, l, *crc);
+		kernel_neon_end();
+
+		data += l;
+		length -= l;
+	}
+
+	if (length > 0) {
+		if (elf_hwcap2 & HWCAP2_CRC32)
+			*crc = crc32c_armv8_le(*crc, data, length);
+		else
+			*crc = __crc32c_le(*crc, data, length);
+	}
+
+	return 0;
+}
+
+static int crc32_pmull_final(struct shash_desc *desc, u8 *out)
+{
+	u32 *crc = shash_desc_ctx(desc);
+
+	put_unaligned_le32(*crc, out);
+	return 0;
+}
+
+static int crc32c_pmull_final(struct shash_desc *desc, u8 *out)
+{
+	u32 *crc = shash_desc_ctx(desc);
+
+	put_unaligned_le32(~*crc, out);
+	return 0;
+}
+
+static struct shash_alg crc32_pmull_algs[] = { {
+	.setkey			= crc32_pmull_setkey,
+	.init			= crc32_pmull_init,
+	.update			= crc32_pmull_update,
+	.final			= crc32_pmull_final,
+	.descsize		= sizeof(u32),
+	.digestsize		= sizeof(u32),
+
+	.base.cra_ctxsize	= sizeof(u32),
+	.base.cra_init		= crc32_pmull_cra_init,
+	.base.cra_name		= "crc32",
+	.base.cra_driver_name	= "crc32-arm-ce",
+	.base.cra_priority	= 200,
+	.base.cra_blocksize	= 1,
+	.base.cra_module	= THIS_MODULE,
+}, {
+	.setkey			= crc32_pmull_setkey,
+	.init			= crc32_pmull_init,
+	.update			= crc32c_pmull_update,
+	.final			= crc32c_pmull_final,
+	.descsize		= sizeof(u32),
+	.digestsize		= sizeof(u32),
+
+	.base.cra_ctxsize	= sizeof(u32),
+	.base.cra_init		= crc32c_pmull_cra_init,
+	.base.cra_name		= "crc32c",
+	.base.cra_driver_name	= "crc32c-arm-ce",
+	.base.cra_priority	= 200,
+	.base.cra_blocksize	= 1,
+	.base.cra_module	= THIS_MODULE,
+} };
+
+static int __init crc32_pmull_mod_init(void)
+{
+	if (!(elf_hwcap2 & (HWCAP2_PMULL|HWCAP2_CRC32)))
+		return -ENODEV;
+
+	return crypto_register_shashes(crc32_pmull_algs,
+				       ARRAY_SIZE(crc32_pmull_algs));
+}
+
+static void __exit crc32_pmull_mod_exit(void)
+{
+	crypto_unregister_shashes(crc32_pmull_algs,
+				  ARRAY_SIZE(crc32_pmull_algs));
+}
+
+module_init(crc32_pmull_mod_init);
+module_exit(crc32_pmull_mod_exit);
+
+MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
+MODULE_LICENSE("GPL v2");
+MODULE_ALIAS_CRYPTO("crc32");
+MODULE_ALIAS_CRYPTO("crc32c");
-- 
2.7.4

^ permalink raw reply related

* [PATCH v2 5/6] crypto: arm64/crc32 - accelerated support based on x86 SSE implementation
From: Ard Biesheuvel @ 2016-12-04 11:54 UTC (permalink / raw)
  To: linux-crypto, herbert; +Cc: linux-arm-kernel, Ard Biesheuvel
In-Reply-To: <1480852447-25082-1-git-send-email-ard.biesheuvel@linaro.org>

This is a combination of the the Intel algorithm implemented using SSE
and PCLMULQDQ instructions from arch/x86/crypto/crc32-pclmul_asm.S, and
the new CRC32 extensions introduced for both 32-bit and 64-bit ARM in
version 8 of the architecture. Two versions of the above combo are
provided, one for CRC32 and one for CRC32C.

The PMULL/NEON algorithm is faster, but operates on blocks of at least
64 bytes, and on multiples of 16 bytes only. For the remaining input,
or for all input on systems that lack the PMULL 64x64->128 instructions,
the CRC32 instructions will be used.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/crypto/Kconfig         |   6 +
 arch/arm64/crypto/Makefile        |   3 +
 arch/arm64/crypto/crc32-ce-core.S | 266 ++++++++++++++++++++
 arch/arm64/crypto/crc32-ce-glue.c | 188 ++++++++++++++
 4 files changed, 463 insertions(+)

diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig
index d773c0659202..21835deb1ab9 100644
--- a/arch/arm64/crypto/Kconfig
+++ b/arch/arm64/crypto/Kconfig
@@ -28,6 +28,11 @@ config CRYPTO_CRCT10DIF_ARM64_CE
 	depends on KERNEL_MODE_NEON && CRC_T10DIF
 	select CRYPTO_HASH
 
+config CRYPTO_CRC32_ARM64_CE
+	tristate "CRC32 and CRC32C digest algorithms using PMULL instructions"
+	depends on KERNEL_MODE_NEON && CRC32
+	select CRYPTO_HASH
+
 config CRYPTO_AES_ARM64_CE
 	tristate "AES core cipher using ARMv8 Crypto Extensions"
 	depends on ARM64 && KERNEL_MODE_NEON
@@ -58,4 +63,5 @@ config CRYPTO_CRC32_ARM64
 	tristate "CRC32 and CRC32C using optional ARMv8 instructions"
 	depends on ARM64
 	select CRYPTO_HASH
+
 endif
diff --git a/arch/arm64/crypto/Makefile b/arch/arm64/crypto/Makefile
index 36fd3eb4201b..144387805a46 100644
--- a/arch/arm64/crypto/Makefile
+++ b/arch/arm64/crypto/Makefile
@@ -20,6 +20,9 @@ ghash-ce-y := ghash-ce-glue.o ghash-ce-core.o
 obj-$(CONFIG_CRYPTO_CRCT10DIF_ARM64_CE) += crct10dif-ce.o
 crct10dif-ce-y := crct10dif-ce-core.o crct10dif-ce-glue.o
 
+obj-$(CONFIG_CRYPTO_CRC32_ARM64_CE) += crc32-ce.o
+crc32-ce-y:= crc32-ce-core.o crc32-ce-glue.o
+
 obj-$(CONFIG_CRYPTO_AES_ARM64_CE) += aes-ce-cipher.o
 CFLAGS_aes-ce-cipher.o += -march=armv8-a+crypto
 
diff --git a/arch/arm64/crypto/crc32-ce-core.S b/arch/arm64/crypto/crc32-ce-core.S
new file mode 100644
index 000000000000..18f5a8442276
--- /dev/null
+++ b/arch/arm64/crypto/crc32-ce-core.S
@@ -0,0 +1,266 @@
+/*
+ * Accelerated CRC32(C) using arm64 CRC, NEON and Crypto Extensions instructions
+ *
+ * Copyright (C) 2016 Linaro Ltd <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+/* GPL HEADER START
+ *
+ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 only,
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License version 2 for more details (a copy is included
+ * in the LICENSE file that accompanied this code).
+ *
+ * You should have received a copy of the GNU General Public License
+ * version 2 along with this program; If not, see http://www.gnu.org/licenses
+ *
+ * Please  visit http://www.xyratex.com/contact if you need additional
+ * information or have any questions.
+ *
+ * GPL HEADER END
+ */
+
+/*
+ * Copyright 2012 Xyratex Technology Limited
+ *
+ * Using hardware provided PCLMULQDQ instruction to accelerate the CRC32
+ * calculation.
+ * CRC32 polynomial:0x04c11db7(BE)/0xEDB88320(LE)
+ * PCLMULQDQ is a new instruction in Intel SSE4.2, the reference can be found
+ * at:
+ * http://www.intel.com/products/processor/manuals/
+ * Intel(R) 64 and IA-32 Architectures Software Developer's Manual
+ * Volume 2B: Instruction Set Reference, N-Z
+ *
+ * Authors:   Gregory Prestas <Gregory_Prestas@us.xyratex.com>
+ *	      Alexander Boyko <Alexander_Boyko@xyratex.com>
+ */
+
+#include <linux/linkage.h>
+#include <asm/assembler.h>
+
+	.text
+	.align		6
+	.cpu		generic+crypto+crc
+
+.Lcrc32_constants:
+	/*
+	 * [x4*128+32 mod P(x) << 32)]'  << 1   = 0x154442bd4
+	 * #define CONSTANT_R1  0x154442bd4LL
+	 *
+	 * [(x4*128-32 mod P(x) << 32)]' << 1   = 0x1c6e41596
+	 * #define CONSTANT_R2  0x1c6e41596LL
+	 */
+	.octa		0x00000001c6e415960000000154442bd4
+
+	/*
+	 * [(x128+32 mod P(x) << 32)]'   << 1   = 0x1751997d0
+	 * #define CONSTANT_R3  0x1751997d0LL
+	 *
+	 * [(x128-32 mod P(x) << 32)]'   << 1   = 0x0ccaa009e
+	 * #define CONSTANT_R4  0x0ccaa009eLL
+	 */
+	.octa		0x00000000ccaa009e00000001751997d0
+
+	/*
+	 * [(x64 mod P(x) << 32)]'       << 1   = 0x163cd6124
+	 * #define CONSTANT_R5  0x163cd6124LL
+	 */
+	.quad		0x0000000163cd6124
+	.quad		0x00000000FFFFFFFF
+
+	/*
+	 * #define CRCPOLY_TRUE_LE_FULL 0x1DB710641LL
+	 *
+	 * Barrett Reduction constant (u64`) = u` = (x**64 / P(x))`
+	 *                                                      = 0x1F7011641LL
+	 * #define CONSTANT_RU  0x1F7011641LL
+	 */
+	.octa		0x00000001F701164100000001DB710641
+
+.Lcrc32c_constants:
+	.octa		0x000000009e4addf800000000740eef02
+	.octa		0x000000014cd00bd600000000f20c0dfe
+	.quad		0x00000000dd45aab8
+	.quad		0x00000000FFFFFFFF
+	.octa		0x00000000dea713f10000000105ec76f0
+
+	vCONSTANT	.req	v0
+	dCONSTANT	.req	d0
+	qCONSTANT	.req	q0
+
+	BUF		.req	x0
+	LEN		.req	x1
+	CRC		.req	x2
+
+	vzr		.req	v9
+
+	/**
+	 * Calculate crc32
+	 * BUF - buffer
+	 * LEN - sizeof buffer (multiple of 16 bytes), LEN should be > 63
+	 * CRC - initial crc32
+	 * return %eax crc32
+	 * uint crc32_pmull_le(unsigned char const *buffer,
+	 *                     size_t len, uint crc32)
+	 */
+ENTRY(crc32_pmull_le)
+	adr		x3, .Lcrc32_constants
+	b		0f
+
+ENTRY(crc32c_pmull_le)
+	adr		x3, .Lcrc32c_constants
+
+0:	bic		LEN, LEN, #15
+	ld1		{v1.16b-v4.16b}, [BUF], #0x40
+	movi		vzr.16b, #0
+	fmov		dCONSTANT, CRC
+	eor		v1.16b, v1.16b, vCONSTANT.16b
+	sub		LEN, LEN, #0x40
+	cmp		LEN, #0x40
+	b.lt		less_64
+
+	ldr		qCONSTANT, [x3]
+
+loop_64:		/* 64 bytes Full cache line folding */
+	sub		LEN, LEN, #0x40
+
+	pmull2		v5.1q, v1.2d, vCONSTANT.2d
+	pmull2		v6.1q, v2.2d, vCONSTANT.2d
+	pmull2		v7.1q, v3.2d, vCONSTANT.2d
+	pmull2		v8.1q, v4.2d, vCONSTANT.2d
+
+	pmull		v1.1q, v1.1d, vCONSTANT.1d
+	pmull		v2.1q, v2.1d, vCONSTANT.1d
+	pmull		v3.1q, v3.1d, vCONSTANT.1d
+	pmull		v4.1q, v4.1d, vCONSTANT.1d
+
+	eor		v1.16b, v1.16b, v5.16b
+	ld1		{v5.16b}, [BUF], #0x10
+	eor		v2.16b, v2.16b, v6.16b
+	ld1		{v6.16b}, [BUF], #0x10
+	eor		v3.16b, v3.16b, v7.16b
+	ld1		{v7.16b}, [BUF], #0x10
+	eor		v4.16b, v4.16b, v8.16b
+	ld1		{v8.16b}, [BUF], #0x10
+
+	eor		v1.16b, v1.16b, v5.16b
+	eor		v2.16b, v2.16b, v6.16b
+	eor		v3.16b, v3.16b, v7.16b
+	eor		v4.16b, v4.16b, v8.16b
+
+	cmp		LEN, #0x40
+	b.ge		loop_64
+
+less_64:		/* Folding cache line into 128bit */
+	ldr		qCONSTANT, [x3, #16]
+
+	pmull2		v5.1q, v1.2d, vCONSTANT.2d
+	pmull		v1.1q, v1.1d, vCONSTANT.1d
+	eor		v1.16b, v1.16b, v5.16b
+	eor		v1.16b, v1.16b, v2.16b
+
+	pmull2		v5.1q, v1.2d, vCONSTANT.2d
+	pmull		v1.1q, v1.1d, vCONSTANT.1d
+	eor		v1.16b, v1.16b, v5.16b
+	eor		v1.16b, v1.16b, v3.16b
+
+	pmull2		v5.1q, v1.2d, vCONSTANT.2d
+	pmull		v1.1q, v1.1d, vCONSTANT.1d
+	eor		v1.16b, v1.16b, v5.16b
+	eor		v1.16b, v1.16b, v4.16b
+
+	cbz		LEN, fold_64
+
+loop_16:		/* Folding rest buffer into 128bit */
+	subs		LEN, LEN, #0x10
+
+	ld1		{v2.16b}, [BUF], #0x10
+	pmull2		v5.1q, v1.2d, vCONSTANT.2d
+	pmull		v1.1q, v1.1d, vCONSTANT.1d
+	eor		v1.16b, v1.16b, v5.16b
+	eor		v1.16b, v1.16b, v2.16b
+
+	b.ne		loop_16
+
+fold_64:
+	/* perform the last 64 bit fold, also adds 32 zeroes
+	 * to the input stream */
+	ext		v2.16b, v1.16b, v1.16b, #8
+	pmull2		v2.1q, v2.2d, vCONSTANT.2d
+	ext		v1.16b, v1.16b, vzr.16b, #8
+	eor		v1.16b, v1.16b, v2.16b
+
+	/* final 32-bit fold */
+	ldr		dCONSTANT, [x3, #32]
+	ldr		d3, [x3, #40]
+
+	ext		v2.16b, v1.16b, vzr.16b, #4
+	and		v1.16b, v1.16b, v3.16b
+	pmull		v1.1q, v1.1d, vCONSTANT.1d
+	eor		v1.16b, v1.16b, v2.16b
+
+	/* Finish up with the bit-reversed barrett reduction 64 ==> 32 bits */
+	ldr		qCONSTANT, [x3, #48]
+
+	and		v2.16b, v1.16b, v3.16b
+	ext		v2.16b, vzr.16b, v2.16b, #8
+	pmull2		v2.1q, v2.2d, vCONSTANT.2d
+	and		v2.16b, v2.16b, v3.16b
+	pmull		v2.1q, v2.1d, vCONSTANT.1d
+	eor		v1.16b, v1.16b, v2.16b
+	mov		w0, v1.s[1]
+
+	ret
+ENDPROC(crc32_pmull_le)
+ENDPROC(crc32c_pmull_le)
+
+	.macro		__crc32, c
+0:	subs		x2, x2, #16
+	b.mi		8f
+	ldp		x3, x4, [x1], #16
+CPU_BE(	rev		x3, x3		)
+CPU_BE(	rev		x4, x4		)
+	crc32\c\()x	w0, w0, x3
+	crc32\c\()x	w0, w0, x4
+	b.ne		0b
+	ret
+
+8:	tbz		x2, #3, 4f
+	ldr		x3, [x1], #8
+CPU_BE(	rev		x3, x3		)
+	crc32\c\()x	w0, w0, x3
+4:	tbz		x2, #2, 2f
+	ldr		w3, [x1], #4
+CPU_BE(	rev		w3, w3		)
+	crc32\c\()w	w0, w0, w3
+2:	tbz		x2, #1, 1f
+	ldrh		w3, [x1], #2
+CPU_BE(	rev16		w3, w3		)
+	crc32\c\()h	w0, w0, w3
+1:	tbz		x2, #0, 0f
+	ldrb		w3, [x1]
+	crc32\c\()b	w0, w0, w3
+0:	ret
+	.endm
+
+	.align		5
+ENTRY(crc32_armv8_le)
+	__crc32
+ENDPROC(crc32_armv8_le)
+
+	.align		5
+ENTRY(crc32c_armv8_le)
+	__crc32		c
+ENDPROC(crc32c_armv8_le)
diff --git a/arch/arm64/crypto/crc32-ce-glue.c b/arch/arm64/crypto/crc32-ce-glue.c
new file mode 100644
index 000000000000..c3c7e5848e2a
--- /dev/null
+++ b/arch/arm64/crypto/crc32-ce-glue.c
@@ -0,0 +1,188 @@
+/*
+ * Accelerated CRC32(C) using arm64 NEON and Crypto Extensions instructions
+ *
+ * Copyright (C) 2016 Linaro Ltd <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/cpufeature.h>
+#include <linux/crc32.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/string.h>
+
+#include <crypto/internal/hash.h>
+
+#include <asm/hwcap.h>
+#include <asm/neon.h>
+#include <asm/unaligned.h>
+
+#define PMULL_MIN_LEN		64L	/* minimum size of buffer
+					 * for crc32_pmull_le_16 */
+#define SCALE_F			16L	/* size of NEON register */
+
+asmlinkage u32 crc32_pmull_le(const u8 buf[], u64 len, u32 init_crc);
+asmlinkage u32 crc32_armv8_le(u32 init_crc, const u8 buf[], u64 len);
+
+asmlinkage u32 crc32c_pmull_le(const u8 buf[], u64 len, u32 init_crc);
+asmlinkage u32 crc32c_armv8_le(u32 init_crc, const u8 buf[], u64 len);
+
+static int crc32_pmull_cra_init(struct crypto_tfm *tfm)
+{
+	u32 *key = crypto_tfm_ctx(tfm);
+
+	*key = 0;
+	return 0;
+}
+
+static int crc32c_pmull_cra_init(struct crypto_tfm *tfm)
+{
+	u32 *key = crypto_tfm_ctx(tfm);
+
+	*key = ~0;
+	return 0;
+}
+
+static int crc32_pmull_setkey(struct crypto_shash *hash, const u8 *key,
+			      unsigned int keylen)
+{
+	u32 *mctx = crypto_shash_ctx(hash);
+
+	if (keylen != sizeof(u32)) {
+		crypto_shash_set_flags(hash, CRYPTO_TFM_RES_BAD_KEY_LEN);
+		return -EINVAL;
+	}
+	*mctx = le32_to_cpup((__le32 *)key);
+	return 0;
+}
+
+static int crc32_pmull_init(struct shash_desc *desc)
+{
+	u32 *mctx = crypto_shash_ctx(desc->tfm);
+	u32 *crc = shash_desc_ctx(desc);
+
+	*crc = *mctx;
+	return 0;
+}
+
+static int crc32_pmull_update(struct shash_desc *desc, const u8 *data,
+			 unsigned int length)
+{
+	u32 *crc = shash_desc_ctx(desc);
+
+	if (length >= PMULL_MIN_LEN) {
+		unsigned int l = round_down(length, SCALE_F);
+
+		kernel_neon_begin_partial(10);
+		*crc = crc32_pmull_le(data, l, *crc);
+		kernel_neon_end();
+
+		data += l;
+		length -= l;
+	}
+
+	if (length > 0) {
+		if (elf_hwcap & HWCAP_CRC32)
+			*crc = crc32_armv8_le(*crc, data, length);
+		else
+			*crc = crc32_le(*crc, data, length);
+	}
+
+	return 0;
+}
+
+static int crc32c_pmull_update(struct shash_desc *desc, const u8 *data,
+			 unsigned int length)
+{
+	u32 *crc = shash_desc_ctx(desc);
+
+	if (length >= PMULL_MIN_LEN) {
+		unsigned int l = round_down(length, SCALE_F);
+
+		kernel_neon_begin_partial(10);
+		*crc = crc32c_pmull_le(data, l, *crc);
+		kernel_neon_end();
+
+		data += l;
+		length -= l;
+	}
+
+	if (length > 0) {
+		if (elf_hwcap & HWCAP_CRC32)
+			*crc = crc32c_armv8_le(*crc, data, length);
+		else
+			*crc = __crc32c_le(*crc, data, length);
+	}
+
+	return 0;
+}
+
+static int crc32_pmull_final(struct shash_desc *desc, u8 *out)
+{
+	u32 *crc = shash_desc_ctx(desc);
+
+	put_unaligned_le32(*crc, out);
+	return 0;
+}
+
+static int crc32c_pmull_final(struct shash_desc *desc, u8 *out)
+{
+	u32 *crc = shash_desc_ctx(desc);
+
+	put_unaligned_le32(~*crc, out);
+	return 0;
+}
+
+static struct shash_alg crc32_pmull_algs[] = { {
+	.setkey			= crc32_pmull_setkey,
+	.init			= crc32_pmull_init,
+	.update			= crc32_pmull_update,
+	.final			= crc32_pmull_final,
+	.descsize		= sizeof(u32),
+	.digestsize		= sizeof(u32),
+
+	.base.cra_ctxsize	= sizeof(u32),
+	.base.cra_init		= crc32_pmull_cra_init,
+	.base.cra_name		= "crc32",
+	.base.cra_driver_name	= "crc32-arm64-ce",
+	.base.cra_priority	= 200,
+	.base.cra_blocksize	= 1,
+	.base.cra_module	= THIS_MODULE,
+}, {
+	.setkey			= crc32_pmull_setkey,
+	.init			= crc32_pmull_init,
+	.update			= crc32c_pmull_update,
+	.final			= crc32c_pmull_final,
+	.descsize		= sizeof(u32),
+	.digestsize		= sizeof(u32),
+
+	.base.cra_ctxsize	= sizeof(u32),
+	.base.cra_init		= crc32c_pmull_cra_init,
+	.base.cra_name		= "crc32c",
+	.base.cra_driver_name	= "crc32c-arm64-ce",
+	.base.cra_priority	= 200,
+	.base.cra_blocksize	= 1,
+	.base.cra_module	= THIS_MODULE,
+} };
+
+static int __init crc32_pmull_mod_init(void)
+{
+	return crypto_register_shashes(crc32_pmull_algs,
+				       ARRAY_SIZE(crc32_pmull_algs));
+}
+
+static void __exit crc32_pmull_mod_exit(void)
+{
+	crypto_unregister_shashes(crc32_pmull_algs,
+				  ARRAY_SIZE(crc32_pmull_algs));
+}
+
+module_cpu_feature_match(PMULL, crc32_pmull_mod_init);
+module_exit(crc32_pmull_mod_exit);
+
+MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
+MODULE_LICENSE("GPL v2");
-- 
2.7.4

^ permalink raw reply related

* [PATCH v2 4/6] crypto: arm/crct10dif - port x86 SSE implementation to ARM
From: Ard Biesheuvel @ 2016-12-04 11:54 UTC (permalink / raw)
  To: linux-crypto, herbert; +Cc: linux-arm-kernel, Ard Biesheuvel
In-Reply-To: <1480852447-25082-1-git-send-email-ard.biesheuvel@linaro.org>

This is a transliteration of the Intel algorithm implemented
using SSE and PCLMULQDQ instructions that resides in the file
arch/x86/crypto/crct10dif-pcl-asm_64.S, but simplified to only
operate on multiples of 16 bytes. The residual data is handled
by the generic C implementation.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm/crypto/Kconfig             |   5 +
 arch/arm/crypto/Makefile            |   2 +
 arch/arm/crypto/crct10dif-ce-core.S | 349 ++++++++++++++++++++
 arch/arm/crypto/crct10dif-ce-glue.c |  95 ++++++
 4 files changed, 451 insertions(+)

diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig
index 27ed1b1cd1d7..fce801fa52a1 100644
--- a/arch/arm/crypto/Kconfig
+++ b/arch/arm/crypto/Kconfig
@@ -120,4 +120,9 @@ config CRYPTO_GHASH_ARM_CE
 	  that uses the 64x64 to 128 bit polynomial multiplication (vmull.p64)
 	  that is part of the ARMv8 Crypto Extensions
 
+config CRYPTO_CRCT10DIF_ARM_CE
+	tristate "CRCT10DIF digest algorithm using PMULL instructions"
+	depends on KERNEL_MODE_NEON && CRC_T10DIF
+	select CRYPTO_HASH
+
 endif
diff --git a/arch/arm/crypto/Makefile b/arch/arm/crypto/Makefile
index fc5150702b64..fc77265014b7 100644
--- a/arch/arm/crypto/Makefile
+++ b/arch/arm/crypto/Makefile
@@ -13,6 +13,7 @@ ce-obj-$(CONFIG_CRYPTO_AES_ARM_CE) += aes-arm-ce.o
 ce-obj-$(CONFIG_CRYPTO_SHA1_ARM_CE) += sha1-arm-ce.o
 ce-obj-$(CONFIG_CRYPTO_SHA2_ARM_CE) += sha2-arm-ce.o
 ce-obj-$(CONFIG_CRYPTO_GHASH_ARM_CE) += ghash-arm-ce.o
+ce-obj-$(CONFIG_CRYPTO_CRCT10DIF_ARM_CE) += crct10dif-arm-ce.o
 
 ifneq ($(ce-obj-y)$(ce-obj-m),)
 ifeq ($(call as-instr,.fpu crypto-neon-fp-armv8,y,n),y)
@@ -36,6 +37,7 @@ sha1-arm-ce-y	:= sha1-ce-core.o sha1-ce-glue.o
 sha2-arm-ce-y	:= sha2-ce-core.o sha2-ce-glue.o
 aes-arm-ce-y	:= aes-ce-core.o aes-ce-glue.o
 ghash-arm-ce-y	:= ghash-ce-core.o ghash-ce-glue.o
+crct10dif-arm-ce-y	:= crct10dif-ce-core.o crct10dif-ce-glue.o
 
 quiet_cmd_perl = PERL    $@
       cmd_perl = $(PERL) $(<) > $(@)
diff --git a/arch/arm/crypto/crct10dif-ce-core.S b/arch/arm/crypto/crct10dif-ce-core.S
new file mode 100644
index 000000000000..ae2adb54e905
--- /dev/null
+++ b/arch/arm/crypto/crct10dif-ce-core.S
@@ -0,0 +1,349 @@
+//
+// Accelerated CRC-T10DIF using ARM NEON and Crypto Extensions instructions
+//
+// Copyright (C) 2016 Linaro Ltd <ard.biesheuvel@linaro.org>
+//
+// This program is free software; you can redistribute it and/or modify
+// it under the terms of the GNU General Public License version 2 as
+// published by the Free Software Foundation.
+//
+
+//
+// Implement fast CRC-T10DIF computation with SSE and PCLMULQDQ instructions
+//
+// Copyright (c) 2013, Intel Corporation
+//
+// Authors:
+//     Erdinc Ozturk <erdinc.ozturk@intel.com>
+//     Vinodh Gopal <vinodh.gopal@intel.com>
+//     James Guilford <james.guilford@intel.com>
+//     Tim Chen <tim.c.chen@linux.intel.com>
+//
+// This software is available to you under a choice of one of two
+// licenses.  You may choose to be licensed under the terms of the GNU
+// General Public License (GPL) Version 2, available from the file
+// COPYING in the main directory of this source tree, or the
+// OpenIB.org BSD license below:
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are
+// met:
+//
+// * Redistributions of source code must retain the above copyright
+//   notice, this list of conditions and the following disclaimer.
+//
+// * Redistributions in binary form must reproduce the above copyright
+//   notice, this list of conditions and the following disclaimer in the
+//   documentation and/or other materials provided with the
+//   distribution.
+//
+// * Neither the name of the Intel Corporation nor the names of its
+//   contributors may be used to endorse or promote products derived from
+//   this software without specific prior written permission.
+//
+//
+// THIS SOFTWARE IS PROVIDED BY INTEL CORPORATION ""AS IS"" AND ANY
+// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL INTEL CORPORATION OR
+// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+// PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+// LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+//
+//       Function API:
+//       UINT16 crc_t10dif_pcl(
+//               UINT16 init_crc, //initial CRC value, 16 bits
+//               const unsigned char *buf, //buffer pointer to calculate CRC on
+//               UINT64 len //buffer length in bytes (64-bit data)
+//       );
+//
+//       Reference paper titled "Fast CRC Computation for Generic
+//	Polynomials Using PCLMULQDQ Instruction"
+//       URL: http://www.intel.com/content/dam/www/public/us/en/documents
+//  /white-papers/fast-crc-computation-generic-polynomials-pclmulqdq-paper.pdf
+//
+//
+
+#include <linux/linkage.h>
+#include <asm/assembler.h>
+
+#ifdef CONFIG_CPU_ENDIAN_BE8
+#define CPU_LE(code...)
+#else
+#define CPU_LE(code...)		code
+#endif
+
+	.text
+	.fpu		crypto-neon-fp-armv8
+
+	arg1_low32	.req	r0
+	arg2		.req	r1
+	arg3		.req	r2
+
+	qzr		.req	q13
+
+	q0l		.req	d0
+	q0h		.req	d1
+	q1l		.req	d2
+	q1h		.req	d3
+	q2l		.req	d4
+	q2h		.req	d5
+	q3l		.req	d6
+	q3h		.req	d7
+	q4l		.req	d8
+	q4h		.req	d9
+	q5l		.req	d10
+	q5h		.req	d11
+	q6l		.req	d12
+	q6h		.req	d13
+	q7l		.req	d14
+	q7h		.req	d15
+
+ENTRY(crc_t10dif_pmull)
+	push		{r4, lr}
+
+	vmov.i8		qzr, #0			// init zero register
+
+	// adjust the 16-bit initial_crc value, scale it to 32 bits
+	lsl		arg1_low32, arg1_low32, #16
+
+	// check if smaller than 256
+	cmp		arg3, #256
+
+	// for sizes less than 128, we can't fold 64B at a time...
+	blt		_less_than_128
+
+	// load the initial crc value
+	// crc value does not need to be byte-reflected, but it needs
+	// to be moved to the high part of the register.
+	// because data will be byte-reflected and will align with
+	// initial crc at correct place.
+	vmov		s0, arg1_low32		// initial crc
+	vext.8		q10, qzr, q0, #4
+
+	// receive the initial 64B data, xor the initial crc value
+	vld1.64		{q0-q1}, [arg2]!
+	vld1.64		{q2-q3}, [arg2]!
+	vld1.64		{q4-q5}, [arg2]!
+	vld1.64		{q6-q7}, [arg2]!
+CPU_LE(	vrev64.8	q0, q0			)
+CPU_LE(	vrev64.8	q1, q1			)
+CPU_LE(	vrev64.8	q2, q2			)
+CPU_LE(	vrev64.8	q3, q3			)
+CPU_LE(	vrev64.8	q4, q4			)
+CPU_LE(	vrev64.8	q5, q5			)
+CPU_LE(	vrev64.8	q6, q6			)
+CPU_LE(	vrev64.8	q7, q7			)
+
+	vswp		d0, d1
+	vswp		d2, d3
+	vswp		d4, d5
+	vswp		d6, d7
+	vswp		d8, d9
+	vswp		d10, d11
+	vswp		d12, d13
+	vswp		d14, d15
+
+	// XOR the initial_crc value
+	veor.8		q0, q0, q10
+
+	adr		ip, rk3
+	vld1.64		{q10}, [ip]	// xmm10 has rk3 and rk4
+					// type of pmull instruction
+					// will determine which constant to use
+
+	//
+	// we subtract 256 instead of 128 to save one instruction from the loop
+	//
+	sub		arg3, arg3, #256
+
+	// at this section of the code, there is 64*x+y (0<=y<64) bytes of
+	// buffer. The _fold_64_B_loop will fold 64B at a time
+	// until we have 64+y Bytes of buffer
+
+
+	// fold 64B at a time. This section of the code folds 4 vector
+	// registers in parallel
+_fold_64_B_loop:
+
+	.macro		fold64, reg1, reg2
+	vld1.64		{q11-q12}, [arg2]!
+
+	vmull.p64	q8, \reg1\()h, d21
+	vmull.p64	\reg1, \reg1\()l, d20
+	vmull.p64	q9, \reg2\()h, d21
+	vmull.p64	\reg2, \reg2\()l, d20
+
+CPU_LE(	vrev64.8	q11, q11		)
+CPU_LE(	vrev64.8	q12, q12		)
+	vswp		d22, d23
+	vswp		d24, d25
+
+	veor.8		\reg1, \reg1, q8
+	veor.8		\reg2, \reg2, q9
+	veor.8		\reg1, \reg1, q11
+	veor.8		\reg2, \reg2, q12
+	.endm
+
+	fold64		q0, q1
+	fold64		q2, q3
+	fold64		q4, q5
+	fold64		q6, q7
+
+	subs		arg3, arg3, #128
+
+	// check if there is another 64B in the buffer to be able to fold
+	bge		_fold_64_B_loop
+
+	// at this point, the buffer pointer is pointing at the last y Bytes
+	// of the buffer the 64B of folded data is in 4 of the vector
+	// registers: v0, v1, v2, v3
+
+	// fold the 8 vector registers to 1 vector register with different
+	// constants
+
+	adr		ip, rk9
+	vld1.64		{q10}, [ip]!
+
+	.macro		fold16, reg, rk
+	vmull.p64	q8, \reg\()l, d20
+	vmull.p64	\reg, \reg\()h, d21
+	.ifnb		\rk
+	vld1.64		{q10}, [ip]!
+	.endif
+	veor.8		q7, q7, q8
+	veor.8		q7, q7, \reg
+	.endm
+
+	fold16		q0, rk11
+	fold16		q1, rk13
+	fold16		q2, rk15
+	fold16		q3, rk17
+	fold16		q4, rk19
+	fold16		q5, rk1
+	fold16		q6
+
+	// instead of 64, we add 48 to the loop counter to save 1 instruction
+	// from the loop instead of a cmp instruction, we use the negative
+	// flag with the jl instruction
+	adds		arg3, arg3, #(128-16)
+	blt		_final_reduction_for_128
+
+	// now we have 16+y bytes left to reduce. 16 Bytes is in register v7
+	// and the rest is in memory. We can fold 16 bytes at a time if y>=16
+	// continue folding 16B at a time
+
+_16B_reduction_loop:
+	vmull.p64	q8, d14, d20
+	vmull.p64	q7, d15, d21
+	veor.8		q7, q7, q8
+
+	vld1.64		{q0}, [arg2]!
+CPU_LE(	vrev64.8	q0, q0		)
+	vswp		d0, d1
+	veor.8		q7, q7, q0
+	subs		arg3, arg3, #16
+
+	// instead of a cmp instruction, we utilize the flags with the
+	// jge instruction equivalent of: cmp arg3, 16-16
+	// check if there is any more 16B in the buffer to be able to fold
+	bge		_16B_reduction_loop
+
+_final_reduction_for_128:
+	// compute crc of a 128-bit value
+	vldr		d20, rk5
+	vldr		d21, rk6		// rk5 and rk6 in xmm10
+
+	// 64b fold
+	vext.8		q0, qzr, q7, #8
+	vmull.p64	q7, d15, d20
+	veor.8		q7, q7, q0
+
+	// 32b fold
+	vext.8		q0, q7, qzr, #12
+	vmov		s31, s3
+	vmull.p64	q0, d0, d21
+	veor.8		q7, q0, q7
+
+	// barrett reduction
+_barrett:
+	vldr		d20, rk7
+	vldr		d21, rk8
+
+	vmull.p64	q0, d15, d20
+	vext.8		q0, qzr, q0, #12
+	vmull.p64	q0, d1, d21
+	vext.8		q0, qzr, q0, #12
+	veor.8		q7, q7, q0
+	vmov		r0, s29
+
+_cleanup:
+	// scale the result back to 16 bits
+	lsr		r0, r0, #16
+	pop		{r4, pc}
+
+_less_than_128:
+	teq		arg3, #0
+	beq		_cleanup
+
+	vmov.i8		q0, #0
+	vmov		s3, arg1_low32		// get the initial crc value
+
+	vld1.64		{q7}, [arg2]!
+CPU_LE(	vrev64.8	q7, q7		)
+	vswp		d14, d15
+	veor.8		q7, q7, q0
+
+	// check if there is enough buffer to be able to fold 16B at a time
+	cmp		arg3, #32
+	blt		_final_reduction_for_128
+
+	// now if there is, load the constants
+	vldr		d20, rk1
+	vldr		d21, rk2		// rk1 and rk2 in xmm10
+
+	// update the counter. subtract 32 instead of 16 to save one
+	// instruction from the loop
+	sub		arg3, arg3, #32
+
+	b		_16B_reduction_loop
+ENDPROC(crc_t10dif_pmull)
+
+// precomputed constants
+// these constants are precomputed from the poly:
+// 0x8bb70000 (0x8bb7 scaled to 32 bits)
+	.align		4
+// Q = 0x18BB70000
+// rk1 = 2^(32*3) mod Q << 32
+// rk2 = 2^(32*5) mod Q << 32
+// rk3 = 2^(32*15) mod Q << 32
+// rk4 = 2^(32*17) mod Q << 32
+// rk5 = 2^(32*3) mod Q << 32
+// rk6 = 2^(32*2) mod Q << 32
+// rk7 = floor(2^64/Q)
+// rk8 = Q
+
+rk3:	.quad		0x9d9d000000000000
+rk4:	.quad		0x7cf5000000000000
+rk5:	.quad		0x2d56000000000000
+rk6:	.quad		0x1368000000000000
+rk7:	.quad		0x00000001f65a57f8
+rk8:	.quad		0x000000018bb70000
+rk9:	.quad		0xceae000000000000
+rk10:	.quad		0xbfd6000000000000
+rk11:	.quad		0x1e16000000000000
+rk12:	.quad		0x713c000000000000
+rk13:	.quad		0xf7f9000000000000
+rk14:	.quad		0x80a6000000000000
+rk15:	.quad		0x044c000000000000
+rk16:	.quad		0xe658000000000000
+rk17:	.quad		0xad18000000000000
+rk18:	.quad		0xa497000000000000
+rk19:	.quad		0x6ee3000000000000
+rk20:	.quad		0xe7b5000000000000
+rk1:	.quad		0x2d56000000000000
+rk2:	.quad		0x06df000000000000
diff --git a/arch/arm/crypto/crct10dif-ce-glue.c b/arch/arm/crypto/crct10dif-ce-glue.c
new file mode 100644
index 000000000000..8225422c34a7
--- /dev/null
+++ b/arch/arm/crypto/crct10dif-ce-glue.c
@@ -0,0 +1,95 @@
+/*
+ * Accelerated CRC-T10DIF using ARM NEON and Crypto Extensions instructions
+ *
+ * Copyright (C) 2016 Linaro Ltd <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/crc-t10dif.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/string.h>
+
+#include <crypto/internal/hash.h>
+
+#include <asm/neon.h>
+#include <asm/simd.h>
+
+#define CRC_T10DIF_PMULL_CHUNK_SIZE	16U
+
+asmlinkage u16 crc_t10dif_pmull(u16 init_crc, const u8 buf[], u64 len);
+
+static int crct10dif_init(struct shash_desc *desc)
+{
+	u16 *crc = shash_desc_ctx(desc);
+
+	*crc = 0;
+	return 0;
+}
+
+static int crct10dif_update(struct shash_desc *desc, const u8 *data,
+			 unsigned int length)
+{
+	u16 *crc = shash_desc_ctx(desc);
+
+	if (may_use_simd() && length >= CRC_T10DIF_PMULL_CHUNK_SIZE) {
+		unsigned int l = length & ~(CRC_T10DIF_PMULL_CHUNK_SIZE - 1);
+
+		kernel_neon_begin();
+		*crc = crc_t10dif_pmull(*crc, data, l);
+		kernel_neon_end();
+
+		length -= l;
+		data += l;
+	}
+	if (length > 0)
+		*crc = crc_t10dif_generic(*crc, data, length);
+
+	return 0;
+}
+
+static int crct10dif_final(struct shash_desc *desc, u8 *out)
+{
+	u16 *crc = shash_desc_ctx(desc);
+
+	*(u16 *)out = *crc;
+	return 0;
+}
+
+static struct shash_alg crc_t10dif_alg = {
+	.digestsize		= CRC_T10DIF_DIGEST_SIZE,
+	.init			= crct10dif_init,
+	.update			= crct10dif_update,
+	.final			= crct10dif_final,
+	.descsize		= CRC_T10DIF_DIGEST_SIZE,
+
+	.base.cra_name		= "crct10dif",
+	.base.cra_driver_name	= "crct10dif-arm-ce",
+	.base.cra_priority	= 200,
+	.base.cra_blocksize	= CRC_T10DIF_BLOCK_SIZE,
+	.base.cra_module	= THIS_MODULE,
+};
+
+static int __init crc_t10dif_mod_init(void)
+{
+	if (!(elf_hwcap2 & HWCAP2_PMULL))
+		return -ENODEV;
+
+	return crypto_register_shash(&crc_t10dif_alg);
+}
+
+static void __exit crc_t10dif_mod_exit(void)
+{
+	crypto_unregister_shash(&crc_t10dif_alg);
+}
+
+module_init(crc_t10dif_mod_init);
+module_exit(crc_t10dif_mod_exit);
+
+MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
+MODULE_LICENSE("GPL v2");
+MODULE_ALIAS_CRYPTO("crct10dif");
-- 
2.7.4

^ permalink raw reply related

* [PATCH v2 3/6] crypto: arm64/crct10dif - port x86 SSE implementation to arm64
From: Ard Biesheuvel @ 2016-12-04 11:54 UTC (permalink / raw)
  To: linux-crypto, herbert; +Cc: linux-arm-kernel, Ard Biesheuvel
In-Reply-To: <1480852447-25082-1-git-send-email-ard.biesheuvel@linaro.org>

This is a transliteration of the Intel algorithm implemented
using SSE and PCLMULQDQ instructions that resides in the file
arch/x86/crypto/crct10dif-pcl-asm_64.S, but simplified to only
operate on multiples of 16 bytes. The residual data is handled
by the generic C implementation.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/crypto/Kconfig             |   5 +
 arch/arm64/crypto/Makefile            |   3 +
 arch/arm64/crypto/crct10dif-ce-core.S | 317 ++++++++++++++++++++
 arch/arm64/crypto/crct10dif-ce-glue.c |  91 ++++++
 4 files changed, 416 insertions(+)

diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig
index 2cf32e9887e1..d773c0659202 100644
--- a/arch/arm64/crypto/Kconfig
+++ b/arch/arm64/crypto/Kconfig
@@ -23,6 +23,11 @@ config CRYPTO_GHASH_ARM64_CE
 	depends on ARM64 && KERNEL_MODE_NEON
 	select CRYPTO_HASH
 
+config CRYPTO_CRCT10DIF_ARM64_CE
+	tristate "CRCT10DIF digest algorithm using PMULL instructions"
+	depends on KERNEL_MODE_NEON && CRC_T10DIF
+	select CRYPTO_HASH
+
 config CRYPTO_AES_ARM64_CE
 	tristate "AES core cipher using ARMv8 Crypto Extensions"
 	depends on ARM64 && KERNEL_MODE_NEON
diff --git a/arch/arm64/crypto/Makefile b/arch/arm64/crypto/Makefile
index abb79b3cfcfe..36fd3eb4201b 100644
--- a/arch/arm64/crypto/Makefile
+++ b/arch/arm64/crypto/Makefile
@@ -17,6 +17,9 @@ sha2-ce-y := sha2-ce-glue.o sha2-ce-core.o
 obj-$(CONFIG_CRYPTO_GHASH_ARM64_CE) += ghash-ce.o
 ghash-ce-y := ghash-ce-glue.o ghash-ce-core.o
 
+obj-$(CONFIG_CRYPTO_CRCT10DIF_ARM64_CE) += crct10dif-ce.o
+crct10dif-ce-y := crct10dif-ce-core.o crct10dif-ce-glue.o
+
 obj-$(CONFIG_CRYPTO_AES_ARM64_CE) += aes-ce-cipher.o
 CFLAGS_aes-ce-cipher.o += -march=armv8-a+crypto
 
diff --git a/arch/arm64/crypto/crct10dif-ce-core.S b/arch/arm64/crypto/crct10dif-ce-core.S
new file mode 100644
index 000000000000..641685effebd
--- /dev/null
+++ b/arch/arm64/crypto/crct10dif-ce-core.S
@@ -0,0 +1,317 @@
+//
+// Accelerated CRC-T10DIF using arm64 NEON and Crypto Extensions instructions
+//
+// Copyright (C) 2016 Linaro Ltd <ard.biesheuvel@linaro.org>
+//
+// This program is free software; you can redistribute it and/or modify
+// it under the terms of the GNU General Public License version 2 as
+// published by the Free Software Foundation.
+//
+
+//
+// Implement fast CRC-T10DIF computation with SSE and PCLMULQDQ instructions
+//
+// Copyright (c) 2013, Intel Corporation
+//
+// Authors:
+//     Erdinc Ozturk <erdinc.ozturk@intel.com>
+//     Vinodh Gopal <vinodh.gopal@intel.com>
+//     James Guilford <james.guilford@intel.com>
+//     Tim Chen <tim.c.chen@linux.intel.com>
+//
+// This software is available to you under a choice of one of two
+// licenses.  You may choose to be licensed under the terms of the GNU
+// General Public License (GPL) Version 2, available from the file
+// COPYING in the main directory of this source tree, or the
+// OpenIB.org BSD license below:
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are
+// met:
+//
+// * Redistributions of source code must retain the above copyright
+//   notice, this list of conditions and the following disclaimer.
+//
+// * Redistributions in binary form must reproduce the above copyright
+//   notice, this list of conditions and the following disclaimer in the
+//   documentation and/or other materials provided with the
+//   distribution.
+//
+// * Neither the name of the Intel Corporation nor the names of its
+//   contributors may be used to endorse or promote products derived from
+//   this software without specific prior written permission.
+//
+//
+// THIS SOFTWARE IS PROVIDED BY INTEL CORPORATION ""AS IS"" AND ANY
+// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL INTEL CORPORATION OR
+// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+// PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+// LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+//
+//       Function API:
+//       UINT16 crc_t10dif_pcl(
+//               UINT16 init_crc, //initial CRC value, 16 bits
+//               const unsigned char *buf, //buffer pointer to calculate CRC on
+//               UINT64 len //buffer length in bytes (64-bit data)
+//       );
+//
+//       Reference paper titled "Fast CRC Computation for Generic
+//	Polynomials Using PCLMULQDQ Instruction"
+//       URL: http://www.intel.com/content/dam/www/public/us/en/documents
+//  /white-papers/fast-crc-computation-generic-polynomials-pclmulqdq-paper.pdf
+//
+//
+
+#include <linux/linkage.h>
+#include <asm/assembler.h>
+
+	.text
+	.cpu		generic+crypto
+
+	arg1_low32	.req	w0
+	arg2		.req	x1
+	arg3		.req	x2
+
+	vzr		.req	v13
+
+ENTRY(crc_t10dif_pmull)
+	stp		x29, x30, [sp, #-16]!
+	mov		x29, sp
+
+	movi		vzr.16b, #0		// init zero register
+
+	// adjust the 16-bit initial_crc value, scale it to 32 bits
+	lsl		arg1_low32, arg1_low32, #16
+
+	// check if smaller than 256
+	cmp		arg3, #256
+
+	// for sizes less than 128, we can't fold 64B at a time...
+	b.lt		_less_than_128
+
+	// load the initial crc value
+	// crc value does not need to be byte-reflected, but it needs
+	// to be moved to the high part of the register.
+	// because data will be byte-reflected and will align with
+	// initial crc at correct place.
+	movi		v10.16b, #0
+	mov		v10.s[3], arg1_low32		// initial crc
+
+	// receive the initial 64B data, xor the initial crc value
+	ldp		q0, q1, [arg2]
+	ldp		q2, q3, [arg2, #0x20]
+	ldp		q4, q5, [arg2, #0x40]
+	ldp		q6, q7, [arg2, #0x60]
+	add		arg2, arg2, #0x80
+
+CPU_LE(	rev64		v0.16b, v0.16b			)
+CPU_LE(	rev64		v1.16b, v1.16b			)
+CPU_LE(	rev64		v2.16b, v2.16b			)
+CPU_LE(	rev64		v3.16b, v3.16b			)
+CPU_LE(	rev64		v4.16b, v4.16b			)
+CPU_LE(	rev64		v5.16b, v5.16b			)
+CPU_LE(	rev64		v6.16b, v6.16b			)
+CPU_LE(	rev64		v7.16b, v7.16b			)
+
+CPU_LE(	ext		v0.16b, v0.16b, v0.16b, #8	)
+CPU_LE(	ext		v1.16b, v1.16b, v1.16b, #8	)
+CPU_LE(	ext		v2.16b, v2.16b, v2.16b, #8	)
+CPU_LE(	ext		v3.16b, v3.16b, v3.16b, #8	)
+CPU_LE(	ext		v4.16b, v4.16b, v4.16b, #8	)
+CPU_LE(	ext		v5.16b, v5.16b, v5.16b, #8	)
+CPU_LE(	ext		v6.16b, v6.16b, v6.16b, #8	)
+CPU_LE(	ext		v7.16b, v7.16b, v7.16b, #8	)
+
+	// XOR the initial_crc value
+	eor		v0.16b, v0.16b, v10.16b
+
+	ldr		q10, rk3	// xmm10 has rk3 and rk4
+					// type of pmull instruction
+					// will determine which constant to use
+
+	//
+	// we subtract 256 instead of 128 to save one instruction from the loop
+	//
+	sub		arg3, arg3, #256
+
+	// at this section of the code, there is 64*x+y (0<=y<64) bytes of
+	// buffer. The _fold_64_B_loop will fold 64B at a time
+	// until we have 64+y Bytes of buffer
+
+
+	// fold 64B at a time. This section of the code folds 4 vector
+	// registers in parallel
+_fold_64_B_loop:
+
+	.macro		fold64, reg1, reg2
+	ldp		q11, q12, [arg2], #0x20
+
+	pmull2		v8.1q, \reg1\().2d, v10.2d
+	pmull		\reg1\().1q, \reg1\().1d, v10.1d
+
+CPU_LE(	rev64		v11.16b, v11.16b		)
+CPU_LE(	rev64		v12.16b, v12.16b		)
+
+	pmull2		v9.1q, \reg2\().2d, v10.2d
+	pmull		\reg2\().1q, \reg2\().1d, v10.1d
+
+CPU_LE(	ext		v11.16b, v11.16b, v11.16b, #8	)
+CPU_LE(	ext		v12.16b, v12.16b, v12.16b, #8	)
+
+	eor		\reg1\().16b, \reg1\().16b, v8.16b
+	eor		\reg2\().16b, \reg2\().16b, v9.16b
+	eor		\reg1\().16b, \reg1\().16b, v11.16b
+	eor		\reg2\().16b, \reg2\().16b, v12.16b
+	.endm
+
+	fold64		v0, v1
+	fold64		v2, v3
+	fold64		v4, v5
+	fold64		v6, v7
+
+	subs		arg3, arg3, #128
+
+	// check if there is another 64B in the buffer to be able to fold
+	b.ge		_fold_64_B_loop
+
+	// at this point, the buffer pointer is pointing at the last y Bytes
+	// of the buffer the 64B of folded data is in 4 of the vector
+	// registers: v0, v1, v2, v3
+
+	// fold the 8 vector registers to 1 vector register with different
+	// constants
+
+	ldr		q10, rk9
+
+	.macro		fold16, reg, rk
+	pmull		v8.1q, \reg\().1d, v10.1d
+	pmull2		\reg\().1q, \reg\().2d, v10.2d
+	.ifnb		\rk
+	ldr		q10, \rk
+	.endif
+	eor		v7.16b, v7.16b, v8.16b
+	eor		v7.16b, v7.16b, \reg\().16b
+	.endm
+
+	fold16		v0, rk11
+	fold16		v1, rk13
+	fold16		v2, rk15
+	fold16		v3, rk17
+	fold16		v4, rk19
+	fold16		v5, rk1
+	fold16		v6
+
+	// instead of 64, we add 48 to the loop counter to save 1 instruction
+	// from the loop instead of a cmp instruction, we use the negative
+	// flag with the jl instruction
+	adds		arg3, arg3, #(128-16)
+	b.lt		_final_reduction_for_128
+
+	// now we have 16+y bytes left to reduce. 16 Bytes is in register v7
+	// and the rest is in memory. We can fold 16 bytes at a time if y>=16
+	// continue folding 16B at a time
+
+_16B_reduction_loop:
+	pmull		v8.1q, v7.1d, v10.1d
+	pmull2		v7.1q, v7.2d, v10.2d
+	eor		v7.16b, v7.16b, v8.16b
+
+	ldr		q0, [arg2], #16
+CPU_LE(	rev64		v0.16b, v0.16b			)
+CPU_LE(	ext		v0.16b, v0.16b, v0.16b, #8	)
+	eor		v7.16b, v7.16b, v0.16b
+	subs		arg3, arg3, #16
+
+	// instead of a cmp instruction, we utilize the flags with the
+	// jge instruction equivalent of: cmp arg3, 16-16
+	// check if there is any more 16B in the buffer to be able to fold
+	b.ge		_16B_reduction_loop
+
+_final_reduction_for_128:
+	// compute crc of a 128-bit value
+	ldr		q10, rk5		// rk5 and rk6 in xmm10
+
+	// 64b fold
+	ext		v0.16b, vzr.16b, v7.16b, #8
+	mov		v7.d[0], v7.d[1]
+	pmull		v7.1q, v7.1d, v10.1d
+	eor		v7.16b, v7.16b, v0.16b
+
+	// 32b fold
+	ext		v0.16b, v7.16b, vzr.16b, #4
+	mov		v7.s[3], vzr.s[0]
+	pmull2		v0.1q, v0.2d, v10.2d
+	eor		v7.16b, v7.16b, v0.16b
+
+	// barrett reduction
+_barrett:
+	ldr		q10, rk7
+	mov		v0.d[0], v7.d[1]
+
+	pmull		v0.1q, v0.1d, v10.1d
+	ext		v0.16b, vzr.16b, v0.16b, #12
+	pmull2		v0.1q, v0.2d, v10.2d
+	ext		v0.16b, vzr.16b, v0.16b, #12
+	eor		v7.16b, v7.16b, v0.16b
+	mov		w0, v7.s[1]
+
+_cleanup:
+	// scale the result back to 16 bits
+	lsr		x0, x0, #16
+	ldp		x29, x30, [sp], #16
+	ret
+
+_less_than_128:
+	cbz		arg3, _cleanup
+
+	movi		v0.16b, #0
+	mov		v0.s[3], arg1_low32	// get the initial crc value
+
+	ldr		q7, [arg2], #0x10
+CPU_LE(	rev64		v7.16b, v7.16b			)
+CPU_LE(	ext		v7.16b, v7.16b, v7.16b, #8	)
+	eor		v7.16b, v7.16b, v0.16b	// xor the initial crc value
+
+	// check if there is enough buffer to be able to fold 16B at a time
+	cmp		arg3, #32
+	b.lt		_final_reduction_for_128
+
+	// now if there is, load the constants
+	ldr		q10, rk1		// rk1 and rk2 in xmm10
+
+	// update the counter. subtract 32 instead of 16 to save one
+	// instruction from the loop
+	sub		arg3, arg3, #32
+	b		_16B_reduction_loop
+ENDPROC(crc_t10dif_pmull)
+
+// precomputed constants
+// these constants are precomputed from the poly:
+// 0x8bb70000 (0x8bb7 scaled to 32 bits)
+	.align		4
+// Q = 0x18BB70000
+// rk1 = 2^(32*3) mod Q << 32
+// rk2 = 2^(32*5) mod Q << 32
+// rk3 = 2^(32*15) mod Q << 32
+// rk4 = 2^(32*17) mod Q << 32
+// rk5 = 2^(32*3) mod Q << 32
+// rk6 = 2^(32*2) mod Q << 32
+// rk7 = floor(2^64/Q)
+// rk8 = Q
+
+rk1:	.octa		0x06df0000000000002d56000000000000
+rk3:	.octa		0x7cf50000000000009d9d000000000000
+rk5:	.octa		0x13680000000000002d56000000000000
+rk7:	.octa		0x000000018bb7000000000001f65a57f8
+rk9:	.octa		0xbfd6000000000000ceae000000000000
+rk11:	.octa		0x713c0000000000001e16000000000000
+rk13:	.octa		0x80a6000000000000f7f9000000000000
+rk15:	.octa		0xe658000000000000044c000000000000
+rk17:	.octa		0xa497000000000000ad18000000000000
+rk19:	.octa		0xe7b50000000000006ee3000000000000
diff --git a/arch/arm64/crypto/crct10dif-ce-glue.c b/arch/arm64/crypto/crct10dif-ce-glue.c
new file mode 100644
index 000000000000..735678884194
--- /dev/null
+++ b/arch/arm64/crypto/crct10dif-ce-glue.c
@@ -0,0 +1,91 @@
+/*
+ * Accelerated CRC-T10DIF using arm64 NEON and Crypto Extensions instructions
+ *
+ * Copyright (C) 2016 Linaro Ltd <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/cpufeature.h>
+#include <linux/crc-t10dif.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/string.h>
+
+#include <crypto/internal/hash.h>
+
+#include <asm/neon.h>
+
+#define CRC_T10DIF_PMULL_CHUNK_SIZE	16U
+
+asmlinkage u16 crc_t10dif_pmull(u16 init_crc, const u8 buf[], u64 len);
+
+static int crct10dif_init(struct shash_desc *desc)
+{
+	u16 *crc = shash_desc_ctx(desc);
+
+	*crc = 0;
+	return 0;
+}
+
+static int crct10dif_update(struct shash_desc *desc, const u8 *data,
+			 unsigned int length)
+{
+	u16 *crc = shash_desc_ctx(desc);
+
+	if (length >= CRC_T10DIF_PMULL_CHUNK_SIZE) {
+		unsigned int l = length & ~(CRC_T10DIF_PMULL_CHUNK_SIZE - 1);
+
+		kernel_neon_begin_partial(14);
+		*crc = crc_t10dif_pmull(*crc, data, l);
+		kernel_neon_end();
+
+		data += l;
+	}
+	if (length % CRC_T10DIF_PMULL_CHUNK_SIZE)
+		*crc = crc_t10dif_generic(*crc, data,
+					  length % CRC_T10DIF_PMULL_CHUNK_SIZE);
+
+	return 0;
+}
+
+static int crct10dif_final(struct shash_desc *desc, u8 *out)
+{
+	u16 *crc = shash_desc_ctx(desc);
+
+	*(u16 *)out = *crc;
+	return 0;
+}
+
+static struct shash_alg crc_t10dif_alg = {
+	.digestsize		= CRC_T10DIF_DIGEST_SIZE,
+	.init			= crct10dif_init,
+	.update			= crct10dif_update,
+	.final			= crct10dif_final,
+	.descsize		= CRC_T10DIF_DIGEST_SIZE,
+
+	.base.cra_name		= "crct10dif",
+	.base.cra_driver_name	= "crct10dif-arm64-ce",
+	.base.cra_priority	= 200,
+	.base.cra_blocksize	= CRC_T10DIF_BLOCK_SIZE,
+	.base.cra_module	= THIS_MODULE,
+};
+
+static int __init crc_t10dif_mod_init(void)
+{
+	return crypto_register_shash(&crc_t10dif_alg);
+}
+
+static void __exit crc_t10dif_mod_exit(void)
+{
+	crypto_unregister_shash(&crc_t10dif_alg);
+}
+
+module_cpu_feature_match(PMULL, crc_t10dif_mod_init);
+module_exit(crc_t10dif_mod_exit);
+
+MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
+MODULE_LICENSE("GPL v2");
-- 
2.7.4

^ permalink raw reply related

* [PATCH v2 2/6] crypto: testmgr - add/enhance test cases for CRC-T10DIF
From: Ard Biesheuvel @ 2016-12-04 11:54 UTC (permalink / raw)
  To: linux-crypto, herbert; +Cc: linux-arm-kernel, Ard Biesheuvel
In-Reply-To: <1480852447-25082-1-git-send-email-ard.biesheuvel@linaro.org>

The existing test cases only exercise a small slice of the various
possible code paths through the x86 SSE/PCLMULQDQ implementation,
and the upcoming ports of it for arm64. So add one that exceeds 256
bytes in size, and convert another to a chunked test.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 crypto/testmgr.h | 70 ++++++++++++--------
 1 file changed, 42 insertions(+), 28 deletions(-)

diff --git a/crypto/testmgr.h b/crypto/testmgr.h
index e64a4ef9d8ca..b7cd41b25a2a 100644
--- a/crypto/testmgr.h
+++ b/crypto/testmgr.h
@@ -1334,36 +1334,50 @@ static struct hash_testvec rmd320_tv_template[] = {
 	}
 };
 
-#define CRCT10DIF_TEST_VECTORS	3
+#define CRCT10DIF_TEST_VECTORS	ARRAY_SIZE(crct10dif_tv_template)
 static struct hash_testvec crct10dif_tv_template[] = {
 	{
-		.plaintext = "abc",
-		.psize  = 3,
-#ifdef __LITTLE_ENDIAN
-		.digest = "\x3b\x44",
-#else
-		.digest = "\x44\x3b",
-#endif
-	}, {
-		.plaintext = "1234567890123456789012345678901234567890"
-			     "123456789012345678901234567890123456789",
-		.psize	= 79,
-#ifdef __LITTLE_ENDIAN
-		.digest	= "\x70\x4b",
-#else
-		.digest	= "\x4b\x70",
-#endif
-	}, {
-		.plaintext =
-		"abcddddddddddddddddddddddddddddddddddddddddddddddddddddd",
-		.psize  = 56,
-#ifdef __LITTLE_ENDIAN
-		.digest = "\xe3\x9c",
-#else
-		.digest = "\x9c\xe3",
-#endif
-		.np     = 2,
-		.tap    = { 28, 28 }
+		.plaintext	= "abc",
+		.psize		= 3,
+		.digest		= (u8 *)(u16 []){ 0x443b },
+	}, {
+		.plaintext 	= "1234567890123456789012345678901234567890"
+				  "123456789012345678901234567890123456789",
+		.psize		= 79,
+		.digest 	= (u8 *)(u16 []){ 0x4b70 },
+		.np		= 2,
+		.tap		= { 63, 16 },
+	}, {
+		.plaintext	= "abcdddddddddddddddddddddddddddddddddddddddd"
+				  "ddddddddddddd",
+		.psize		= 56,
+		.digest		= (u8 *)(u16 []){ 0x9ce3 },
+		.np		= 8,
+		.tap		= { 1, 2, 28, 7, 6, 5, 4, 3 },
+	}, {
+		.plaintext 	= "1234567890123456789012345678901234567890"
+				  "1234567890123456789012345678901234567890"
+				  "1234567890123456789012345678901234567890"
+				  "1234567890123456789012345678901234567890"
+				  "1234567890123456789012345678901234567890"
+				  "1234567890123456789012345678901234567890"
+				  "1234567890123456789012345678901234567890"
+				  "123456789012345678901234567890123456789",
+		.psize		= 319,
+		.digest		= (u8 *)(u16 []){ 0x44c6 },
+	}, {
+		.plaintext 	= "1234567890123456789012345678901234567890"
+				  "1234567890123456789012345678901234567890"
+				  "1234567890123456789012345678901234567890"
+				  "1234567890123456789012345678901234567890"
+				  "1234567890123456789012345678901234567890"
+				  "1234567890123456789012345678901234567890"
+				  "1234567890123456789012345678901234567890"
+				  "123456789012345678901234567890123456789",
+		.psize		= 319,
+		.digest		= (u8 *)(u16 []){ 0x44c6 },
+		.np		= 4,
+		.tap		= { 1, 255, 57, 6 },
 	}
 };
 
-- 
2.7.4

^ permalink raw reply related

* [PATCH v2 1/6] crypto: testmgr - avoid overlap in chunked tests
From: Ard Biesheuvel @ 2016-12-04 11:54 UTC (permalink / raw)
  To: linux-crypto, herbert; +Cc: linux-arm-kernel, Ard Biesheuvel
In-Reply-To: <1480852447-25082-1-git-send-email-ard.biesheuvel@linaro.org>

The IDXn offsets are chosen such that tap values (which may go up to
255) end up overlapping in the xbuf allocation. In particular, IDX1
and IDX3 are too close together, so update IDX3 to avoid this issue.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 crypto/testmgr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/crypto/testmgr.c b/crypto/testmgr.c
index ded50b67c757..670893bcf361 100644
--- a/crypto/testmgr.c
+++ b/crypto/testmgr.c
@@ -63,7 +63,7 @@ int alg_test(const char *driver, const char *alg, u32 type, u32 mask)
  */
 #define IDX1		32
 #define IDX2		32400
-#define IDX3		1
+#define IDX3		511
 #define IDX4		8193
 #define IDX5		22222
 #define IDX6		17101
-- 
2.7.4

^ permalink raw reply related

* [PATCH v2 0/6] crypto: ARM/arm64 CRC-T10DIF/CRC32/CRC32C roundup
From: Ard Biesheuvel @ 2016-12-04 11:54 UTC (permalink / raw)
  To: linux-crypto, herbert; +Cc: linux-arm-kernel, Ard Biesheuvel

This v2 combines the CRC-T10DIF and CRC32 implementations for both ARM and
arm64 that I sent out a couple of weeks ago, and adds support to the latter
for CRC32C.

Ard Biesheuvel (6):
  crypto: testmgr - avoid overlap in chunked tests
  crypto: testmgr - add/enhance test cases for CRC-T10DIF
  crypto: arm64/crct10dif - port x86 SSE implementation to arm64
  crypto: arm/crct10dif - port x86 SSE implementation to ARM
  crypto: arm64/crc32 - accelerated support based on x86 SSE
    implementation
  crypto: arm/crc32 - accelerated support based on x86 SSE
    implementation

 arch/arm/crypto/Kconfig               |  10 +
 arch/arm/crypto/Makefile              |   4 +
 arch/arm/crypto/crc32-ce-core.S       | 306 +++++++++++++++++
 arch/arm/crypto/crc32-ce-glue.c       | 195 +++++++++++
 arch/arm/crypto/crct10dif-ce-core.S   | 349 ++++++++++++++++++++
 arch/arm/crypto/crct10dif-ce-glue.c   |  95 ++++++
 arch/arm64/crypto/Kconfig             |  11 +
 arch/arm64/crypto/Makefile            |   6 +
 arch/arm64/crypto/crc32-ce-core.S     | 266 +++++++++++++++
 arch/arm64/crypto/crc32-ce-glue.c     | 188 +++++++++++
 arch/arm64/crypto/crct10dif-ce-core.S | 317 ++++++++++++++++++
 arch/arm64/crypto/crct10dif-ce-glue.c |  91 +++++
 crypto/testmgr.c                      |   2 +-
 crypto/testmgr.h                      |  70 ++--
 14 files changed, 1881 insertions(+), 29 deletions(-)
 create mode 100644 arch/arm/crypto/crc32-ce-core.S
 create mode 100644 arch/arm/crypto/crc32-ce-glue.c
 create mode 100644 arch/arm/crypto/crct10dif-ce-core.S
 create mode 100644 arch/arm/crypto/crct10dif-ce-glue.c
 create mode 100644 arch/arm64/crypto/crc32-ce-core.S
 create mode 100644 arch/arm64/crypto/crc32-ce-glue.c
 create mode 100644 arch/arm64/crypto/crct10dif-ce-core.S
 create mode 100644 arch/arm64/crypto/crct10dif-ce-glue.c

-- 
2.7.4

^ permalink raw reply

* Re: [PATCH v5 1/1] crypto: add virtio-crypto driver
From: kbuild test robot @ 2016-12-04  2:39 UTC (permalink / raw)
  To: Gonglei
  Cc: xuquan8, weidong.huang, mst, qemu-devel, wanzongshun,
	virtualization, jianjay.zhou, arei.gonglei, virtio-dev, herbert,
	hanweidong, Gonglei, longpeng2, kbuild-all, luonengjun, stefanha,
	claudio.fontana, linux-kernel, linux-crypto, davem, wu.wubin
In-Reply-To: <1480595945-63656-2-git-send-email-arei.gonglei@huawei.com>

[-- Attachment #1: Type: text/plain, Size: 5483 bytes --]

Hi Gonglei,

[auto build test ERROR on cryptodev/master]
[also build test ERROR on v4.9-rc7 next-20161202]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Gonglei/crypto-add-virtio-crypto-driver/20161202-190424
base:   https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git master
config: sparc64-allyesconfig (attached as .config)
compiler: sparc64-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
        wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=sparc64 

All errors (new ones prefixed by >>):

   In file included from arch/sparc/include/asm/topology.h:4:0,
                    from include/linux/topology.h:35,
                    from include/linux/gfp.h:8,
                    from include/linux/kmod.h:22,
                    from include/linux/module.h:13,
                    from drivers/crypto/virtio/virtio_crypto_mgr.c:21:
   drivers/crypto/virtio/virtio_crypto_common.h: In function 'virtio_crypto_get_current_node':
>> arch/sparc/include/asm/topology_64.h:44:44: error: implicit declaration of function 'cpu_data' [-Werror=implicit-function-declaration]
    #define topology_physical_package_id(cpu) (cpu_data(cpu).proc_id)
                                               ^
   drivers/crypto/virtio/virtio_crypto_common.h:116:9: note: in expansion of macro 'topology_physical_package_id'
     return topology_physical_package_id(smp_processor_id());
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> arch/sparc/include/asm/topology_64.h:44:57: error: request for member 'proc_id' in something not a structure or union
    #define topology_physical_package_id(cpu) (cpu_data(cpu).proc_id)
                                                            ^
   drivers/crypto/virtio/virtio_crypto_common.h:116:9: note: in expansion of macro 'topology_physical_package_id'
     return topology_physical_package_id(smp_processor_id());
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors

vim +/cpu_data +44 arch/sparc/include/asm/topology_64.h

f5e706ad include/asm-sparc/topology_64.h      Sam Ravnborg        2008-07-17  28  
9d079337 arch/sparc/include/asm/topology_64.h David Miller        2009-01-11  29  #define cpumask_of_pcibus(bus)	\
9d079337 arch/sparc/include/asm/topology_64.h David Miller        2009-01-11  30  	(pcibus_to_node(bus) == -1 ? \
e9b37512 arch/sparc/include/asm/topology_64.h Rusty Russell       2009-03-16  31  	 cpu_all_mask : \
9d079337 arch/sparc/include/asm/topology_64.h David Miller        2009-01-11  32  	 cpumask_of_node(pcibus_to_node(bus)))
f5e706ad include/asm-sparc/topology_64.h      Sam Ravnborg        2008-07-17  33  
52708d69 arch/sparc/include/asm/topology_64.h Nitin Gupta         2015-11-02  34  int __node_distance(int, int);
52708d69 arch/sparc/include/asm/topology_64.h Nitin Gupta         2015-11-02  35  #define node_distance(a, b) __node_distance(a, b)
52708d69 arch/sparc/include/asm/topology_64.h Nitin Gupta         2015-11-02  36  
f5e706ad include/asm-sparc/topology_64.h      Sam Ravnborg        2008-07-17  37  #else /* CONFIG_NUMA */
f5e706ad include/asm-sparc/topology_64.h      Sam Ravnborg        2008-07-17  38  
f5e706ad include/asm-sparc/topology_64.h      Sam Ravnborg        2008-07-17  39  #include <asm-generic/topology.h>
f5e706ad include/asm-sparc/topology_64.h      Sam Ravnborg        2008-07-17  40  
f5e706ad include/asm-sparc/topology_64.h      Sam Ravnborg        2008-07-17  41  #endif /* !(CONFIG_NUMA) */
f5e706ad include/asm-sparc/topology_64.h      Sam Ravnborg        2008-07-17  42  
f5e706ad include/asm-sparc/topology_64.h      Sam Ravnborg        2008-07-17  43  #ifdef CONFIG_SMP
f5e706ad include/asm-sparc/topology_64.h      Sam Ravnborg        2008-07-17 @44  #define topology_physical_package_id(cpu)	(cpu_data(cpu).proc_id)
f5e706ad include/asm-sparc/topology_64.h      Sam Ravnborg        2008-07-17  45  #define topology_core_id(cpu)			(cpu_data(cpu).core_id)
acc455cf arch/sparc/include/asm/topology_64.h chris hyser         2015-04-22  46  #define topology_core_cpumask(cpu)		(&cpu_core_sib_map[cpu])
06931e62 arch/sparc/include/asm/topology_64.h Bartosz Golaszewski 2015-05-26  47  #define topology_sibling_cpumask(cpu)		(&per_cpu(cpu_sibling_map, cpu))
f5e706ad include/asm-sparc/topology_64.h      Sam Ravnborg        2008-07-17  48  #endif /* CONFIG_SMP */
f5e706ad include/asm-sparc/topology_64.h      Sam Ravnborg        2008-07-17  49  
3905c54f arch/sparc/include/asm/topology_64.h Stephen Rothwell    2011-04-12  50  extern cpumask_t cpu_core_map[NR_CPUS];
acc455cf arch/sparc/include/asm/topology_64.h chris hyser         2015-04-22  51  extern cpumask_t cpu_core_sib_map[NR_CPUS];
3905c54f arch/sparc/include/asm/topology_64.h Stephen Rothwell    2011-04-12  52  static inline const struct cpumask *cpu_coregroup_mask(int cpu)

:::::: The code at line 44 was first introduced by commit
:::::: f5e706ad886b6a5eb59637830110b09ccebf01c5 sparc: join the remaining header files

:::::: TO: Sam Ravnborg <sam@ravnborg.org>
:::::: CC: David S. Miller <davem@davemloft.net>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 47988 bytes --]

[-- Attachment #3: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* [PATCH 1/1] crypto: asymmetric_keys: set error code on failure
From: Pan Bian @ 2016-12-03 14:57 UTC (permalink / raw)
  To: David Howells, Herbert Xu, David S. Miller, keyrings,
	linux-crypto
  Cc: linux-kernel, Pan Bian

From: Pan Bian <bianpan2016@163.com>

In function public_key_verify_signature(), returns variable ret on
error paths. When the call to kmalloc() fails, the value of ret is 0,
and it is not set to an errno before returning. This patch fixes the
bug.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188891

Signed-off-by: Pan Bian <bianpan2016@163.com>
---
 crypto/asymmetric_keys/public_key.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/crypto/asymmetric_keys/public_key.c b/crypto/asymmetric_keys/public_key.c
index fd76b5f..1dc65ba 100644
--- a/crypto/asymmetric_keys/public_key.c
+++ b/crypto/asymmetric_keys/public_key.c
@@ -123,8 +123,10 @@ int public_key_verify_signature(const struct public_key *pkey,
 
 	outlen = crypto_akcipher_maxsize(tfm);
 	output = kmalloc(outlen, GFP_KERNEL);
-	if (!output)
+	if (!output) {
+		ret = -ENOMEM;
 		goto error_free_req;
+	}
 
 	sg_init_one(&sig_sg, sig->s, sig->s_size);
 	sg_init_one(&digest_sg, output, outlen);
-- 
1.9.1

^ permalink raw reply related

* Re: Crash in crypto mcryptd
From: Tim Chen @ 2016-12-03  0:16 UTC (permalink / raw)
  To: Mikulas Patocka, Herbert Xu, David S. Miller
  Cc: linux-crypto, dm-devel, Milan Broz
In-Reply-To: <alpine.LRH.2.02.1612011819540.27565@file01.intranet.prod.int.rdu2.redhat.com>

On Thu, 2016-12-01 at 19:00 -0500, Mikulas Patocka wrote:
> Hi
> 
> There is a bug in mcryptd initialization.
> 
> This is a test module that tries various hash algorithms. When you load 
> the module with "insmod test.ko 'alg=mcryptd(md5)'", the machine crashes.
> 
> Mikulas
> 
> 

Mikulas,

Can you try out the patch that I've sent out in a separate mail?

Thanks.

Tim

^ permalink raw reply

* [PATCH] crypto/mcryptd: Check mcryptd algorithm compatability
From: Tim Chen @ 2016-12-03  0:15 UTC (permalink / raw)
  To: Mikulas Patocka, Herbert Xu, David S. Miller
  Cc: Tim Chen, megha.dey, linux-crypto, dm-devel, Milan Broz,
	Eric Biggers, stable

Algorithms not compatible with mcryptd could be spawned by mcryptd
with a direct crypto_alloc_tfm invocation using a "mcryptd(alg)"
name construct.  This causes mcryptd to crash the kernel if
"alg" is incompatible and not intended to be used with mcryptd.

A flag CRYPTO_ALG_MCRYPT is being added to mcryptd compatible
algorithms' cra_flags. The compatability is checked when mcryptd spawn
off an algorithm.

Link: http://marc.info/?l=linux-crypto-vger&m=148063683310477&w=2
Cc: stable@vger.kernel.org
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Tested-by: Megha Dey <megha.dey@linux.intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
---
 arch/x86/crypto/sha1-mb/sha1_mb.c     | 3 ++-
 arch/x86/crypto/sha256-mb/sha256_mb.c | 3 ++-
 arch/x86/crypto/sha512-mb/sha512_mb.c | 3 ++-
 crypto/mcryptd.c                      | 6 ++++++
 include/linux/crypto.h                | 6 ++++++
 5 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/arch/x86/crypto/sha1-mb/sha1_mb.c b/arch/x86/crypto/sha1-mb/sha1_mb.c
index acf9fdf..475959db 100644
--- a/arch/x86/crypto/sha1-mb/sha1_mb.c
+++ b/arch/x86/crypto/sha1-mb/sha1_mb.c
@@ -770,7 +770,8 @@ static struct ahash_alg sha1_mb_areq_alg = {
 			 */
 			.cra_flags	= CRYPTO_ALG_TYPE_AHASH |
 						CRYPTO_ALG_ASYNC |
-						CRYPTO_ALG_INTERNAL,
+						CRYPTO_ALG_INTERNAL |
+						CRYPTO_ALG_MCRYPT,
 			.cra_blocksize	= SHA1_BLOCK_SIZE,
 			.cra_module	= THIS_MODULE,
 			.cra_list	= LIST_HEAD_INIT
diff --git a/arch/x86/crypto/sha256-mb/sha256_mb.c b/arch/x86/crypto/sha256-mb/sha256_mb.c
index 7926a22..f33b592 100644
--- a/arch/x86/crypto/sha256-mb/sha256_mb.c
+++ b/arch/x86/crypto/sha256-mb/sha256_mb.c
@@ -768,7 +768,8 @@ static struct ahash_alg sha256_mb_areq_alg = {
 			 */
 			.cra_flags	= CRYPTO_ALG_TYPE_AHASH |
 						CRYPTO_ALG_ASYNC |
-						CRYPTO_ALG_INTERNAL,
+						CRYPTO_ALG_INTERNAL |
+						CRYPTO_ALG_MCRYPT,
 			.cra_blocksize	= SHA256_BLOCK_SIZE,
 			.cra_module	= THIS_MODULE,
 			.cra_list	= LIST_HEAD_INIT
diff --git a/arch/x86/crypto/sha512-mb/sha512_mb.c b/arch/x86/crypto/sha512-mb/sha512_mb.c
index 9c1bb6d..13aa2e6 100644
--- a/arch/x86/crypto/sha512-mb/sha512_mb.c
+++ b/arch/x86/crypto/sha512-mb/sha512_mb.c
@@ -783,7 +783,8 @@ static struct ahash_alg sha512_mb_areq_alg = {
 			 */
 			.cra_flags	= CRYPTO_ALG_TYPE_AHASH |
 						CRYPTO_ALG_ASYNC |
-						CRYPTO_ALG_INTERNAL,
+						CRYPTO_ALG_INTERNAL |
+						CRYPTO_ALG_MCRYPT,
 			.cra_blocksize	= SHA512_BLOCK_SIZE,
 			.cra_module	= THIS_MODULE,
 			.cra_list	= LIST_HEAD_INIT
diff --git a/crypto/mcryptd.c b/crypto/mcryptd.c
index 94ee44a..5c40e13 100644
--- a/crypto/mcryptd.c
+++ b/crypto/mcryptd.c
@@ -500,6 +500,12 @@ static int mcryptd_create_hash(struct crypto_template *tmpl, struct rtattr **tb,
 
 	alg = &halg->base;
 	pr_debug("crypto: mcryptd hash alg: %s\n", alg->cra_name);
+
+	if (!(alg->cra_flags & CRYPTO_ALG_MCRYPT)) {
+		err = -EINVAL;
+		goto out_put_alg;
+	}
+
 	inst = mcryptd_alloc_instance(alg, ahash_instance_headroom(),
 					sizeof(*ctx));
 	err = PTR_ERR(inst);
diff --git a/include/linux/crypto.h b/include/linux/crypto.h
index 167aea2..e47d5a8 100644
--- a/include/linux/crypto.h
+++ b/include/linux/crypto.h
@@ -106,6 +106,12 @@
 #define CRYPTO_ALG_INTERNAL		0x00002000
 
 /*
+ * Mark cipher as compatible with mcryptd
+ * for multi-buffer processing
+ */
+#define CRYPTO_ALG_MCRYPT		0x00004000
+
+/*
  * Transform masks and values (for crt_flags).
  */
 #define CRYPTO_TFM_REQ_MASK		0x000fff00
-- 
2.5.5

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox