Linux cryptographic layer development
 help / color / mirror / Atom feed
* bug in blkcipher_walk code
From: Stephan Mueller @ 2016-11-18 11:31 UTC (permalink / raw)
  To: herbert; +Cc: linux-crypto

Hi Herbert,

Once in a while I seem to trigger a bug in the blkcipher_walk code which I 
cannot track down. This bug happens sporadically where I assume that it has 
something to do with the memory management in the slow path of blkcipher_walk.

I am using the CTR DRBG code that in turn uses the ctr-aes-aesni 
implementation. The bug only appears when I want to obtain a random number 
that is less than the CTR AES block size. In my particular case, I want 4 
bytes from the DRBG.

The bug happens in arch/x86/crypto/aesni-intel_glue.c:ctr_crypt_final() at the 
line:

	memcpy(dst, keystream, nbytes);

The bug looks like the following:

[   12.328676] BUG: unable to handle kernel paging request at ffffa17ae418b988
[   12.328680] IP: [<ffffffff82060eea>] ctr_crypt+0x19a/0x1c0
[   12.328681] PGD 66fed067
[   12.328681] PUD 0
[   12.328681]
[   12.328683] Oops: 0002 [#1] SMP
[   12.328692] Modules linked in: bridge(+) stp llc ebtable_nat ip6table_raw 
ip6table_security ip6table_mangle iptable_raw iptable_security iptable_mangle 
ebtable_filter ebtables ip6table_filter ip6_tables crct10dif_pclmul 
crc32_pclmul ghash_clmulni_intel pcspkr i2c_piix4 virtio_net virtio_balloon 
acpi_cpufreq sch_fq_codel virtio_console virtio_blk virtio_pci virtio_ring 
serio_raw crc32c_intel virtio
[   12.328693] CPU: 0 PID: 521 Comm: modprobe Not tainted 4.9.0-rc1+ #253
[   12.328694] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.9.1-1.fc24 04/01/2014
[   12.328694] task: ffffa17ab8453fc0 task.stack: ffffbdafc0744000
[   12.328696] RIP: 0010:[<ffffffff82060eea>]  [<ffffffff82060eea>] ctr_crypt
+0x19a/0x1c0
[   12.328696] RSP: 0018:ffffbdafc0747a60  EFLAGS: 00010002
[   12.328697] RAX: 0000000032e455a6 RBX: 0000000000000004 RCX: 
0000000000000002
[   12.328697] RDX: 0000000000000001 RSI: 0000000000000086 RDI: 
0000000000000086
[   12.328698] RBP: ffffbdafc0747b28 R08: ffffa17abc16e900 R09: 
0000000000000019
[   12.328698] R10: ffffa17a764f68b0 R11: 000000000002e918 R12: 
ffffbdafc0747b38
[   12.328698] R13: ffffa17a764f6840 R14: ffffa17ae418b988 R15: 
ffffbdafc0747a70
[   12.328699] FS:  00007f55f57a6700(0000) GS:ffffa17abfc00000(0000) knlGS:
0000000000000000
[   12.328700] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   12.328700] CR2: ffffa17ae418b988 CR3: 0000000079b26000 CR4: 
00000000003406f0
[   12.328703] Stack:
[   12.328705]  ffffa17abc16e900 ffffa17ab845fd80 2ae7e40732e455a6 
3a224612a8f9841d
[   12.328706]  fffffb4e81e117c0 ffffa17ab845fd80 fffffb4e829062c0 
ffffa17ae418b988
[   12.328707]  ffffbdafc0747ba8 ffffffff00000d80 ffffffff00000004 
ffffbdafc0747bc8
[   12.328708] Call Trace:
[   12.328712]  [<ffffffff823e5fd3>] __ablk_encrypt+0x43/0x50
[   12.328714]  [<ffffffff823e6012>] ablk_encrypt+0x32/0xc0
[   12.328716]  [<ffffffff823c4f2e>] skcipher_encrypt_ablkcipher+0x5e/0x60
[   12.328717]  [<ffffffff823dbb80>] drbg_kcapi_sym_ctr+0xb0/0x130
[   12.328719]  [<ffffffff823de153>] drbg_ctr_generate+0x53/0x80


Now, the interesting part is the following: the original memory pointer that 
shall be processed by the DRBG is in my example ffffffffc018b988 -- this 
pointer is used until the DRBG invokes crypto_skcipher_encrypt. However, when 
I print out the buffer pointer that is used as dst in the memcpy of 
ctr_crypt_final, I see ffffa17ae418b988 -- i.e. the buffer that causes paging 
failure.

During tracing the blkcipher_walk code I see that the slow code path is used 
when the request size is smaller than the block size. That slow code path 
allocates new memory that will be used for the dst pointer in ctr_crypt_final.

May I ask you for checking whether the allocation and the memory pointer logic 
has an issue that would cause a paging failure?

Ciao
Stephan

^ permalink raw reply

* [PATCH] crypto: CTR DRBG - advance output buffer pointer
From: Stephan Mueller @ 2016-11-18 11:27 UTC (permalink / raw)
  To: herbert; +Cc: linux-crypto

The CTR DRBG segments the number of random bytes to be generated into
128 byte blocks. The current code misses the advancement of the output
buffer pointer when the requestor asks for more than 128 bytes of data.
In this case, the next 128 byte block of random numbers is copied to
the beginning of the output buffer again. This implies that only the
first 128 bytes of the output buffer would ever be filled.

The patch adds the advancement of the buffer pointer to fill the entire
buffer.

Signed-off-by: Stephan Mueller <smueller@chronox.de>
---
 crypto/drbg.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/crypto/drbg.c b/crypto/drbg.c
index fb33f7d..9a95b61 100644
--- a/crypto/drbg.c
+++ b/crypto/drbg.c
@@ -1766,6 +1766,7 @@ static int drbg_kcapi_sym_ctr(struct drbg_state *drbg,
 		init_completion(&drbg->ctr_completion);
 
 		outlen -= cryptlen;
+		outbuf += cryptlen;
 	}
 
 	return 0;
-- 
2.7.4

^ permalink raw reply related

* Re: [patch] s390/crypto: unlock on error in prng_tdes_read()
From: Martin Schwidefsky @ 2016-11-18 12:12 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Herbert Xu, Harald Freudenberger, David S. Miller, Heiko Carstens,
	linux-crypto, linux-s390, kernel-janitors
In-Reply-To: <20161118105451.GA26523@mwanda>

On Fri, 18 Nov 2016 14:11:00 +0300
Dan Carpenter <dan.carpenter@oracle.com> wrote:

> We added some new locking but forgot to unlock on error.
> 
> Fixes: 57127645d79d ("s390/zcrypt: Introduce new SHA-512 based Pseudo Random Generator.")
> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
> 
> diff --git a/arch/s390/crypto/prng.c b/arch/s390/crypto/prng.c
> index 9cc050f..1113389 100644
> --- a/arch/s390/crypto/prng.c
> +++ b/arch/s390/crypto/prng.c
> @@ -507,8 +507,10 @@ static ssize_t prng_tdes_read(struct file *file, char __user *ubuf,
>  		prng_data->prngws.byte_counter += n;
>  		prng_data->prngws.reseed_counter += n;
> 
> -		if (copy_to_user(ubuf, prng_data->buf, chunk))
> -			return -EFAULT;
> +		if (copy_to_user(ubuf, prng_data->buf, chunk)) {
> +			ret = -EFAULT;
> +			break;
> +		}
> 
>  		nbytes -= chunk;
>  		ret += chunk;
> 

Nice spotting, I will add this to my fixes tree. Thank you..

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply

* [PATCH 1/3] drivers: crypto: Add Support for Octeon-tx CPT Engine
From: gcherianv @ 2016-11-18 15:00 UTC (permalink / raw)
  To: linux-kernel, linux-crypto; +Cc: davem, herbert, George Cherian
In-Reply-To: <1479481209-11475-1-git-send-email-gcherianv@gmail.com>

From: George Cherian <george.cherian@cavium.com>

Enable the Physical Function diver for the Cavium Crypto Engine (CPT)
found in Octeon-tx series of SoC's. CPT is the Cryptographic Acceleration
Unit. CPT includes microcoded GigaCypher symmetric engines (SEs) and
asymmetric engines (AEs).

Signed-off-by: George Cherian <george.cherian@cavium.com>
---
 drivers/crypto/cavium/cpt/Kconfig        |  22 +
 drivers/crypto/cavium/cpt/Makefile       |   2 +
 drivers/crypto/cavium/cpt/cpt.h          |  90 +++
 drivers/crypto/cavium/cpt/cpt_common.h   | 377 +++++++++++++
 drivers/crypto/cavium/cpt/cpt_hw_types.h | 940 +++++++++++++++++++++++++++++++
 drivers/crypto/cavium/cpt/cpt_main.c     | 891 +++++++++++++++++++++++++++++
 drivers/crypto/cavium/cpt/cpt_pf_mbox.c  | 174 ++++++
 7 files changed, 2496 insertions(+)
 create mode 100644 drivers/crypto/cavium/cpt/Kconfig
 create mode 100644 drivers/crypto/cavium/cpt/Makefile
 create mode 100644 drivers/crypto/cavium/cpt/cpt.h
 create mode 100644 drivers/crypto/cavium/cpt/cpt_common.h
 create mode 100644 drivers/crypto/cavium/cpt/cpt_hw_types.h
 create mode 100644 drivers/crypto/cavium/cpt/cpt_main.c
 create mode 100644 drivers/crypto/cavium/cpt/cpt_pf_mbox.c

diff --git a/drivers/crypto/cavium/cpt/Kconfig b/drivers/crypto/cavium/cpt/Kconfig
new file mode 100644
index 0000000..8fe3f44
--- /dev/null
+++ b/drivers/crypto/cavium/cpt/Kconfig
@@ -0,0 +1,22 @@
+#
+# Cavium crypto device configuration
+#
+
+config CRYPTO_DEV_CPT
+	tristate
+	select HW_RANDOM_OCTEON
+	select CRYPTO_AES
+	select CRYPTO_DES
+	select CRYPTO_BLKCIPHER
+	select FW_LOADER
+
+config OCTEONTX_CPT_PF
+	tristate "Octeon-tx CPT Physical function driver"
+	depends on ARCH_THUNDER
+	select CRYPTO_DEV_CPT
+	help
+	  Support for Cavium CPT block found in octeon-tx series of
+	  processors.
+
+	  To compile this as a module, choose M here: the module will be
+	  called cptpf.
diff --git a/drivers/crypto/cavium/cpt/Makefile b/drivers/crypto/cavium/cpt/Makefile
new file mode 100644
index 0000000..bf758e2
--- /dev/null
+++ b/drivers/crypto/cavium/cpt/Makefile
@@ -0,0 +1,2 @@
+obj-$(CONFIG_OCTEONTX_CPT_PF) += cptpf.o
+cptpf-objs := cpt_main.o cpt_pf_mbox.o
diff --git a/drivers/crypto/cavium/cpt/cpt.h b/drivers/crypto/cavium/cpt/cpt.h
new file mode 100644
index 0000000..63d12da
--- /dev/null
+++ b/drivers/crypto/cavium/cpt/cpt.h
@@ -0,0 +1,90 @@
+/*
+ * Copyright (C) 2016 Cavium, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ */
+
+#ifndef __CPT_H
+#define __CPT_H
+
+#include "cpt_common.h"
+
+#define BASE_PROC_DIR	"cavium"
+
+#define PF  0
+#define VF  1
+
+struct cpt_device;
+
+struct microcode {
+	uint8_t  is_mc_valid;
+	uint8_t  is_ae;
+	uint8_t  group;
+	uint32_t code_size;
+	void    *code;
+	uint8_t  num_cores;
+	uint64_t core_mask_low; /* Used as long as num # cores are <= 64 */
+	uint64_t core_mask_hi;  /* Unused for now */
+	uint8_t  version[32];
+
+	/* Base info */
+	dma_addr_t dma;
+	dma_addr_t phys_base;
+	void *base;
+};
+
+#define VF_STATE_DOWN	(0)
+#define VF_STATE_UP	(1)
+
+struct cpt_vf_info {
+	uint8_t state;
+	uint8_t priority;
+	uint32_t qlen;
+	union cpt_chipid_vfid id;
+};
+
+/**
+ * cpt device structure
+ */
+struct cpt_device {
+	uint32_t chip_id; /**< CPT Device ID */
+	uint16_t core_freq; /**< CPT Device Frequency */
+	uint16_t flags;	/**< Flags to hold device status bits */
+	uint8_t idx; /**< Device Index (0...MAX_CPT_DEVICES) */
+	uint8_t num_vf_en; /**< Number of VFs enabled (0...CPT_MAX_VF_NUM) */
+
+	struct cpt_vf_info vfinfo[CPT_MAX_VF_NUM]; /* Per VF info */
+	uint8_t next_mc_idx; /**< next microcode index */
+	uint8_t next_group;
+
+	uint8_t max_se_cores;
+	uint8_t max_ae_cores;
+	uint8_t avail_se_cores;
+	uint8_t avail_ae_cores;
+
+	void __iomem *reg_base; /* Register start address */
+
+	/* MSI-X */
+	bool msix_enabled;
+	uint8_t	num_vec;
+	struct msix_entry msix_entries[CPT_PF_MSIX_VECTORS];
+	bool irq_allocated[CPT_PF_MSIX_VECTORS];
+
+	bool mbx_lock[CPT_MAX_VF_NUM]; /* Mailbox locks per VF */
+
+	struct pci_dev *pdev; /**< pci device handle */
+	void *proc; /**< proc dir */
+	struct microcode mcode[CPT_MAX_CORE_GROUPS];
+};
+
+struct cpt_device_list {
+	/* device list lock */
+	spinlock_t lock;
+	uint32_t nr_device;
+	struct cpt_device *device_ptr[MAX_CPT_DEVICES];
+};
+
+void cpt_mbox_intr_handler(struct cpt_device *cpt, int mbx);
+#endif /* __CPT_H */
diff --git a/drivers/crypto/cavium/cpt/cpt_common.h b/drivers/crypto/cavium/cpt/cpt_common.h
new file mode 100644
index 0000000..351ed4a
--- /dev/null
+++ b/drivers/crypto/cavium/cpt/cpt_common.h
@@ -0,0 +1,377 @@
+/*
+ * Copyright (C) 2016 Cavium, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ */
+
+#ifndef __CPT_COMMON_H
+#define __CPT_COMMON_H
+
+#include <asm/byteorder.h>
+#include <linux/uaccess.h>
+#include <linux/types.h>
+#include <linux/spinlock.h>
+#include <linux/pci.h>
+#include <linux/cpumask.h>
+#include <linux/string.h>
+#include <linux/pci_regs.h>
+#include <linux/delay.h>
+#include <linux/printk.h>
+#include <linux/sched.h>
+#include <linux/completion.h>
+#include <asm/arch_timer.h>
+#include <linux/types.h>
+
+#include "cpt_hw_types.h"
+
+/* configuration space offsets */
+#ifndef PCI_VENDOR_ID
+#define PCI_VENDOR_ID 0x00 /* 16 bits */
+#endif
+#ifndef PCI_DEVICE_ID
+#define PCI_DEVICE_ID 0x02 /* 16 bits */
+#endif
+#ifndef PCI_REVISION_ID
+#define PCI_REVISION_ID 0x08 /* Revision ID */
+#endif
+#ifndef PCI_CAPABILITY_LIST
+#define PCI_CAPABILITY_LIST 0x34 /* first capability list entry */
+#endif
+
+/* Device ID */
+#define PCI_VENDOR_ID_CAVIUM 0x177d
+#define CPT_81XX_PCI_PF_DEVICE_ID 0xa040
+#define CPT_81XX_PCI_VF_DEVICE_ID 0xa041
+
+#define PASS_1_0 0x0
+
+/* CPT Models ((Device ID<<16)|Revision ID) */
+/* CPT models */
+#define CPT_81XX_PASS1_0 ((CPT_81XX_PCI_PF_DEVICE_ID << 8) | PASS_1_0)
+#define CPTVF_81XX_PASS1_0 ((CPT_81XX_PCI_VF_DEVICE_ID << 8) | PASS_1_0)
+
+#define PF 0
+#define VF 1
+
+#define DEFAULT_DEVICE_QUEUES CPT_NUM_QS_PER_VF
+
+#define SUCCESS	(0)
+#define FAIL	(1)
+
+#ifndef ROUNDUP4
+#define ROUNDUP4(val) (((val) + 3) & 0xfffffffc)
+#endif
+
+#ifndef ROUNDUP8
+#define ROUNDUP8(val) (((val) + 7) & 0xfffffff8)
+#endif
+
+#ifndef ROUNDUP16
+#define ROUNDUP16(val) (((val) + 15) & 0xfffffff0)
+#endif
+
+#define ERR_ADDR_LEN 8
+
+#define CPT_MBOX_MSG_TIMEOUT 2000
+#define VF_STATE_DOWN (0)
+#define VF_STATE_UP (1)
+
+/**< flags to indicate the features supported */
+#define CPT_FLAG_DMA_64BIT (uint16_t)(1 << 0)
+#define CPT_FLAG_MSIX_ENABLED (uint16_t)(1 << 1)
+#define CPT_FLAG_SRIOV_ENABLED (uint16_t)(1 << 2)
+#define CPT_FLAG_VF_DRIVER (uint16_t)(1 << 3)
+#define CPT_FLAG_DEVICE_READY (uint16_t)(1 << 4)
+
+#define cpt_msix_enabled(cpt) ((cpt)->flags & CPT_FLAG_MSIX_ENABLED)
+#define cpt_sriov_enabled(cpt) ((cpt)->flags & CPT_FLAG_SRIOV_ENABLED)
+#define cpt_vf_driver(cpt) ((cpt)->flags & CPT_FLAG_VF_DRIVER)
+#define cpt_pf_driver(cpt) (!((cpt)->flags & CPT_FLAG_VF_DRIVER))
+#define cpt_device_ready(cpt) ((cpt)->flags & CPT_FLAG_DEVICE_READY)
+
+#define MAX_CPT_DEVICES	2
+
+/* Default command queue length */
+#define DEFAULT_CMD_QLEN 2046
+#define DEFAULT_CMD_QCHUNK_SIZE 1023
+
+/* Max command queue length allowed. This is to restrict host memory usage */
+#define MAX_CMD_QLEN 16000
+
+/* Completion Interrupt threshold */
+#define COMPLETION_INTR_THOLD 1
+
+/* Default command timeout in seconds */
+#define DEFAULT_COMMAND_TIMEOUT 4
+
+/* Default Mailbox ACK timeout */
+#define DEFAULT_MBOX_ACK_TIMEOUT 4
+
+#define CPT_MBOX_MSG_TYPE_REQ 0
+#define CPT_MBOX_MSG_TYPE_ACK 1
+#define CPT_MBOX_MSG_TYPE_NACK 2
+#define CPT_MBOX_MSG_TYPE_NOP 3
+
+#define CPT_COUNT_THOLD 1
+#define CPT_TIMER_THOLD	0xFFFF
+#define CPT_DBELL_THOLD	1
+
+/*
+ * CPT Registers map for 81xx
+ */
+
+/* PF registers */
+#define CPTX_PF_CONSTANTS(a) (0x0ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_RESET(a) (0x100ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_DIAG(a) (0x120ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_BIST_STATUS(a) (0x160ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_ECC0_CTL(a) (0x200ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_ECC0_FLIP(a) (0x210ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_ECC0_INT(a) (0x220ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_ECC0_INT_W1S(a) (0x230ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_ECC0_ENA_W1S(a)	(0x240ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_ECC0_ENA_W1C(a)	(0x250ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_MBOX_INTX(a, b)	\
+	(0x400ll + 0x1000000000ll * ((a) & 0x1) + 8ll * ((b) & 0x0))
+#define CPTX_PF_MBOX_INT_W1SX(a, b) \
+	(0x420ll + 0x1000000000ll * ((a) & 0x1) + 8ll * ((b) & 0x0))
+#define CPTX_PF_MBOX_ENA_W1CX(a, b) \
+	(0x440ll + 0x1000000000ll * ((a) & 0x1) + 8ll * ((b) & 0x0))
+#define CPTX_PF_MBOX_ENA_W1SX(a, b) \
+	(0x460ll + 0x1000000000ll * ((a) & 0x1) + 8ll * ((b) & 0x0))
+#define CPTX_PF_EXEC_INT(a) (0x500ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_EXEC_INT_W1S(a)	(0x520ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_EXEC_ENA_W1C(a)	(0x540ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_EXEC_ENA_W1S(a)	(0x560ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_GX_EN(a, b) \
+	(0x600ll + 0x1000000000ll * ((a) & 0x1) + 8ll * ((b) & 0x7))
+#define CPTX_PF_EXEC_INFO(a) (0x700ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_EXEC_BUSY(a) (0x800ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_EXEC_INFO0(a) (0x900ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_EXEC_INFO1(a) (0x910ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_INST_REQ_PC(a) (0x10000ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_INST_LATENCY_PC(a) \
+	(0x10020ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_RD_REQ_PC(a) (0x10040ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_RD_LATENCY_PC(a) (0x10060ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_RD_UC_PC(a) (0x10080ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_ACTIVE_CYCLES_PC(a) \
+	(0x10100ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_EXE_CTL(a) (0x4000000ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_EXE_STATUS(a) (0x4000008ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_EXE_CLK(a) (0x4000010ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_EXE_DBG_CTL(a) (0x4000018ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_EXE_DBG_DATA(a)	(0x4000020ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_EXE_BIST_STATUS(a) \
+	(0x4000028ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_EXE_REQ_TIMER(a) (0x4000030ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_EXE_MEM_CTL(a) (0x4000038ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_EXE_PERF_CTL(a)	(0x4001000ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_EXE_DBG_CNTX(a, b) \
+	(0x4001100ll + 0x1000000000ll * ((a) & 0x1) + 8ll * ((b) & 0xf))
+#define CPTX_PF_EXE_PERF_EVENT_CNT(a) \
+	(0x4001180ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_EXE_EPCI_INBX_CNT(a, b) \
+	(0x4001200ll + 0x1000000000ll * ((a) & 0x1) + 8ll * ((b) & 0x0))
+#define CPTX_PF_EXE_EPCI_OUTBX_CNT(a, b) \
+	(0x4001240ll + 0x1000000000ll * ((a) & 0x1) + 8ll * ((b) & 0x0))
+#define CPTX_PF_ENGX_UCODE_BASE(a, b) \
+	(0x4002000ll + 0x1000000000ll * ((a) & 0x1) + 8ll * ((b) & 0x3f))
+#define CPTX_PF_QX_CTL(a, b) \
+	(0x8000000ll + 0x1000000000ll * ((a) & 0x1) + 0x100000ll * ((b) & 0xf))
+#define CPTX_PF_QX_GMCTL(a, b) \
+	(0x8000020ll + 0x1000000000ll * ((a) & 0x1) + 0x100000ll * ((b) & 0xf))
+#define CPTX_PF_QX_CTL2(a, b) \
+	(0x8000100ll + 0x1000000000ll * ((a) & 0x1) + 0x100000ll * ((b) & 0xf))
+#define CPTX_PF_VFX_MBOXX(a, b, c) \
+	(0x8001000ll + 0x1000000000ll * ((a) & 0x1) + 0x100000ll * ((b) & 0xf) + 0x100ll * ((c) & 0x1))
+#define CPTX_PF_MSIX_VECX_ADDR(a, b) \
+	(0x0ll + 0x1000000000ll * ((a) & 0x1) + 0x10ll * ((b) & 0x3))
+#define CPTX_PF_MSIX_VECX_CTL(a, b) \
+	(0x8ll + 0x1000000000ll * ((a) & 0x1) + 0x10ll * ((b) & 0x3))
+#define CPTX_PF_MSIX_PBAX(a, b)	\
+	(0xf0000ll + 0x1000000000ll * ((a) & 0x1) + 8ll * ((b) & 0x0))
+
+/* VF registers */
+#define CPTX_VQX_CTL(a, b) \
+	(0x100ll + 0x1000000000ll * ((a) & 0x0) + 0x100000ll * ((b) & 0xf))
+#define CPTX_VQX_SADDR(a, b) \
+	(0x200ll + 0x1000000000ll * ((a) & 0x0) + 0x100000ll * ((b) & 0xf))
+#define CPTX_VQX_DONE_WAIT(a, b) \
+	(0x400ll + 0x1000000000ll * ((a) & 0x0) + 0x100000ll * ((b) & 0xf))
+#define CPTX_VQX_INPROG(a, b) \
+	(0x410ll + 0x1000000000ll * ((a) & 0x0) + 0x100000ll * ((b) & 0xf))
+#define CPTX_VQX_DONE(a, b) \
+	(0x420ll + 0x1000000000ll * ((a) & 0x1) + 0x100000ll * ((b) & 0xf))
+#define CPTX_VQX_DONE_ACK(a, b) \
+	(0x440ll + 0x1000000000ll * ((a) & 0x1) + 0x100000ll * ((b) & 0xf))
+#define CPTX_VQX_DONE_INT_W1S(a, b) \
+	(0x460ll + 0x1000000000ll * ((a) & 0x1) + 0x100000ll * ((b) & 0xf))
+#define CPTX_VQX_DONE_INT_W1C(a, b) \
+	(0x468ll + 0x1000000000ll * ((a) & 0x1) + 0x100000ll * ((b) & 0xf))
+#define CPTX_VQX_DONE_ENA_W1S(a, b) \
+	(0x470ll + 0x1000000000ll * ((a) & 0x1) + 0x100000ll * ((b) & 0xf))
+#define CPTX_VQX_DONE_ENA_W1C(a, b) \
+	(0x478ll + 0x1000000000ll * ((a) & 0x1) + 0x100000ll * ((b) & 0xf))
+#define CPTX_VQX_MISC_INT(a, b)	\
+	(0x500ll + 0x1000000000ll * ((a) & 0x1) + 0x100000ll * ((b) & 0xf))
+#define CPTX_VQX_MISC_INT_W1S(a, b) \
+	(0x508ll + 0x1000000000ll * ((a) & 0x1) + 0x100000ll * ((b) & 0xf))
+#define CPTX_VQX_MISC_ENA_W1S(a, b) \
+	(0x510ll + 0x1000000000ll * ((a) & 0x1) + 0x100000ll * ((b) & 0xf))
+#define CPTX_VQX_MISC_ENA_W1C(a, b) \
+	(0x518ll + 0x1000000000ll * ((a) & 0x1) + 0x100000ll * ((b) & 0xf))
+#define CPTX_VQX_DOORBELL(a, b)	\
+	(0x600ll + 0x1000000000ll * ((a) & 0x1) + 0x100000ll * ((b) & 0xf))
+#define CPTX_VFX_PF_MBOXX(a, b, c) \
+	(0x1000ll + 0x1000000000ll * ((a) & 0x1) + 0x100000ll * ((b) & 0xf) + 8ll * ((c) & 0x1))
+#define CPTX_VFX_MSIX_VECX_ADDR(a, b, c) \
+	(0x0ll + 0x1000000000ll * ((a) & 0x1) + 0x100000ll * ((b) & 0xf) + 0x10ll * ((c) & 0x1))
+#define CPTX_VFX_MSIX_VECX_CTL(a, b, c) \
+	(0x8ll + 0x1000000000ll * ((a) & 0x1) + 0x100000ll * ((b) & 0xf) + 0x10ll * ((c) & 0x1))
+#define CPTX_VFX_MSIX_PBAX(a, b, c) \
+	(0xf0000ll + 0x1000000000ll * ((a) & 0x1) + 0x100000ll * ((b) & 0xf) + 8ll * ((c) & 0x0))
+
+/* Future extensions */
+#define CPTX_BRIDGE_BP_TEST(a) (0x1c0ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_CQM_CORE_OBS0(a) (0x1a0ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_CQM_CORE_OBS1(a) (0x1a8ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_NCBI_OBS(a) (0x190ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_BP_TEST(a) (0x180ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_ECO(a) (0x140ll + 0x1000000000ll * ((a) & 0x1))
+
+/*###### PCIE EP-Mode Configuration Registers #########*/
+#define PCIEEP0_CFG000 (0x0)
+#define PCIEEP0_CFG002 (0x8)
+#define PCIEEP0_CFG011 (0x2C)
+#define PCIEEP0_CFG020 (0x50)
+#define PCIEEP0_CFG025 (0x64)
+#define PCIEEP0_CFG030 (0x78)
+#define PCIEEP0_CFG044 (0xB0)
+#define PCIEEP0_CFG045 (0xB4)
+#define PCIEEP0_CFG082 (0x148)
+#define PCIEEP0_CFG095 (0x17C)
+#define PCIEEP0_CFG096 (0x180)
+#define PCIEEP0_CFG097 (0x184)
+#define PCIEEP0_CFG103 (0x19C)
+#define PCIEEP0_CFG460 (0x730)
+#define PCIEEP0_CFG461 (0x734)
+#define PCIEEP0_CFG462 (0x738)
+
+/*#######  PCIe EP-Mode SR-IOV Configuration Registers  #####*/
+#define PCIEEPVF0_CFG000 (0x0)
+#define PCIEEPVF0_CFG002 (0x8)
+#define PCIEEPVF0_CFG011 (0x2C)
+#define PCIEEPVF0_CFG030 (0x78)
+#define PCIEEPVF0_CFG044 (0xB0)
+
+enum vftype {
+	AE_TYPES = 1,
+	SE_TYPES = 2,
+	BAD_CPT_TYPES,
+};
+
+static inline int32_t count_set_bits(uint64_t mask)
+{
+	int32_t count = 0;
+
+	while (mask) {
+		if (mask & 1ULL)
+			count++;
+		mask = mask >> 1;
+	}
+
+	return count;
+}
+
+static const uint8_t cpt_device_name[] = "CPT81XX";
+static const uint8_t cptvf_device_name[] = "CPT81XX-VF";
+static const uint8_t cpt_device_file[] = "cpt";
+static const uint8_t cptvf_device_file[] = "cptvf";
+
+static const uint8_t cpt_driver_name[] = "CPT Driver";
+static const uint8_t cpt_driver_class[] = "crypto";
+static const uint8_t cptvf_driver_class[] = "cryptovf";
+
+/* Max CPT devices supported */
+enum cpt_mbox_opcode {
+	CPT_MSG_VF_CFG = 1,
+	CPT_MSG_VF_UP,
+	CPT_MSG_VF_DOWN,
+	CPT_MSG_CHIPID_VFID,
+	CPT_MSG_READY,
+	CPT_MSG_QLEN,
+	CPT_MSG_QBIND_GRP,
+	CPT_MSG_VQ_PRIORITY,
+	CPT_MSG_VF_QUERY_HEALTH,
+};
+
+union cpt_chipid_vfid {
+	uint16_t u16;
+	struct {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		uint16_t chip_id:8;
+		uint16_t vfid:8;
+#else
+		uint16_t vfid:8;
+		uint16_t chip_id:8;
+#endif
+	} s;
+};
+
+/* CPT mailbox structure */
+struct cpt_mbox {
+	uint64_t msg; /* Message type MBOX[0] */
+	uint64_t data;/* Data         MBOX[1] */
+};
+
+/* The Cryptographic Acceleration Unit can *only* be found in SoCs
+ * containing the ThunderX ARM64 CPU implementation.  All accesses to the device
+ * registers on this platform are implicitly strongly ordered with respect
+ * to memory accesses. So writeq_relaxed() and readq_relaxed() are safe to use
+ * with no memory barriers in this driver.  The readq()/writeq() functions add
+ * explicit ordering operation which in this case are redundant, and only
+ * add overhead.
+ */
+/* Register read/write APIs */
+static inline void cpt_write_csr64(uint8_t __iomem *hw_addr, uint64_t offset,
+				   uint64_t val)
+{
+	uint8_t __iomem *base = ACCESS_ONCE(hw_addr);
+
+	writeq_relaxed(val, base + offset);
+}
+
+static inline uint64_t cpt_read_csr64(uint8_t __iomem *hw_addr, uint64_t offset)
+{
+	uint8_t __iomem *base = ACCESS_ONCE(hw_addr);
+
+	return readq_relaxed(base + offset);
+}
+
+static inline void byte_swap_64(uint64_t *data)
+{
+	uint64_t val = 0ULL;
+	uint8_t *a, *b;
+
+	a = (uint8_t *)data;
+	b = (uint8_t *)&val;
+	b[0] = a[7];
+	b[1] = a[6];
+	b[2] = a[5];
+	b[3] = a[4];
+	b[4] = a[3];
+	b[5] = a[2];
+	b[6] = a[1];
+	b[7] = a[0];
+	*data = val;
+}
+
+static inline void byte_swap_16(uint16_t *data)
+{
+	uint16_t val = *data;
+	*data = (val >> 8) | (val << 8);
+}
+#endif /* __CPT_COMMON_H */
diff --git a/drivers/crypto/cavium/cpt/cpt_hw_types.h b/drivers/crypto/cavium/cpt/cpt_hw_types.h
new file mode 100644
index 0000000..a6def18
--- /dev/null
+++ b/drivers/crypto/cavium/cpt/cpt_hw_types.h
@@ -0,0 +1,940 @@
+/*
+ * Copyright (C) 2016 Cavium, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ */
+
+#ifndef __CPT_HW_TYPES_H
+#define __CPT_HW_TYPES_H
+
+#include "cpt_common.h"
+
+#define NR_CLUSTER (4)
+#define CSR_DELAY (30)
+
+#define CPT_NUM_QS_PER_VF (1)
+#define CPT_INST_SIZE (64)
+#define CPT_VQ_CHUNK_ALIGN (128) /**< 128 byte align */
+#define CPT_NEXT_CHUNK_PTR_SIZE (8)
+#define CPT_INST_CHUNK_MAX_SIZE (1023)
+
+#define CPT_MAX_CORE_GROUPS (8)
+#define CPT_MAX_SE_CORES (10)
+#define CPT_MAX_AE_CORES (6)
+#define CPT_MAX_TOTAL_CORES (CPT_MAX_SE_CORES + CPT_MAX_AE_CORES)
+#define CPT_MAX_VF_NUM (16)
+#define CPT_MAX_VQ_NUM (16)
+#define CPT_PF_VF_MAILBOX_SIZE (2)
+
+/* MSI-X interrupts */
+#define	CPT_PF_MSIX_VECTORS (3)
+#define	CPT_VF_MSIX_VECTORS (2)
+
+/* Configuration and Status registers are in BAR 0 */
+#define CPT_CSR_BAR 0
+#define CPT_MSIX_BAR 4
+
+/**
+ * Enumeration cpt_bar_e
+ *
+ * CPT Base Address Register Enumeration
+ * Enumerates the base address registers.
+ */
+#define CPT_BAR_E_CPTX_PF_BAR0(a) (0x872000000000ll + 0x1000000000ll * (a))
+#define CPT_BAR_E_CPTX_PF_BAR4(a) (0x872010000000ll + 0x1000000000ll * (a))
+#define CPT_BAR_E_CPTX_VFX_BAR0(a, b) \
+	(0x872020000000ll + 0x1000000000ll * (a) + 0x100000ll * (b))
+#define CPT_BAR_E_CPTX_VFX_BAR4(a, b) \
+	(0x872030000000ll + 0x1000000000ll * (a) + 0x100000ll * (b))
+
+/**
+ * Enumeration cpt_comp_e
+ *
+ * CPT Completion Enumeration
+ * Enumerates the values of CPT_RES_S[COMPCODE].
+ */
+enum cpt_comp_e {
+	CPT_COMP_E_NOTDONE = 0x00,
+	CPT_COMP_E_GOOD = 0x01,
+	CPT_COMP_E_FAULT = 0x02,
+	CPT_COMP_E_SWERR = 0x03,
+	CPT_COMP_E_LAST_ENTRY = 0xFF
+};
+
+/**
+ * Enumeration cpt_engine_err_type_e
+ *
+ * CPT Engine Error Code Enumeration
+ * Enumerates the values of CPT_RES_S[COMPCODE].
+ */
+enum cpt_engine_err_type_e {
+	CPT_ENGINE_ERR_TYPE_E_NOERR = 0x00,
+	CPT_ENGINE_ERR_TYPE_E_RF = 0x01,
+	CPT_ENGINE_ERR_TYPE_E_UC = 0x02,
+	CPT_ENGINE_ERR_TYPE_E_WD = 0x04,
+	CPT_ENGINE_ERR_TYPE_E_GE = 0x08,
+	CPT_ENGINE_ERR_TYPE_E_BUS = 0x20,
+	CPT_ENGINE_ERR_TYPE_E_LAST = 0xFF
+};
+
+/**
+ * Enumeration cpt_eop_e
+ *
+ * CPT EOP (EPCI Opcodes) Enumeration
+ * Opcodes on the epci bus.
+ */
+enum cpt_eop_e {
+	CPT_EOP_E_DMA_RD_LDT = 0x01,
+	CPT_EOP_E_DMA_RD_LDI = 0x02,
+	CPT_EOP_E_DMA_RD_LDY = 0x06,
+	CPT_EOP_E_DMA_RD_LDD = 0x08,
+	CPT_EOP_E_DMA_RD_LDE = 0x0b,
+	CPT_EOP_E_DMA_RD_LDWB = 0x0d,
+	CPT_EOP_E_DMA_WR_STY = 0x0e,
+	CPT_EOP_E_DMA_WR_STT = 0x11,
+	CPT_EOP_E_DMA_WR_STP = 0x12,
+	CPT_EOP_E_ATM_FAA64 = 0x3b,
+	CPT_EOP_E_RANDOM1_REQ = 0x61,
+	CPT_EOP_E_RANDOM_REQ = 0x60,
+	CPT_EOP_E_ERR_REQUEST = 0xfb,
+	CPT_EOP_E_UCODE_REQ = 0xfc,
+	CPT_EOP_E_MEMB = 0xfd,
+	CPT_EOP_E_NEW_WORK_REQ = 0xff,
+};
+
+/**
+ * Enumeration cpt_pf_int_vec_e
+ *
+ * CPT PF MSI-X Vector Enumeration
+ * Enumerates the MSI-X interrupt vectors.
+ */
+enum cpt_pf_int_vec_e {
+	CPT_PF_INT_VEC_E_ECC0 = 0x00,
+	CPT_PF_INT_VEC_E_EXEC = 0x01
+};
+
+#define CPT_PF_INT_VEC_E_MBOXX(a) (0x02 + (a))
+
+/**
+ * Enumeration cpt_rams_e
+ *
+ * CPT RAM Field Enumeration
+ * Enumerates the relative bit positions within CPT()_PF_ECC0_CTL[CDIS].
+ */
+enum cpt_rams_e {
+	CPT_RAMS_E_NCBI_DATFIF = 0x00,
+	CPT_RAMS_E_NCBO_MEM0 = 0x01,
+	CPT_RAMS_E_CQM_CTLMEM = 0x02,
+	CPT_RAMS_E_CQM_BPTR = 0x03,
+	CPT_RAMS_E_CQM_GMID = 0x04,
+	CPT_RAMS_E_CQM_INSTFIF0 = 0x05,
+	CPT_RAMS_E_CQM_INSTFIF1 = 0x06,
+	CPT_RAMS_E_CQM_INSTFIF2 = 0x07,
+	CPT_RAMS_E_CQM_INSTFIF3 = 0x08,
+	CPT_RAMS_E_CQM_INSTFIF4 = 0x09,
+	CPT_RAMS_E_CQM_INSTFIF5 = 0x0a,
+	CPT_RAMS_E_CQM_INSTFIF6 = 0x0b,
+	CPT_RAMS_E_CQM_INSTFIF7 = 0x0c,
+	CPT_RAMS_E_CQM_DONE_CNT = 0x0d,
+	CPT_RAMS_E_CQM_DONE_TIMER = 0x0e,
+	CPT_RAMS_E_COMP_FIFO = 0x0f,
+	CPT_RAMS_E_MBOX_MEM = 0x10,
+	CPT_RAMS_E_FPA_MEM = 0x11,
+	CPT_RAMS_E_CDEI_UCODE = 0x12,
+	CPT_RAMS_E_COMP_ARRAY0 = 0x13,
+	CPT_RAMS_E_COMP_ARRAY1 = 0x14,
+	CPT_RAMS_E_CSR_VMEM = 0x15,
+	CPT_RAMS_E_RSP_MAP = 0x16,
+	CPT_RAMS_E_RSP_INST = 0x17,
+	CPT_RAMS_E_RSP_NCBO = 0x18,
+	CPT_RAMS_E_RSP_RNM = 0x19,
+	CPT_RAMS_E_CDEI_FIFO0 = 0x1a,
+	CPT_RAMS_E_CDEI_FIFO1 = 0x1b,
+	CPT_RAMS_E_EPCO_FIFO0 = 0x1c,
+	CPT_RAMS_E_EPCO_FIFO1 = 0x1d,
+	CPT_RAMS_E_LAST_ENTRY = 0xff
+};
+
+/**
+ * Enumeration cpt_vf_int_vec_e
+ *
+ * CPT VF MSI-X Vector Enumeration
+ * Enumerates the MSI-X interrupt vectors.
+ */
+enum cpt_vf_int_vec_e {
+	CPT_VF_INT_VEC_E_MISC = 0x00,
+	CPT_VF_INT_VEC_E_DONE = 0x01
+};
+
+#define CPT_VF_INTR_MBOX_MASK BIT(0)
+#define CPT_VF_INTR_DOVF_MASK BIT(1)
+#define CPT_VF_INTR_IRDE_MASK BIT(2)
+#define CPT_VF_INTR_NWRP_MASK BIT(3)
+#define CPT_VF_INTR_SERR_MASK BIT(4)
+
+/**
+ * Structure cpt_inst_s
+ *
+ * CPT Instruction Structure
+ * This structure specifies the instruction layout. Instructions are
+ * stored in memory as little-endian unless CPT()_PF_Q()_CTL[INST_BE] is set.
+ * cpt_inst_s_s
+ * Word 0
+ * doneint:1 Done interrupt.
+ *	0 = No interrupts related to this instruction.
+ *	1 = When the instruction completes, CPT()_VQ()_DONE[DONE] will be
+ *	incremented,and based on the rules described there an interrupt may
+ *	occur.
+ * Word 1
+ * res_addr:64 [127: 64] Result IOVA.
+ *	If nonzero, specifies where to write CPT_RES_S.
+ *	If zero, no result structure will be written.
+ *	Address must be 16-byte aligned.
+ *	Bits <63:49> are ignored by hardware; software should use a
+ *	sign-extended bit <48> for forward compatibility.
+ * Word 2
+ *  grp:10 [171:162] If [WQ_PTR] is nonzero, the SSO guest-group to use when
+ *	CPT submits work SSO.
+ *	For the SSO to not discard the add-work request, FPA_PF_MAP() must map
+ *	[GRP] and CPT()_PF_Q()_GMCTL[GMID] as valid.
+ *  tt:2 [161:160] If [WQ_PTR] is nonzero, the SSO tag type to use when CPT
+ *	submits work to SSO
+ *  tag:32 [159:128] If [WQ_PTR] is nonzero, the SSO tag to use when CPT
+ *	submits work to SSO.
+ * Word 3
+ *  wq_ptr:64 [255:192] If [WQ_PTR] is nonzero, it is a pointer to a
+ *	work-queue entry that CPT submits work to SSO after all context,
+ *	output data, and result write operations are visible to other
+ *	CNXXXX units and the cores. Bits <2:0> must be zero.
+ *	Bits <63:49> are ignored by hardware; software should
+ *	use a sign-extended bit <48> for forward compatibility.
+ *	Internal:
+ *	Bits <63:49>, <2:0> are ignored by hardware, treated as always 0x0.
+ * Word 4
+ *  ei0:64; [319:256] Engine instruction word 0. Passed to the AE/SE.
+ * Word 5
+ *  ei1:64; [383:320] Engine instruction word 1. Passed to the AE/SE.
+ * Word 6
+ *  ei2:64; [447:384] Engine instruction word 1. Passed to the AE/SE.
+ * Word 7
+ *  ei3:64; [511:448] Engine instruction word 1. Passed to the AE/SE.
+ *
+ */
+union cpt_inst_s {
+	uint64_t u[8];
+	struct cpt_inst_s_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		uint64_t reserved_17_63:47;
+		uint64_t doneint:1;
+		uint64_t reserved_0_1:16;
+#else /* Word 0 - Little Endian */
+		uint64_t reserved_0_15:16;
+		uint64_t doneint:1;
+		uint64_t reserved_17_63:47;
+#endif /* Word 0 - End */
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 1 - Big Endian */
+		uint64_t res_addr:64;
+#else /* Word 1 - Little Endian */
+		uint64_t res_addr:64;
+#endif /* Word 1 - End */
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 2 - Big Endian */
+		uint64_t reserved_172_19:20;
+		uint64_t grp:10;
+		uint64_t tt:2;
+		uint64_t tag:32;
+#else /* Word 2 - Little Endian */
+		uint64_t tag:32;
+		uint64_t tt:2;
+		uint64_t grp:10;
+		uint64_t reserved_172_191:20;
+#endif /* Word 2 - End */
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 3 - Big Endian */
+		uint64_t wq_ptr:64;
+#else /* Word 3 - Little Endian */
+		uint64_t wq_ptr:64;
+#endif /* Word 3 - End */
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 4 - Big Endian */
+		uint64_t ei0:64;
+#else /* Word 4 - Little Endian */
+		uint64_t ei0:64;
+#endif /* Word 4 - End */
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 5 - Big Endian */
+		uint64_t ei1:64;
+#else /* Word 5 - Little Endian */
+		uint64_t ei1:64;
+#endif /* Word 5 - End */
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 6 - Big Endian */
+		uint64_t ei2:64;
+#else /* Word 6 - Little Endian */
+		uint64_t ei2:64;
+#endif /* Word 6 - End */
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 7 - Big Endian */
+		uint64_t ei3:64;
+#else /* Word 7 - Little Endian */
+		uint64_t ei3:64;
+#endif /* Word 7 - End */
+	} s;
+};
+
+/**
+ * Structure cpt_res_s
+ *
+ * CPT Result Structure
+ * The CPT coprocessor writes the result structure after it completes a
+ * CPT_INST_S instruction. The result structure is exactly 16 bytes, and
+ * each instruction completion produces exactly one result structure.
+ *
+ * This structure is stored in memory as little-endian unless
+ * CPT()_PF_Q()_CTL[INST_BE] is set.
+ * cpt_res_s_s
+ * Word 0
+ *  doneint:1 [16:16] Done interrupt. This bit is copied from the
+ *	corresponding instruction's CPT_INST_S[DONEINT].
+ *  compcode:8 [7:0] Indicates completion/error status of the CPT coprocessor
+ *	for the	associated instruction, as enumerated by CPT_COMP_E.
+ *	Core software may write the memory location containing [COMPCODE] to
+ *	0x0 before ringing the doorbell, and then poll for completion by
+ *	checking for a nonzero value.
+ *	Once the core observes a nonzero [COMPCODE] value in this case,the CPT
+ *	coprocessor will have also completed L2/DRAM write operations.
+ * Word 1
+ *  reserved
+ *
+ */
+union cpt_res_s {
+	uint64_t u[2];
+	struct cpt_res_s_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		uint64_t reserved_17_63:47;
+		uint64_t doneint:1;
+		uint64_t reserved_8_15:8;
+		uint64_t compcode:8;
+#else /* Word 0 - Little Endian */
+		uint64_t compcode:8;
+		uint64_t reserved_8_15:8;
+		uint64_t doneint:1;
+		uint64_t reserved_17_63:47;
+#endif /* Word 0 - End */
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 1 - Big Endian */
+		uint64_t reserved_64_127:64;
+#else /* Word 1 - Little Endian */
+		uint64_t reserved_64_127:64;
+#endif /* Word 1 - End */
+	} s;
+};
+
+/**
+ * Register (NCB) cpt#_pf_bist_status
+ *
+ * CPT PF Control Bist Status Register
+ * This register has the BIST status of memories. Each bit is the BIST result
+ * of an individual memory (per bit, 0 = pass and 1 = fail).
+ * cptx_pf_bist_status_s
+ * Word0
+ *  bstatus [29:0](RO/H) BIST status. One bit per memory, enumerated by
+ *	CPT_RAMS_E.
+ */
+union cptx_pf_bist_status {
+	uint64_t u;
+	struct cptx_pf_bist_status_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		uint64_t reserved_30_63:34;
+		uint64_t bstatus:30;
+#else /* Word 0 - Little Endian */
+		uint64_t bstatus:30;
+		uint64_t reserved_30_63:34;
+#endif /* Word 0 - End */
+	} s;
+};
+
+/**
+ * Register (NCB) cpt#_pf_constants
+ *
+ * CPT PF Constants Register
+ * This register contains implementation-related parameters of CPT in CNXXXX.
+ * cptx_pf_constants_s
+ * Word 0
+ *  reserved_40_63:24 [63:40] Reserved.
+ *  epcis:8 [39:32](RO) Number of EPCI busses.
+ *  grps:8 [31:24](RO) Number of engine groups implemented.
+ *  ae:8 [23:16](RO/H) Number of AEs. In CNXXXX, for CPT0 returns 0x0,
+ *	for CPT1 returns 0x18, or less if there are fuse-disables.
+ *  se:8 [15:8](RO/H) Number of SEs. In CNXXXX, for CPT0 returns 0x30,
+ *	or less if there are fuse-disables, for CPT1 returns 0x0.
+ *  vq:8 [7:0](RO) Number of VQs.
+ * cptx_pf_constants_cn81xx
+ * Word 0
+ *  reserved_40_63:24 [63:40] Reserved
+ *  epcis:8 [39:32](RO) Number of EPCI busses.
+ *  grps:8 [31:24](RO) Number of engine groups implemented.
+ *  ae:8 [23:16](RO/H) Number of AEs. In CNXXXX, returns 0x6 or less
+ *	if there are fuse-disables.
+ *  se:8 [15: 8](RO/H) Number of SEs. In CNXXXX, returns 0xA, or less
+ *	if there are fuse-disables.
+ *  vq:8 [7:0](RO) Number of VQs.
+ *
+ */
+union cptx_pf_constants {
+	uint64_t u;
+	struct cptx_pf_constants_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		uint64_t reserved_40_63:24;
+		uint64_t epcis:8;
+		uint64_t grps:8;
+		uint64_t ae:8;
+		uint64_t se:8;
+		uint64_t vq:8;
+#else /* Word 0 - Little Endian */
+		uint64_t vq:8;
+		uint64_t se:8;
+		uint64_t ae:8;
+		uint64_t grps:8;
+		uint64_t epcis:8;
+		uint64_t reserved_40_63:24;
+#endif /* Word 0 - End */
+	} s;
+	struct cptx_pf_constants_cn81xx {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		uint64_t reserved_40_63:24;
+		uint64_t epcis:8;
+		uint64_t grps:8;
+		uint64_t ae:8;
+		uint64_t se:8;
+		uint64_t vq:8;
+#else /* Word 0 - Little Endian */
+		uint64_t vq:8;
+		uint64_t se:8;
+		uint64_t ae:8;
+		uint64_t grps:8;
+		uint64_t epcis:8;
+		uint64_t reserved_40_63:24;
+#endif /* Word 0 - End */
+	} cn81xx;
+};
+
+/**
+ * Register (NCB) cpt#_pf_exe_bist_status
+ *
+ * CPT PF Engine Bist Status Register
+ * This register has the BIST status of each engine.  Each bit is the
+ * BIST result of an individual engine (per bit, 0 = pass and 1 = fail).
+ * cptx_pf_exe_bist_status_s
+ * Word0
+ *  reserved_48_63:16 [63:48] reserved
+ *  bstatus:48 [47:0](RO/H) BIST status. One bit per engine.
+ *
+ */
+union cptx_pf_exe_bist_status {
+	uint64_t u;
+	struct cptx_pf_exe_bist_status_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		uint64_t reserved_48_63:16;
+		uint64_t bstatus:48
+#else /* Word 0 - Little Endian */
+		uint64_t bstatus:48;
+		uint64_t reserved_48_63:16;
+#endif /* Word 0 - End */
+	} s;
+	struct cptx_pf_exe_bist_status_cn81xx {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		uint64_t reserved_16_63:48;
+		uint64_t bstatus:16;
+#else /* Word 0 - Little Endian */
+		uint64_t bstatus:16;
+		uint64_t reserved_16_63:48;
+#endif /* Word 0 - End */
+	} cn81xx;
+};
+
+/**
+ * Register (NCB) cpt#_pf_exe_ctl
+ *
+ * CPT PF Engine Control Register
+ * This register enables the engines.
+ * cptx_pf_exe_ctl_s
+ * Word0
+ *  enable:64 [63:0](R/W) Individual enables for each of the engines.
+ */
+union cptx_pf_exe_ctl {
+	uint64_t u;
+	struct cptx_pf_exe_ctl_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		uint64_t enable:64;
+#else /* Word 0 - Little Endian */
+		uint64_t enable:64;
+#endif /* Word 0 - End */
+	} s;
+};
+
+/**
+ * Register (NCB) cpt#_pf_q#_ctl
+ *
+ * CPT Queue Control Register
+ * This register configures queues. This register should be changed only
+ * when quiescent (see CPT()_VQ()_INPROG[INFLIGHT]).
+ * cptx_pf_qx_ctl_s
+ * Word0
+ *  reserved_60_63:4 [63:60] reserved.
+ *  aura:12; [59:48](R/W) Guest-aura for returning this queue's
+ *	instruction-chunk buffers to FPA. Only used when [INST_FREE] is set.
+ *	For the FPA to not discard the request, FPA_PF_MAP() must map
+ *	[AURA] and CPT()_PF_Q()_GMCTL[GMID] as valid.
+ *  reserved_45_47:3 [47:45] reserved.
+ *  size:13 [44:32](R/W) Command-buffer size, in number of 64-bit words per
+ *	command buffer segment. Must be 8*n + 1, where n is the number of
+ *	instructions per buffer segment.
+ *  reserved_11_31:21 [31:11] Reserved.
+ *  cont_err:1 [10:10](R/W) Continue on error.
+ *	0 = When CPT()_VQ()_MISC_INT[NWRP], CPT()_VQ()_MISC_INT[IRDE] or
+ *	CPT()_VQ()_MISC_INT[DOVF] are set by hardware or software via
+ *	CPT()_VQ()_MISC_INT_W1S, then CPT()_VQ()_CTL[ENA] is cleared.  Due to
+ *	pipelining, additional instructions may have been processed between the
+ *	instruction causing the error and the next instruction in the disabled
+ *	queue (the instruction at CPT()_VQ()_SADDR).
+ *	1 = Ignore errors and continue processing instructions.
+ *	For diagnostic use only.
+ *  inst_free:1 [9:9](R/W) Instruction FPA free. When set, when CPT reaches the
+ *	end of an instruction chunk, that chunk will be freed to the FPA.
+ *  inst_be:1 [8:8](R/W) Instruction big-endian control. When set, instructions,
+ *	instruction next chunk pointers, and result structures are stored in
+ *	big-endian format in memory.
+ *  iqb_ldwb:1 [7:7](R/W) Instruction load don't write back.
+ *	0 = The hardware issues NCB transient load (LDT) towards the cache,
+ *	which if the line hits and is is dirty will cause the line to be
+ *	written back before being replaced.
+ *	1 = The hardware issues NCB LDWB read-and-invalidate command towards
+ *	the cache when fetching the last word of instructions; as a result the
+ *	line will not be written back when replaced.  This improves
+ *	performance, but software must not read the instructions after they are
+ *	posted to the hardware.	Reads that do not consume the last word of a
+ *	cache line always use LDI.
+ *  reserved_4_6:3 [6:4] Reserved.
+ *  grp:3; [3:1](R/W) Engine group.
+ *  pri:1; [0:0](R/W) Queue priority.
+ *	1 = This queue has higher priority. Round-robin between higher
+ *	priority queues.
+ *	0 = This queue has lower priority. Round-robin between lower
+ *	priority queues.
+ */
+union cptx_pf_qx_ctl {
+	uint64_t u;
+	struct cptx_pf_qx_ctl_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		uint64_t reserved_60_63:4;
+		uint64_t aura:12;
+		uint64_t reserved_45_47:3;
+		uint64_t size:13;
+		uint64_t reserved_11_31:21;
+		uint64_t cont_err:1;
+		uint64_t inst_free:1;
+		uint64_t inst_be:1;
+		uint64_t iqb_ldwb:1;
+		uint64_t reserved_4_6:3;
+		uint64_t grp:3;
+		uint64_t pri:1;
+#else /* Word 0 - Little Endian */
+		uint64_t pri:1;
+		uint64_t grp:3;
+		uint64_t reserved_4_6:3;
+		uint64_t iqb_ldwb:1;
+		uint64_t inst_be:1;
+		uint64_t inst_free:1;
+		uint64_t cont_err:1;
+		uint64_t reserved_11_31:21;
+		uint64_t size:13;
+		uint64_t reserved_45_47:3;
+		uint64_t aura:12;
+		uint64_t reserved_60_63:4;
+#endif /* Word 0 - End */
+	} s;
+    /* struct cptx_pf_qx_ctl_s cn; */
+};
+
+/**
+ * Register (NCB) cpt#_pf_g#_en
+ *
+ * CPT PF Group Control Register
+ * This register configures engine groups.
+ * cptx_pf_gx_en_s
+ * Word0
+ *  en: 64; [63:0](R/W/H) Engine group enable. One bit corresponds to each
+ *	engine, with the bit set to indicate this engine can service this group.
+ *	Bits corresponding to unimplemented engines read as zero, i.e. only bit
+ *	numbers	less than CPT()_PF_CONSTANTS[AE] + CPT()_PF_CONSTANTS[SE] are
+ *	writable. AE engine bits follow SE engine bits.
+ *	E.g. if CPT()_PF_CONSTANTS[AE] = 0x1, and CPT()_PF_CONSTANTS[SE] = 0x2,
+ *	then bits <2:0> are read/writable with bit <2> corresponding to AE<0>,
+ *	and bit <1> to SE<1>, and bit<0> to SE<0>. Before disabling an engine,
+ *	the corresponding bit in each group must be cleared. CPT()_PF_EXEC_BUSY
+ *	can then be polled to determing when the engine becomes	idle.
+ *	At the point, the engine can be disabled.
+ */
+union cptx_pf_gx_en {
+	uint64_t u;
+	struct cptx_pf_gx_en_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		uint64_t en:64;
+#else /* Word 0 - Little Endian */
+		uint64_t en:64;
+#endif /* Word 0 - End */
+	} s;
+};
+
+/**
+ * Register (NCB) cpt#_vq#_saddr
+ *
+ * CPT Queue Starting Buffer Address Registers
+ * These registers set the instruction buffer starting address.
+ * cptx_vqx_saddr_s
+ * Word0
+ *  reserved_49_63:15 [63:49] Reserved.
+ *  ptr:43 [48:6](R/W/H) Instruction buffer IOVA <48:6> (64-byte aligned).
+ *	When written, it is the initial buffer starting address; when read,
+ *	it is the next read pointer to be requested from L2C. The PTR field
+ *	is overwritten with the next pointer each time that the command buffer
+ *	segment is exhausted. New commands will then be read from the newly
+ *	specified command buffer pointer.
+ *  reserved_0_5:6 [5:0] Reserved.
+ *
+ */
+union cptx_vqx_saddr {
+	uint64_t u;
+	struct cptx_vqx_saddr_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		uint64_t reserved_49_63:15;
+		uint64_t ptr:43
+		uint64_t reserved_0_5:6;
+#else /* Word 0 - Little Endian */
+		uint64_t reserved_0_5:6;
+		uint64_t ptr:43;
+		uint64_t reserved_49_63:15;
+#endif /* Word 0 - End */
+	} s;
+};
+
+/**
+ * Register (NCB) cpt#_vq#_misc_ena_w1s
+ *
+ * CPT Queue Misc Interrupt Enable Set Register
+ * This register sets interrupt enable bits.
+ * cptx_vqx_misc_ena_w1s_s
+ * Word0
+ * reserved_5_63:59 [63:5] Reserved.
+ * swerr:1 [4:4](R/W1S/H) Reads or sets enable for
+ *	CPT(0..1)_VQ(0..63)_MISC_INT[SWERR].
+ * nwrp:1 [3:3](R/W1S/H) Reads or sets enable for
+ *	CPT(0..1)_VQ(0..63)_MISC_INT[NWRP].
+ * irde:1 [2:2](R/W1S/H) Reads or sets enable for
+ *	CPT(0..1)_VQ(0..63)_MISC_INT[IRDE].
+ * dovf:1 [1:1](R/W1S/H) Reads or sets enable for
+ *	CPT(0..1)_VQ(0..63)_MISC_INT[DOVF].
+ * mbox:1 [0:0](R/W1S/H) Reads or sets enable for
+ *	CPT(0..1)_VQ(0..63)_MISC_INT[MBOX].
+ *
+ */
+union cptx_vqx_misc_ena_w1s {
+	uint64_t u;
+	struct cptx_vqx_misc_ena_w1s_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		uint64_t reserved_5_63:59;
+		uint64_t swerr:1;
+		uint64_t nwrp:1;
+		uint64_t irde:1;
+		uint64_t dovf:1;
+		uint64_t mbox:1;
+#else /* Word 0 - Little Endian */
+		uint64_t mbox:1;
+		uint64_t dovf:1;
+		uint64_t irde:1;
+		uint64_t nwrp:1;
+		uint64_t swerr:1;
+		uint64_t reserved_5_63:59;
+#endif /* Word 0 - End */
+	} s;
+	struct cptx_vqx_misc_ena_w1s_cn81xx {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		uint64_t reserved_5_63:59;
+		uint64_t swerr:1;
+		uint64_t nwrp:1;
+		uint64_t irde:1;
+		uint64_t dovf:1;
+		uint64_t mbox:1;
+#else /* Word 0 - Little Endian */
+		uint64_t mbox:1;
+		uint64_t dovf:1;
+		uint64_t irde:1;
+		uint64_t nwrp:1;
+		uint64_t swerr:1;
+		uint64_t reserved_5_63:59;
+#endif /* Word 0 - End */
+	} cn81xx;
+};
+
+/**
+ * Register (NCB) cpt#_vq#_doorbell
+ *
+ * CPT Queue Doorbell Registers
+ * Doorbells for the CPT instruction queues.
+ * cptx_vqx_doorbell_s
+ * Word0
+ *  reserved_20_63:44 [63:20] Reserved.
+ *  dbell_cnt:20 [19:0](R/W/H) Number of instruction queue 64-bit words to add
+ *	to the CPT instruction doorbell count. Readback value is the the
+ *	current number of pending doorbell requests. If counter overflows
+ *	CPT()_VQ()_MISC_INT[DBELL_DOVF] is set. To reset the count back to
+ *	zero, write one to clear CPT()_VQ()_MISC_INT_ENA_W1C[DBELL_DOVF],
+ *	then write a value of 2^20 minus the read [DBELL_CNT], then write one
+ *	to CPT()_VQ()_MISC_INT_W1C[DBELL_DOVF] and
+ *	CPT()_VQ()_MISC_INT_ENA_W1S[DBELL_DOVF]. Must be a multiple of 8.
+ *	All CPT instructions are 8 words and require a doorbell count of
+ *	multiple of 8.
+ */
+union cptx_vqx_doorbell {
+	uint64_t u;
+	struct cptx_vqx_doorbell_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		uint64_t reserved_20_63:44;
+		uint64_t dbell_cnt:20;
+#else /* Word 0 - Little Endian */
+		uint64_t dbell_cnt:20;
+		uint64_t reserved_20_63:44;
+#endif /* Word 0 - End */
+	} s;
+};
+
+/**
+ * Register (NCB) cpt#_vq#_inprog
+ *
+ * CPT Queue In Progress Count Registers
+ * These registers contain the per-queue instruction in flight registers.
+ * cptx_vqx_inprog_s
+ * Word0
+ *  reserved_8_63:56 [63:8] Reserved.
+ *  inflight:8 [7:0](RO/H) Inflight count. Counts the number of instructions
+ *	for the VF for which CPT is fetching, executing or responding to
+ *	instructions. However this does not include any interrupts that are
+ *	awaiting software handling (CPT()_VQ()_DONE[DONE] != 0x0).
+ *	A queue may not be reconfigured until:
+ *	1. CPT()_VQ()_CTL[ENA] is cleared by software.
+ *	2. [INFLIGHT] is polled until equals to zero.
+ */
+union cptx_vqx_inprog {
+	uint64_t u;
+	struct cptx_vqx_inprog_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		uint64_t reserved_8_63:56;
+		uint64_t inflight:8;
+#else /* Word 0 - Little Endian */
+		uint64_t inflight:8;
+		uint64_t reserved_8_63:56;
+#endif /* Word 0 - End */
+	} s;
+};
+
+/**
+ * Register (NCB) cpt#_vq#_misc_int
+ *
+ * CPT Queue Misc Interrupt Register
+ * These registers contain the per-queue miscellaneous interrupts.
+ * cptx_vqx_misc_int_s
+ * Word 0
+ *  reserved_5_63:59 [63:5] Reserved.
+ *  swerr:1 [4:4](R/W1C/H) Software error from engines.
+ *  nwrp:1  [3:3](R/W1C/H) NCB result write response error.
+ *  irde:1  [2:2](R/W1C/H) Instruction NCB read response error.
+ *  dovf:1 [1:1](R/W1C/H) Doorbell overflow.
+ *  mbox:1 [0:0](R/W1C/H) PF to VF mailbox interrupt. Set when
+ *	CPT()_VF()_PF_MBOX(0) is written.
+ *
+ */
+union cptx_vqx_misc_int {
+	uint64_t u;
+	struct cptx_vqx_misc_int_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		uint64_t reserved_5_63:59;
+		uint64_t swerr:1;
+		uint64_t nwrp:1;
+		uint64_t irde:1;
+		uint64_t dovf:1;
+		uint64_t mbox:1;
+#else /* Word 0 - Little Endian */
+		uint64_t mbox:1;
+		uint64_t dovf:1;
+		uint64_t irde:1;
+		uint64_t nwrp:1;
+		uint64_t swerr:1;
+		uint64_t reserved_5_63:59;
+#endif /* Word 0 - End */
+	} s;
+};
+
+/**
+ * Register (NCB) cpt#_vq#_done_ack
+ *
+ * CPT Queue Done Count Ack Registers
+ * This register is written by software to acknowledge interrupts.
+ * cptx_vqx_done_ack_s
+ * Word0
+ *  reserved_20_63:44 [63:20] Reserved.
+ *  done_ack:20 [19:0](R/W/H) Number of decrements to CPT()_VQ()_DONE[DONE].
+ *	Reads CPT()_VQ()_DONE[DONE]. Written by software to acknowledge
+ *	interrupts. If CPT()_VQ()_DONE[DONE] is still nonzero the interrupt
+ *	will be re-sent if the conditions described in CPT()_VQ()_DONE[DONE]
+ *	are satisfied.
+ *
+ */
+union cptx_vqx_done_ack {
+	uint64_t u;
+	struct cptx_vqx_done_ack_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		uint64_t reserved_20_63:44;
+		uint64_t done_ack:20;
+#else /* Word 0 - Little Endian */
+		uint64_t done_ack:20;
+		uint64_t reserved_20_63:44;
+#endif /* Word 0 - End */
+	} s;
+};
+
+/**
+ * Register (NCB) cpt#_vq#_done
+ *
+ * CPT Queue Done Count Registers
+ * These registers contain the per-queue instruction done count.
+ * cptx_vqx_done_s
+ * Word0
+ *  reserved_20_63:44 [63:20] Reserved.
+ *  done:20 [19:0](R/W/H) Done count. When CPT_INST_S[DONEINT] set and that
+ *	instruction completes, CPT()_VQ()_DONE[DONE] is incremented when the
+ *	instruction finishes. Write to this field are for diagnostic use only;
+ *	instead software writes CPT()_VQ()_DONE_ACK with the number of
+ *	decrements for this field.
+ *	Interrupts are sent as follows:
+ *	* When CPT()_VQ()_DONE[DONE] = 0, then no results are pending, the
+ *	interrupt coalescing timer is held to zero, and an interrupt is not
+ *	sent.
+ *	* When CPT()_VQ()_DONE[DONE] != 0, then the interrupt coalescing timer
+ *	counts. If the counter is >= CPT()_VQ()_DONE_WAIT[TIME_WAIT]*1024, or
+ *	CPT()_VQ()_DONE[DONE] >= CPT()_VQ()_DONE_WAIT[NUM_WAIT], i.e. enough
+ *	time has passed or enough results have arrived, then the interrupt is
+ *	sent.
+ *	* When CPT()_VQ()_DONE_ACK is written (or CPT()_VQ()_DONE is written
+ *	but this is not typical), the interrupt coalescing timer restarts.
+ *	Note after decrementing this interrupt equation is recomputed,
+ *	for example if CPT()_VQ()_DONE[DONE] >= CPT()_VQ()_DONE_WAIT[NUM_WAIT]
+ *	and because the timer is zero, the interrupt will be resent immediately.
+ *	(This covers the race case between software acknowledging an interrupt
+ *	and a result returning.)
+ *	* When CPT()_VQ()_DONE_ENA_W1S[DONE] = 0, interrupts are not sent,
+ *	but the counting described above still occurs.
+ *	Since CPT instructions complete out-of-order, if software is using
+ *	completion interrupts the suggested scheme is to request a DONEINT on
+ *	each request, and when an interrupt arrives perform a "greedy" scan for
+ *	completions; even if a later command is acknowledged first this will
+ *	not result in missing a completion.
+ *	Software is responsible for making sure [DONE] does not overflow;
+ *	for example by insuring there are not more than 2^20-1 instructions in
+ *	flight that may request interrupts.
+ *
+ */
+union cptx_vqx_done {
+	uint64_t u;
+	struct cptx_vqx_done_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		uint64_t reserved_20_63:44;
+		uint64_t done:20;
+#else /* Word 0 - Little Endian */
+		uint64_t done:20;
+		uint64_t reserved_20_63:44;
+#endif /* Word 0 - End */
+	} s;
+};
+
+/**
+ * Register (NCB) cpt#_vq#_done_wait
+ *
+ * CPT Queue Done Interrupt Coalescing Wait Registers
+ * Specifies the per queue interrupt coalescing settings.
+ * cptx_vqx_done_wait_s
+ * Word0
+ *  reserved_48_63:16 [63:48] Reserved.
+ *  time_wait:16; [47:32](R/W) Time hold-off. When CPT()_VQ()_DONE[DONE] = 0
+ *	or CPT()_VQ()_DONE_ACK is written a timer is cleared. When the timer
+ *	reaches [TIME_WAIT]*1024 then interrupt coalescing ends.
+ *	see CPT()_VQ()_DONE[DONE]. If 0x0, time coalescing is disabled.
+ *  reserved_20_31:12 [31:20] Reserved.
+ *  num_wait:20 [19:0](R/W) Number of messages hold-off.
+ *	When CPT()_VQ()_DONE[DONE] >= [NUM_WAIT] then interrupt coalescing ends
+ *	see CPT()_VQ()_DONE[DONE]. If 0x0, same behavior as 0x1.
+ *
+ */
+union cptx_vqx_done_wait {
+	uint64_t u;
+	struct cptx_vqx_done_wait_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		uint64_t reserved_48_63:16;
+		uint64_t time_wait:16;
+		uint64_t reserved_20_31:12;
+		uint64_t num_wait:20;
+#else /* Word 0 - Little Endian */
+		uint64_t num_wait:20;
+		uint64_t reserved_20_31:12;
+		uint64_t time_wait:16;
+		uint64_t reserved_48_63:16;
+#endif /* Word 0 - End */
+	} s;
+};
+
+/**
+ * Register (NCB) cpt#_vq#_done_ena_w1s
+ *
+ * CPT Queue Done Interrupt Enable Set Registers
+ * Write 1 to these registers will enable the DONEINT interrupt for the queue.
+ * cptx_vqx_done_ena_w1s_s
+ * Word0
+ *  reserved_1_63:63 [63:1] Reserved.
+ *  done:1 [0:0](R/W1S/H) Write 1 will enable DONEINT for this queue.
+ *	Write 0 has no effect. Read will return the enable bit.
+ */
+union cptx_vqx_done_ena_w1s {
+	uint64_t u;
+	struct cptx_vqx_done_ena_w1s_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		uint64_t reserved_1_63:63;
+		uint64_t done:1;
+#else /* Word 0 - Little Endian */
+		uint64_t done:1;
+		uint64_t reserved_1_63:63;
+#endif /* Word 0 - End */
+	} s;
+};
+
+/**
+ * Register (NCB) cpt#_vq#_ctl
+ *
+ * CPT VF Queue Control Registers
+ * This register configures queues. This register should be changed (other than
+ * clearing [ENA]) only when quiescent (see CPT()_VQ()_INPROG[INFLIGHT]).
+ * cptx_vqx_ctl_s
+ * Word0
+ *  reserved_1_63:63 [63:1] Reserved.
+ *  ena:1 [0:0](R/W/H) Enables the logical instruction queue.
+ *	See also CPT()_PF_Q()_CTL[CONT_ERR] and	CPT()_VQ()_INPROG[INFLIGHT].
+ *	1 = Queue is enabled.
+ *	0 = Queue is disabled.
+ */
+union cptx_vqx_ctl {
+	uint64_t u;
+	struct cptx_vqx_ctl_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		uint64_t reserved_1_63:63;
+		uint64_t ena:1;
+#else /* Word 0 - Little Endian */
+		uint64_t ena:1;
+		uint64_t reserved_1_63:63;
+#endif /* Word 0 - End */
+	} s;
+};
+#endif /*__CPT_HW_TYPES_H*/
diff --git a/drivers/crypto/cavium/cpt/cpt_main.c b/drivers/crypto/cavium/cpt/cpt_main.c
new file mode 100644
index 0000000..f5a89f9
--- /dev/null
+++ b/drivers/crypto/cavium/cpt/cpt_main.c
@@ -0,0 +1,891 @@
+/*
+ * Copyright (C) 2016 Cavium, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ */
+
+#include <linux/version.h>
+#include <linux/aer.h>
+#include <linux/module.h>
+#include <linux/moduleparam.h>
+#include <linux/printk.h>
+#include <linux/device.h>
+#include <linux/interrupt.h>
+#include <linux/firmware.h>
+#include <linux/pci.h>
+
+#include "cpt.h"
+
+#define DRV_NAME	"thunder-cpt"
+#define DRV_VERSION	"1.0"
+
+/* Global list for holding all cpt_device pointers */
+struct cpt_device_list cpt_dev_list;
+
+static uint32_t num_vfs = 1; /* Default 1 VF enabled */
+module_param(num_vfs, uint, 0);
+MODULE_PARM_DESC(num_vfs, "Number of VFs to enable(1-16)");
+
+static inline void cpt_init_device_list(struct cpt_device_list *cpt_list)
+{
+	cpt_list->nr_device = 0;
+	spin_lock_init(&cpt_list->lock);
+
+	memset(cpt_list->device_ptr, 0, (sizeof(void *) * MAX_CPT_DEVICES));
+}
+
+static inline int32_t cpt_get_device_number(struct cpt_device_list *cpt_list,
+					    void *dev)
+{
+	struct cpt_device *cpt = (struct cpt_device *)dev;
+	int32_t i = 0;
+
+	spin_lock(&cpt_list->lock);
+
+	for (i = 0; i < MAX_CPT_DEVICES; i++) {
+		if (cpt_list->device_ptr[i] == cpt) {
+			spin_unlock(&cpt_list->lock);
+			return i;
+		}
+	}
+	spin_unlock(&cpt_list->lock);
+
+	return -1;
+}
+
+static inline int32_t cpt_add_device(struct cpt_device_list *cpt_list,
+				     struct cpt_device *cpt)
+{
+	/* lock the global device list */
+	spin_lock(&cpt_list->lock);
+
+	if (cpt_list->nr_device > MAX_CPT_DEVICES) {
+		/* unlock the global device list */
+		spin_unlock(&cpt_list->lock);
+		return -ENOMEM;
+	}
+
+	cpt->idx = cpt_list->nr_device;
+
+	cpt_list->device_ptr[cpt_list->nr_device] = cpt;
+	cpt_list->nr_device++;
+
+	/* unlock the global device list */
+	spin_unlock(&cpt_list->lock);
+
+	return 0;
+}
+
+static inline void cpt_remove_device(struct cpt_device_list *cpt_list,
+				     struct cpt_device *cpt)
+{
+	int32_t i = 0;
+
+	/* lock the global device list */
+	spin_lock(&cpt_list->lock);
+
+	while (i < MAX_CPT_DEVICES) {
+		if (cpt_list->device_ptr[i] == cpt) {
+			cpt_list->device_ptr[i] = NULL;
+			cpt_list->nr_device--;
+			break;
+		}
+		i++;
+	}
+
+	/* unlock the global device list */
+	spin_unlock(&cpt_list->lock);
+}
+
+struct cpt_device *cpt_get_device(struct cpt_device_list *cpt_list,
+				  int32_t dev_no)
+{
+	if (dev_no >= cpt_list->nr_device)
+		return NULL;
+
+	return cpt_list->device_ptr[dev_no];
+}
+
+int32_t nr_cpt_devices(struct cpt_device_list *cpt_list)
+{
+	return cpt_list->nr_device;
+}
+
+static uint64_t get_mask_from_value(int32_t value)
+{
+	uint64_t mask = 0ULL;
+	int32_t i;
+
+	for (i = 0; i < value; i++)
+		mask |= ((uint64_t)1 << i);
+
+	return mask;
+}
+
+/*
+ * Disable cores specified by coremask
+ */
+static void cpt_disable_cores(struct cpt_device *cpt, uint64_t coremask,
+			      uint8_t type, uint8_t grp)
+{
+	union cptx_pf_exe_ctl pf_exe_ctl;
+	uint32_t timeout = 0xFFFFFFFF;
+	uint64_t grpmask = 0;
+	struct device *dev = &cpt->pdev->dev;
+
+	if (type == AE_TYPES)
+		coremask = (coremask << cpt->max_se_cores);
+
+	/* Disengage the cores from groups */
+	grpmask = cpt_read_csr64(cpt->reg_base, CPTX_PF_GX_EN(0, grp));
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_GX_EN(0, grp),
+			(grpmask & ~coremask));
+	udelay(CSR_DELAY);
+	grp = cpt_read_csr64(cpt->reg_base, CPTX_PF_EXEC_BUSY(0));
+	while (grp & coremask) {
+		dev_err(dev, "Cores still busy %llx", coremask);
+		grp = cpt_read_csr64(cpt->reg_base,
+				     CPTX_PF_EXEC_BUSY(0));
+		if (timeout--)
+			break;
+	}
+
+	/* Disable the cores */
+	pf_exe_ctl.u = cpt_read_csr64(cpt->reg_base, CPTX_PF_EXE_CTL(0));
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_EXE_CTL(0),
+			(pf_exe_ctl.u & ~coremask));
+	udelay(CSR_DELAY);
+}
+
+/*
+ * Enable cores specified by coremask
+ */
+static void cpt_enable_cores(struct cpt_device *cpt, uint64_t coremask,
+			     uint8_t type)
+{
+	union cptx_pf_exe_ctl pf_exe_ctl;
+
+	if (type == AE_TYPES)
+		coremask = (coremask << cpt->max_se_cores);
+
+	pf_exe_ctl.u = cpt_read_csr64(cpt->reg_base, CPTX_PF_EXE_CTL(0));
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_EXE_CTL(0),
+			(pf_exe_ctl.u | coremask));
+	udelay(CSR_DELAY);
+}
+
+static void cpt_configure_group(struct cpt_device *cpt, uint8_t grp,
+				uint64_t coremask, uint8_t type)
+{
+	union cptx_pf_gx_en pf_gx_en = {0};
+
+	if (type == AE_TYPES)
+		coremask = (coremask << cpt->max_se_cores);
+
+	pf_gx_en.u = cpt_read_csr64(cpt->reg_base, CPTX_PF_GX_EN(0, grp));
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_GX_EN(0, grp),
+			(pf_gx_en.u | coremask));
+	udelay(CSR_DELAY);
+}
+
+static void cpt_disable_mbox_interrupts(struct cpt_device *cpt)
+{
+	/* Clear mbox(0) interupts for all vfs */
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_MBOX_ENA_W1CX(0, 0), ~0ull);
+}
+
+static void cpt_disable_ecc_interrupts(struct cpt_device *cpt)
+{
+	/* Clear ecc(0) interupts for all vfs */
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_ECC0_ENA_W1C(0), ~0ull);
+}
+
+static void cpt_disable_exec_interrupts(struct cpt_device *cpt)
+{
+	/* Clear exec interupts for all vfs */
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_EXEC_ENA_W1C(0), ~0ull);
+}
+
+static void cpt_disable_all_interrupts(struct cpt_device *cpt)
+{
+	cpt_disable_mbox_interrupts(cpt);
+	cpt_disable_ecc_interrupts(cpt);
+	cpt_disable_exec_interrupts(cpt);
+}
+
+static void cpt_enable_mbox_interrupts(struct cpt_device *cpt)
+{
+	/* Set mbox(0) interupts for all vfs */
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_MBOX_ENA_W1SX(0, 0), ~0ull);
+}
+
+static void cpt_enable_ecc_interrupts(struct cpt_device *cpt)
+{
+	/* Set ecc(0) interupts for all vfs */
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_ECC0_ENA_W1S(0), ~0ull);
+}
+
+static void cpt_enable_exec_interrupts(struct cpt_device *cpt)
+{
+	/* Set exec interupts for all vfs */
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_EXEC_ENA_W1S(0), ~0ull);
+}
+
+static void cpt_enable_all_interrupts(struct cpt_device *cpt)
+{
+	cpt_enable_mbox_interrupts(cpt);
+	cpt_enable_ecc_interrupts(cpt);
+	cpt_enable_exec_interrupts(cpt);
+}
+
+static int32_t cpt_load_microcode(struct cpt_device *cpt,
+				  struct microcode *mcode)
+{
+	int32_t ret = 0, core = 0, shift = 0;
+	uint32_t total_cores = 0;
+	struct device *dev = &cpt->pdev->dev;
+
+	if (!mcode || !mcode->code) {
+		dev_err(dev, "Either the mcode is null or data is NULL\n");
+		return 1;
+	}
+
+	if (mcode->code_size == 0) {
+		dev_err(dev, "microcode size is 0\n");
+		return 1;
+	}
+
+	/* Assumes 0-9 are SE cores for UCODE_BASE registers and
+	 * AE core bases follow
+	 */
+	if (mcode->is_ae) {
+		core = CPT_MAX_SE_CORES; /* start couting from 10 */
+		total_cores = CPT_MAX_TOTAL_CORES; /* upto 15 */
+	} else {
+		core = 0; /* start couting from 0 */
+		total_cores = CPT_MAX_SE_CORES; /* upto 9 */
+	}
+
+	/* Point to microcode for each core of the group */
+	for (; core < total_cores ; core++, shift++) {
+		if (mcode->core_mask_low & (1 << shift)) {
+			cpt_write_csr64(cpt->reg_base,
+					CPTX_PF_ENGX_UCODE_BASE(0, core),
+					(uint64_t)mcode->phys_base);
+		}
+	}
+	return ret;
+}
+
+static int32_t do_cpt_init(struct cpt_device *cpt, struct microcode *mcode)
+{
+	int32_t ret = 0;
+	struct device *dev = &cpt->pdev->dev;
+
+	/* Make device not ready */
+	cpt->flags &= ~CPT_FLAG_DEVICE_READY;
+	/* Disable All PF interrupts */
+	cpt_disable_all_interrupts(cpt);
+	/* Calculate mcode group and coremasks */
+	if (mcode->is_ae) {
+		if (mcode->num_cores > cpt->avail_ae_cores) {
+			dev_err(dev, "Requested for more cores than available AE cores\n");
+			ret = -1;
+			goto cpt_init_fail;
+		}
+
+		if (cpt->next_group >= CPT_MAX_CORE_GROUPS) {
+			dev_err(dev, "Can't load, all eight microcode groups in use");
+			return -ENFILE;
+		}
+
+		mcode->group = cpt->next_group;
+		/* Convert requested cores to mask */
+		mcode->core_mask_low = get_mask_from_value(mcode->num_cores);
+		mcode->core_mask_low <<= (cpt->max_ae_cores -
+					  cpt->avail_ae_cores);
+		/* Deduct the available ae cores */
+		cpt->avail_ae_cores -= mcode->num_cores;
+		cpt_disable_cores(cpt, mcode->core_mask_low, AE_TYPES,
+				  mcode->group);
+		/* Load microcode for AE engines */
+		if (cpt_load_microcode(cpt, mcode)) {
+			dev_err(dev, "Microcode load Failed for %s\n",
+				mcode->version);
+			ret = -1;
+			goto cpt_init_fail;
+		}
+		cpt->next_group++;
+		/* Configure group mask for the mcode */
+		cpt_configure_group(cpt, mcode->group, mcode->core_mask_low,
+				    AE_TYPES);
+		/* Enable AE cores for the group mask */
+		cpt_enable_cores(cpt, mcode->core_mask_low, AE_TYPES);
+	} else {
+		if (mcode->num_cores > cpt->avail_se_cores) {
+			dev_err(dev, "Requested for more cores than available SE cores\n");
+			ret = -1;
+			goto cpt_init_fail;
+		}
+		if (cpt->next_group >= CPT_MAX_CORE_GROUPS) {
+			dev_err(dev, "Can't load, all eight microcode groups in use");
+			return -ENFILE;
+		}
+
+		mcode->group = cpt->next_group;
+		/* Covert requested cores to mask */
+		mcode->core_mask_low = get_mask_from_value(mcode->num_cores);
+		mcode->core_mask_low <<= (cpt->max_se_cores -
+					  cpt->avail_se_cores);
+		/* Deduct the available se cores */
+		cpt->avail_se_cores -= mcode->num_cores;
+		cpt_disable_cores(cpt, mcode->core_mask_low, SE_TYPES,
+				  mcode->group);
+		/* Load microcode for SE engines */
+		if (cpt_load_microcode(cpt, mcode)) {
+			dev_err(dev, "Microcode load Failed for %s\n",
+				mcode->version);
+			ret = -1;
+			goto cpt_init_fail;
+		}
+		cpt->next_group++;
+		/* Configure group mask for the mcode */
+		cpt_configure_group(cpt, mcode->group, mcode->core_mask_low,
+				    SE_TYPES);
+		/* Enable SE cores for the group mask */
+		cpt_enable_cores(cpt, mcode->core_mask_low, SE_TYPES);
+	}
+
+	/* Enabled PF mailbox interrupts */
+	cpt_enable_mbox_interrupts(cpt);
+	cpt->flags |= CPT_FLAG_DEVICE_READY;
+
+	return ret;
+
+cpt_init_fail:
+	/* Enabled PF mailbox interrupts */
+	cpt_enable_mbox_interrupts(cpt);
+	/* Reset coremask values */
+	/* TODO: Revisit this failure case for more loads case */
+	cpt->avail_ae_cores = cpt->max_ae_cores;
+	cpt->avail_se_cores = cpt->max_se_cores;
+
+	return ret;
+}
+
+struct ucode_header {
+	uint8_t version[32];
+	uint32_t code_length;
+	uint32_t data_length;
+	uint64_t sram_address;
+};
+
+static int32_t cpt_ucode_load_fw(struct cpt_device *cpt, const uint8_t *fw,
+				 bool is_ae)
+{
+	const struct firmware *fw_entry;
+	struct device *dev = &cpt->pdev->dev;
+	struct ucode_header *ucode;
+	struct microcode *mcode;
+	int j, ret = 0;
+
+	ret = request_firmware(&fw_entry, fw, dev);
+	if (ret)
+		return ret;
+
+	mcode = &cpt->mcode[cpt->next_mc_idx];
+	ucode = (struct ucode_header *)fw_entry->data;
+	memcpy(mcode->version, (uint8_t *)fw_entry->data, 32);
+	mcode->code_size = ntohl(ucode->code_length) * 2;
+	mcode->is_ae = is_ae;
+	mcode->core_mask_low  = 0ULL;
+	mcode->core_mask_hi   = 0ULL;
+	mcode->num_cores = is_ae ? 6 : 10;
+
+	/*  Allocate DMAable space */
+	mcode->code = dma_zalloc_coherent(&cpt->pdev->dev, mcode->code_size,
+					  &mcode->dma, GFP_KERNEL);
+	if (!mcode->code) {
+		dev_err(dev, "Unable to allocate space for microcode");
+		return -ENOMEM;
+	}
+	/* Align memory address for 'align_bytes' */
+	/* Neglect Bits 6:0 and 49:63: Align for 128-bytes */
+	mcode->phys_base = ALIGN((uint64_t)mcode->dma, 128);
+	mcode->base = mcode->code + (mcode->phys_base - mcode->dma);
+	memcpy((void *)mcode->base, (void *)(fw_entry->data + 48),
+	       mcode->code_size);
+
+	/* Byte swap 64-bit */
+	for (j = 0; j < (mcode->code_size / 8); j++)
+		byte_swap_64(&((uint64_t *)mcode->base)[j]);
+	/*  MC needs 16-bit swap */
+	for (j = 0; j < (mcode->code_size / 2); j++)
+		byte_swap_16(&((uint16_t *)mcode->base)[j]);
+
+	dev_dbg(dev, "mcode->code_size = %u\n", mcode->code_size);
+	dev_dbg(dev, "mcode->is_ae       = %u\n", mcode->is_ae);
+	dev_dbg(dev, "mcode->num_cores   = %u\n", mcode->num_cores);
+	dev_dbg(dev, "mcode->code = %llx\n", (uint64_t)mcode->code);
+	dev_dbg(dev, "mcode->phys_base = %llx\n", mcode->phys_base);
+	dev_dbg(dev, "mcode->base = %llx\n", (uint64_t)mcode->base);
+	dev_dbg(dev, "mcode->is_mc_valid = %u\n", mcode->is_mc_valid);
+
+	ret = do_cpt_init(cpt, mcode);
+	if (ret) {
+		dev_err(dev, "do_cpt_init failed with ret: %d\n", ret);
+		return ret;
+	}
+
+	dev_dbg(dev, "Microcode Loaded\n");
+	mcode->is_mc_valid = 1;
+	cpt->next_mc_idx++;
+	dev_dbg(dev, "mcode->is_mc_valid = %u\n", mcode->is_mc_valid);
+	release_firmware(fw_entry);
+
+	return ret;
+}
+
+static int32_t cpt_ucode_load(struct cpt_device *cpt)
+{
+	int32_t ret = 0;
+	struct device *dev = &cpt->pdev->dev;
+
+	ret = cpt_ucode_load_fw(cpt, "cpt8x-mc-ae.out", true);
+	if (ret) {
+		dev_err(dev, "ae:cpt_ucode_load failed with ret: %d\n", ret);
+		return ret;
+	}
+	ret = cpt_ucode_load_fw(cpt, "cpt8x-mc-se.out", false);
+	if (ret) {
+		dev_err(dev, "se:cpt_ucode_load failed with ret: %d\n", ret);
+		return ret;
+	}
+
+	return ret;
+}
+
+uint16_t active_cpt_devmask(struct cpt_device_list *cpt_list)
+{
+	struct cpt_device *cpt;
+	uint16_t mask = 0;
+	int32_t i = 0;
+
+	while (i < MAX_CPT_DEVICES) {
+		cpt = cpt_list->device_ptr[i];
+		if (cpt && cpt_device_ready(cpt))
+			mask |= (1 << i);
+		i++;
+	}
+
+	return mask;
+}
+
+static int32_t cpt_enable_msix(struct cpt_device *cpt)
+{
+	int32_t i, ret;
+
+	cpt->num_vec = CPT_PF_MSIX_VECTORS;
+
+	for (i = 0; i < cpt->num_vec; i++)
+		cpt->msix_entries[i].entry = i;
+
+	ret = pci_enable_msix(cpt->pdev, cpt->msix_entries, cpt->num_vec);
+	if (ret) {
+		dev_err(&cpt->pdev->dev, "Request for #%d msix vectors failed\n",
+			cpt->num_vec);
+		return ret;
+	}
+
+	cpt->msix_enabled = 1;
+	return 0;
+}
+
+static irqreturn_t cpt_mbx0_intr_handler (int32_t irq, void *cpt_irq)
+{
+	struct cpt_device *cpt = (struct cpt_device *)cpt_irq;
+
+	cpt_mbox_intr_handler(cpt, 0);
+
+	return IRQ_HANDLED;
+}
+
+static void cpt_disable_msix(struct cpt_device *cpt)
+{
+	if (cpt->msix_enabled) {
+		pci_disable_msix(cpt->pdev);
+		cpt->msix_enabled = 0;
+		cpt->num_vec = 0;
+	}
+}
+
+static void cpt_free_all_interrupts(struct cpt_device *cpt)
+{
+	int32_t irq;
+
+	for (irq = 0; irq < cpt->num_vec; irq++) {
+		if (cpt->irq_allocated[irq])
+			free_irq(cpt->msix_entries[irq].vector, cpt);
+		cpt->irq_allocated[irq] = false;
+	}
+}
+
+static void cpt_reset(struct cpt_device *cpt)
+{
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_RESET(0), 1);
+}
+
+static void cpt_find_max_enabled_cores(struct cpt_device *cpt)
+{
+	union cptx_pf_constants pf_cnsts = {0};
+
+	pf_cnsts.u = cpt_read_csr64(cpt->reg_base, CPTX_PF_CONSTANTS(0));
+	cpt->max_se_cores = pf_cnsts.s.se;
+	cpt->max_ae_cores = pf_cnsts.s.ae;
+}
+
+static uint32_t cpt_check_bist_status(struct cpt_device *cpt)
+{
+	union cptx_pf_bist_status bist_sts = {0};
+
+	bist_sts.u = cpt_read_csr64(cpt->reg_base,
+				    CPTX_PF_BIST_STATUS(0));
+
+	return bist_sts.u;
+}
+
+static uint64_t cpt_check_exe_bist_status(struct cpt_device *cpt)
+{
+	union cptx_pf_exe_bist_status bist_sts = {0};
+
+	bist_sts.u = cpt_read_csr64(cpt->reg_base,
+				    CPTX_PF_EXE_BIST_STATUS(0));
+
+	return bist_sts.u;
+}
+
+static void cpt_disable_all_cores(struct cpt_device *cpt)
+{
+	uint32_t grp, timeout = 0xFFFFFFFF;
+	struct device *dev = &cpt->pdev->dev;
+
+	/* Disengage the cores from groups */
+	for (grp = 0; grp < CPT_MAX_CORE_GROUPS; grp++) {
+		cpt_write_csr64(cpt->reg_base, CPTX_PF_GX_EN(0, grp), 0);
+		udelay(CSR_DELAY);
+	}
+
+	grp = cpt_read_csr64(cpt->reg_base, CPTX_PF_EXEC_BUSY(0));
+	while (grp) {
+		dev_err(dev, "Cores still busy");
+		grp = cpt_read_csr64(cpt->reg_base,
+				     CPTX_PF_EXEC_BUSY(0));
+		if (timeout--)
+			break;
+	}
+	/* Disable the cores */
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_EXE_CTL(0), 0);
+}
+
+/**
+ * Ensure all cores are disenganed from all groups by
+ * calling cpt_disable_all_cores() before calling this
+ * function.
+ */
+static void cpt_unload_microcode(struct cpt_device *cpt)
+{
+	uint32_t grp = 0, core;
+
+	/* Free microcode bases and reset group masks */
+	for (grp = 0; grp < CPT_MAX_CORE_GROUPS; grp++) {
+		struct microcode *mcode = &cpt->mcode[grp];
+
+		if (cpt->mcode[grp].code)
+			dma_free_coherent(&cpt->pdev->dev, mcode->code_size,
+					  mcode->code, mcode->dma);
+		mcode->code = NULL;
+		mcode->base = NULL;
+	}
+	/* Clear UCODE_BASE registers for all engines */
+	for (core = 0; core < CPT_MAX_TOTAL_CORES; core++)
+		cpt_write_csr64(cpt->reg_base,
+				CPTX_PF_ENGX_UCODE_BASE(0, core), 0ull);
+}
+
+static int32_t cpt_device_init(struct cpt_device *cpt)
+{
+	uint16_t device_id;
+	uint8_t rev_id;
+	uint64_t bist;
+	struct device *dev = &cpt->pdev->dev;
+
+	/* Reset the PF when probed first */
+	cpt_reset(cpt);
+	mdelay((100));
+
+	pci_read_config_word(cpt->pdev, PCI_DEVICE_ID, &device_id);
+	pci_read_config_byte(cpt->pdev, PCI_REVISION_ID, &rev_id);
+	cpt->chip_id = (device_id << 8) | rev_id;
+	dev_dbg(dev, "CPT Chip ID: 0x%0x ", cpt->chip_id);
+
+	/*Check BIST status*/
+	bist = (uint64_t)cpt_check_bist_status(cpt);
+	if (bist) {
+		dev_err(dev, "RAM BIST failed with code 0x%llx", bist);
+		return -ENODEV;
+	}
+
+	bist = cpt_check_exe_bist_status(cpt);
+	if (bist) {
+		dev_err(dev, "Engine BIST failed with code 0x%llx", bist);
+	return -ENODEV;
+	}
+
+	/*Get CLK frequency*/
+	/*Get max enabled cores */
+	cpt_find_max_enabled_cores(cpt);
+	/*Disable all cores*/
+	cpt_disable_all_cores(cpt);
+	/*Reset device parameters*/
+	cpt->next_mc_idx   = 0;
+	cpt->next_group = 0;
+	cpt->avail_se_cores = cpt->max_se_cores;
+	cpt->avail_ae_cores = cpt->max_ae_cores;
+	/* PF is ready */
+	cpt->flags |= CPT_FLAG_DEVICE_READY;
+
+	return 0;
+}
+
+static int32_t cpt_register_interrupts(struct cpt_device *cpt)
+{
+	int32_t ret;
+	struct device *dev = &cpt->pdev->dev;
+
+	/* Enable MSI-X */
+	ret = cpt_enable_msix(cpt);
+	if (ret)
+		return ret;
+
+	/* Register mailbox interrupt handlers */
+	ret = request_irq(cpt->msix_entries[CPT_PF_INT_VEC_E_MBOXX(0)].vector,
+			  cpt_mbx0_intr_handler, 0, "CPT Mbox0", cpt);
+	if (ret)
+		goto fail;
+
+	cpt->irq_allocated[CPT_PF_INT_VEC_E_MBOXX(0)] = true;
+
+	/* Enable mailbox interrupt */
+	cpt_enable_mbox_interrupts(cpt);
+	return 0;
+
+fail:
+	dev_err(dev, "Request irq failed\n");
+	cpt_free_all_interrupts(cpt);
+	return ret;
+}
+
+static void cpt_unregister_interrupts(struct cpt_device *cpt)
+{
+	cpt_free_all_interrupts(cpt);
+	cpt_disable_msix(cpt);
+}
+
+static int32_t cpt_sriov_init(struct cpt_device *cpt, int32_t num_vfs)
+{
+	int32_t pos = 0;
+	int32_t err;
+	uint16_t total_vf_cnt;
+	struct pci_dev *pdev = cpt->pdev;
+
+	pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_SRIOV);
+	if (!pos) {
+		dev_err(&pdev->dev, "SRIOV capability is not found in PCIe config space\n");
+		return -ENODEV;
+	}
+
+	cpt->num_vf_en = num_vfs; /* User requested VFs */
+	pci_read_config_word(pdev, (pos + PCI_SRIOV_TOTAL_VF), &total_vf_cnt);
+	if (total_vf_cnt < cpt->num_vf_en)
+		cpt->num_vf_en = total_vf_cnt;
+
+	if (!total_vf_cnt)
+		return 0;
+
+	/*Enabled the available VFs */
+	err = pci_enable_sriov(pdev, cpt->num_vf_en);
+	if (err) {
+		dev_err(&pdev->dev, "SRIOV enable failed, num VF is %d\n",
+			cpt->num_vf_en);
+		cpt->num_vf_en = 0;
+		return err;
+	}
+
+	/* TODO: Optionally enable static VQ priorities feature */
+
+	dev_info(&pdev->dev, "SRIOV enabled, number of VF available %d\n",
+		 cpt->num_vf_en);
+
+	cpt->flags |= CPT_FLAG_SRIOV_ENABLED;
+
+	return 0;
+}
+
+static int32_t cpt_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
+{
+	struct device *dev = &pdev->dev;
+	struct cpt_device *cpt;
+	int32_t    err;
+
+	cpt = devm_kzalloc(dev, sizeof(struct cpt_device), GFP_KERNEL);
+	if (!cpt)
+		return -ENOMEM;
+
+	pci_set_drvdata(pdev, cpt);
+	cpt->pdev = pdev;
+	err = pci_enable_device(pdev);
+	if (err) {
+		dev_err(dev, "Failed to enable PCI device\n");
+		pci_set_drvdata(pdev, NULL);
+		return err;
+	}
+
+	err = pci_request_regions(pdev, DRV_NAME);
+	if (err) {
+		dev_err(dev, "PCI request regions failed 0x%x\n", err);
+		goto cpt_err_disable_device;
+	}
+
+	err = pci_set_dma_mask(pdev, DMA_BIT_MASK(48));
+	if (err) {
+		dev_err(dev, "Unable to get usable DMA configuration\n");
+		goto cpt_err_release_regions;
+	}
+
+	err = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(48));
+	if (err) {
+		dev_err(dev, "Unable to get 48-bit DMA for consistent allocations\n");
+		goto cpt_err_release_regions;
+	}
+
+	/* MAP PF's configuration registers */
+	cpt->reg_base = pcim_iomap(pdev, CPT_CSR_BAR, 0);
+	if (!cpt->reg_base) {
+		dev_err(dev, "Cannot map config register space, aborting\n");
+		err = -ENOMEM;
+		goto cpt_err_release_regions;
+	}
+
+	/* CPT device HW initialization */
+	cpt_device_init(cpt);
+
+	/* Register interrupts */
+	err = cpt_register_interrupts(cpt);
+	if (err)
+		goto cpt_err_release_regions;
+
+	err = cpt_ucode_load(cpt);
+	if (err)
+		goto cpt_err_unregister_interrupts;
+
+	/* Configure SRIOV */
+	err = cpt_sriov_init(cpt, num_vfs);
+	if (err)
+		goto cpt_err_unregister_interrupts;
+
+	/* Add device to global device list */
+	cpt_add_device(&cpt_dev_list, cpt);
+
+	return 0;
+
+cpt_err_unregister_interrupts:
+	cpt_unregister_interrupts(cpt);
+cpt_err_release_regions:
+	pci_release_regions(pdev);
+cpt_err_disable_device:
+	pci_disable_device(pdev);
+	pci_set_drvdata(pdev, NULL);
+	return err;
+}
+
+static void cpt_remove(struct pci_dev *pdev)
+{
+	struct cpt_device *cpt = pci_get_drvdata(pdev);
+
+	/* Disengage SE and AE cores from all groups*/
+	cpt_disable_all_cores(cpt);
+	/* Unload microcodes */
+	cpt_unload_microcode(cpt);
+	cpt_unregister_interrupts(cpt);
+	pci_disable_sriov(pdev);
+	pci_release_regions(pdev);
+	pci_disable_device(pdev);
+	pci_set_drvdata(pdev, NULL);
+}
+
+static void cpt_shutdown(struct pci_dev *pdev)
+{
+	struct cpt_device *cpt = pci_get_drvdata(pdev);
+
+	if (!cpt)
+		return;
+
+	dev_info(&pdev->dev, "Shutdown device %x:%x.\n",
+		 (uint32_t)pdev->vendor, (uint32_t)pdev->device);
+
+	cpt_unregister_interrupts(cpt);
+	pci_release_regions(pdev);
+	pci_disable_device(pdev);
+	pci_set_drvdata(pdev, NULL);
+	kzfree(cpt);
+}
+
+/* Supported devices */
+static const struct pci_device_id cpt_id_table[] = {
+	{ PCI_DEVICE(PCI_VENDOR_ID_CAVIUM, CPT_81XX_PCI_PF_DEVICE_ID) },
+	{ 0, }  /* end of table */
+};
+
+static struct pci_driver cpt_pci_driver = {
+	.name = DRV_NAME,
+	.id_table = cpt_id_table,
+	.probe = cpt_probe,
+	.remove = cpt_remove,
+	.shutdown = cpt_shutdown,
+};
+
+static int32_t __init cpt_init_module(void)
+{
+	int32_t ret = -1;
+
+	pr_info("%s, ver %s\n", DRV_NAME, DRV_VERSION);
+
+	if (num_vfs > 16) {
+		pr_warn("Invalid vf count %d, Resetting it to 1(default)\n",
+			num_vfs);
+		num_vfs = 1;
+	}
+
+	cpt_init_device_list(&cpt_dev_list);
+	ret = pci_register_driver(&cpt_pci_driver);
+	if (ret)
+		pr_err("pci_register_driver() failed");
+
+	return ret;
+}
+
+static void __exit cpt_cleanup_module(void)
+{
+	pci_unregister_driver(&cpt_pci_driver);
+}
+
+module_init(cpt_init_module);
+module_exit(cpt_cleanup_module);
+
+MODULE_AUTHOR("George Cherian <george.cherian@cavium.com>, Murthy Nidadavolu");
+MODULE_DESCRIPTION("Cavium Thunder CPT Physical Function Driver");
+MODULE_LICENSE("GPL v2");
+MODULE_VERSION(DRV_VERSION);
+MODULE_DEVICE_TABLE(pci, cpt_id_table);
diff --git a/drivers/crypto/cavium/cpt/cpt_pf_mbox.c b/drivers/crypto/cavium/cpt/cpt_pf_mbox.c
new file mode 100644
index 0000000..7ed2d9c
--- /dev/null
+++ b/drivers/crypto/cavium/cpt/cpt_pf_mbox.c
@@ -0,0 +1,174 @@
+/*
+ * Copyright (C) 2016 Cavium, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+
+#include "cpt.h"
+
+static void cpt_send_msg_to_vf(struct cpt_device *cpt, int vf,
+			       struct cpt_mbox *mbx)
+{
+	/* Writing mbox(0) causes interrupt */
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_VFX_MBOXX(0, vf, 1),
+			mbx->data);
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_VFX_MBOXX(0, vf, 0), mbx->msg);
+}
+
+/* ACKs VF's mailbox message
+ * @vf: VF to which ACK to be sent
+ */
+static void cpt_mbox_send_ack(struct cpt_device *cpt, int vf,
+			      struct cpt_mbox *mbx)
+{
+	mbx->data = 0ull;
+	mbx->msg = CPT_MBOX_MSG_TYPE_ACK;
+	cpt_send_msg_to_vf(cpt, vf, mbx);
+}
+
+static void cpt_clear_mbox_intr(struct cpt_device *cpt, uint32_t vf)
+{
+	/* W1C for the VF */
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_MBOX_INTX(0, 0), (1 << vf));
+}
+
+/*
+ *  Configure QLEN/Chunk sizes for VF
+ */
+static void cpt_cfg_qlen_for_vf(struct cpt_device *cpt, int vf, uint32_t size)
+{
+	union cptx_pf_qx_ctl pf_qx_ctl;
+
+	pf_qx_ctl.u = cpt_read_csr64(cpt->reg_base, CPTX_PF_QX_CTL(0, vf));
+	pf_qx_ctl.s.size = size;
+	pf_qx_ctl.s.cont_err = true;
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_QX_CTL(0, vf), pf_qx_ctl.u);
+}
+
+/*
+ * Configure VQ priority
+ */
+static void cpt_cfg_vq_priority(struct cpt_device *cpt, int vf, uint32_t pri)
+{
+	union cptx_pf_qx_ctl pf_qx_ctl;
+
+	pf_qx_ctl.u = cpt_read_csr64(cpt->reg_base, CPTX_PF_QX_CTL(0, vf));
+	pf_qx_ctl.s.pri = pri;
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_QX_CTL(0, vf), pf_qx_ctl.u);
+}
+
+static uint8_t cpt_bind_vq_to_grp(struct cpt_device *cpt, uint8_t q,
+				  uint8_t grp)
+{
+	struct microcode *mcode = cpt->mcode;
+	union cptx_pf_qx_ctl pf_qx_ctl;
+	struct device *dev = &cpt->pdev->dev;
+
+	if (q >= CPT_MAX_VQ_NUM) {
+		dev_err(dev, "Queues are more than cores in the group");
+		return -EINVAL;
+	}
+	if (grp >= CPT_MAX_CORE_GROUPS) {
+		dev_err(dev, "Request group is more than possible groups");
+		return -EINVAL;
+	}
+	if (grp >= cpt->next_mc_idx) {
+		dev_err(dev, "Request group is higher than available functional groups");
+		return -EINVAL;
+	}
+	pf_qx_ctl.u = cpt_read_csr64(cpt->reg_base, CPTX_PF_QX_CTL(0, q));
+	pf_qx_ctl.s.grp = mcode[grp].group;
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_QX_CTL(0, q), pf_qx_ctl.u);
+	dev_dbg(dev, "VF %d TYPE %s", q, (mcode[grp].is_ae ? "AE" : "SE"));
+
+	return mcode[grp].is_ae ? AE_TYPES : SE_TYPES;
+}
+
+/* Interrupt handler to handle mailbox messages from VFs */
+static void cpt_handle_mbox_intr(struct cpt_device *cpt, int vf)
+{
+	struct cpt_vf_info *vfx = &cpt->vfinfo[vf];
+	struct cpt_mbox mbx = {};
+	union cpt_chipid_vfid chipid_vfid;
+	uint8_t vftype;
+	struct device *dev = &cpt->pdev->dev;
+	/* Take mbox lock */
+	cpt->mbx_lock[vf] = true;
+	/*
+	 * MBOX[0] contains msg
+	 * MBOX[1] contains data
+	 */
+	mbx.msg  = cpt_read_csr64(cpt->reg_base, CPTX_PF_VFX_MBOXX(0, vf, 0));
+	mbx.data = cpt_read_csr64(cpt->reg_base, CPTX_PF_VFX_MBOXX(0, vf, 1));
+	dev_dbg(dev, "%s: Mailbox msg 0x%llx from VF%d", __func__, mbx.msg, vf);
+	switch (mbx.msg) {
+	case CPT_MSG_VF_UP:
+		vfx->state = VF_STATE_UP;
+		try_module_get(THIS_MODULE);
+		cpt_mbox_send_ack(cpt, vf, &mbx);
+		break;
+	case CPT_MSG_READY:
+		chipid_vfid.u16 = 0;
+		chipid_vfid.s.chip_id = cpt->chip_id;
+		chipid_vfid.s.vfid = vf;
+		mbx.msg  = CPT_MSG_READY;
+		mbx.data = chipid_vfid.u16;
+		cpt_send_msg_to_vf(cpt, vf, &mbx);
+		break;
+	case CPT_MSG_VF_DOWN:
+		/* First msg in VF teardown sequence */
+		vfx->state = VF_STATE_DOWN;
+		module_put(THIS_MODULE);
+		cpt_mbox_send_ack(cpt, vf, &mbx);
+		break;
+	case CPT_MSG_QLEN:
+		vfx->qlen = mbx.data;
+		cpt_cfg_qlen_for_vf(cpt, vf, vfx->qlen);
+		cpt_mbox_send_ack(cpt, vf, &mbx);
+		break;
+	case CPT_MSG_QBIND_GRP:
+		vftype = cpt_bind_vq_to_grp(cpt, vf, (uint8_t)mbx.data);
+		if ((vftype != AE_TYPES) && (vftype != SE_TYPES))
+			dev_err(dev, "Queue %d binding to group %llu failed",
+				vf, mbx.data);
+		else {
+			dev_dbg(dev, "Queue %d binding to group %llu successful",
+				vf, mbx.data);
+			mbx.msg = CPT_MSG_QBIND_GRP;
+			mbx.data = vftype;
+			cpt_send_msg_to_vf(cpt, vf, &mbx);
+		}
+		break;
+	case CPT_MSG_VQ_PRIORITY:
+		vfx->priority = mbx.data;
+		cpt_cfg_vq_priority(cpt, vf, vfx->priority);
+		cpt_mbox_send_ack(cpt, vf, &mbx);
+		break;
+	default:
+		dev_err(&cpt->pdev->dev, "Invalid msg from VF%d, msg 0x%llx\n",
+			vf, mbx.msg);
+		break;
+	}
+	/* Unlock mailbox */
+	cpt->mbx_lock[vf] = false;
+}
+
+void cpt_mbox_intr_handler (struct cpt_device *cpt, int mbx)
+{
+	uint64_t intr;
+	uint8_t  vf;
+
+	intr = cpt_read_csr64(cpt->reg_base, CPTX_PF_MBOX_INTX(0, 0));
+	dev_dbg(&cpt->pdev->dev, "PF interrupt Mbox%d 0x%llx\n", mbx, intr);
+	for (vf = 0; vf < CPT_MAX_VF_NUM; vf++) {
+		if (intr & (1ULL << vf)) {
+			dev_dbg(&cpt->pdev->dev, "Intr from VF %d\n", vf);
+			cpt_handle_mbox_intr(cpt, vf);
+			cpt_clear_mbox_intr(cpt, vf);
+		}
+	}
+}
-- 
2.1.4

^ permalink raw reply related

* [PATCH 0/3] Add Support for Cavium Cryptographic Accelerarion Unit
From: gcherianv @ 2016-11-18 15:00 UTC (permalink / raw)
  To: linux-kernel, linux-crypto; +Cc: davem, herbert, George Cherian

From: George Cherian <george.cherian@cavium.com>

This series adds the support for Cavium Cryptographic Accelerarion Unit (CPT)
CPT is available in Octeon-Tx SoC series.

George Cherian (3):
  drivers: crypto: Add Support for Octeon-tx CPT Engine
  drivers: crypto: Add the Virtual Function driver for CPT
  drivers: crypto: Enable CPT options crypto for build

 drivers/crypto/Kconfig                       |    1 +
 drivers/crypto/Makefile                      |    1 +
 drivers/crypto/cavium/cpt/Kconfig            |   32 +
 drivers/crypto/cavium/cpt/Makefile           |    4 +
 drivers/crypto/cavium/cpt/cpt.h              |   90 +++
 drivers/crypto/cavium/cpt/cpt_common.h       |  377 ++++++++++
 drivers/crypto/cavium/cpt/cpt_hw_types.h     |  940 +++++++++++++++++++++++
 drivers/crypto/cavium/cpt/cpt_main.c         |  891 ++++++++++++++++++++++
 drivers/crypto/cavium/cpt/cpt_pf_mbox.c      |  174 +++++
 drivers/crypto/cavium/cpt/cptvf.h            |  255 +++++++
 drivers/crypto/cavium/cpt/cptvf_algs.c       |  446 +++++++++++
 drivers/crypto/cavium/cpt/cptvf_algs.h       |  159 ++++
 drivers/crypto/cavium/cpt/cptvf_main.c       | 1038 ++++++++++++++++++++++++++
 drivers/crypto/cavium/cpt/cptvf_mbox.c       |  208 ++++++
 drivers/crypto/cavium/cpt/cptvf_reqmanager.c |  655 ++++++++++++++++
 drivers/crypto/cavium/cpt/request_manager.h  |  221 ++++++
 16 files changed, 5492 insertions(+)
 create mode 100644 drivers/crypto/cavium/cpt/Kconfig
 create mode 100644 drivers/crypto/cavium/cpt/Makefile
 create mode 100644 drivers/crypto/cavium/cpt/cpt.h
 create mode 100644 drivers/crypto/cavium/cpt/cpt_common.h
 create mode 100644 drivers/crypto/cavium/cpt/cpt_hw_types.h
 create mode 100644 drivers/crypto/cavium/cpt/cpt_main.c
 create mode 100644 drivers/crypto/cavium/cpt/cpt_pf_mbox.c
 create mode 100644 drivers/crypto/cavium/cpt/cptvf.h
 create mode 100644 drivers/crypto/cavium/cpt/cptvf_algs.c
 create mode 100644 drivers/crypto/cavium/cpt/cptvf_algs.h
 create mode 100644 drivers/crypto/cavium/cpt/cptvf_main.c
 create mode 100644 drivers/crypto/cavium/cpt/cptvf_mbox.c
 create mode 100644 drivers/crypto/cavium/cpt/cptvf_reqmanager.c
 create mode 100644 drivers/crypto/cavium/cpt/request_manager.h

-- 
2.1.4

^ permalink raw reply

* [PATCH 2/3] drivers: crypto: Add the Virtual Function driver for CPT
From: gcherianv @ 2016-11-18 15:00 UTC (permalink / raw)
  To: linux-kernel, linux-crypto; +Cc: davem, herbert, George Cherian
In-Reply-To: <1479481209-11475-1-git-send-email-gcherianv@gmail.com>

From: George Cherian <george.cherian@cavium.com>

Enable the CPT VF driver. CPT is the cryptographic Accelaration Unit
in Octeon-tx series of processors.

Signed-off-by: George Cherian <george.cherian@cavium.com>
---
 drivers/crypto/cavium/cpt/Kconfig            |   10 +
 drivers/crypto/cavium/cpt/Makefile           |    2 +
 drivers/crypto/cavium/cpt/cptvf.h            |  255 +++++++
 drivers/crypto/cavium/cpt/cptvf_algs.c       |  446 +++++++++++
 drivers/crypto/cavium/cpt/cptvf_algs.h       |  159 ++++
 drivers/crypto/cavium/cpt/cptvf_main.c       | 1038 ++++++++++++++++++++++++++
 drivers/crypto/cavium/cpt/cptvf_mbox.c       |  208 ++++++
 drivers/crypto/cavium/cpt/cptvf_reqmanager.c |  655 ++++++++++++++++
 drivers/crypto/cavium/cpt/request_manager.h  |  221 ++++++
 9 files changed, 2994 insertions(+)
 create mode 100644 drivers/crypto/cavium/cpt/cptvf.h
 create mode 100644 drivers/crypto/cavium/cpt/cptvf_algs.c
 create mode 100644 drivers/crypto/cavium/cpt/cptvf_algs.h
 create mode 100644 drivers/crypto/cavium/cpt/cptvf_main.c
 create mode 100644 drivers/crypto/cavium/cpt/cptvf_mbox.c
 create mode 100644 drivers/crypto/cavium/cpt/cptvf_reqmanager.c
 create mode 100644 drivers/crypto/cavium/cpt/request_manager.h

diff --git a/drivers/crypto/cavium/cpt/Kconfig b/drivers/crypto/cavium/cpt/Kconfig
index 8fe3f44..d8c3f48 100644
--- a/drivers/crypto/cavium/cpt/Kconfig
+++ b/drivers/crypto/cavium/cpt/Kconfig
@@ -20,3 +20,13 @@ config OCTEONTX_CPT_PF
 
 	  To compile this as a module, choose M here: the module will be
 	  called cptpf.
+config OCTEONTX_CPT_VF
+	tristate "Octeon-tx CPT Virtual function driver"
+	depends on ARCH_THUNDER
+	select CRYPTO_DEV_CPT
+	help
+	  Support for Cavium CPT Virtual function found in octeon-tx
+	  series of processors.
+
+	  To compile this as a module, choose M here: the module will be
+	  called cptvf.
diff --git a/drivers/crypto/cavium/cpt/Makefile b/drivers/crypto/cavium/cpt/Makefile
index bf758e2..6f70b15 100644
--- a/drivers/crypto/cavium/cpt/Makefile
+++ b/drivers/crypto/cavium/cpt/Makefile
@@ -1,2 +1,4 @@
 obj-$(CONFIG_OCTEONTX_CPT_PF) += cptpf.o
 cptpf-objs := cpt_main.o cpt_pf_mbox.o
+obj-$(CONFIG_OCTEONTX_CPT_VF) += cptvf.o
+cptvf-objs := cptvf_main.o cptvf_reqmanager.o cptvf_mbox.o cptvf_algs.o
diff --git a/drivers/crypto/cavium/cpt/cptvf.h b/drivers/crypto/cavium/cpt/cptvf.h
new file mode 100644
index 0000000..1fafea8
--- /dev/null
+++ b/drivers/crypto/cavium/cpt/cptvf.h
@@ -0,0 +1,255 @@
+/*
+ * Copyright (C) 2016 Cavium, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ */
+
+#ifndef __CPTVF_H
+#define __CPTVF_H
+
+#include <linux/list.h>
+#include "cpt_common.h"
+
+struct command_chunk {
+	uint8_t *head; /* 128-byte aligned real_vaddr */
+	uint8_t *real_vaddr; /* Virtual address after dma_alloc_consistent */
+	dma_addr_t dma_addr; /* 128-byte aligned real_dma_addr */
+	dma_addr_t real_dma_addr; /* DMA address after dma_alloc_consistent */
+	uint32_t size; /* Chunk size, max CPT_INST_CHUNK_MAX_SIZE */
+	struct hlist_node nextchunk;
+};
+
+struct iq_stats {
+	atomic64_t instr_posted;
+	atomic64_t instr_dropped;
+};
+
+/**
+ * comamnd queue structure
+ */
+struct command_queue {
+	spinlock_t lock; /* command queue lock */
+	uint32_t idx; /* Command queue host write idx */
+	uint32_t dbell_count; /* outstanding commands */
+	uint32_t nchunks; /* Number of command chunks */
+	struct command_chunk *qhead;	/* Command queue head, instructions
+					 * are inserted here
+					 */
+	struct hlist_head chead;
+	struct iq_stats stats; /* Queue statistics */
+};
+
+struct command_qinfo {
+	uint32_t dbell_thold; /* Command queue doorbell threshold */
+	uint32_t cmd_size; /* Command size (32/64-Byte) */
+	uint32_t qchunksize; /* Command queue chunk size configured by user */
+	struct command_queue queue[DEFAULT_DEVICE_QUEUES];
+};
+
+/**
+ * pending entry structure
+ */
+struct pending_entry {
+	uint8_t busy; /* Entry status (free/busy) */
+	uint8_t done;
+	uint8_t is_ae;
+
+	volatile uint64_t *completion_addr; /* Completion address */
+	void *post_arg;
+	void (*callback)(int, void *); /* Kernel ASYNC request callabck */
+	void *callback_arg; /* Kernel ASYNC request callabck arg */
+};
+
+/**
+ * pending queue structure
+ */
+struct pending_queue {
+	struct pending_entry *head;	/* head of the queue */
+	uint32_t front; /* Process work from here */
+	uint32_t rear; /* Append new work here */
+	atomic64_t pending_count;
+	spinlock_t lock; /* Queue lock */
+};
+
+struct pending_qinfo {
+	uint32_t nr_queues;	/* Number of queues supported */
+	uint32_t qlen; /* Queue length */
+	struct pending_queue queue[DEFAULT_DEVICE_QUEUES];
+};
+
+#define for_each_pending_queue(qinfo, q, i)	\
+	for (i = 0, q = &qinfo->queue[i]; i < qinfo->nr_queues; i++, \
+	     q = &qinfo->queue[i])
+
+/**
+ * CPT VF device structure
+ */
+struct cpt_vf {
+	uint32_t chip_id; /* CPT Device ID */
+	uint16_t flags; /* Flags to hold device status bits */
+	uint8_t vfid; /* Device Index 0...CPT_MAX_VF_NUM */
+	uint8_t vftype; /* VF type of SE_TYPE(1) or AE_TYPE(1) */
+	uint8_t vfgrp; /* VF group (0 - 8) */
+	uint8_t node; /* Operating node: Bits (46:44) in BAR0 address */
+	uint8_t  priority; /* VF priority ring: 1-High proirity round
+			    * robin ring;0-Low priority round robin ring;
+			    */
+	uint8_t  reqmode; /* Request processing mode POLL/ASYNC */
+	struct pci_dev *pdev; /* pci device handle */
+	void *sysdev; /* sysfs device */
+	void *proc; /* proc dir */
+	void __iomem *reg_base; /* Register start address */
+	void *wqe_info;	/* BH worker threads */
+	void *context;	/* Context Specific Information*/
+	void *nqueue_info; /* Queue Specific Information*/
+	/* MSI-X */
+	bool msix_enabled;
+	uint8_t	num_vec;
+	struct msix_entry msix_entries[CPT_VF_MSIX_VECTORS];
+	bool irq_allocated[CPT_VF_MSIX_VECTORS];
+	cpumask_var_t affinity_mask[CPT_VF_MSIX_VECTORS];
+	uint64_t intcnt;
+	/* Command and Pending queues */
+	uint32_t qlen;
+	uint32_t qsize; /* Calculated queue size */
+	uint32_t nr_queues;
+	uint32_t max_queues;
+	struct command_qinfo cqinfo; /* Command queue information */
+	struct pending_qinfo pqinfo; /* Pending queue information */
+	/* VF-PF mailbox communication */
+	bool pf_acked;
+	bool pf_nacked;
+} ____cacheline_aligned_in_smp;
+
+#define CPT_NODE_ID_SHIFT (44u)
+#define CPT_NODE_ID_MASK (3u)
+
+#define MAX_CPT_AE_CORES 6
+#define MAX_CPT_SE_CORES 10
+
+enum req_mode {
+	BLOCKING,
+	NON_BLOCKING,
+	SPEED,
+	KERN_POLL,
+};
+
+enum dma_mode {
+	DMA_DIRECT_DIRECT, /* Input DIRECT, Output DIRECT */
+	DMA_GATHER_SCATTER
+};
+
+enum inputype {
+	FROM_CTX = 0,
+	FROM_DPTR = 1
+};
+
+enum CspErrorCodes {
+	/*Microcode errors*/
+	NO_ERR = 0x00,
+	ERR_OPCODE_UNSUPPORTED = 0x01,
+
+	/*SCATTER GATHER*/
+	ERR_SCATTER_GATHER_WRITE_LENGTH = 0x02,
+	ERR_SCATTER_GATHER_LIST = 0x03,
+	ERR_SCATTER_GATHER_NOT_SUPPORTED = 0x04,
+
+	/*AE*/
+	ERR_LENGTH_INVALID = 0x05,
+	ERR_MOD_LEN_INVALID = 0x06,
+	ERR_EXP_LEN_INVALID = 0x07,
+	ERR_DATA_LEN_INVALID = 0x08,
+	ERR_MOD_LEN_ODD = 0x09,
+	ERR_PKCS_DECRYPT_INCORRECT = 0x0a,
+	ERR_ECC_PAI = 0xb,
+	ERR_ECC_CURVE_UNSUPPORTED = 0xc,
+	ERR_ECC_SIGN_R_INVALID = 0xd,
+	ERR_ECC_SIGN_S_INVALID = 0xe,
+	ERR_ECC_SIGNATURE_MISMATCH = 0xf,
+
+	/*SE GC*/
+	ERR_GC_LENGTH_INVALID = 0x41,
+	ERR_GC_RANDOM_LEN_INVALID = 0x42,
+	ERR_GC_DATA_LEN_INVALID = 0x43,
+	ERR_GC_DRBG_TYPE_INVALID = 0x44,
+	ERR_GC_CTX_LEN_INVALID = 0x45,
+	ERR_GC_CIPHER_UNSUPPORTED = 0x46,
+	ERR_GC_AUTH_UNSUPPORTED = 0x47,
+	ERR_GC_OFFSET_INVALID = 0x48,
+	ERR_GC_HASH_MODE_UNSUPPORTED = 0x49,
+	ERR_GC_DRBG_ENTROPY_LEN_INVALID = 0x4a,
+	ERR_GC_DRBG_ADDNL_LEN_INVALID = 0x4b,
+	ERR_GC_ICV_MISCOMPARE = 0x4c,
+	ERR_GC_DATA_UNALIGNED = 0x4d,
+
+	/*SE IPSEC*/
+	ERR_IPSEC_AUTH_UNSUPPORTED = 0xB0,
+	ERR_IPSEC_ENCRYPT_UNSUPPORTED = 0xB1,
+	ERR_IPSEC_IP_VERSION = 0xB2,
+	ERR_IPSEC_PROTOCOL = 0xB3,
+	ERR_IPSEC_CONTEXT_INVALID = 0xB4,
+	ERR_IPSEC_CONTEXT_DIRECTION_MISMATCH = 0xB5,
+	ERR_IPSEC_IP_PAYLOAD_TYPE = 0xB6,
+	ERR_IPSEC_CONTEXT_FLAG_MISMATCH = 0xB7,
+	ERR_IPSEC_GRE_HEADER_MISMATCH = 0xB8,
+	ERR_IPSEC_GRE_PROTOCOL = 0xB9,
+	ERR_IPSEC_CUSTOM_HDR_LEN = 0xBA,
+	ERR_IPSEC_ESP_NEXT_HEADER = 0xBB,
+	ERR_IPSEC_IPCOMP_CONFIGURATION = 0xBC,
+	ERR_IPSEC_FRAG_SIZE_CONFIGURATION = 0xBD,
+	ERR_IPSEC_SPI_MISMATCH = 0xBE,
+	ERR_IPSEC_CHECKSUM = 0xBF,
+	ERR_IPSEC_IPCOMP_PACKET_DETECTED = 0xC0,
+	ERR_IPSEC_TFC_PADDING_WITH_PREFRAG = 0xC1,
+	ERR_IPSEC_DSIV_INCORRECT_PARAM = 0xC2,
+	ERR_IPSEC_AUTHENTICATION_MISMATCH = 0xC3,
+	ERR_IPSEC_PADDING = 0xC4,
+	ERR_IPSEC_DUMMY_PAYLOAD = 0xC5,
+	ERR_IPSEC_IPV6_EXTENSION_HEADERS_TOO_BIG = 0xC6,
+	ERR_IPSEC_IPV6_HOP_BY_HOP = 0xC7,
+	ERR_IPSEC_IPV6_RH_LENGTH = 0xC8,
+	ERR_IPSEC_IPV6_OUTBOUND_RH_COPY_ADDR = 0xC9,
+	ERR_IPSEC_IPV6_DECRYPT_RH_SEGS_LEFT = 0xCA,
+	ERR_IPSEC_IPV6_HEADER_INVALID = 0xCB,
+	ERR_IPSEC_SELECTOR_MATCH = 0xCC,
+
+	/*SE SSL*/
+	ERR_SSL_POM_LEN_INVALID = 0x81,
+	ERR_SSL_RECORD_LEN_INVALID = 0x82,
+	ERR_SSL_CTX_LEN_INVALID = 0x83,
+	ERR_SSL_CIPHER_UNSUPPORTED = 0x84,
+	ERR_SSL_MAC_UNSUPPORTED = 0x85,
+	ERR_SSL_VERSION_UNSUPPORTED = 0x86,
+	ERR_SSL_VERIFY_AUTH_UNSUPPORTED = 0x87,
+	ERR_SSL_MS_LEN_INVALID = 0x88,
+	ERR_SSL_MAC_MISMATCH = 0x89,
+
+	/* API Layer */
+	ERR_REQ_TIMEOUT      = (0x40000000 | 0x103),    /* 0x40000103 */
+	ERR_REQ_PENDING      = (0x40000000 | 0x110),    /* 0x40000110 */
+	ERR_BAD_INPUT_LENGTH = (0x40000000 | 384),    /* 0x40000180 */
+	ERR_BAD_KEY_LENGTH,
+	ERR_BAD_KEY_HANDLE,
+	ERR_BAD_CONTEXT_HANDLE,
+	ERR_BAD_SCALAR_LENGTH,
+	ERR_BAD_DIGEST_LENGTH,
+	ERR_BAD_INPUT_ARG,
+	ERR_BAD_SSL_MSG_TYPE,
+	ERR_BAD_RECORD_PADDING,
+	ERR_NB_REQUEST_PENDING,
+};
+
+int cptvf_send_vf_up(struct cpt_vf *cptvf);
+int cptvf_send_vf_down(struct cpt_vf *cptvf);
+int cptvf_send_vf_to_grp_msg(struct cpt_vf *cptvf);
+int cptvf_send_vf_priority_msg(struct cpt_vf *cptvf);
+int cptvf_send_vq_size_msg(struct cpt_vf *cptvf);
+int cptvf_check_pf_ready(struct cpt_vf *cptvf);
+void cptvf_handle_mbox_intr(struct cpt_vf *cptvf);
+void cvm_crypto_exit(void);
+int cvm_crypto_init(struct cpt_vf *cptvf);
+void vq_post_process(struct cpt_vf *cptvf, uint32_t qno);
+void cptvf_write_vq_doorbell(struct cpt_vf *cptvf, uint32_t val);
+#endif /* __CPTVF_H */
diff --git a/drivers/crypto/cavium/cpt/cptvf_algs.c b/drivers/crypto/cavium/cpt/cptvf_algs.c
new file mode 100644
index 0000000..4705e90
--- /dev/null
+++ b/drivers/crypto/cavium/cpt/cptvf_algs.c
@@ -0,0 +1,446 @@
+
+/*
+ * Copyright (C) 2016 Cavium, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ */
+
+#include <linux/crypto.h>
+#include <crypto/algapi.h>
+#include <crypto/cryptd.h>
+#include <crypto/crypto_wq.h>
+#include <linux/list.h>
+#include <linux/scatterlist.h>
+#include <linux/err.h>
+#include <crypto/aes.h>
+#include <crypto/internal/aead.h>
+#include <crypto/aead.h>
+#include <crypto/authenc.h>
+#include <crypto/aes.h>
+#include <crypto/des.h>
+#include "request_manager.h"
+#include "cptvf.h"
+#include "cptvf_algs.h"
+
+struct cpt_device_handle {
+	void *cdev[MAX_DEVICES];
+	uint32_t dev_count;
+};
+
+static struct cpt_device_handle dev_handle;
+
+static void cvm_callback(uint32_t status, void *arg)
+{
+	struct crypto_async_request *req = (struct crypto_async_request *)arg;
+
+	req->complete(req, !status);
+}
+
+static inline void update_input_iv(struct cpt_request_info *req_info,
+				   uint8_t *iv, uint32_t enc_iv_len,
+				   uint32_t *argcnt)
+{
+	/* Setting the iv information */
+	req_info->in[*argcnt].ptr.addr = (void *)iv;
+	req_info->in[*argcnt].size = enc_iv_len;
+	req_info->in[*argcnt].offset = enc_iv_len;
+	req_info->in[*argcnt].type = UNIT_8_BIT;
+	req_info->req.dlen += enc_iv_len;
+
+	++(*argcnt);
+}
+
+static inline void update_output_iv(struct cpt_request_info *req_info,
+				    uint8_t *iv, uint32_t enc_iv_len,
+				    uint32_t *argcnt)
+{
+	/* Setting the iv information */
+	req_info->out[*argcnt].ptr.addr = (void *)iv;
+	req_info->out[*argcnt].size = enc_iv_len;
+	req_info->out[*argcnt].offset = enc_iv_len;
+	req_info->out[*argcnt].type = UNIT_8_BIT;
+
+	req_info->rlen += enc_iv_len;
+
+	++(*argcnt);
+}
+
+static inline void update_input_data(struct cpt_request_info *req_info,
+				     struct scatterlist *inp_sg,
+				     uint32_t nbytes, uint32_t *argcnt)
+{
+	req_info->req.dlen += nbytes;
+
+	while (nbytes) {
+		uint32_t len = min(nbytes, inp_sg->length);
+		uint8_t *ptr = page_address(sg_page(inp_sg)) + inp_sg->offset;
+
+		req_info->in[*argcnt].ptr.addr = (void *)ptr;
+		req_info->in[*argcnt].size = len;
+		req_info->in[*argcnt].offset = len;
+		req_info->in[*argcnt].type = UNIT_8_BIT;
+		nbytes -= len;
+
+		++(*argcnt);
+		++inp_sg;
+	}
+}
+
+static inline void update_output_data(struct cpt_request_info *req_info,
+				      struct scatterlist *outp_sg,
+				      uint32_t nbytes, uint32_t *argcnt)
+{
+	req_info->rlen += nbytes;
+
+	while (nbytes) {
+		uint32_t len = min(nbytes, outp_sg->length);
+		uint8_t *ptr = page_address(sg_page(outp_sg)) +
+					    outp_sg->offset;
+
+		req_info->out[*argcnt].ptr.addr = (void *)ptr;
+		req_info->out[*argcnt].size = len;
+		req_info->out[*argcnt].offset = len;
+		req_info->out[*argcnt].type = UNIT_8_BIT;
+		nbytes -= len;
+		++(*argcnt);
+		++outp_sg;
+	}
+}
+
+static inline uint32_t create_ctx_hdr(struct ablkcipher_request *req,
+				      uint32_t enc, uint32_t cipher_type,
+				      uint32_t aes_key_type, uint32_t *argcnt)
+{
+	struct crypto_ablkcipher *tfm = crypto_ablkcipher_reqtfm(req);
+	struct cvm_enc_ctx *ctx = crypto_ablkcipher_ctx(tfm);
+	struct cvm_req_ctx *rctx = ablkcipher_request_ctx(req);
+	struct fc_context *fctx = &rctx->fctx;
+	uint64_t *offset_control = &rctx->control_word;
+	uint32_t enc_iv_len = crypto_ablkcipher_ivsize(tfm);
+	struct cpt_request_info *req_info = &rctx->cpt_req;
+	uint64_t *ctrl_flags = NULL;
+	uint8_t iv_inp = FROM_DPTR;
+	uint8_t dma_mode = DMA_GATHER_SCATTER;
+
+	req_info->ctrl.s.grp = 0;
+	req_info->ctrl.s.dma_mode = dma_mode;
+	req_info->ctrl.s.req_mode = NON_BLOCKING;
+	req_info->ctrl.s.se_req = SE_CORE_REQ;
+
+	req_info->ctxl = sizeof(struct fc_context);
+	req_info->handle = 0;
+
+	req_info->req.opcode.s.major = MAJOR_OP_FC | DMA_MODE_FLAG(dma_mode);
+	if (enc)
+		req_info->req.opcode.s.minor = 2;
+	else
+		req_info->req.opcode.s.minor = 3;
+
+	req_info->req.param1 = req->nbytes; /* Encryption Data length */
+	req_info->req.param2 = 0; /*Auth data length */
+
+	fctx->enc.enc_ctrl.e.enc_cipher = cipher_type;
+	fctx->enc.enc_ctrl.e.aes_key = aes_key_type;
+	fctx->enc.enc_ctrl.e.iv_source = iv_inp;
+
+	memcpy(fctx->enc.encr_key, ctx->enc_key, ctx->key_len);
+	ctrl_flags = (uint64_t *)&fctx->enc.enc_ctrl.flags;
+	*ctrl_flags = cpu_to_be64(*ctrl_flags);
+
+	*offset_control = cpu_to_be64(((uint64_t)(enc_iv_len) << 16));
+	/* Storing  Packet Data Information in offset
+	 * Control Word First 8 bytes
+	 */
+	req_info->in[*argcnt].ptr.addr = (uint8_t *)offset_control;
+	req_info->in[*argcnt].size = CONTROL_WORD_LEN;
+	req_info->in[*argcnt].offset = CONTROL_WORD_LEN;
+	req_info->in[*argcnt].type = UNIT_8_BIT;
+	req_info->req.dlen += CONTROL_WORD_LEN;
+
+	++(*argcnt);
+
+	req_info->in[*argcnt].ptr.addr = (uint8_t *)fctx;
+	req_info->in[*argcnt].size = sizeof(struct fc_context);
+	req_info->in[*argcnt].offset = sizeof(struct fc_context);
+	req_info->in[*argcnt].type = UNIT_8_BIT;
+	req_info->req.dlen += sizeof(struct fc_context);
+
+	++(*argcnt);
+
+	return 0;
+}
+
+static inline uint32_t create_input_list(struct ablkcipher_request  *req,
+					 uint32_t enc, uint32_t cipher_type,
+					 uint32_t aes_key_type,
+					 uint32_t enc_iv_len)
+{
+	struct cvm_req_ctx *rctx = ablkcipher_request_ctx(req);
+	struct cpt_request_info *req_info = &rctx->cpt_req;
+	uint32_t argcnt =  0;
+
+	create_ctx_hdr(req, enc, cipher_type, aes_key_type, &argcnt);
+	update_input_iv(req_info, req->info, enc_iv_len, &argcnt);
+	update_input_data(req_info, req->src, req->nbytes, &argcnt);
+	req_info->incnt = argcnt;
+
+	return 0;
+}
+
+static inline void store_cb_info(struct ablkcipher_request *req,
+				 struct cpt_request_info *req_info)
+{
+	req_info->callback = (void *)cvm_callback;
+	req_info->callback_arg = (void *)&req->base;
+}
+
+static inline void create_output_list(struct ablkcipher_request *req,
+				      uint32_t cipher_type,
+				      uint32_t enc_iv_len)
+{
+	struct cvm_req_ctx *rctx = ablkcipher_request_ctx(req);
+	struct cpt_request_info *req_info = &rctx->cpt_req;
+	uint32_t argcnt = 0;
+
+	/* OUTPUT Buffer Processing
+	 * AES encryption/decryption output would be
+	 * received in the following format
+	 *
+	 * ------IV--------|------ENCRYPTED/DECRYPTED DATA-----|
+	 * [ 16 Bytes/     [   Request Enc/Dec/ DATA Len AES CBC ]
+	 */
+	/* Reading IV information */
+	update_output_iv(req_info, req->info, enc_iv_len, &argcnt);
+	update_output_data(req_info, req->dst, req->nbytes, &argcnt);
+	req_info->outcnt = argcnt;
+}
+
+static inline uint32_t cvm_enc_dec(struct ablkcipher_request *req,
+				   uint32_t enc, uint32_t cipher_type)
+{
+	struct crypto_ablkcipher *tfm = crypto_ablkcipher_reqtfm(req);
+	struct cvm_enc_ctx *ctx = crypto_ablkcipher_ctx(tfm);
+	uint32_t key_type = AES_128_BIT;
+	struct cvm_req_ctx *rctx = ablkcipher_request_ctx(req);
+	uint32_t enc_iv_len = crypto_ablkcipher_ivsize(tfm);
+	struct fc_context *fctx = &rctx->fctx;
+	struct cpt_request_info *req_info = &rctx->cpt_req;
+	void *cdev = NULL;
+	uint32_t status = -1;
+
+	switch (ctx->key_len) {
+	case BYTE_16:
+		key_type = AES_128_BIT;
+		break;
+	case BYTE_24:
+		key_type = AES_192_BIT;
+		break;
+	case BYTE_32:
+		key_type = AES_256_BIT;
+		break;
+	default:
+		return ERR_GC_CIPHER_UNSUPPORTED;
+	}
+
+	if (cipher_type == DES3_CBC)
+		key_type = 0;
+
+	memset(req_info, 0, sizeof(struct cpt_request_info));
+	memset(fctx, 0, sizeof(struct fc_context));
+	create_input_list(req, enc, cipher_type, key_type, enc_iv_len);
+	create_output_list(req, cipher_type, enc_iv_len);
+	store_cb_info(req, req_info);
+	cdev = dev_handle.cdev[smp_processor_id()];
+	status = cptvf_do_request(cdev, req_info);
+	/* We perform an asynchronous send and once
+	 * the request is completed the driver would
+	 * intimate through  registered call back functions
+	 */
+
+	if (status)
+		return status;
+	else
+		return -EINPROGRESS;
+}
+
+int cvm_des3_encrypt_cbc(struct ablkcipher_request *req)
+{
+	return cvm_enc_dec(req, true, DES3_CBC);
+}
+
+int cvm_des3_decrypt_cbc(struct ablkcipher_request *req)
+{
+	return cvm_enc_dec(req, false, DES3_CBC);
+}
+
+int cvm_aes_encrypt_xts(struct ablkcipher_request *req)
+{
+	return cvm_enc_dec(req, true, AES_XTS);
+}
+
+int cvm_aes_decrypt_xts(struct ablkcipher_request *req)
+{
+	return cvm_enc_dec(req, false, AES_XTS);
+}
+
+int cvm_aes_encrypt_cbc(struct ablkcipher_request *req)
+{
+	return cvm_enc_dec(req, true, AES_CBC);
+}
+
+int cvm_aes_decrypt_cbc(struct ablkcipher_request *req)
+{
+	return cvm_enc_dec(req, false, AES_CBC);
+}
+
+int cvm_enc_dec_setkey(struct crypto_ablkcipher *cipher, const uint8_t *key,
+		       uint32_t keylen)
+{
+	struct crypto_tfm *tfm = crypto_ablkcipher_tfm(cipher);
+	struct cvm_enc_ctx *ctx = crypto_tfm_ctx(tfm);
+
+	if ((keylen == BYTE_16) || (keylen == BYTE_24) ||
+	    (keylen == BYTE_32)) {
+		ctx->key_len = keylen;
+		memcpy(ctx->enc_key, key, keylen);
+		return 0;
+	}
+	crypto_ablkcipher_set_flags(cipher, CRYPTO_TFM_RES_BAD_KEY_LEN);
+
+	return -EINVAL;
+}
+
+int cvm_enc_dec_init(struct crypto_tfm *tfm)
+{
+	struct cvm_enc_ctx *ctx = crypto_tfm_ctx(tfm);
+
+	memset(ctx, 0, sizeof(*ctx));
+	tfm->crt_ablkcipher.reqsize = sizeof(struct cvm_req_ctx) +
+					sizeof(struct ablkcipher_request);
+	/* Additional memory for ablkcipher_request is
+	 * allocated since the cryptd daemon uses
+	 * this memory for request_ctx information
+	 */
+
+	return 0;
+}
+
+void cvm_enc_dec_exit(struct crypto_tfm *tfm)
+{
+	return;
+}
+
+struct crypto_alg algs[] = { {
+	.cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER | CRYPTO_ALG_ASYNC,
+	.cra_blocksize = AES_BLOCK_SIZE,
+	.cra_ctxsize = sizeof(struct cvm_enc_ctx),
+	.cra_alignmask = 7,
+	.cra_priority = CAV_PRIORITY,
+	.cra_name = "xts(aes)",
+	.cra_driver_name = "cavium-xts-aes",
+	.cra_type = &crypto_ablkcipher_type,
+	.cra_u = {
+		.ablkcipher = {
+			.ivsize = AES_BLOCK_SIZE,
+			.min_keysize = AES_MIN_KEY_SIZE,
+			.max_keysize = AES_MAX_KEY_SIZE,
+			.setkey = cvm_enc_dec_setkey,
+			.encrypt = cvm_aes_encrypt_xts,
+			.decrypt = cvm_aes_decrypt_xts,
+		},
+	},
+	.cra_init = cvm_enc_dec_init,
+	.cra_exit = cvm_enc_dec_exit,
+	.cra_module = THIS_MODULE,
+}, {
+	.cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER | CRYPTO_ALG_ASYNC,
+	.cra_blocksize = AES_BLOCK_SIZE,
+	.cra_ctxsize = sizeof(struct cvm_enc_ctx),
+	.cra_alignmask = 7,
+	.cra_priority = CAV_PRIORITY,
+	.cra_name = "cbc(aes)",
+	.cra_driver_name = "cavium-cbc-aes",
+	.cra_type = &crypto_ablkcipher_type,
+	.cra_u = {
+		.ablkcipher = {
+			.ivsize = AES_BLOCK_SIZE,
+			.min_keysize = AES_MIN_KEY_SIZE,
+			.max_keysize = AES_MAX_KEY_SIZE,
+			.setkey = cvm_enc_dec_setkey,
+			.encrypt = cvm_aes_encrypt_cbc,
+			.decrypt = cvm_aes_decrypt_cbc,
+		},
+	},
+	.cra_init = cvm_enc_dec_init,
+	.cra_exit = cvm_enc_dec_exit,
+	.cra_module = THIS_MODULE,
+}, {
+	.cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER | CRYPTO_ALG_ASYNC,
+	.cra_blocksize = DES3_EDE_BLOCK_SIZE,
+	.cra_ctxsize = sizeof(struct cvm_des3_ctx),
+	.cra_alignmask = 7,
+	.cra_priority = CAV_PRIORITY,
+	.cra_name = "cbc(des3_ede)",
+	.cra_driver_name = "cavium-cbc-des3_ede",
+	.cra_type = &crypto_ablkcipher_type,
+	.cra_u = {
+		.ablkcipher = {
+			.min_keysize = DES3_EDE_KEY_SIZE,
+			.max_keysize = DES3_EDE_KEY_SIZE,
+			.ivsize = DES_BLOCK_SIZE,
+			.setkey = cvm_enc_dec_setkey,
+			.encrypt = cvm_des3_encrypt_cbc,
+			.decrypt = cvm_des3_decrypt_cbc,
+		},
+	},
+	.cra_init = cvm_enc_dec_init,
+	.cra_exit = cvm_enc_dec_exit,
+	.cra_module = THIS_MODULE,
+} };
+
+static inline int cav_register_algs(void)
+{
+	int err = 0;
+
+	err = crypto_register_algs(algs, ARRAY_SIZE(algs));
+	if (err) {
+		pr_err("Error in aes module init %d\n", err);
+		return -1;
+	}
+
+	return 0;
+}
+
+static inline void cav_unregister_algs(void)
+{
+	crypto_unregister_algs(algs, ARRAY_SIZE(algs));
+}
+
+int cvm_crypto_init(struct cpt_vf *cptvf)
+{
+	uint32_t dev_count;
+
+	dev_count = dev_handle.dev_count;
+	dev_handle.cdev[dev_count] = cptvf;
+	dev_handle.dev_count++;
+
+	if (!dev_count) {
+		if (cav_register_algs()) {
+			pr_err("Error in registering crypto algorithms\n");
+			return -EINVAL;
+		}
+	}
+
+	return 0;
+}
+
+void cvm_crypto_exit(void)
+{
+	uint32_t dev_count;
+
+	dev_count = --dev_handle.dev_count;
+	if (!dev_count)
+		cav_unregister_algs();
+}
diff --git a/drivers/crypto/cavium/cpt/cptvf_algs.h b/drivers/crypto/cavium/cpt/cptvf_algs.h
new file mode 100644
index 0000000..2e45797
--- /dev/null
+++ b/drivers/crypto/cavium/cpt/cptvf_algs.h
@@ -0,0 +1,159 @@
+/*
+ * Copyright (C) 2016 Cavium, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ */
+
+#ifndef _CAVIUM_SYM_CRYPTO_H_
+#define _CAVIUM_SYM_CRYPTO_H_
+
+#define MAX_DEVICES 16
+/* AE opcodes*/
+#define MAJOR_OP_MISC         0x01
+#define MAJOR_OP_RANDOM       0x02
+#define MAJOR_OP_MODEXP       0x03
+#define MAJOR_OP_ECDSA        0x04
+#define MAJOR_OP_ECC          0x05
+#define MAJOR_OP_GENRSAPRIME  0x06
+#define MAJOR_OP_AE_RANDOM    0x32
+#define MAJOR_OP_AE_PASSTHRU  0x01
+#define MINOR_OP_AE_PASSTHRU  0x07
+
+/*SE opcodes*/
+#define MAJOR_OP_SE_MISC    0x31
+#define MAJOR_OP_SE_RANDOM  0x32
+#define MAJOR_OP_FC         0x33
+#define MAJOR_OP_HASH       0x34
+#define MAJOR_OP_HMAC       0x35
+#define MAJOR_OP_DSIV       0x36
+
+#define MAJOR_OP_SSL_FULL    0x10
+#define MAJOR_OP_SSL_VERIFY  0x11
+#define MAJOR_OP_SSL_RESUME  0x12
+#define MAJOR_OP_SSL_FINISH  0x13
+#define MAJOR_OP_SSL_ENCREC  0x14
+#define MAJOR_OP_SSL_DECREC  0x15
+
+#define MAJOR_OP_WRITESA_OUTBOUND 0x20
+#define MAJOR_OP_WRITESA_INBOUND  0x21
+#define MAJOR_OP_OUTBOUND         0x23
+#define MAJOR_OP_INBOUND          0x24
+
+#define MAJOR_OP_SE_PASSTHRU  0x01
+#define MINOR_OP_SE_PASSTHRU  0x07
+
+#define  CAV_PRIORITY 1000
+#define  MAX_ENC_KEY_SIZE 32
+#define  MAX_HASH_KEY_SIZE 64
+#define  MAX_KEY_SIZE (MAX_ENC_KEY_SIZE + MAX_HASH_KEY_SIZE)
+#define  CONTROL_WORD_LEN 8
+
+#define IV_OFFSET 8   /* Include SPI | SNO 8 Bytes */
+#define AES_CBC_ALG_NAME "cbc(aes)"
+#define AES_XTS_ALG_NAME "xts(aes)"
+#define DES3_ALG_NAME "cbc(des3_ede)"
+
+#define  BYTE_16 16
+#define  BYTE_24 24
+#define  BYTE_32 32
+
+#define DMA_MODE_FLAG(dma_mode) \
+	((dma_mode == DMA_GATHER_SCATTER) ? (1 << 7) : 0)
+
+enum req_type {
+	AE_CORE_REQ,
+	SE_CORE_REQ,
+};
+
+enum cipher_type {
+	DES3_CBC = 0x1,
+	DES3_ECB = 0x2,
+	AES_CBC = 0x3,
+	AES_ECB = 0x4,
+	AES_CFB = 0x5,
+	AES_CTR = 0x6,
+	AES_GCM = 0x7,
+	AES_XTS = 0x8
+};
+
+enum aes_type {
+	AES_128_BIT = 0x1,
+	AES_192_BIT = 0x2,
+	AES_256_BIT = 0x3
+};
+
+/*Context length in words*/
+#define  FC_CTX_LENGTH       23
+#define  ENC_CTX_LENGTH       7
+#define  HASH_CTX_LENGTH     34
+#define  HMAC_CTX_LENGTH     34
+
+union encr_ctrl {
+	uint64_t flags;
+	struct {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		uint64_t enc_cipher:4;
+		uint64_t reserved1:1;
+		uint64_t aes_key:2;
+		uint64_t iv_source:1;
+		uint64_t hash_type:4;
+		uint64_t reserved2:3;
+		uint64_t auth_input_type:1;
+		uint64_t mac_len:8;
+		uint64_t reserved3:8;
+		uint64_t encr_offset:16;
+		uint64_t iv_offset:8;
+		uint64_t auth_offset:8;
+#else
+		uint64_t auth_offset:8;
+		uint64_t iv_offset:8;
+		uint64_t encr_offset:16;
+		uint64_t reserved3:8;
+		uint64_t mac_len:8;
+		uint64_t auth_input_type:1;
+		uint64_t reserved2:3;
+		uint64_t hash_type:4;
+		uint64_t iv_source:1;
+		uint64_t aes_key:2;
+		uint64_t reserved1:1;
+		uint64_t enc_cipher:4;
+#endif
+	} e;
+};
+
+struct enc_context {
+	union encr_ctrl enc_ctrl;
+	uint8_t  encr_key[32];
+	uint8_t  encr_iv[16];
+};
+
+struct fchmac_context {
+	uint8_t  ipad[64];
+	uint8_t  opad[64]; /* or OPAD */
+};
+
+struct fc_context {
+	struct enc_context enc;
+	struct fchmac_context hmac;
+};
+
+struct cvm_enc_ctx {
+	uint32_t key_len;
+	uint8_t enc_key[MAX_KEY_SIZE];
+};
+
+struct cvm_des3_ctx {
+	uint32_t key_len;
+	uint8_t des3_key[MAX_KEY_SIZE];
+};
+
+struct cvm_req_ctx {
+	struct cpt_request_info cpt_req;
+	uint64_t control_word;
+	struct fc_context fctx;
+};
+
+uint32_t cptvf_do_request(void *cptvf, struct cpt_request_info *);
+#endif /*_CAVIUM_SYM_CRYPTO_H_*/
diff --git a/drivers/crypto/cavium/cpt/cptvf_main.c b/drivers/crypto/cavium/cpt/cptvf_main.c
new file mode 100644
index 0000000..57b796f
--- /dev/null
+++ b/drivers/crypto/cavium/cpt/cptvf_main.c
@@ -0,0 +1,1038 @@
+/*
+ * Copyright (C) 2016 Cavium, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ */
+
+#include <linux/version.h>
+#include <linux/aer.h>
+#include <linux/module.h>
+#include <linux/moduleparam.h>
+#include <linux/printk.h>
+#include <linux/device.h>
+#include <linux/interrupt.h>
+#include <linux/pci.h>
+#include <linux/cpumask.h>
+
+#include "cptvf.h"
+
+#define DRV_NAME	"thunder-cptvf"
+#define DRV_VERSION	"1.0"
+
+static uint32_t qlen = DEFAULT_CMD_QLEN;
+module_param(qlen, uint, 0644);
+MODULE_PARM_DESC(qlen, "Command queue length");
+
+static uint32_t chunksize = DEFAULT_CMD_QCHUNK_SIZE;
+module_param(chunksize, uint, 0644);
+MODULE_PARM_DESC(chunksize, "Command queue chunk size");
+
+static uint32_t group = 1; /* Default to SE group */
+module_param(group, uint, 0644);
+MODULE_PARM_DESC(group, "VF group (Value between 0 - 7)");
+
+static uint32_t priority;
+module_param(priority, uint, 0644);
+MODULE_PARM_DESC(priority, "VF/VQ Priority (0-1)");
+
+struct cptvf_wqe {
+	struct tasklet_struct twork;
+	void *cptvf;
+	uint32_t qno;
+};
+
+struct cptvf_wqe_info {
+	struct cptvf_wqe vq_wqe[DEFAULT_DEVICE_QUEUES];
+};
+
+static void vq_work_handler(unsigned long data)
+{
+	struct cptvf_wqe_info *cwqe_info = (struct cptvf_wqe_info *)data;
+	struct cptvf_wqe *cwqe = &cwqe_info->vq_wqe[0];
+
+	vq_post_process(cwqe->cptvf, cwqe->qno);
+}
+
+static int init_worker_threads(struct cpt_vf *cptvf)
+{
+	struct pci_dev *pdev = cptvf->pdev;
+	struct cptvf_wqe_info *cwqe_info;
+	int i;
+
+	cwqe_info = kzalloc(sizeof(*cwqe_info), GFP_KERNEL);
+	if (!cwqe_info)
+		return -ENOMEM;
+
+	if (cptvf->nr_queues) {
+		dev_info(&pdev->dev, "Creating VQ worker threads (%d)\n",
+			 cptvf->nr_queues);
+	}
+
+	for (i = 0; i < cptvf->nr_queues; i++) {
+		tasklet_init(&cwqe_info->vq_wqe[i].twork, vq_work_handler,
+			     (uint64_t)cwqe_info);
+		cwqe_info->vq_wqe[i].qno = i;
+		cwqe_info->vq_wqe[i].cptvf = cptvf;
+	}
+
+	cptvf->wqe_info = cwqe_info;
+
+	return 0;
+}
+
+static void cleanup_worker_threads(struct cpt_vf *cptvf)
+{
+	struct cptvf_wqe_info *cwqe_info;
+	struct pci_dev *pdev = cptvf->pdev;
+	int i;
+
+	cwqe_info = (struct cptvf_wqe_info *)cptvf->wqe_info;
+	if (!cwqe_info)
+		return;
+
+	if (cptvf->nr_queues) {
+		dev_info(&pdev->dev, "Cleaning VQ worker threads (%u)\n",
+			 cptvf->nr_queues);
+	}
+
+	for (i = 0; i < cptvf->nr_queues; i++)
+		tasklet_kill(&cwqe_info->vq_wqe[i].twork);
+
+	kzfree(cwqe_info);
+	cptvf->wqe_info = NULL;
+}
+
+static void free_pending_queues(struct pending_qinfo *pqinfo)
+{
+	int32_t i;
+	struct pending_queue *queue;
+
+	for_each_pending_queue(pqinfo, queue, i) {
+		if (!queue->head)
+			continue;
+
+		/* free single queue */
+		kzfree((queue->head));
+
+		queue->front = 0;
+		queue->rear = 0;
+
+		return;
+	}
+
+	pqinfo->qlen = 0;
+	pqinfo->nr_queues = 0;
+}
+
+static int32_t alloc_pending_queues(struct pending_qinfo *pqinfo,
+				    uint32_t qlen, uint32_t nr_queues)
+{
+	uint32_t i;
+	size_t size;
+	int32_t ret;
+	struct pending_queue *queue = NULL;
+
+	pqinfo->nr_queues = nr_queues;
+	pqinfo->qlen = qlen;
+
+	size = (qlen * sizeof(struct pending_entry));
+
+	for_each_pending_queue(pqinfo, queue, i) {
+		queue->head = kzalloc((size), GFP_KERNEL);
+		if (!queue->head) {
+			pr_err("pending Q (%d) allocation failed\n", i);
+			ret = -ENOMEM;
+			goto pending_qfail;
+		}
+
+		queue->front = 0;
+		queue->rear = 0;
+		atomic64_set((&queue->pending_count), (0));
+
+		/* init queue spin lock */
+		spin_lock_init(&queue->lock);
+	}
+
+	return 0;
+
+pending_qfail:
+	free_pending_queues(pqinfo);
+
+	return ret;
+}
+
+static int32_t init_pending_queues(struct cpt_vf *cptvf, uint32_t qlen,
+				   uint32_t nr_queues)
+{
+	int32_t ret;
+
+	if (!nr_queues)
+		return 0;
+
+	ret = alloc_pending_queues(&cptvf->pqinfo, qlen, nr_queues);
+	if (ret) {
+		pr_err("failed to setup pending queues (%u)\n", nr_queues);
+		return ret;
+	}
+
+	return 0;
+}
+
+static void cleanup_pending_queues(struct cpt_vf *cptvf)
+{
+	struct pci_dev *pdev = cptvf->pdev;
+
+	if (!cptvf->nr_queues)
+		return;
+
+	dev_info(&pdev->dev, "Cleaning VQ pending queue (%u)\n",
+		 cptvf->nr_queues);
+	free_pending_queues(&cptvf->pqinfo);
+}
+
+static void free_command_queues(struct cpt_vf *cptvf,
+				struct command_qinfo *cqinfo)
+{
+	int i, j;
+	struct command_queue *queue = NULL;
+	struct command_chunk *chunk = NULL, *next = NULL;
+	struct pci_dev *pdev = cptvf->pdev;
+	struct hlist_node *node;
+
+	/* clean up for each queue */
+	for (i = 0; i < cptvf->nr_queues; i++) {
+		queue = &cqinfo->queue[i];
+		if (hlist_empty(&cqinfo->queue[i].chead))
+			continue;
+
+		hlist_for_each(node, &cqinfo->queue[i].chead) {
+			chunk = hlist_entry(node, struct command_chunk,
+					    nextchunk);
+			break;
+		}
+
+		for (j = 0; j < queue->nchunks; j++) {
+			if (j < queue->nchunks) {
+				node = node->next;
+				next = hlist_entry(node, struct command_chunk,
+						   nextchunk);
+			}
+
+			dma_free_coherent(&pdev->dev, chunk->size,
+					  chunk->real_vaddr,
+					  chunk->real_dma_addr);
+			chunk->real_vaddr = NULL;
+			chunk->real_dma_addr = 0;
+			chunk->head = NULL;
+			chunk->dma_addr = 0;
+			hlist_del(&chunk->nextchunk);
+			kzfree(chunk);
+			chunk = next;
+		}
+		queue->nchunks = 0;
+		queue->idx = 0;
+		queue->dbell_count = 0;
+	}
+
+	/* common cleanup */
+	cqinfo->cmd_size = 0;
+	cqinfo->dbell_thold = 0;
+}
+
+static int32_t alloc_command_queues(struct cpt_vf *cptvf,
+				    struct command_qinfo *cqinfo,
+				    size_t cmd_size, size_t align,
+				    uint32_t qlen, uint32_t nr_queues)
+{
+	int i;
+	size_t q_size;
+	struct command_queue *queue = NULL;
+	struct pci_dev *pdev = cptvf->pdev;
+
+	/* common init */
+	cqinfo->cmd_size = cmd_size;
+	cqinfo->dbell_thold = CPT_DBELL_THOLD;
+
+	/* Qsize in dwords, needed for SADDR config, 1-next chunk pointer */
+	cptvf->qsize = min(qlen, cqinfo->qchunksize) *
+			CPT_NEXT_CHUNK_PTR_SIZE + 1;
+	/* Qsize in bytes to create space for alignment */
+	q_size = qlen * cqinfo->cmd_size;
+
+	/* per queue initialization */
+	for (i = 0; i < cptvf->nr_queues; i++) {
+		size_t c_size = 0;
+		size_t rem_q_size = q_size;
+		struct command_chunk *curr = NULL, *first = NULL, *last = NULL;
+		uint32_t qcsize_bytes = cqinfo->qchunksize * cqinfo->cmd_size;
+
+		queue = &cqinfo->queue[i];
+		INIT_HLIST_HEAD(&cqinfo->queue[i].chead);
+		do {
+			curr = kzalloc(sizeof(*curr), GFP_KERNEL);
+			if (!curr)
+				goto cmd_qfail;
+
+			c_size = (rem_q_size > qcsize_bytes) ? qcsize_bytes :
+					rem_q_size;
+			curr->real_vaddr = (uint8_t *)dma_zalloc_coherent(&pdev->dev,
+					  c_size + CPT_NEXT_CHUNK_PTR_SIZE,
+					  &curr->real_dma_addr, GFP_KERNEL);
+			if (!curr->real_vaddr) {
+				pr_err("Command Q (%d) chunk (%d) allocation failed\n",
+				       i, queue->nchunks);
+				goto cmd_qfail;
+			}
+
+			curr->head = (uint8_t *)PTR_ALIGN(curr->real_vaddr, align);
+			curr->dma_addr = (dma_addr_t)PTR_ALIGN(curr->real_dma_addr,
+								align);
+			curr->size = c_size;
+			if (queue->nchunks == 0) {
+				hlist_add_head(&curr->nextchunk,
+					       &cqinfo->queue[i].chead);
+				first = curr;
+			} else {
+				hlist_add_behind(&curr->nextchunk,
+						 &last->nextchunk);
+			}
+
+			queue->nchunks++;
+			rem_q_size -= c_size;
+			if (last)
+				*((uint64_t *)(&last->head[last->size])) = (uint64_t)curr->dma_addr;
+
+			last = curr;
+		} while (rem_q_size);
+
+		/* Make the queue circular */
+		/* Tie back last chunk entry to head */
+		curr = first;
+		*((uint64_t *)(&last->head[last->size])) = (uint64_t)curr->dma_addr;
+		last->nextchunk.next = &curr->nextchunk;
+		queue->qhead = curr;
+		queue->dbell_count = 0;
+		spin_lock_init(&queue->lock);
+	}
+	return 0;
+
+cmd_qfail:
+	free_command_queues(cptvf, cqinfo);
+	return -ENOMEM;
+}
+
+static int32_t init_command_queues(struct cpt_vf *cptvf, uint32_t qlen,
+				   uint32_t nr_queues)
+{
+	int32_t ret;
+
+	if (!nr_queues)
+		return 0;
+
+	/* setup AE command queues */
+	ret = alloc_command_queues(cptvf, &cptvf->cqinfo, CPT_INST_SIZE,
+				   CPT_VQ_CHUNK_ALIGN, qlen, nr_queues);
+	if (ret) {
+		pr_err("failed to allocate AE command queues (%u)\n",
+		       nr_queues);
+		return ret;
+	}
+
+	return ret;
+}
+
+static void cleanup_command_queues(struct cpt_vf *cptvf)
+{
+	struct pci_dev *pdev = cptvf->pdev;
+
+	if (!cptvf->nr_queues)
+		return;
+
+	dev_info(&pdev->dev, "Cleaning VQ command queue (%u)\n",
+		 cptvf->nr_queues);
+	free_command_queues(cptvf, &cptvf->cqinfo);
+}
+
+static void cptvf_sw_cleanup(struct cpt_vf *cptvf)
+{
+	cleanup_worker_threads(cptvf);
+	cleanup_pending_queues(cptvf);
+	cleanup_command_queues(cptvf);
+}
+
+static int32_t cptvf_sw_init(struct cpt_vf *cptvf, uint32_t qlen,
+			     uint32_t nr_queues)
+{
+	int32_t ret = 0;
+	uint32_t max_dev_queues = 0, nr_cpus = num_online_cpus();
+
+	max_dev_queues = CPT_NUM_QS_PER_VF;
+	/* possible cpus */
+	nr_queues = max_t(uint32_t, nr_cpus, nr_queues);
+	nr_queues = min_t(uint32_t, nr_queues, max_dev_queues);
+	cptvf->max_queues = nr_queues;
+	cptvf->nr_queues = nr_queues;
+	cptvf->qlen = qlen;
+
+	ret = init_command_queues(cptvf, qlen, nr_queues);
+	if (ret) {
+		pr_err("Failed to setup command queues (%u)\n", nr_queues);
+		return ret;
+	}
+
+	ret = init_pending_queues(cptvf, qlen, nr_queues);
+	if (ret) {
+		pr_err("Failed to setup pending queues (%u)\n", nr_queues);
+		goto setup_pqfail;
+	}
+
+	/* Create worker threads for BH processing */
+	ret = init_worker_threads(cptvf);
+	if (ret) {
+		pr_err("Failed to setup worker threads\n");
+		goto init_work_fail;
+	}
+
+	return 0;
+
+init_work_fail:
+	cleanup_worker_threads(cptvf);
+	cleanup_pending_queues(cptvf);
+
+setup_pqfail:
+	cleanup_command_queues(cptvf);
+
+	return ret;
+}
+
+static inline int cptvf_get_node_id(struct pci_dev *pdev)
+{
+	uint64_t addr = pci_resource_start(pdev, CPT_CSR_BAR);
+
+	return ((addr >> CPT_NODE_ID_SHIFT) & CPT_NODE_ID_MASK);
+}
+
+static void cptvf_disable_msix(struct cpt_vf *cptvf)
+{
+	if (cptvf->msix_enabled) {
+		pci_disable_msix(cptvf->pdev);
+		cptvf->msix_enabled = 0;
+		cptvf->num_vec = 0;
+	}
+}
+
+static int cptvf_enable_msix(struct cpt_vf *cptvf)
+{
+	int i, ret;
+
+	cptvf->num_vec = CPT_VF_MSIX_VECTORS;
+
+	for (i = 0; i < cptvf->num_vec; i++)
+		cptvf->msix_entries[i].entry = i;
+
+	ret = pci_enable_msix(cptvf->pdev, cptvf->msix_entries,
+			      cptvf->num_vec);
+	if (ret) {
+		dev_err(&cptvf->pdev->dev, "Request for #%d msix vectors failed\n",
+			cptvf->num_vec);
+		return ret;
+	}
+
+	cptvf->msix_enabled = 1;
+	/* Mark MSIX enabled */
+	cptvf->flags |= CPT_FLAG_MSIX_ENABLED;
+
+	return 0;
+}
+
+static void cptvf_free_all_interrupts(struct cpt_vf *cptvf)
+{
+	int irq;
+
+	for (irq = 0; irq < cptvf->num_vec; irq++) {
+		if (cptvf->irq_allocated[irq])
+			irq_set_affinity_hint(cptvf->msix_entries[irq].vector,
+					      NULL);
+		free_cpumask_var(cptvf->affinity_mask[irq]);
+		free_irq(cptvf->msix_entries[irq].vector, cptvf);
+		cptvf->irq_allocated[irq] = false;
+	}
+}
+
+static void cptvf_write_vq_ctl(struct cpt_vf *cptvf, bool val)
+{
+	union cptx_vqx_ctl vqx_ctl;
+
+	vqx_ctl.u = cpt_read_csr64(cptvf->reg_base, CPTX_VQX_CTL(0, 0));
+	vqx_ctl.s.ena = val;
+	cpt_write_csr64(cptvf->reg_base, CPTX_VQX_CTL(0, 0), vqx_ctl.u);
+}
+
+void cptvf_write_vq_doorbell(struct cpt_vf *cptvf, uint32_t val)
+{
+	union cptx_vqx_doorbell vqx_dbell;
+
+	vqx_dbell.u = cpt_read_csr64(cptvf->reg_base,
+				     CPTX_VQX_DOORBELL(0, 0));
+	vqx_dbell.s.dbell_cnt = val * 8; /* Num of Instructions * 8 words */
+	cpt_write_csr64(cptvf->reg_base, CPTX_VQX_DOORBELL(0, 0),
+			vqx_dbell.u);
+}
+
+static void cptvf_write_vq_inprog(struct cpt_vf *cptvf, uint8_t val)
+{
+	union cptx_vqx_inprog vqx_inprg;
+
+	vqx_inprg.u = cpt_read_csr64(cptvf->reg_base, CPTX_VQX_INPROG(0, 0));
+	vqx_inprg.s.inflight = val;
+	cpt_write_csr64(cptvf->reg_base, CPTX_VQX_INPROG(0, 0), vqx_inprg.u);
+}
+
+static void cptvf_write_vq_done_numwait(struct cpt_vf *cptvf, uint32_t val)
+{
+	union cptx_vqx_done_wait vqx_dwait;
+
+	vqx_dwait.u = cpt_read_csr64(cptvf->reg_base,
+				     CPTX_VQX_DONE_WAIT(0, 0));
+	vqx_dwait.s.num_wait = val;
+	cpt_write_csr64(cptvf->reg_base, CPTX_VQX_DONE_WAIT(0, 0),
+			vqx_dwait.u);
+}
+
+static void cptvf_write_vq_done_timewait(struct cpt_vf *cptvf, uint16_t val)
+{
+	union cptx_vqx_done_wait vqx_dwait;
+
+	vqx_dwait.u = cpt_read_csr64(cptvf->reg_base,
+				     CPTX_VQX_DONE_WAIT(0, 0));
+	vqx_dwait.s.time_wait = val;
+	cpt_write_csr64(cptvf->reg_base, CPTX_VQX_DONE_WAIT(0, 0),
+			vqx_dwait.u);
+}
+
+static void cptvf_enable_swerr_interrupts(struct cpt_vf *cptvf)
+{
+	union cptx_vqx_misc_ena_w1s vqx_misc_ena;
+
+	vqx_misc_ena.u = cpt_read_csr64(cptvf->reg_base,
+					CPTX_VQX_MISC_ENA_W1S(0, 0));
+	/* Set mbox(0) interupts for the requested vf */
+	vqx_misc_ena.s.swerr = 1;
+	cpt_write_csr64(cptvf->reg_base, CPTX_VQX_MISC_ENA_W1S(0, 0),
+			vqx_misc_ena.u);
+}
+
+static void cptvf_enable_mbox_interrupts(struct cpt_vf *cptvf)
+{
+	union cptx_vqx_misc_ena_w1s vqx_misc_ena;
+
+	vqx_misc_ena.u = cpt_read_csr64(cptvf->reg_base,
+					CPTX_VQX_MISC_ENA_W1S(0, 0));
+	/* Set mbox(0) interupts for the requested vf */
+	vqx_misc_ena.s.mbox = 1;
+	cpt_write_csr64(cptvf->reg_base, CPTX_VQX_MISC_ENA_W1S(0, 0),
+			vqx_misc_ena.u);
+}
+
+static void cptvf_enable_done_interrupts(struct cpt_vf *cptvf)
+{
+	union cptx_vqx_done_ena_w1s vqx_done_ena;
+
+	vqx_done_ena.u = cpt_read_csr64(cptvf->reg_base,
+					CPTX_VQX_DONE_ENA_W1S(0, 0));
+	/* Set DONE interrupt for the requested vf */
+	vqx_done_ena.s.done = 1;
+	cpt_write_csr64(cptvf->reg_base, CPTX_VQX_DONE_ENA_W1S(0, 0),
+			vqx_done_ena.u);
+}
+
+static void cptvf_clear_dovf_intr(struct cpt_vf *cptvf)
+{
+	union cptx_vqx_misc_int vqx_misc_int;
+
+	vqx_misc_int.u = cpt_read_csr64(cptvf->reg_base,
+					CPTX_VQX_MISC_INT(0, 0));
+	/* W1C for the VF */
+	vqx_misc_int.s.dovf = 1;
+	cpt_write_csr64(cptvf->reg_base, CPTX_VQX_MISC_INT(0, 0),
+			vqx_misc_int.u);
+}
+
+static void cptvf_clear_irde_intr(struct cpt_vf *cptvf)
+{
+	union cptx_vqx_misc_int vqx_misc_int;
+
+	vqx_misc_int.u = cpt_read_csr64(cptvf->reg_base,
+					CPTX_VQX_MISC_INT(0, 0));
+	/* W1C for the VF */
+	vqx_misc_int.s.irde = 1;
+	cpt_write_csr64(cptvf->reg_base, CPTX_VQX_MISC_INT(0, 0),
+			vqx_misc_int.u);
+}
+
+static void cptvf_clear_nwrp_intr(struct cpt_vf *cptvf)
+{
+	union cptx_vqx_misc_int vqx_misc_int;
+
+	vqx_misc_int.u = cpt_read_csr64(cptvf->reg_base,
+					CPTX_VQX_MISC_INT(0, 0));
+	/* W1C for the VF */
+	vqx_misc_int.s.nwrp = 1;
+	cpt_write_csr64(cptvf->reg_base,
+			CPTX_VQX_MISC_INT(0, 0), vqx_misc_int.u);
+}
+
+static void cptvf_clear_mbox_intr(struct cpt_vf *cptvf)
+{
+	union cptx_vqx_misc_int vqx_misc_int;
+
+	vqx_misc_int.u = cpt_read_csr64(cptvf->reg_base,
+					CPTX_VQX_MISC_INT(0, 0));
+	/* W1C for the VF */
+	vqx_misc_int.s.mbox = 1;
+	cpt_write_csr64(cptvf->reg_base, CPTX_VQX_MISC_INT(0, 0),
+			vqx_misc_int.u);
+}
+
+static void cptvf_clear_swerr_intr(struct cpt_vf *cptvf)
+{
+	union cptx_vqx_misc_int vqx_misc_int;
+
+	vqx_misc_int.u = cpt_read_csr64(cptvf->reg_base,
+					CPTX_VQX_MISC_INT(0, 0));
+	/* W1C for the VF */
+	vqx_misc_int.s.swerr = 1;
+	cpt_write_csr64(cptvf->reg_base, CPTX_VQX_MISC_INT(0, 0),
+			vqx_misc_int.u);
+}
+
+static uint64_t cptvf_read_vf_misc_intr_status(struct cpt_vf *cptvf)
+{
+	return cpt_read_csr64(cptvf->reg_base, CPTX_VQX_MISC_INT(0, 0));
+}
+
+static irqreturn_t cptvf_misc_intr_handler(int irq, void *cptvf_irq)
+{
+	struct cpt_vf *cptvf = (struct cpt_vf *)cptvf_irq;
+	uint64_t intr;
+
+	intr = cptvf_read_vf_misc_intr_status(cptvf);
+	/*Check for MISC interrupt types*/
+	if (likely(intr & CPT_VF_INTR_MBOX_MASK)) {
+		pr_err("Mailbox interrupt 0x%llx on CPT VF %d\n",
+		       intr, cptvf->vfid);
+		cptvf_handle_mbox_intr(cptvf);
+		cptvf_clear_mbox_intr(cptvf);
+	} else if (unlikely(intr & CPT_VF_INTR_DOVF_MASK)) {
+		cptvf_clear_dovf_intr(cptvf);
+		/*Clear doorbell count*/
+		cptvf_write_vq_doorbell(cptvf, 0);
+		pr_err("Doorbell overflow error interrupt 0x%llx on CPT VF %d\n",
+		       intr, cptvf->vfid);
+	} else if (unlikely(intr & CPT_VF_INTR_IRDE_MASK)) {
+		cptvf_clear_irde_intr(cptvf);
+		pr_err("Instruction NCB read error interrupt 0x%llx on CPT VF %d\n",
+		       intr, cptvf->vfid);
+	} else if (unlikely(intr & CPT_VF_INTR_NWRP_MASK)) {
+		cptvf_clear_nwrp_intr(cptvf);
+		pr_err("NCB response write error interrupt 0x%llx on CPT VF %d\n",
+		       intr, cptvf->vfid);
+	} else if (unlikely(intr & CPT_VF_INTR_SERR_MASK)) {
+		cptvf_clear_swerr_intr(cptvf);
+		pr_err("Software error interrupt 0x%llx on CPT VF %d\n",
+		       intr, cptvf->vfid);
+	} else {
+		pr_err("Unhandled interrupt in CPT VF %d\n", cptvf->vfid);
+	}
+
+	return IRQ_HANDLED;
+}
+
+static inline struct cptvf_wqe *get_cptvf_vq_wqe(struct cpt_vf *cptvf,
+						 int qno)
+{
+	struct cptvf_wqe_info *nwqe_info;
+
+	if (unlikely(qno >= cptvf->nr_queues))
+		return NULL;
+	nwqe_info = (struct cptvf_wqe_info *)cptvf->wqe_info;
+
+	return &nwqe_info->vq_wqe[qno];
+}
+
+static inline uint32_t cptvf_read_vq_done_count(struct cpt_vf *cptvf)
+{
+	union cptx_vqx_done vqx_done;
+
+	vqx_done.u = cpt_read_csr64(cptvf->reg_base, CPTX_VQX_DONE(0, 0));
+	return vqx_done.s.done;
+}
+
+static inline void cptvf_write_vq_done_ack(struct cpt_vf *cptvf,
+					   uint32_t ackcnt)
+{
+	union cptx_vqx_done_ack vqx_dack_cnt;
+
+	vqx_dack_cnt.u = cpt_read_csr64(cptvf->reg_base,
+					CPTX_VQX_DONE_ACK(0, 0));
+	vqx_dack_cnt.s.done_ack = ackcnt;
+	cpt_write_csr64(cptvf->reg_base, CPTX_VQX_DONE_ACK(0, 0),
+			vqx_dack_cnt.u);
+}
+
+static irqreturn_t cptvf_done_intr_handler(int irq, void *cptvf_irq)
+{
+	struct cpt_vf *cptvf = (struct cpt_vf *)cptvf_irq;
+	/* Read the number of completions */
+	uint32_t intr = cptvf_read_vq_done_count(cptvf);
+
+	cptvf->intcnt += intr;
+	if (intr) {
+		struct cptvf_wqe *wqe;
+
+		/* Acknowledge the number of
+		 * scheduled completions for processing
+		 */
+		cptvf_write_vq_done_ack(cptvf, intr);
+		wqe = get_cptvf_vq_wqe(cptvf, 0);
+		if (unlikely(!wqe)) {
+			pr_err("No work to schedule for VF (%d)",
+			       cptvf->vfid);
+			return 1;
+		}
+		tasklet_hi_schedule(&wqe->twork);
+	}
+
+	return IRQ_HANDLED;
+}
+
+static int cptvf_register_misc_intr(struct cpt_vf *cptvf)
+{
+	int ret;
+	struct device *dev = &cptvf->pdev->dev;
+
+	/* Register misc interrupt handlers */
+	ret = request_irq(cptvf->msix_entries[CPT_VF_INT_VEC_E_MISC].vector,
+			  cptvf_misc_intr_handler, 0, "CPT VF misc intr",
+			  cptvf);
+	if (ret)
+		goto fail;
+
+	cptvf->irq_allocated[CPT_VF_INT_VEC_E_MISC] = true;
+
+	/* Enable mailbox interrupt */
+	cptvf_enable_mbox_interrupts(cptvf);
+	cptvf_enable_swerr_interrupts(cptvf);
+
+	return 0;
+
+fail:
+	dev_err(dev, "Request misc irq failed");
+	cptvf_free_all_interrupts(cptvf);
+	return ret;
+}
+
+static int cptvf_register_done_intr(struct cpt_vf *cptvf)
+{
+	int ret;
+	struct device *dev = &cptvf->pdev->dev;
+
+	/* Register DONE interrupt handlers */
+	ret = request_irq(cptvf->msix_entries[CPT_VF_INT_VEC_E_DONE].vector,
+			  cptvf_done_intr_handler, 0, "CPT VF done intr",
+			  cptvf);
+	if (ret)
+		goto fail;
+
+	cptvf->irq_allocated[CPT_VF_INT_VEC_E_DONE] = true;
+
+	/* Enable mailbox interrupt */
+	cptvf_enable_done_interrupts(cptvf);
+	return 0;
+
+fail:
+	dev_err(dev, "Request done irq failed\n");
+	cptvf_free_all_interrupts(cptvf);
+	return ret;
+}
+
+static void cptvf_unregister_interrupts(struct cpt_vf *cptvf)
+{
+	cptvf_free_all_interrupts(cptvf);
+	cptvf_disable_msix(cptvf);
+}
+
+static void cptvf_set_irq_affinity(struct cpt_vf *cptvf)
+{
+	int32_t vec, cpu;
+	int32_t irqnum;
+
+	for (vec = 0; vec < cptvf->num_vec; vec++) {
+		if (!cptvf->irq_allocated[vec])
+			continue;
+
+		if (!zalloc_cpumask_var(&cptvf->affinity_mask[vec],
+					GFP_KERNEL)) {
+			pr_err("Allocation failed for affinity_mask for VF %d",
+			       cptvf->vfid);
+			return;
+		}
+
+		cpu = cptvf->vfid % num_online_cpus();
+		cpumask_set_cpu(cpumask_local_spread(cpu, cptvf->node),
+				cptvf->affinity_mask[vec]);
+		irqnum = cptvf->msix_entries[vec].vector;
+		irq_set_affinity_hint(irqnum, cptvf->affinity_mask[vec]);
+	}
+}
+
+static void cptvf_write_vq_saddr(struct cpt_vf *cptvf, uint64_t val)
+{
+	union cptx_vqx_saddr vqx_saddr;
+
+	vqx_saddr.u = val;
+	cpt_write_csr64(cptvf->reg_base, CPTX_VQX_SADDR(0, 0), vqx_saddr.u);
+}
+
+void cptvf_device_init(struct cpt_vf *cptvf)
+{
+	uint64_t base_addr = 0;
+
+	cptvf->chip_id = CPTVF_81XX_PASS1_0;
+	/* Disable the VQ */
+	cptvf_write_vq_ctl(cptvf, 0);
+	/* Reset the doorbell */
+	cptvf_write_vq_doorbell(cptvf, 0);
+	/* Clear inflight */
+	cptvf_write_vq_inprog(cptvf, 0);
+	/* Write VQ SADDR */
+	/* TODO: for now only one queue, so hard coded */
+	base_addr = (uint64_t)(cptvf->cqinfo.queue[0].qhead->dma_addr);
+	cptvf_write_vq_saddr(cptvf, base_addr);
+	/* Configure timerhold / coalescence */
+	cptvf_write_vq_done_timewait(cptvf, CPT_TIMER_THOLD);
+	cptvf_write_vq_done_numwait(cptvf, CPT_COUNT_THOLD);
+	/* Enable the VQ */
+	cptvf_write_vq_ctl(cptvf, 1);
+	/* Flag the VF ready */
+	cptvf->flags |= CPT_FLAG_DEVICE_READY;
+}
+
+static int cptvf_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
+{
+	struct device *dev = &pdev->dev;
+	struct cpt_vf *cptvf;
+	int    err;
+
+	cptvf = devm_kzalloc(dev, sizeof(struct cpt_vf), GFP_KERNEL);
+	if (!cptvf)
+		return -ENOMEM;
+
+	pci_set_drvdata(pdev, cptvf);
+	cptvf->pdev = pdev;
+	err = pci_enable_device(pdev);
+	if (err) {
+		dev_err(dev, "Failed to enable PCI device\n");
+		pci_set_drvdata(pdev, NULL);
+		return err;
+	}
+
+	err = pci_request_regions(pdev, DRV_NAME);
+	if (err) {
+		dev_err(dev, "PCI request regions failed 0x%x\n", err);
+		goto cptvf_err_disable_device;
+	}
+	/* Mark as VF driver */
+	cptvf->flags |= CPT_FLAG_VF_DRIVER;
+	err = pci_set_dma_mask(pdev, DMA_BIT_MASK(48));
+	if (err) {
+		dev_err(dev, "Unable to get usable DMA configuration\n");
+		goto cptvf_err_release_regions;
+	}
+
+	err = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(48));
+	if (err) {
+		dev_err(dev, "Unable to get 48-bit DMA for consistent allocations\n");
+		goto cptvf_err_release_regions;
+	}
+
+	/* MAP PF's configuration registers */
+	cptvf->reg_base = pcim_iomap(pdev, CPT_CSR_BAR, 0);
+	if (!cptvf->reg_base) {
+		dev_err(dev, "Cannot map config register space, aborting\n");
+		err = -ENOMEM;
+		goto cptvf_err_release_regions;
+	}
+
+	cptvf->node = cptvf_get_node_id(pdev);
+	/* Enable MSI-X */
+	err = cptvf_enable_msix(cptvf);
+	if (err) {
+		dev_err(dev, "cptvf_enable_msix() failed");
+		goto cptvf_err_release_regions;
+	}
+
+	/* Register mailbox interrupts */
+	cptvf_register_misc_intr(cptvf);
+
+	/* Check ready with PF */
+	/* Gets chip ID / device Id from PF if ready */
+	err = cptvf_check_pf_ready(cptvf);
+	if (err) {
+		dev_err(dev, "PF not responding to READY msg");
+		err = -EBUSY;
+		goto cptvf_err_release_regions;
+	}
+
+	/* CPT VF software resources initialization */
+	cptvf->cqinfo.qchunksize = chunksize;
+	err = cptvf_sw_init(cptvf, qlen, CPT_NUM_QS_PER_VF);
+	if (err) {
+		dev_err(dev, "cptvf_sw_init() failed");
+		goto cptvf_err_release_regions;
+	}
+	/* Convey VQ LEN to PF */
+	err = cptvf_send_vq_size_msg(cptvf);
+	if (err) {
+		dev_err(dev, "PF not responding to QLEN msg");
+		err = -EBUSY;
+		goto cptvf_err_release_regions;
+	}
+
+	/* CPT VF device initialization */
+	cptvf_device_init(cptvf);
+	/* Send msg to PF to assign currnet Q to required group */
+	cptvf->vfgrp = group;
+	err = cptvf_send_vf_to_grp_msg(cptvf);
+	if (err) {
+		dev_err(dev, "PF not responding to VF_GRP msg");
+		err = -EBUSY;
+		goto cptvf_err_release_regions;
+	}
+
+	cptvf->priority = priority;
+	err = cptvf_send_vf_priority_msg(cptvf);
+	if (err) {
+		dev_err(dev, "PF not responding to VF_PRIO msg");
+		err = -EBUSY;
+		goto cptvf_err_release_regions;
+	}
+	/* Register DONE interrupts */
+	err = cptvf_register_done_intr(cptvf);
+	if (err)
+		goto cptvf_err_release_regions;
+
+	/* Set irq affinity masks */
+	cptvf_set_irq_affinity(cptvf);
+	/* Convey UP to PF */
+	err = cptvf_send_vf_up(cptvf);
+	if (err) {
+		dev_err(dev, "PF not responding to UP msg");
+		err = -EBUSY;
+		goto cptvf_up_fail;
+	}
+	err = cvm_crypto_init(cptvf);
+	if (err) {
+		dev_err(dev, "Algorithm register failed\n");
+		err = -EBUSY;
+		goto cptvf_up_fail;
+	}
+	return 0;
+
+cptvf_up_fail:
+	cptvf_unregister_interrupts(cptvf);
+cptvf_err_release_regions:
+	pci_release_regions(pdev);
+cptvf_err_disable_device:
+	pci_disable_device(pdev);
+	pci_set_drvdata(pdev, NULL);
+
+	return err;
+}
+
+static void cptvf_remove(struct pci_dev *pdev)
+{
+	struct cpt_vf *cptvf = pci_get_drvdata(pdev);
+
+	if (!cptvf)
+		pr_err("Invalid CPT-VF device\n");
+
+	/* Convey DOWN to PF */
+	if (cptvf_send_vf_down(cptvf)) {
+		pr_err("PF not responding to DOWN msg");
+	} else {
+		cptvf_unregister_interrupts(cptvf);
+		cptvf_sw_cleanup(cptvf);
+		pci_set_drvdata(pdev, NULL);
+		pci_release_regions(pdev);
+		pci_disable_device(pdev);
+		cvm_crypto_exit();
+	}
+}
+
+static void cptvf_shutdown(struct pci_dev *pdev)
+{
+	cptvf_remove(pdev);
+}
+
+/* Supported devices */
+static const struct pci_device_id cptvf_id_table[] = {
+	{PCI_VDEVICE(CAVIUM, CPT_81XX_PCI_VF_DEVICE_ID), 0},
+	{ 0, }  /* end of table */
+};
+
+static struct pci_driver cptvf_pci_driver = {
+	.name = DRV_NAME,
+	.id_table = cptvf_id_table,
+	.probe = cptvf_probe,
+	.remove = cptvf_remove,
+	.shutdown = cptvf_shutdown,
+};
+
+static int __init cptvf_init_module(void)
+{
+	int ret = -1;
+
+	pr_info("%s, ver %s\n", DRV_NAME, DRV_VERSION);
+	if (group < 0 || group > 7) {
+		pr_warn("Invalid group. Should be (0-7), setting to default 1.\n");
+		group = 1;
+	}
+
+	if (chunksize > CPT_INST_CHUNK_MAX_SIZE || chunksize <= 0) {
+		pr_warn("Invalid instruction chunk size. Should be (1-1023). Setting to default 1023\n");
+		chunksize = CPT_INST_CHUNK_MAX_SIZE;
+	}
+
+	if ((qlen > chunksize) && (qlen % chunksize != 0)) {
+		pr_warn("qlen should be multiple of chunksize when qlen > chunksize, rounding up qlen\n");
+		qlen += chunksize - (qlen % chunksize);
+	}
+
+	if (priority < 0 || priority > 1) {
+		pr_warn("Invalid VQ/VF priority. Should be (0-1), setting to default 0.\n");
+		priority = 0;
+	}
+
+	ret = pci_register_driver(&cptvf_pci_driver);
+	if (ret)
+		pr_err("pci_register_driver() failed");
+
+	return ret;
+}
+
+static void __exit cptvf_cleanup_module(void)
+{
+	pci_unregister_driver(&cptvf_pci_driver);
+}
+
+module_init(cptvf_init_module);
+module_exit(cptvf_cleanup_module);
+
+MODULE_AUTHOR("George Cherian <george.cherian@cavium.com>, Murthy Nidadavolu");
+MODULE_DESCRIPTION("Cavium Thunder CPT Physical Function Driver");
+MODULE_LICENSE("GPL v2");
+MODULE_VERSION(DRV_VERSION);
+MODULE_DEVICE_TABLE(pci, cptvf_id_table);
diff --git a/drivers/crypto/cavium/cpt/cptvf_mbox.c b/drivers/crypto/cavium/cpt/cptvf_mbox.c
new file mode 100644
index 0000000..80de249
--- /dev/null
+++ b/drivers/crypto/cavium/cpt/cptvf_mbox.c
@@ -0,0 +1,208 @@
+/*
+ * Copyright (C) 2016 Cavium, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ */
+
+#include "cptvf.h"
+
+static void cptvf_send_msg_to_pf(struct cpt_vf *cptvf, struct cpt_mbox *mbx)
+{
+	/* Writing mbox(1) causes interrupt */
+	cpt_write_csr64(cptvf->reg_base, CPTX_VFX_PF_MBOXX(0, 0, 0),
+			mbx->msg);
+	cpt_write_csr64(cptvf->reg_base, CPTX_VFX_PF_MBOXX(0, 0, 1),
+			mbx->data);
+}
+
+/* ACKs PF's mailbox message
+ */
+void cptvf_mbox_send_ack(struct cpt_vf *cptvf, struct cpt_mbox *mbx)
+{
+	mbx->msg = CPT_MBOX_MSG_TYPE_ACK;
+	cptvf_send_msg_to_pf(cptvf, mbx);
+}
+
+/* NACKs PF's mailbox message that VF is not able to
+ * complete the action
+ */
+void cptvf_mbox_send_nack(struct cpt_vf *cptvf, struct cpt_mbox *mbx)
+{
+	mbx->msg = CPT_MBOX_MSG_TYPE_NACK;
+	cptvf_send_msg_to_pf(cptvf, mbx);
+}
+
+/* Interrupt handler to handle mailbox messages from VFs */
+void cptvf_handle_mbox_intr(struct cpt_vf *cptvf)
+{
+	struct cpt_mbox mbx = {};
+
+	/*
+	 * MBOX[0] contains msg
+	 * MBOX[1] contains data
+	 */
+	mbx.msg  = cpt_read_csr64(cptvf->reg_base, CPTX_VFX_PF_MBOXX(0, 0, 0));
+	mbx.data = cpt_read_csr64(cptvf->reg_base, CPTX_VFX_PF_MBOXX(0, 0, 1));
+	dev_dbg(&cptvf->pdev->dev, "%s: Mailbox msg 0x%llx from PF\n",
+		__func__, mbx.msg);
+	switch (mbx.msg) {
+	case CPT_MSG_READY:
+	{
+		union cpt_chipid_vfid cid;
+
+		cid.u16 = mbx.data;
+		cptvf->pf_acked = true;
+		cptvf->vfid = cid.s.vfid;
+		dev_dbg(&cptvf->pdev->dev, "Received VFID %d\n", cptvf->vfid);
+		break;
+	}
+	case CPT_MSG_QBIND_GRP:
+		cptvf->pf_acked = true;
+		cptvf->vftype = mbx.data;
+		dev_dbg(&cptvf->pdev->dev, "VF %d type %s group %d\n",
+			cptvf->vfid, ((mbx.data == SE_TYPES) ? "SE" : "AE"),
+			cptvf->vfgrp);
+		break;
+	case CPT_MBOX_MSG_TYPE_ACK:
+		cptvf->pf_acked = true;
+		break;
+	case CPT_MBOX_MSG_TYPE_NACK:
+		cptvf->pf_nacked = true;
+		break;
+	default:
+		dev_err(&cptvf->pdev->dev, "Invalid msg from PF, msg 0x%llx\n",
+			mbx.msg);
+		break;
+	}
+}
+
+static int32_t cptvf_send_msg_to_pf_timeout(struct cpt_vf *cptvf,
+					    struct cpt_mbox *mbx)
+{
+	int timeout = CPT_MBOX_MSG_TIMEOUT;
+	int sleep = 10;
+
+	cptvf->pf_acked = false;
+	cptvf->pf_nacked = false;
+	cptvf_send_msg_to_pf(cptvf, mbx);
+	/* Wait for previous message to be acked, timeout 2sec */
+	while (!cptvf->pf_acked) {
+		if (cptvf->pf_nacked)
+			return -EINVAL;
+		msleep(sleep);
+		if (cptvf->pf_acked)
+			break;
+		timeout -= sleep;
+		if (!timeout) {
+			dev_err(&cptvf->pdev->dev, "PF didn't ack to mbox msg %llx from VF%u\n",
+				(mbx->msg & 0xFF), cptvf->vfid);
+			return -EBUSY;
+		}
+	}
+
+	return 0;
+}
+
+/*
+ * Checks if VF is able to comminicate with PF
+ * and also gets the CPT number this VF is associated to.
+ */
+int cptvf_check_pf_ready(struct cpt_vf *cptvf)
+{
+	struct cpt_mbox mbx = {};
+
+	mbx.msg = CPT_MSG_READY;
+	if (cptvf_send_msg_to_pf_timeout(cptvf, &mbx)) {
+		dev_err(&cptvf->pdev->dev, "PF didn't respond to READY msg\n");
+		return 1;
+	}
+
+	return 0;
+}
+
+/*
+ * Communicate VQs size to PF to program CPT(0)_PF_Q(0-15)_CTL of the VF.
+ * Must be ACKed.
+ */
+int cptvf_send_vq_size_msg(struct cpt_vf *cptvf)
+{
+	struct cpt_mbox mbx = {};
+
+	mbx.msg = CPT_MSG_QLEN;
+	mbx.data = cptvf->qsize;
+	if (cptvf_send_msg_to_pf_timeout(cptvf, &mbx)) {
+		dev_err(&cptvf->pdev->dev, "PF didn't respond to vq_size msg\n");
+		return 1;
+	}
+
+	return 0;
+}
+
+/*
+ * Communicate VF group required to PF and get the VQ binded to that group
+ */
+int cptvf_send_vf_to_grp_msg(struct cpt_vf *cptvf)
+{
+	struct cpt_mbox mbx = {};
+
+	mbx.msg = CPT_MSG_QBIND_GRP;
+	/* Convey group of the VF */
+	mbx.data = cptvf->vfgrp;
+	if (cptvf_send_msg_to_pf_timeout(cptvf, &mbx)) {
+		dev_err(&cptvf->pdev->dev, "PF didn't respond to vf_type msg\n");
+		return 1;
+	}
+
+	return 0;
+}
+
+/*
+ * Communicate VF group required to PF and get the VQ binded to that group
+ */
+int cptvf_send_vf_priority_msg(struct cpt_vf *cptvf)
+{
+	struct cpt_mbox mbx = {};
+
+	mbx.msg = CPT_MSG_VQ_PRIORITY;
+	/* Convey group of the VF */
+	mbx.data = cptvf->priority;
+	if (cptvf_send_msg_to_pf_timeout(cptvf, &mbx)) {
+		dev_err(&cptvf->pdev->dev, "PF didn't respond to vf_type msg\n");
+		return 1;
+	}
+	return 0;
+}
+
+/*
+ * Communicate to PF that VF is UP and running
+ */
+int cptvf_send_vf_up(struct cpt_vf *cptvf)
+{
+	struct cpt_mbox mbx = {};
+
+	mbx.msg = CPT_MSG_VF_UP;
+	if (cptvf_send_msg_to_pf_timeout(cptvf, &mbx)) {
+		dev_err(&cptvf->pdev->dev, "PF didn't respond to UP msg\n");
+		return 1;
+	}
+
+	return 0;
+}
+
+/*
+ * Communicate to PF that VF is DOWN and running
+ */
+int cptvf_send_vf_down(struct cpt_vf *cptvf)
+{
+	struct cpt_mbox mbx = {};
+
+	mbx.msg = CPT_MSG_VF_DOWN;
+	if (cptvf_send_msg_to_pf_timeout(cptvf, &mbx)) {
+		dev_err(&cptvf->pdev->dev, "PF didn't respond to DOWN msg\n");
+		return 1;
+	}
+
+	return 0;
+}
diff --git a/drivers/crypto/cavium/cpt/cptvf_reqmanager.c b/drivers/crypto/cavium/cpt/cptvf_reqmanager.c
new file mode 100644
index 0000000..e6fc3f9
--- /dev/null
+++ b/drivers/crypto/cavium/cpt/cptvf_reqmanager.c
@@ -0,0 +1,655 @@
+/*
+ * Copyright (C) 2016 Cavium, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ */
+
+#include <linux/bitmap.h>
+#include <linux/kdev_t.h>
+#include <linux/fs.h>
+#include <linux/device.h>
+#include <linux/cdev.h>
+#include <linux/poll.h>
+
+#include "cptvf.h"
+#include "request_manager.h"
+
+/**
+ * get_free_pending_entry - get free entry from pending queue
+ * @param pqinfo: pending_qinfo structure
+ * @param qno: queue number
+ */
+static struct pending_entry *get_free_pending_entry(struct pending_queue *q,
+						    int32_t qlen)
+{
+	struct pending_entry *ent = NULL;
+
+	ent = &q->head[q->rear];
+	if (unlikely(ent->busy)) {
+		ent = NULL;
+		goto no_free_entry;
+	}
+
+	q->rear++;
+	if (unlikely(q->rear == qlen))
+		q->rear = 0;
+
+no_free_entry:
+	return ent;
+}
+
+static inline void pending_queue_inc_front(struct pending_qinfo *pqinfo,
+					   int32_t qno)
+{
+	struct pending_queue *queue = &pqinfo->queue[qno];
+
+	queue->front++;
+	if (unlikely(queue->front == pqinfo->qlen))
+		queue->front = 0;
+}
+
+static int32_t setup_sgio_components(struct cpt_vf *cptvf,
+				     struct buf_ptr *list,
+				     int32_t buf_count, uint8_t *buffer)
+{
+	int32_t ret = 0, i, j;
+	int32_t components;
+	struct sglist_component *sg_ptr = NULL;
+	struct pci_dev *pdev = cptvf->pdev;
+
+	if (unlikely(!list)) {
+		pr_err("Input List pointer is NULL\n");
+		ret = -EFAULT;
+		return ret;
+	}
+
+	for (i = 0; i < buf_count; i++) {
+		if (likely(list[i].vptr)) {
+			list[i].dma_addr = dma_map_single(&pdev->dev,
+							  list[i].vptr,
+							  list[i].size,
+							  DMA_BIDIRECTIONAL);
+			if (unlikely(dma_mapping_error(&pdev->dev,
+						       list[i].dma_addr))) {
+				pr_err("DMA map kernel buffer failed for component: %d\n",
+				       i);
+				ret = -EIO;
+				goto sg_cleanup;
+			}
+		}
+	}
+
+	components = buf_count / 4;
+	sg_ptr = (struct sglist_component *)buffer;
+	for (i = 0; i < components; i++) {
+		sg_ptr->u.s.len0 = cpu_to_be16(list[i * 4 + 0].size);
+		sg_ptr->u.s.len1 = cpu_to_be16(list[i * 4 + 1].size);
+		sg_ptr->u.s.len2 = cpu_to_be16(list[i * 4 + 2].size);
+		sg_ptr->u.s.len3 = cpu_to_be16(list[i * 4 + 3].size);
+		sg_ptr->ptr0 = cpu_to_be64(list[i * 4 + 0].dma_addr);
+		sg_ptr->ptr1 = cpu_to_be64(list[i * 4 + 1].dma_addr);
+		sg_ptr->ptr2 = cpu_to_be64(list[i * 4 + 2].dma_addr);
+		sg_ptr->ptr3 = cpu_to_be64(list[i * 4 + 3].dma_addr);
+		sg_ptr++;
+	}
+
+	components = buf_count % 4;
+
+	switch (components) {
+	case 3:
+		sg_ptr->u.s.len2 = cpu_to_be16(list[i * 4 + 2].size);
+		sg_ptr->ptr2 = cpu_to_be64(list[i * 4 + 2].dma_addr);
+		/* Fall through */
+	case 2:
+		sg_ptr->u.s.len1 = cpu_to_be16(list[i * 4 + 1].size);
+		sg_ptr->ptr1 = cpu_to_be64(list[i * 4 + 1].dma_addr);
+		/* Fall through */
+	case 1:
+		sg_ptr->u.s.len0 = cpu_to_be16(list[i * 4 + 0].size);
+		sg_ptr->ptr0 = cpu_to_be64(list[i * 4 + 0].dma_addr);
+		break;
+	default:
+		break;
+	}
+
+	return ret;
+
+sg_cleanup:
+	for (j = 0; j < i; j++) {
+		if (list[j].dma_addr) {
+			dma_unmap_single(&pdev->dev, list[i].dma_addr,
+					 list[i].size, DMA_BIDIRECTIONAL);
+		}
+
+		list[j].dma_addr = 0;
+	}
+
+	return ret;
+}
+
+static inline int32_t setup_sgio_list(struct cpt_vf *cptvf,
+				      struct cpt_info_buffer *info,
+				      struct cpt_request_info *req)
+{
+	uint16_t g_size_bytes = 0, s_size_bytes = 0;
+	int32_t i = 0, ret = 0;
+	struct pci_dev *pdev = cptvf->pdev;
+
+	if ((req->incnt + req->outcnt) > MAX_SG_IN_OUT_CNT) {
+		pr_err("Requestes SG components are higher than supported\n");
+		ret = -EINVAL;
+		goto  scatter_gather_clean;
+	}
+
+	/* Setup gather (input) components */
+	info->g_size = (req->incnt + 3) / 4;
+	info->glist_cnt = req->incnt;
+	g_size_bytes = info->g_size * sizeof(struct sglist_component);
+	for (i = 0; i < req->incnt; i++) {
+		info->glist_ptr[i].vptr = req->in[i].ptr.addr;
+		info->glist_ptr[i].size = req->in[i].size;
+	}
+
+	info->gather_components = kzalloc((g_size_bytes), GFP_KERNEL);
+	if (!info->gather_components) {
+		ret = -ENOMEM;
+		goto  scatter_gather_clean;
+	}
+
+	ret = setup_sgio_components(cptvf, info->glist_ptr,
+				    info->glist_cnt,
+				    info->gather_components);
+	if (ret) {
+		pr_err("Failed to setup gather list\n");
+		ret = -EFAULT;
+		goto  scatter_gather_clean;
+	}
+
+	/* Setup scatter (output) components */
+	info->s_size = (req->outcnt + 3) / 4;
+	info->slist_cnt = req->outcnt;
+	s_size_bytes = info->s_size * sizeof(struct sglist_component);
+	for (i = 0; i < info->slist_cnt ; i++) {
+		info->slist_ptr[i].vptr = req->out[i].ptr.addr;
+		info->slist_ptr[i].size = req->out[i].size;
+		info->outptr[i] = req->out[i].ptr.addr;
+		info->outsize[i] = req->out[i].size;
+		info->total_out += info->outsize[i];
+	}
+
+	info->scatter_components = kzalloc((s_size_bytes), GFP_KERNEL);
+	if (!info->scatter_components) {
+		ret = -ENOMEM;
+		goto  scatter_gather_clean;
+	}
+
+	ret = setup_sgio_components(cptvf, info->slist_ptr,
+				    info->slist_cnt,
+				    info->scatter_components);
+	if (ret) {
+		pr_err("Failed to setup gather list\n");
+		ret = -EFAULT;
+		goto  scatter_gather_clean;
+	}
+
+	/* Create and initialize DPTR */
+	info->dlen = g_size_bytes + s_size_bytes + SG_LIST_HDR_SIZE;
+	info->in_buffer = kzalloc((info->dlen), GFP_KERNEL);
+	if (!info->in_buffer) {
+		ret = -ENOMEM;
+		goto  scatter_gather_clean;
+	}
+
+	((uint16_t *)info->in_buffer)[0] = info->slist_cnt;
+	((uint16_t *)info->in_buffer)[1] = info->glist_cnt;
+	((uint16_t *)info->in_buffer)[2] = 0;
+	((uint16_t *)info->in_buffer)[3] = 0;
+	byte_swap_64((uint64_t *)info->in_buffer);
+
+	memcpy(&info->in_buffer[8], info->gather_components,
+	       g_size_bytes);
+	memcpy(&info->in_buffer[8 + g_size_bytes],
+	       info->scatter_components, s_size_bytes);
+
+	info->dptr_baddr = dma_map_single(&pdev->dev,
+					       (void *)info->in_buffer,
+					       info->dlen,
+					       DMA_BIDIRECTIONAL);
+	if (dma_mapping_error(&pdev->dev, info->dptr_baddr)) {
+		pr_err("Mapping DPTR Failed %d\n", info->dlen);
+		ret = -EIO;
+		goto  scatter_gather_clean;
+	}
+
+	/* Create and initialize RPTR */
+	info->rlen = COMPLETION_CODE_SIZE;
+	info->out_buffer = kzalloc((info->rlen), GFP_KERNEL);
+	if (!info->out_buffer) {
+		ret = -ENOMEM;
+		goto  scatter_gather_clean;
+	}
+
+	*((uint64_t *)info->out_buffer) = ~((uint64_t)COMPLETION_CODE_INIT);
+	info->alternate_caddr = (uint64_t *)info->out_buffer;
+	info->rptr_baddr = dma_map_single(&pdev->dev,
+					       (void *)info->out_buffer,
+					       info->rlen,
+					       DMA_BIDIRECTIONAL);
+	if (dma_mapping_error(&pdev->dev, info->rptr_baddr)) {
+		pr_err("Mapping RPTR Failed %d\n", info->rlen);
+		ret = -EIO;
+		goto  scatter_gather_clean;
+	}
+
+	return 0;
+
+scatter_gather_clean:
+	return ret;
+}
+
+int32_t send_cpt_command(struct cpt_vf *cptvf, union cpt_inst_s *cmd,
+			 uint32_t qno)
+{
+	struct command_qinfo *qinfo = NULL;
+	struct command_queue *queue;
+	struct command_chunk *chunk;
+	uint8_t *ent;
+	int32_t ret = 0;
+
+	if (unlikely(qno >= cptvf->nr_queues)) {
+		pr_err("Invalid queue (qno: %d, nr_queues: %d)\n",
+		       qno, cptvf->nr_queues);
+		return -EINVAL;
+	}
+
+	qinfo = &cptvf->cqinfo;
+	queue = &qinfo->queue[qno];
+	/* lock commad queue */
+	spin_lock(&queue->lock);
+	ent = &queue->qhead->head[queue->idx * qinfo->cmd_size];
+	memcpy(ent, (void *)cmd, qinfo->cmd_size);
+
+	if (++queue->idx >= queue->qhead->size / 64) {
+		struct hlist_node *node;
+
+		hlist_for_each(node, &queue->chead) {
+			chunk = hlist_entry(node, struct command_chunk,
+					    nextchunk);
+			if (chunk == queue->qhead) {
+				continue;
+			} else {
+				queue->qhead = chunk;
+				break;
+			}
+		}
+		queue->idx = 0;
+	}
+	/* make sure all memory stores are done before ringing doorbell */
+	smp_wmb();
+	cptvf_write_vq_doorbell(cptvf, 1);
+	/* unlock command queue */
+	spin_unlock(&queue->lock);
+
+	return ret;
+}
+
+void do_request_cleanup(struct cpt_vf *cptvf,
+			struct cpt_info_buffer *info)
+{
+	int32_t i;
+	struct pci_dev *pdev = cptvf->pdev;
+
+	if (info->dptr_baddr) {
+		dma_unmap_single(&pdev->dev, info->dptr_baddr,
+				 info->dlen, DMA_BIDIRECTIONAL);
+		info->dptr_baddr = 0;
+	}
+
+	if (info->rptr_baddr) {
+		dma_unmap_single(&pdev->dev, info->rptr_baddr,
+				 info->rlen, DMA_BIDIRECTIONAL);
+		info->rptr_baddr = 0;
+	}
+
+	if (info->comp_baddr) {
+		dma_unmap_single(&pdev->dev, info->comp_baddr,
+				 sizeof(union cpt_res_s), DMA_BIDIRECTIONAL);
+		info->comp_baddr = 0;
+	}
+
+	if (info->dma_mode == DMA_GATHER_SCATTER) {
+		for (i = 0; i < info->slist_cnt; i++) {
+			if (info->slist_ptr[i].dma_addr) {
+				dma_unmap_single(&pdev->dev,
+						 info->slist_ptr[i].dma_addr,
+						 info->slist_ptr[i].size,
+						 DMA_BIDIRECTIONAL);
+				info->slist_ptr[i].dma_addr = 0ULL;
+			}
+		}
+		info->slist_cnt = 0;
+		if (info->scatter_components)
+			kzfree(info->scatter_components);
+
+		for (i = 0; i < info->glist_cnt; i++) {
+			if (info->glist_ptr[i].dma_addr) {
+				dma_unmap_single(&pdev->dev,
+						 info->glist_ptr[i].dma_addr,
+						 info->glist_ptr[i].size,
+						 DMA_BIDIRECTIONAL);
+				info->glist_ptr[i].dma_addr = 0ULL;
+			}
+		}
+		info->glist_cnt = 0;
+		if (info->gather_components)
+			kzfree((info->gather_components));
+	}
+
+	if (info->out_buffer) {
+		kzfree((info->out_buffer));
+		info->out_buffer = NULL;
+	}
+
+	if (info->in_buffer) {
+		kzfree((info->in_buffer));
+		info->in_buffer = NULL;
+	}
+
+	if (info->completion_addr) {
+		kzfree(((void *)info->completion_addr));
+		info->completion_addr = NULL;
+	}
+
+	if (info) {
+		kzfree((info));
+		info = NULL;
+	}
+}
+
+void do_post_process(struct cpt_vf *cptvf, struct cpt_info_buffer *info)
+{
+	uint64_t *p;
+	uint32_t i;
+
+	if (!info || !cptvf) {
+		pr_err("Input params are incorrect for post processing\n");
+		return;
+	}
+
+	if (info->rlen) {
+		for (i = 0; i < info->slist_cnt; i++) {
+			if (info->outunit[i] == UNIT_64_BIT) {
+				p = (uint64_t *)info->slist_ptr[i].vptr;
+				*p = cpu_to_be64(*p);
+			}
+		}
+	}
+
+	do_request_cleanup(cptvf, info);
+}
+
+static inline void process_pending_queue(struct cpt_vf *cptvf,
+					 struct pending_qinfo *pqinfo,
+					 int32_t qno)
+{
+	struct pending_queue *pqueue = &pqinfo->queue[qno];
+	struct pending_entry *pentry = NULL;
+	struct cpt_info_buffer *info = NULL;
+	union cpt_res_s *status = NULL;
+
+	while (1) {
+		spin_lock_bh(&pqueue->lock);
+		pentry = &pqueue->head[pqueue->front];
+		if (unlikely(!pentry->busy)) {
+			spin_unlock_bh(&pqueue->lock);
+			break;
+		}
+
+		info = (struct cpt_info_buffer *)pentry->post_arg;
+		if (unlikely(!info)) {
+			pr_err("Pending Entry post arg NULL\n");
+			pending_queue_inc_front(pqinfo, qno);
+			spin_unlock_bh(&pqueue->lock);
+			continue;
+		}
+
+		status = (union cpt_res_s *)pentry->completion_addr;
+		if ((status->s.compcode == CPT_COMP_E_FAULT) ||
+		    (status->s.compcode == CPT_COMP_E_SWERR)) {
+			pr_err("Request failed with %s\n",
+			       (status->s.compcode == CPT_COMP_E_FAULT) ?
+			       "DMA Fault" : "Software error");
+			pentry->completion_addr = NULL;
+			pentry->busy = false;
+			atomic64_dec((&pqueue->pending_count));
+			pentry->post_arg = NULL;
+			pending_queue_inc_front(pqinfo, qno);
+			do_request_cleanup(cptvf, info);
+			spin_unlock_bh(&pqueue->lock);
+			break;
+		} else if (status->s.compcode == COMPLETION_CODE_INIT) {
+			/* check for timeout */
+			if (time_after_eq(jiffies,
+			    (info->time_in + (DEFAULT_COMMAND_TIMEOUT * HZ)))) {
+				pr_err("Request timed out");
+				pentry->completion_addr = NULL;
+				pentry->busy = false;
+				atomic64_dec((&pqueue->pending_count));
+				pentry->post_arg = NULL;
+				pending_queue_inc_front(pqinfo, qno);
+				do_request_cleanup(cptvf, info);
+				spin_unlock_bh(&pqueue->lock);
+				break;
+			} else if ((*info->alternate_caddr ==
+				(~COMPLETION_CODE_INIT)) &&
+				(info->extra_time < TIME_IN_RESET_COUNT)) {
+				info->time_in = jiffies;
+				info->extra_time++;
+				spin_unlock_bh(&pqueue->lock);
+				break;
+			}
+		}
+
+		info->status = 0;
+		pentry->completion_addr = NULL;
+		pentry->busy = false;
+		pentry->post_arg = NULL;
+		atomic64_dec((&pqueue->pending_count));
+		pending_queue_inc_front(pqinfo, qno);
+		spin_unlock_bh(&pqueue->lock);
+
+		do_post_process(info->cptvf, info);
+		/*
+		 * Calling callback after we find
+		 * that the request has been serviced
+		 */
+		pentry->callback(status->s.compcode, pentry->callback_arg);
+	}
+}
+
+int32_t process_request(struct cpt_vf *cptvf, struct cpt_request_info *req)
+{
+	int32_t ret = 0, clear = 0, queue = 0;
+	struct cpt_info_buffer *info = NULL;
+	struct cptvf_request *cpt_req = NULL;
+	union ctrl_info *ctrl = NULL;
+	struct pending_entry *pentry = NULL;
+	struct pending_queue *pqueue = NULL;
+	struct pci_dev *pdev = cptvf->pdev;
+	uint64_t key_handle = 0ULL;
+	uint8_t group = 0;
+	struct cpt_vq_command vq_cmd;
+	union cpt_inst_s cptinst;
+
+	if (unlikely(!cptvf || !req)) {
+		pr_err("Invalid inputs (cptvf: %p, req: %p)\n", cptvf, req);
+		return -EINVAL;
+	}
+
+	info = kzalloc(sizeof(*info), GFP_KERNEL | GFP_ATOMIC);
+	if (unlikely(!info)) {
+		pr_err("Unable to allocate memory for info_buffer\n");
+		return -ENOMEM;
+	}
+
+	cpt_req = (struct cptvf_request *)&req->req;
+	ctrl = (union ctrl_info *)&req->ctrl;
+	key_handle = req->handle;
+
+	info->cptvf = cptvf;
+	info->outcnt = req->outcnt;
+	info->req_type = ctrl->s.req_mode;
+	info->dma_mode = ctrl->s.dma_mode;
+	info->dlen   = cpt_req->dlen;
+	/* Add 8-bytes more for microcode completion code */
+	info->rlen   = ROUNDUP8(req->rlen + COMPLETION_CODE_SIZE);
+
+	group = ctrl->s.grp;
+	ret = setup_sgio_list(cptvf, info, req);
+	if (ret) {
+		pr_err("Setting up SG list failed");
+		goto request_cleanup;
+	}
+
+	cpt_req->dlen = info->dlen;
+	info->opcode = cpt_req->opcode.flags;
+	/*
+	 * Get buffer for union cpt_res_s response
+	 * structure and its physical address
+	 */
+	info->completion_addr = kzalloc(sizeof(union cpt_res_s),
+					     GFP_KERNEL | GFP_ATOMIC);
+	*((uint8_t *)(info->completion_addr)) = COMPLETION_CODE_INIT;
+	info->comp_baddr = dma_map_single(&pdev->dev,
+					       (void *)info->completion_addr,
+					       sizeof(union cpt_res_s),
+					       DMA_BIDIRECTIONAL);
+	if (dma_mapping_error(&pdev->dev, info->comp_baddr)) {
+		pr_err("mapping compptr Failed %lu\n", sizeof(union cpt_res_s));
+		ret = -EFAULT;
+		goto  request_cleanup;
+	}
+
+	/* Fill the VQ command */
+	vq_cmd.cmd.u64 = 0;
+	vq_cmd.cmd.s.opcode = cpu_to_be16(cpt_req->opcode.flags);
+	vq_cmd.cmd.s.param1 = cpu_to_be16(cpt_req->param1);
+	vq_cmd.cmd.s.param2 = cpu_to_be16(cpt_req->param2);
+	vq_cmd.cmd.s.dlen   = cpu_to_be16(cpt_req->dlen);
+
+	/* 64-bit swap for microcode data reads, not needed for addresses*/
+	vq_cmd.cmd.u64 = cpu_to_be64(vq_cmd.cmd.u64);
+	vq_cmd.dptr = info->dptr_baddr;
+	vq_cmd.rptr = info->rptr_baddr;
+	vq_cmd.cptr.u64 = 0;
+	vq_cmd.cptr.s.grp = group;
+	/* Get Pending Entry to submit command */
+	/*queue = SMP_PROCESSOR_ID() % cptvf->nr_queues;*/
+	/* Always queue 0, because 1 queue per VF */
+	queue = 0;
+	info->queue = queue;
+	pqueue = &cptvf->pqinfo.queue[queue];
+
+	if (atomic64_read(&pqueue->pending_count) > PENDING_THOLD) {
+		pr_err("pending threshold reached\n");
+		process_pending_queue(cptvf, &cptvf->pqinfo, queue);
+	}
+
+get_pending_entry:
+	spin_lock_bh(&pqueue->lock);
+	pentry = get_free_pending_entry(pqueue, cptvf->pqinfo.qlen);
+	if (unlikely(!pentry)) {
+		spin_unlock_bh(&pqueue->lock);
+		if (clear == 0) {
+			process_pending_queue(cptvf, &cptvf->pqinfo, queue);
+			clear = 1;
+			goto get_pending_entry;
+		}
+		pr_err("Get free entry failed\n");
+		pr_err("queue: %d, rear: %d, front: %d\n",
+		       queue, pqueue->rear, pqueue->front);
+		ret = -EFAULT;
+		goto request_cleanup;
+	}
+
+	pentry->done = false;
+	pentry->completion_addr = info->completion_addr;
+	pentry->post_arg = (void *)info;
+	pentry->callback = req->callback;
+	pentry->callback_arg = req->callback_arg;
+	info->pentry = pentry;
+	pentry->busy = true;
+	atomic64_inc(&pqueue->pending_count);
+
+	/* Send CPT command */
+	info->pentry = pentry;
+	info->status = ERR_REQ_PENDING;
+	info->time_in = jiffies;
+
+	/* Create the CPT_INST_S type command for HW intrepretation */
+	cptinst.s.doneint = true;
+	cptinst.s.res_addr = (uint64_t)info->comp_baddr;
+	cptinst.s.tag = 0;
+	cptinst.s.grp = 0;
+	cptinst.s.wq_ptr = 0;
+	cptinst.s.ei0 = vq_cmd.cmd.u64;
+	cptinst.s.ei1 = vq_cmd.dptr;
+	cptinst.s.ei2 = vq_cmd.rptr;
+	cptinst.s.ei3 = vq_cmd.cptr.u64;
+
+	ret = send_cpt_command(cptvf, &cptinst, queue);
+	spin_unlock_bh(&pqueue->lock);
+	if (unlikely(ret)) {
+		spin_unlock_bh(&pqueue->lock);
+		pr_err("Send command failed for AE\n");
+		ret = -EFAULT;
+		goto request_cleanup;
+	}
+
+	/* Non-Blocking request */
+	req->request_id = (uint64_t)(info);
+	req->status = -EAGAIN;
+
+	return 0;
+
+request_cleanup:
+	pr_debug("Failed to submit CPT command\n");
+	do_request_cleanup(cptvf, info);
+
+	return ret;
+}
+
+void vq_post_process(struct cpt_vf *cptvf, uint32_t qno)
+{
+	if (unlikely(qno > cptvf->nr_queues)) {
+		pr_err("Request for post processing on invalid pending queue: %u\n",
+		       qno);
+		return;
+	}
+
+	process_pending_queue(cptvf, &cptvf->pqinfo, qno);
+}
+
+int32_t cptvf_do_request(void *vfdev, struct cpt_request_info *req)
+{
+	struct cpt_vf *cptvf = (struct cpt_vf *)vfdev;
+
+	if (!cpt_device_ready(cptvf)) {
+		pr_err("CPT Device is not ready");
+		return -ENODEV;
+	}
+
+	if ((cptvf->vftype == SE_TYPES) && (!req->ctrl.s.se_req)) {
+		pr_err("CPTVF-%d of SE TYPE got AE request", cptvf->vfid);
+		return -EINVAL;
+	} else if ((cptvf->vftype == AE_TYPES) && (req->ctrl.s.se_req)) {
+		pr_err("CPTVF-%d of AE TYPE got SE request", cptvf->vfid);
+		return -EINVAL;
+	}
+
+	cptvf->reqmode = req->ctrl.s.req_mode;
+
+	return process_request(cptvf, req);
+}
diff --git a/drivers/crypto/cavium/cpt/request_manager.h b/drivers/crypto/cavium/cpt/request_manager.h
new file mode 100644
index 0000000..d18d95b
--- /dev/null
+++ b/drivers/crypto/cavium/cpt/request_manager.h
@@ -0,0 +1,221 @@
+/*
+ * Copyright (C) 2016 Cavium, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ */
+
+#ifndef __REQUEST_MANGER_H
+#define __REQUEST_MANGER_H
+
+#include "cpt_common.h"
+
+#define TIME_IN_RESET_COUNT  5
+#define COMPLETION_CODE_SIZE 8
+#define COMPLETION_CODE_INIT 0
+
+#if defined(__BIG_ENDIAN_BITFIELD)
+#define COMPLETION_CODE_SHIFT     56
+#else
+#define COMPLETION_CODE_SHIFT      0
+#endif
+
+#define PENDING_THOLD  100
+
+#define MAX_SG_IN_OUT_CNT (25u)
+#define SG_LIST_HDR_SIZE  (8u)
+
+union data_ptr {
+	uint64_t addr64;
+	uint8_t *addr;
+};
+
+struct cpt_buffer {
+	uint8_t type; /**< How to interpret the buffer */
+	uint8_t reserved0;
+	uint16_t size; /**< Sizeof of the data */
+	uint16_t offset;
+	uint16_t reserved1;
+	union data_ptr ptr; /**< Pointer to data */
+};
+
+union ctrl_info {
+	uint32_t flags;
+	struct {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		uint32_t reserved0:24;
+		uint32_t grp:3; /**< Group bits */
+		uint32_t dma_mode:2; /**< DMA mode */
+		uint32_t req_mode:2; /**< Requeset mode BLOCKING/NONBLOCKING*/
+		uint32_t se_req:1;/**< To SE core */
+#else
+		uint32_t se_req:1; /**< To SE core */
+		uint32_t req_mode:2; /**< Requeset mode BLOCKING/NONBLOCKING*/
+		uint32_t dma_mode:2; /**< DMA mode */
+		uint32_t grp:3; /* Group bits */
+		uint32_t reserved0:24;
+#endif
+	} s;
+};
+
+union opcode_info {
+	uint16_t flags;
+	struct {
+		uint8_t major;
+		uint8_t minor;
+	} s;
+};
+
+struct cptvf_request {
+	union opcode_info opcode;
+	uint16_t param1;
+	uint16_t param2;
+	uint16_t dlen;
+};
+
+#define MAX_BUF_CNT	16
+
+struct cpt_request_info {
+	uint8_t incnt; /**< Number of input buffers */
+	uint8_t outcnt; /**< Number of output buffers */
+	uint8_t ctxl; /**< Context length, if 0, then INLINE */
+	uint16_t rlen; /**< Output length */
+	union ctrl_info ctrl; /**< User control information */
+
+	struct cptvf_request req; /**< Request Information (Core specific) */
+
+	uint64_t handle; /**< key/context handle */
+	uint64_t request_id; /**< Request ID */
+
+	struct cpt_buffer in[MAX_BUF_CNT];
+	struct cpt_buffer out[MAX_BUF_CNT];
+
+	void (*callback)(int, void *); /**< Kernel ASYNC request callabck */
+	void *callback_arg; /**< Kernel ASYNC request callabck arg */
+
+	uint32_t status; /**< Request status */
+};
+
+enum {
+	UNIT_8_BIT,
+	UNIT_16_BIT,
+	UNIT_32_BIT,
+	UNIT_64_BIT
+};
+
+struct sglist_component {
+	union {
+		uint64_t len;
+		struct {
+			uint16_t len0;
+			uint16_t len1;
+			uint16_t len2;
+			uint16_t len3;
+		} s;
+	} u;
+	uint64_t ptr0;
+	uint64_t ptr1;
+	uint64_t ptr2;
+	uint64_t ptr3;
+};
+
+struct buf_ptr {
+	uint8_t *vptr;
+	dma_addr_t dma_addr;
+	uint16_t size;
+};
+
+#define MAX_OUTCNT	10
+#define MAX_INCNT	10
+
+struct cpt_info_buffer {
+	struct cpt_vf *cptvf;
+	uint8_t req_type;
+	uint8_t dma_mode;
+
+	uint16_t opcode;
+	uint8_t queue;
+	uint8_t extra_time;
+	uint8_t is_ae;
+
+	uint16_t glist_cnt;
+	uint16_t slist_cnt;
+	uint16_t g_size;
+	uint16_t s_size;
+
+	uint32_t outcnt;
+	uint32_t status;
+
+	unsigned long time_in;
+	uint64_t request_id;
+
+	uint32_t dlen;
+	uint32_t rlen;
+	uint32_t total_in;
+	uint32_t total_out;
+	uint64_t dptr_baddr;
+	uint64_t rptr_baddr;
+	uint64_t comp_baddr;
+	uint8_t *in_buffer;
+	uint8_t *out_buffer;
+	uint8_t *gather_components;
+	uint8_t *scatter_components;
+	uint32_t outsize[MAX_OUTCNT];
+	uint32_t outunit[MAX_OUTCNT];
+	uint8_t *outptr[MAX_OUTCNT];
+
+	struct pending_entry *pentry;
+	volatile uint64_t *completion_addr;
+	volatile uint64_t *alternate_caddr;
+
+	struct buf_ptr glist_ptr[MAX_INCNT];
+	struct buf_ptr slist_ptr[MAX_OUTCNT];
+};
+
+/*
+ * CPT_INST_S software command definitions
+ * Words EI (0-3)
+ */
+union vq_cmd_word0 {
+	uint64_t u64;
+	struct {
+		uint16_t opcode;
+		uint16_t param1;
+		uint16_t param2;
+		uint16_t dlen;
+	} s;
+};
+
+union vq_cmd_word3 {
+	uint64_t u64;
+	struct {
+#if defined(__BIG_ENDIAN_BITFIELD)
+		uint64_t grp	: 3;
+		uint64_t cptr	: 61;
+#else
+		uint64_t cptr	: 61;
+		uint64_t grp	: 3;
+#endif
+	} s;
+};
+
+struct cpt_vq_command {
+	union vq_cmd_word0 cmd;
+	uint64_t dptr;
+	uint64_t rptr;
+	union vq_cmd_word3 cptr;
+};
+
+#if defined(__BIG_ENDIAN_BITFIELD)
+#define set_scatter_chunks(value, scatter_component)	{\
+	(value) |= (((uint64_t)scatter_component) << 25); }
+#else
+#define set_scatter_chunks(value, scatter_component)	{\
+	(value) |= (((uint64_t)scatter_component) << 32); }
+#endif
+
+void vq_post_process(struct cpt_vf *cptvf, uint32_t qno);
+int32_t process_request(struct cpt_vf *cptvf,
+			struct cpt_request_info *kern_req);
+#endif /* __REQUEST_MANGER_H */
-- 
2.1.4

^ permalink raw reply related

* [PATCH 3/3] drivers: crypto: Enable CPT options crypto for build
From: gcherianv @ 2016-11-18 15:00 UTC (permalink / raw)
  To: linux-kernel, linux-crypto; +Cc: davem, herbert, George Cherian
In-Reply-To: <1479481209-11475-1-git-send-email-gcherianv@gmail.com>

From: George Cherian <george.cherian@cavium.com>

Add the CPT options in crypto Kconfig and update the
crypto Makefile

Signed-off-by: George Cherian <george.cherian@cavium.com>
---
 drivers/crypto/Kconfig  | 1 +
 drivers/crypto/Makefile | 1 +
 2 files changed, 2 insertions(+)

diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index 4d2b81f..15f9040 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -484,6 +484,7 @@ config CRYPTO_DEV_MXS_DCP
 	  will be called mxs-dcp.
 
 source "drivers/crypto/qat/Kconfig"
+source "drivers/crypto/cavium/cpt/Kconfig"
 
 config CRYPTO_DEV_QCE
 	tristate "Qualcomm crypto engine accelerator"
diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
index ad7250f..dd33290 100644
--- a/drivers/crypto/Makefile
+++ b/drivers/crypto/Makefile
@@ -32,3 +32,4 @@ obj-$(CONFIG_CRYPTO_DEV_VMX) += vmx/
 obj-$(CONFIG_CRYPTO_DEV_SUN4I_SS) += sunxi-ss/
 obj-$(CONFIG_CRYPTO_DEV_ROCKCHIP) += rockchip/
 obj-$(CONFIG_CRYPTO_DEV_CHELSIO) += chelsio/
+obj-$(CONFIG_CRYPTO_DEV_CPT) += cavium/cpt/
-- 
2.1.4

^ permalink raw reply related

* [PATCH] hw_random: Make explicit that max >= 32 always
From: PrasannaKumar Muralidharan @ 2016-11-18 17:30 UTC (permalink / raw)
  To: mpm, herbert, daniel.thompson, linux-crypto; +Cc: PrasannaKumar Muralidharan

As hw_random core calls ->read with max > 32 or more, make it explicit.
Also remove checks involving 'max' being less than 8.

Signed-off-by: PrasannaKumar Muralidharan <prasannatsmkumar@gmail.com>
---
 drivers/char/hw_random/msm-rng.c     | 4 ----
 drivers/char/hw_random/pic32-rng.c   | 3 ---
 drivers/char/hw_random/pseries-rng.c | 5 ++---
 include/linux/hw_random.h            | 3 +--
 4 files changed, 3 insertions(+), 12 deletions(-)

diff --git a/drivers/char/hw_random/msm-rng.c b/drivers/char/hw_random/msm-rng.c
index 96fb986..841fee8 100644
--- a/drivers/char/hw_random/msm-rng.c
+++ b/drivers/char/hw_random/msm-rng.c
@@ -90,10 +90,6 @@ static int msm_rng_read(struct hwrng *hwrng, void *data, size_t max, bool wait)
 	/* calculate max size bytes to transfer back to caller */
 	maxsize = min_t(size_t, MAX_HW_FIFO_SIZE, max);
 
-	/* no room for word data */
-	if (maxsize < WORD_SZ)
-		return 0;
-
 	ret = clk_prepare_enable(rng->clk);
 	if (ret)
 		return ret;
diff --git a/drivers/char/hw_random/pic32-rng.c b/drivers/char/hw_random/pic32-rng.c
index 11dc9b7..9b5e68a 100644
--- a/drivers/char/hw_random/pic32-rng.c
+++ b/drivers/char/hw_random/pic32-rng.c
@@ -62,9 +62,6 @@ static int pic32_rng_read(struct hwrng *rng, void *buf, size_t max,
 	u32 t;
 	unsigned int timeout = RNG_TIMEOUT;
 
-	if (max < 8)
-		return 0;
-
 	do {
 		t = readl(priv->base + RNGRCNT) & RCNT_MASK;
 		if (t == 64) {
diff --git a/drivers/char/hw_random/pseries-rng.c b/drivers/char/hw_random/pseries-rng.c
index 63ce51d..d9f46b4 100644
--- a/drivers/char/hw_random/pseries-rng.c
+++ b/drivers/char/hw_random/pseries-rng.c
@@ -28,7 +28,6 @@
 static int pseries_rng_read(struct hwrng *rng, void *data, size_t max, bool wait)
 {
 	u64 buffer[PLPAR_HCALL_BUFSIZE];
-	size_t size = max < 8 ? max : 8;
 	int rc;
 
 	rc = plpar_hcall(H_RANDOM, (unsigned long *)buffer);
@@ -36,10 +35,10 @@ static int pseries_rng_read(struct hwrng *rng, void *data, size_t max, bool wait
 		pr_err_ratelimited("H_RANDOM call failed %d\n", rc);
 		return -EIO;
 	}
-	memcpy(data, buffer, size);
+	memcpy(data, buffer, 8);
 
 	/* The hypervisor interface returns 64 bits */
-	return size;
+	return 8;
 }
 
 /**
diff --git a/include/linux/hw_random.h b/include/linux/hw_random.h
index 34a0dc1..bee0827 100644
--- a/include/linux/hw_random.h
+++ b/include/linux/hw_random.h
@@ -30,8 +30,7 @@
  *			Must not be NULL.    *OBSOLETE*
  * @read:		New API. drivers can fill up to max bytes of data
  *			into the buffer. The buffer is aligned for any type
- *			and max is guaranteed to be >= to that alignment
- *			(either 4 or 8 depending on architecture).
+ *			and max is a multiple of 4 and >= 32 bytes.
  * @priv:		Private data, for use by the RNG driver.
  * @quality:		Estimation of true entropy in RNG's bitstream
  *			(per mill).
-- 
2.9.3

^ permalink raw reply related

* Re: [PATCH net-next] cxgb4: Allocate Tx queues dynamically
From: David Miller @ 2016-11-18 19:04 UTC (permalink / raw)
  To: atul.gupta-ut6Up61K2wZBDgjK7y7TUQ
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, linux-scsi-u79uwXL29TY76Z2rM5mHXA,
	target-devel-u79uwXL29TY76Z2rM5mHXA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-crypto-u79uwXL29TY76Z2rM5mHXA, nab-IzHhD5pYlfBP7FQvKIMDCQ,
	jejb-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	martin.petersen-QHcLZuEGTsvQT0dZR+AlfA,
	dledford-H+wXaHxf7aLQT0dZR+AlfA,
	herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q,
	leedom-ut6Up61K2wZBDgjK7y7TUQ, nirranjan-ut6Up61K2wZBDgjK7y7TUQ,
	varun-ut6Up61K2wZBDgjK7y7TUQ,
	swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW,
	hariprasad-ut6Up61K2wZBDgjK7y7TUQ
In-Reply-To: <1479467260-6509-1-git-send-email-atul.gupta-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org>

From: Atul Gupta <atul.gupta-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org>
Date: Fri, 18 Nov 2016 16:37:40 +0530

> From: Hariprasad Shenai <hariprasad-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org>
> 
> Allocate resources dynamically for Upper layer driver's (ULD) like
> cxgbit, iw_cxgb4, cxgb4i and chcr. The resources allocated include Tx
> queues which are allocated when ULD register with cxgb4 driver and freed
> while un-registering. The Tx queues which are shared by ULD shall be
> allocated by first registering driver and un-allocated by last
> unregistering driver.
> 
> Signed-off-by: Atul Gupta <atul.gupta-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org>

Applied.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 1/3] drivers: crypto: Add Support for Octeon-tx CPT Engine
From: David Daney @ 2016-11-18 18:55 UTC (permalink / raw)
  To: gcherianv; +Cc: linux-kernel, linux-crypto, davem, herbert, George Cherian
In-Reply-To: <1479481209-11475-2-git-send-email-gcherianv@gmail.com>

On 11/18/2016 07:00 AM, gcherianv@gmail.com wrote:
> From: George Cherian <george.cherian@cavium.com>
>
> Enable the Physical Function diver for the Cavium Crypto Engine (CPT)
> found in Octeon-tx series of SoC's. CPT is the Cryptographic Acceleration
> Unit. CPT includes microcoded GigaCypher symmetric engines (SEs) and
> asymmetric engines (AEs).
>
> Signed-off-by: George Cherian <george.cherian@cavium.com>


How was this tested?

> ---
>   drivers/crypto/cavium/cpt/Kconfig        |  22 +
>   drivers/crypto/cavium/cpt/Makefile       |   2 +
>   drivers/crypto/cavium/cpt/cpt.h          |  90 +++
>   drivers/crypto/cavium/cpt/cpt_common.h   | 377 +++++++++++++
>   drivers/crypto/cavium/cpt/cpt_hw_types.h | 940 +++++++++++++++++++++++++++++++
>   drivers/crypto/cavium/cpt/cpt_main.c     | 891 +++++++++++++++++++++++++++++
>   drivers/crypto/cavium/cpt/cpt_pf_mbox.c  | 174 ++++++
>   7 files changed, 2496 insertions(+)
>   create mode 100644 drivers/crypto/cavium/cpt/Kconfig
>   create mode 100644 drivers/crypto/cavium/cpt/Makefile
>   create mode 100644 drivers/crypto/cavium/cpt/cpt.h
>   create mode 100644 drivers/crypto/cavium/cpt/cpt_common.h
>   create mode 100644 drivers/crypto/cavium/cpt/cpt_hw_types.h
>   create mode 100644 drivers/crypto/cavium/cpt/cpt_main.c
>   create mode 100644 drivers/crypto/cavium/cpt/cpt_pf_mbox.c
>
> diff --git a/drivers/crypto/cavium/cpt/Kconfig b/drivers/crypto/cavium/cpt/Kconfig
> new file mode 100644
> index 0000000..8fe3f44
> --- /dev/null
> +++ b/drivers/crypto/cavium/cpt/Kconfig
> @@ -0,0 +1,22 @@
> +#
> +# Cavium crypto device configuration
> +#
> +
> +config CRYPTO_DEV_CPT
> +	tristate
> +	select HW_RANDOM_OCTEON

This makes no sense.  HW_RANDOM_OCTEON is for a mips64 based SOC and 
isn't present on devices that have this crypto block.  Why select this?


> +	select CRYPTO_AES
> +	select CRYPTO_DES
> +	select CRYPTO_BLKCIPHER
> +	select FW_LOADER
> +
> +config OCTEONTX_CPT_PF
> +	tristate "Octeon-tx CPT Physical function driver"
> +	depends on ARCH_THUNDER
> +	select CRYPTO_DEV_CPT
> +	help
> +	  Support for Cavium CPT block found in octeon-tx series of
> +	  processors.
> +
> +	  To compile this as a module, choose M here: the module will be
> +	  called cptpf.
> diff --git a/drivers/crypto/cavium/cpt/Makefile b/drivers/crypto/cavium/cpt/Makefile
> new file mode 100644
> index 0000000..bf758e2
> --- /dev/null
> +++ b/drivers/crypto/cavium/cpt/Makefile
> @@ -0,0 +1,2 @@
> +obj-$(CONFIG_OCTEONTX_CPT_PF) += cptpf.o
> +cptpf-objs := cpt_main.o cpt_pf_mbox.o
> diff --git a/drivers/crypto/cavium/cpt/cpt.h b/drivers/crypto/cavium/cpt/cpt.h
> new file mode 100644
> index 0000000..63d12da
> --- /dev/null
> +++ b/drivers/crypto/cavium/cpt/cpt.h
> @@ -0,0 +1,90 @@
> +/*
> + * Copyright (C) 2016 Cavium, Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of version 2 of the GNU General Public License
> + * as published by the Free Software Foundation.
> + */
> +
> +#ifndef __CPT_H
> +#define __CPT_H
> +
> +#include "cpt_common.h"
> +
> +#define BASE_PROC_DIR	"cavium"
> +
> +#define PF  0
> +#define VF  1
> +
> +struct cpt_device;
> +
> +struct microcode {
> +	uint8_t  is_mc_valid;

s/uint8_t/u8/  ??

That could be done everywhere.

[...]
> diff --git a/drivers/crypto/cavium/cpt/cpt_common.h b/drivers/crypto/cavium/cpt/cpt_common.h
> new file mode 100644
> index 0000000..351ed4a
> --- /dev/null
> +++ b/drivers/crypto/cavium/cpt/cpt_common.h
> @@ -0,0 +1,377 @@
> +/*
> + * Copyright (C) 2016 Cavium, Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of version 2 of the GNU General Public License
> + * as published by the Free Software Foundation.
> + */
> +
> +#ifndef __CPT_COMMON_H
> +#define __CPT_COMMON_H
> +
> +#include <asm/byteorder.h>
> +#include <linux/uaccess.h>
> +#include <linux/types.h>
> +#include <linux/spinlock.h>
> +#include <linux/pci.h>
> +#include <linux/cpumask.h>
> +#include <linux/string.h>
> +#include <linux/pci_regs.h>
> +#include <linux/delay.h>
> +#include <linux/printk.h>
> +#include <linux/sched.h>
> +#include <linux/completion.h>
> +#include <asm/arch_timer.h>
> +#include <linux/types.h>
> +
> +#include "cpt_hw_types.h"
> +
> +/* configuration space offsets */
> +#ifndef PCI_VENDOR_ID
> +#define PCI_VENDOR_ID 0x00 /* 16 bits */
> +#endif
> +#ifndef PCI_DEVICE_ID
> +#define PCI_DEVICE_ID 0x02 /* 16 bits */
> +#endif
> +#ifndef PCI_REVISION_ID
> +#define PCI_REVISION_ID 0x08 /* Revision ID */
> +#endif
> +#ifndef PCI_CAPABILITY_LIST
> +#define PCI_CAPABILITY_LIST 0x34 /* first capability list entry */
> +#endif
> +

Standard PCI core functions give you access to all that information, use 
pdev->device, pdev->revision, etc. instead of reinventing the wheel here 
with all these #defines.


> +/* Device ID */
> +#define PCI_VENDOR_ID_CAVIUM 0x177d

This is defined in pci_ids.h, use value from there instead of placing a 
duplicate definition here.

> +#define CPT_81XX_PCI_PF_DEVICE_ID 0xa040
> +#define CPT_81XX_PCI_VF_DEVICE_ID 0xa041
> +
> +#define PASS_1_0 0x0
> +
> +/* CPT Models ((Device ID<<16)|Revision ID) */
> +/* CPT models */
> +#define CPT_81XX_PASS1_0 ((CPT_81XX_PCI_PF_DEVICE_ID << 8) | PASS_1_0)
> +#define CPTVF_81XX_PASS1_0 ((CPT_81XX_PCI_VF_DEVICE_ID << 8) | PASS_1_0)
> +
> +#define PF 0
> +#define VF 1
> +
> +#define DEFAULT_DEVICE_QUEUES CPT_NUM_QS_PER_VF
> +
> +#define SUCCESS	(0)
> +#define FAIL	(1)
> +
> +#ifndef ROUNDUP4
> +#define ROUNDUP4(val) (((val) + 3) & 0xfffffffc)
> +#endif
> +
> +#ifndef ROUNDUP8
> +#define ROUNDUP8(val) (((val) + 7) & 0xfffffff8)
> +#endif
> +
> +#ifndef ROUNDUP16
> +#define ROUNDUP16(val) (((val) + 15) & 0xfffffff0)
> +#endif
> +

kernel.h has round_up(), use that instead of defining all these.

> +#define ERR_ADDR_LEN 8
> +

What is that for?  It looks unused.

[...]
> +/*###### PCIE EP-Mode Configuration Registers #########*/
> +#define PCIEEP0_CFG000 (0x0)
> +#define PCIEEP0_CFG002 (0x8)
> +#define PCIEEP0_CFG011 (0x2C)
> +#define PCIEEP0_CFG020 (0x50)
> +#define PCIEEP0_CFG025 (0x64)
> +#define PCIEEP0_CFG030 (0x78)
> +#define PCIEEP0_CFG044 (0xB0)
> +#define PCIEEP0_CFG045 (0xB4)
> +#define PCIEEP0_CFG082 (0x148)
> +#define PCIEEP0_CFG095 (0x17C)
> +#define PCIEEP0_CFG096 (0x180)
> +#define PCIEEP0_CFG097 (0x184)
> +#define PCIEEP0_CFG103 (0x19C)
> +#define PCIEEP0_CFG460 (0x730)
> +#define PCIEEP0_CFG461 (0x734)
> +#define PCIEEP0_CFG462 (0x738)
> +
> +/*#######  PCIe EP-Mode SR-IOV Configuration Registers  #####*/
> +#define PCIEEPVF0_CFG000 (0x0)
> +#define PCIEEPVF0_CFG002 (0x8)
> +#define PCIEEPVF0_CFG011 (0x2C)
> +#define PCIEEPVF0_CFG030 (0x78)
> +#define PCIEEPVF0_CFG044 (0xB0)
> +

Where are all those defines used?  What are they for?


That's all I can look at for now.

David.

^ permalink raw reply

* Re: [PATCH 1/3] drivers: crypto: Add Support for Octeon-tx CPT Engine
From: George Cherian @ 2016-11-18 19:31 UTC (permalink / raw)
  To: David Daney, gcherianv
  Cc: linux-kernel, linux-crypto, davem, herbert, George Cherian
In-Reply-To: <582F4EA7.9030303@caviumnetworks.com>

Hi David,

Thanks for the review.
On Saturday 19 November 2016 12:25 AM, David Daney wrote:
> On 11/18/2016 07:00 AM, gcherianv@gmail.com wrote:
>> From: George Cherian <george.cherian@cavium.com>
>>
>> Enable the Physical Function diver for the Cavium Crypto Engine (CPT)
>> found in Octeon-tx series of SoC's. CPT is the Cryptographic 
>> Acceleration
>> Unit. CPT includes microcoded GigaCypher symmetric engines (SEs) and
>> asymmetric engines (AEs).
>>
>> Signed-off-by: George Cherian <george.cherian@cavium.com>
>
>
> How was this tested?
Using ecryptfs and dm-crypt.
>
>
>> ---
>>   drivers/crypto/cavium/cpt/Kconfig        |  22 +
>>   drivers/crypto/cavium/cpt/Makefile       |   2 +
>>   drivers/crypto/cavium/cpt/cpt.h          |  90 +++
>>   drivers/crypto/cavium/cpt/cpt_common.h   | 377 +++++++++++++
>>   drivers/crypto/cavium/cpt/cpt_hw_types.h | 940 
>> +++++++++++++++++++++++++++++++
>>   drivers/crypto/cavium/cpt/cpt_main.c     | 891 
>> +++++++++++++++++++++++++++++
>>   drivers/crypto/cavium/cpt/cpt_pf_mbox.c  | 174 ++++++
>>   7 files changed, 2496 insertions(+)
>>   create mode 100644 drivers/crypto/cavium/cpt/Kconfig
>>   create mode 100644 drivers/crypto/cavium/cpt/Makefile
>>   create mode 100644 drivers/crypto/cavium/cpt/cpt.h
>>   create mode 100644 drivers/crypto/cavium/cpt/cpt_common.h
>>   create mode 100644 drivers/crypto/cavium/cpt/cpt_hw_types.h
>>   create mode 100644 drivers/crypto/cavium/cpt/cpt_main.c
>>   create mode 100644 drivers/crypto/cavium/cpt/cpt_pf_mbox.c
>>
>> diff --git a/drivers/crypto/cavium/cpt/Kconfig 
>> b/drivers/crypto/cavium/cpt/Kconfig
>> new file mode 100644
>> index 0000000..8fe3f44
>> --- /dev/null
>> +++ b/drivers/crypto/cavium/cpt/Kconfig
>> @@ -0,0 +1,22 @@
>> +#
>> +# Cavium crypto device configuration
>> +#
>> +
>> +config CRYPTO_DEV_CPT
>> +    tristate
>> +    select HW_RANDOM_OCTEON
>
> This makes no sense.  HW_RANDOM_OCTEON is for a mips64 based SOC and 
> isn't present on devices that have this crypto block.  Why select this?
>
Yeah true... I actually wanted to this one instead 
|CONFIG_HW_RANDOM_CAVIUM|
>
>> +    select CRYPTO_AES
>> +    select CRYPTO_DES
>> +    select CRYPTO_BLKCIPHER
>> +    select FW_LOADER
>> +
>> +config OCTEONTX_CPT_PF
>> +    tristate "Octeon-tx CPT Physical function driver"
>> +    depends on ARCH_THUNDER
>> +    select CRYPTO_DEV_CPT
>> +    help
>> +      Support for Cavium CPT block found in octeon-tx series of
>> +      processors.
>> +
>> +      To compile this as a module, choose M here: the module will be
>> +      called cptpf.
>> diff --git a/drivers/crypto/cavium/cpt/Makefile 
>> b/drivers/crypto/cavium/cpt/Makefile
>> new file mode 100644
>> index 0000000..bf758e2
>> --- /dev/null
>> +++ b/drivers/crypto/cavium/cpt/Makefile
>> @@ -0,0 +1,2 @@
>> +obj-$(CONFIG_OCTEONTX_CPT_PF) += cptpf.o
>> +cptpf-objs := cpt_main.o cpt_pf_mbox.o
>> diff --git a/drivers/crypto/cavium/cpt/cpt.h 
>> b/drivers/crypto/cavium/cpt/cpt.h
>> new file mode 100644
>> index 0000000..63d12da
>> --- /dev/null
>> +++ b/drivers/crypto/cavium/cpt/cpt.h
>> @@ -0,0 +1,90 @@
>> +/*
>> + * Copyright (C) 2016 Cavium, Inc.
>> + *
>> + * This program is free software; you can redistribute it and/or 
>> modify it
>> + * under the terms of version 2 of the GNU General Public License
>> + * as published by the Free Software Foundation.
>> + */
>> +
>> +#ifndef __CPT_H
>> +#define __CPT_H
>> +
>> +#include "cpt_common.h"
>> +
>> +#define BASE_PROC_DIR    "cavium"
>> +
>> +#define PF  0
>> +#define VF  1
>> +
>> +struct cpt_device;
>> +
>> +struct microcode {
>> +    uint8_t  is_mc_valid;
>
> s/uint8_t/u8/  ??
>
> That could be done everywhere.
will do
>
> [...]
>> diff --git a/drivers/crypto/cavium/cpt/cpt_common.h 
>> b/drivers/crypto/cavium/cpt/cpt_common.h
>> new file mode 100644
>> index 0000000..351ed4a
>> --- /dev/null
>> +++ b/drivers/crypto/cavium/cpt/cpt_common.h
>> @@ -0,0 +1,377 @@
>> +/*
>> + * Copyright (C) 2016 Cavium, Inc.
>> + *
>> + * This program is free software; you can redistribute it and/or 
>> modify it
>> + * under the terms of version 2 of the GNU General Public License
>> + * as published by the Free Software Foundation.
>> + */
>> +
>> +#ifndef __CPT_COMMON_H
>> +#define __CPT_COMMON_H
>> +
>> +#include <asm/byteorder.h>
>> +#include <linux/uaccess.h>
>> +#include <linux/types.h>
>> +#include <linux/spinlock.h>
>> +#include <linux/pci.h>
>> +#include <linux/cpumask.h>
>> +#include <linux/string.h>
>> +#include <linux/pci_regs.h>
>> +#include <linux/delay.h>
>> +#include <linux/printk.h>
>> +#include <linux/sched.h>
>> +#include <linux/completion.h>
>> +#include <asm/arch_timer.h>
>> +#include <linux/types.h>
>> +
>> +#include "cpt_hw_types.h"
>> +
>> +/* configuration space offsets */
>> +#ifndef PCI_VENDOR_ID
>> +#define PCI_VENDOR_ID 0x00 /* 16 bits */
>> +#endif
>> +#ifndef PCI_DEVICE_ID
>> +#define PCI_DEVICE_ID 0x02 /* 16 bits */
>> +#endif
>> +#ifndef PCI_REVISION_ID
>> +#define PCI_REVISION_ID 0x08 /* Revision ID */
>> +#endif
>> +#ifndef PCI_CAPABILITY_LIST
>> +#define PCI_CAPABILITY_LIST 0x34 /* first capability list entry */
>> +#endif
>> +
>
> Standard PCI core functions give you access to all that information, 
> use pdev->device, pdev->revision, etc. instead of reinventing the 
> wheel here with all these #defines.
>
>
>> +/* Device ID */
>> +#define PCI_VENDOR_ID_CAVIUM 0x177d
>
> This is defined in pci_ids.h, use value from there instead of placing 
> a duplicate definition here.
>
okay will remove them
>> +#define CPT_81XX_PCI_PF_DEVICE_ID 0xa040
>> +#define CPT_81XX_PCI_VF_DEVICE_ID 0xa041
>> +
>> +#define PASS_1_0 0x0
>> +
>> +/* CPT Models ((Device ID<<16)|Revision ID) */
>> +/* CPT models */
>> +#define CPT_81XX_PASS1_0 ((CPT_81XX_PCI_PF_DEVICE_ID << 8) | PASS_1_0)
>> +#define CPTVF_81XX_PASS1_0 ((CPT_81XX_PCI_VF_DEVICE_ID << 8) | 
>> PASS_1_0)
>> +
>> +#define PF 0
>> +#define VF 1
>> +
>> +#define DEFAULT_DEVICE_QUEUES CPT_NUM_QS_PER_VF
>> +
>> +#define SUCCESS    (0)
>> +#define FAIL    (1)
>> +
>> +#ifndef ROUNDUP4
>> +#define ROUNDUP4(val) (((val) + 3) & 0xfffffffc)
>> +#endif
>> +
>> +#ifndef ROUNDUP8
>> +#define ROUNDUP8(val) (((val) + 7) & 0xfffffff8)
>> +#endif
>> +
>> +#ifndef ROUNDUP16
>> +#define ROUNDUP16(val) (((val) + 15) & 0xfffffff0)
>> +#endif
>> +
>
> kernel.h has round_up(), use that instead of defining all these.
>
>> +#define ERR_ADDR_LEN 8
>> +
>
> What is that for?  It looks unused.
>
> [...]
>> +/*###### PCIE EP-Mode Configuration Registers #########*/
>> +#define PCIEEP0_CFG000 (0x0)
>> +#define PCIEEP0_CFG002 (0x8)
>> +#define PCIEEP0_CFG011 (0x2C)
>> +#define PCIEEP0_CFG020 (0x50)
>> +#define PCIEEP0_CFG025 (0x64)
>> +#define PCIEEP0_CFG030 (0x78)
>> +#define PCIEEP0_CFG044 (0xB0)
>> +#define PCIEEP0_CFG045 (0xB4)
>> +#define PCIEEP0_CFG082 (0x148)
>> +#define PCIEEP0_CFG095 (0x17C)
>> +#define PCIEEP0_CFG096 (0x180)
>> +#define PCIEEP0_CFG097 (0x184)
>> +#define PCIEEP0_CFG103 (0x19C)
>> +#define PCIEEP0_CFG460 (0x730)
>> +#define PCIEEP0_CFG461 (0x734)
>> +#define PCIEEP0_CFG462 (0x738)
>> +
>> +/*#######  PCIe EP-Mode SR-IOV Configuration Registers  #####*/
>> +#define PCIEEPVF0_CFG000 (0x0)
>> +#define PCIEEPVF0_CFG002 (0x8)
>> +#define PCIEEPVF0_CFG011 (0x2C)
>> +#define PCIEEPVF0_CFG030 (0x78)
>> +#define PCIEEPVF0_CFG044 (0xB0)
>> +
>
> Where are all those defines used?  What are they for?
>
>
> That's all I can look at for now.
>
I will address your comments in next version.
> David.
>

^ permalink raw reply

* Re: [PATCH 3/3] drivers: crypto: Enable CPT options crypto for build
From: kbuild test robot @ 2016-11-18 20:44 UTC (permalink / raw)
  To: gcherianv
  Cc: kbuild-all, linux-kernel, linux-crypto, davem, herbert,
	George Cherian
In-Reply-To: <1479481209-11475-4-git-send-email-gcherianv@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 16490 bytes --]

Hi George,

[auto build test ERROR on cryptodev/master]
[also build test ERROR on v4.9-rc5 next-20161117]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/gcherianv-gmail-com/Add-Support-for-Cavium-Cryptographic-Accelerarion-Unit/20161119-005337
base:   https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git master
config: arm64-allmodconfig (attached as .config)
compiler: aarch64-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
        wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=arm64 

All error/warnings (new ones prefixed by >>):

warning: (CRYPTO_DEV_CPT) selects HW_RANDOM_OCTEON which has unmet direct dependencies (HW_RANDOM && CAVIUM_OCTEON_SOC)
   In file included from drivers/crypto/cavium/cpt/cpt_common.h:27:0,
                    from drivers/crypto/cavium/cpt/cpt.h:12,
                    from drivers/crypto/cavium/cpt/cpt_main.c:19:
>> drivers/crypto/cavium/cpt/cpt_hw_types.h:439:2: warning: no semicolon at end of struct or union
     } s;
     ^
>> drivers/crypto/cavium/cpt/cpt_hw_types.h:608:3: error: expected ',', ';' or '}' before 'uint64_t'
      uint64_t reserved_0_5:6;
      ^~~~~~~~
   drivers/crypto/cavium/cpt/cpt_main.c:236:13: warning: 'cpt_enable_all_interrupts' defined but not used [-Wunused-function]
    static void cpt_enable_all_interrupts(struct cpt_device *cpt)
                ^~~~~~~~~~~~~~~~~~~~~~~~~
--
   In file included from drivers/crypto/cavium/cpt/cpt_common.h:27:0,
                    from drivers/crypto/cavium/cpt/cpt.h:12,
                    from drivers/crypto/cavium/cpt/cpt_pf_mbox.c:11:
>> drivers/crypto/cavium/cpt/cpt_hw_types.h:439:2: warning: no semicolon at end of struct or union
     } s;
     ^
>> drivers/crypto/cavium/cpt/cpt_hw_types.h:608:3: error: expected ',', ';' or '}' before 'uint64_t'
      uint64_t reserved_0_5:6;
      ^~~~~~~~
--
>> drivers/char/hw_random/octeon-rng.c:19:31: fatal error: asm/octeon/octeon.h: No such file or directory
    #include <asm/octeon/octeon.h>
                                  ^
   compilation terminated.

vim +608 drivers/crypto/cavium/cpt/cpt_hw_types.h

fcb2dbd1 George Cherian 2016-11-18  433  		uint64_t reserved_48_63:16;
fcb2dbd1 George Cherian 2016-11-18  434  		uint64_t bstatus:48
fcb2dbd1 George Cherian 2016-11-18  435  #else /* Word 0 - Little Endian */
fcb2dbd1 George Cherian 2016-11-18  436  		uint64_t bstatus:48;
fcb2dbd1 George Cherian 2016-11-18  437  		uint64_t reserved_48_63:16;
fcb2dbd1 George Cherian 2016-11-18  438  #endif /* Word 0 - End */
fcb2dbd1 George Cherian 2016-11-18 @439  	} s;
fcb2dbd1 George Cherian 2016-11-18  440  	struct cptx_pf_exe_bist_status_cn81xx {
fcb2dbd1 George Cherian 2016-11-18  441  #if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
fcb2dbd1 George Cherian 2016-11-18  442  		uint64_t reserved_16_63:48;
fcb2dbd1 George Cherian 2016-11-18  443  		uint64_t bstatus:16;
fcb2dbd1 George Cherian 2016-11-18  444  #else /* Word 0 - Little Endian */
fcb2dbd1 George Cherian 2016-11-18  445  		uint64_t bstatus:16;
fcb2dbd1 George Cherian 2016-11-18  446  		uint64_t reserved_16_63:48;
fcb2dbd1 George Cherian 2016-11-18  447  #endif /* Word 0 - End */
fcb2dbd1 George Cherian 2016-11-18  448  	} cn81xx;
fcb2dbd1 George Cherian 2016-11-18  449  };
fcb2dbd1 George Cherian 2016-11-18  450  
fcb2dbd1 George Cherian 2016-11-18  451  /**
fcb2dbd1 George Cherian 2016-11-18  452   * Register (NCB) cpt#_pf_exe_ctl
fcb2dbd1 George Cherian 2016-11-18  453   *
fcb2dbd1 George Cherian 2016-11-18  454   * CPT PF Engine Control Register
fcb2dbd1 George Cherian 2016-11-18  455   * This register enables the engines.
fcb2dbd1 George Cherian 2016-11-18  456   * cptx_pf_exe_ctl_s
fcb2dbd1 George Cherian 2016-11-18  457   * Word0
fcb2dbd1 George Cherian 2016-11-18  458   *  enable:64 [63:0](R/W) Individual enables for each of the engines.
fcb2dbd1 George Cherian 2016-11-18  459   */
fcb2dbd1 George Cherian 2016-11-18  460  union cptx_pf_exe_ctl {
fcb2dbd1 George Cherian 2016-11-18  461  	uint64_t u;
fcb2dbd1 George Cherian 2016-11-18  462  	struct cptx_pf_exe_ctl_s {
fcb2dbd1 George Cherian 2016-11-18  463  #if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
fcb2dbd1 George Cherian 2016-11-18  464  		uint64_t enable:64;
fcb2dbd1 George Cherian 2016-11-18  465  #else /* Word 0 - Little Endian */
fcb2dbd1 George Cherian 2016-11-18  466  		uint64_t enable:64;
fcb2dbd1 George Cherian 2016-11-18  467  #endif /* Word 0 - End */
fcb2dbd1 George Cherian 2016-11-18  468  	} s;
fcb2dbd1 George Cherian 2016-11-18  469  };
fcb2dbd1 George Cherian 2016-11-18  470  
fcb2dbd1 George Cherian 2016-11-18  471  /**
fcb2dbd1 George Cherian 2016-11-18  472   * Register (NCB) cpt#_pf_q#_ctl
fcb2dbd1 George Cherian 2016-11-18  473   *
fcb2dbd1 George Cherian 2016-11-18  474   * CPT Queue Control Register
fcb2dbd1 George Cherian 2016-11-18  475   * This register configures queues. This register should be changed only
fcb2dbd1 George Cherian 2016-11-18  476   * when quiescent (see CPT()_VQ()_INPROG[INFLIGHT]).
fcb2dbd1 George Cherian 2016-11-18  477   * cptx_pf_qx_ctl_s
fcb2dbd1 George Cherian 2016-11-18  478   * Word0
fcb2dbd1 George Cherian 2016-11-18  479   *  reserved_60_63:4 [63:60] reserved.
fcb2dbd1 George Cherian 2016-11-18  480   *  aura:12; [59:48](R/W) Guest-aura for returning this queue's
fcb2dbd1 George Cherian 2016-11-18  481   *	instruction-chunk buffers to FPA. Only used when [INST_FREE] is set.
fcb2dbd1 George Cherian 2016-11-18  482   *	For the FPA to not discard the request, FPA_PF_MAP() must map
fcb2dbd1 George Cherian 2016-11-18  483   *	[AURA] and CPT()_PF_Q()_GMCTL[GMID] as valid.
fcb2dbd1 George Cherian 2016-11-18  484   *  reserved_45_47:3 [47:45] reserved.
fcb2dbd1 George Cherian 2016-11-18  485   *  size:13 [44:32](R/W) Command-buffer size, in number of 64-bit words per
fcb2dbd1 George Cherian 2016-11-18  486   *	command buffer segment. Must be 8*n + 1, where n is the number of
fcb2dbd1 George Cherian 2016-11-18  487   *	instructions per buffer segment.
fcb2dbd1 George Cherian 2016-11-18  488   *  reserved_11_31:21 [31:11] Reserved.
fcb2dbd1 George Cherian 2016-11-18  489   *  cont_err:1 [10:10](R/W) Continue on error.
fcb2dbd1 George Cherian 2016-11-18  490   *	0 = When CPT()_VQ()_MISC_INT[NWRP], CPT()_VQ()_MISC_INT[IRDE] or
fcb2dbd1 George Cherian 2016-11-18  491   *	CPT()_VQ()_MISC_INT[DOVF] are set by hardware or software via
fcb2dbd1 George Cherian 2016-11-18  492   *	CPT()_VQ()_MISC_INT_W1S, then CPT()_VQ()_CTL[ENA] is cleared.  Due to
fcb2dbd1 George Cherian 2016-11-18  493   *	pipelining, additional instructions may have been processed between the
fcb2dbd1 George Cherian 2016-11-18  494   *	instruction causing the error and the next instruction in the disabled
fcb2dbd1 George Cherian 2016-11-18  495   *	queue (the instruction at CPT()_VQ()_SADDR).
fcb2dbd1 George Cherian 2016-11-18  496   *	1 = Ignore errors and continue processing instructions.
fcb2dbd1 George Cherian 2016-11-18  497   *	For diagnostic use only.
fcb2dbd1 George Cherian 2016-11-18  498   *  inst_free:1 [9:9](R/W) Instruction FPA free. When set, when CPT reaches the
fcb2dbd1 George Cherian 2016-11-18  499   *	end of an instruction chunk, that chunk will be freed to the FPA.
fcb2dbd1 George Cherian 2016-11-18  500   *  inst_be:1 [8:8](R/W) Instruction big-endian control. When set, instructions,
fcb2dbd1 George Cherian 2016-11-18  501   *	instruction next chunk pointers, and result structures are stored in
fcb2dbd1 George Cherian 2016-11-18  502   *	big-endian format in memory.
fcb2dbd1 George Cherian 2016-11-18  503   *  iqb_ldwb:1 [7:7](R/W) Instruction load don't write back.
fcb2dbd1 George Cherian 2016-11-18  504   *	0 = The hardware issues NCB transient load (LDT) towards the cache,
fcb2dbd1 George Cherian 2016-11-18  505   *	which if the line hits and is is dirty will cause the line to be
fcb2dbd1 George Cherian 2016-11-18  506   *	written back before being replaced.
fcb2dbd1 George Cherian 2016-11-18  507   *	1 = The hardware issues NCB LDWB read-and-invalidate command towards
fcb2dbd1 George Cherian 2016-11-18  508   *	the cache when fetching the last word of instructions; as a result the
fcb2dbd1 George Cherian 2016-11-18  509   *	line will not be written back when replaced.  This improves
fcb2dbd1 George Cherian 2016-11-18  510   *	performance, but software must not read the instructions after they are
fcb2dbd1 George Cherian 2016-11-18  511   *	posted to the hardware.	Reads that do not consume the last word of a
fcb2dbd1 George Cherian 2016-11-18  512   *	cache line always use LDI.
fcb2dbd1 George Cherian 2016-11-18  513   *  reserved_4_6:3 [6:4] Reserved.
fcb2dbd1 George Cherian 2016-11-18  514   *  grp:3; [3:1](R/W) Engine group.
fcb2dbd1 George Cherian 2016-11-18  515   *  pri:1; [0:0](R/W) Queue priority.
fcb2dbd1 George Cherian 2016-11-18  516   *	1 = This queue has higher priority. Round-robin between higher
fcb2dbd1 George Cherian 2016-11-18  517   *	priority queues.
fcb2dbd1 George Cherian 2016-11-18  518   *	0 = This queue has lower priority. Round-robin between lower
fcb2dbd1 George Cherian 2016-11-18  519   *	priority queues.
fcb2dbd1 George Cherian 2016-11-18  520   */
fcb2dbd1 George Cherian 2016-11-18  521  union cptx_pf_qx_ctl {
fcb2dbd1 George Cherian 2016-11-18  522  	uint64_t u;
fcb2dbd1 George Cherian 2016-11-18  523  	struct cptx_pf_qx_ctl_s {
fcb2dbd1 George Cherian 2016-11-18  524  #if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
fcb2dbd1 George Cherian 2016-11-18  525  		uint64_t reserved_60_63:4;
fcb2dbd1 George Cherian 2016-11-18  526  		uint64_t aura:12;
fcb2dbd1 George Cherian 2016-11-18  527  		uint64_t reserved_45_47:3;
fcb2dbd1 George Cherian 2016-11-18  528  		uint64_t size:13;
fcb2dbd1 George Cherian 2016-11-18  529  		uint64_t reserved_11_31:21;
fcb2dbd1 George Cherian 2016-11-18  530  		uint64_t cont_err:1;
fcb2dbd1 George Cherian 2016-11-18  531  		uint64_t inst_free:1;
fcb2dbd1 George Cherian 2016-11-18  532  		uint64_t inst_be:1;
fcb2dbd1 George Cherian 2016-11-18  533  		uint64_t iqb_ldwb:1;
fcb2dbd1 George Cherian 2016-11-18  534  		uint64_t reserved_4_6:3;
fcb2dbd1 George Cherian 2016-11-18  535  		uint64_t grp:3;
fcb2dbd1 George Cherian 2016-11-18  536  		uint64_t pri:1;
fcb2dbd1 George Cherian 2016-11-18  537  #else /* Word 0 - Little Endian */
fcb2dbd1 George Cherian 2016-11-18  538  		uint64_t pri:1;
fcb2dbd1 George Cherian 2016-11-18  539  		uint64_t grp:3;
fcb2dbd1 George Cherian 2016-11-18  540  		uint64_t reserved_4_6:3;
fcb2dbd1 George Cherian 2016-11-18  541  		uint64_t iqb_ldwb:1;
fcb2dbd1 George Cherian 2016-11-18  542  		uint64_t inst_be:1;
fcb2dbd1 George Cherian 2016-11-18  543  		uint64_t inst_free:1;
fcb2dbd1 George Cherian 2016-11-18  544  		uint64_t cont_err:1;
fcb2dbd1 George Cherian 2016-11-18  545  		uint64_t reserved_11_31:21;
fcb2dbd1 George Cherian 2016-11-18  546  		uint64_t size:13;
fcb2dbd1 George Cherian 2016-11-18  547  		uint64_t reserved_45_47:3;
fcb2dbd1 George Cherian 2016-11-18  548  		uint64_t aura:12;
fcb2dbd1 George Cherian 2016-11-18  549  		uint64_t reserved_60_63:4;
fcb2dbd1 George Cherian 2016-11-18  550  #endif /* Word 0 - End */
fcb2dbd1 George Cherian 2016-11-18  551  	} s;
fcb2dbd1 George Cherian 2016-11-18  552      /* struct cptx_pf_qx_ctl_s cn; */
fcb2dbd1 George Cherian 2016-11-18  553  };
fcb2dbd1 George Cherian 2016-11-18  554  
fcb2dbd1 George Cherian 2016-11-18  555  /**
fcb2dbd1 George Cherian 2016-11-18  556   * Register (NCB) cpt#_pf_g#_en
fcb2dbd1 George Cherian 2016-11-18  557   *
fcb2dbd1 George Cherian 2016-11-18  558   * CPT PF Group Control Register
fcb2dbd1 George Cherian 2016-11-18  559   * This register configures engine groups.
fcb2dbd1 George Cherian 2016-11-18  560   * cptx_pf_gx_en_s
fcb2dbd1 George Cherian 2016-11-18  561   * Word0
fcb2dbd1 George Cherian 2016-11-18  562   *  en: 64; [63:0](R/W/H) Engine group enable. One bit corresponds to each
fcb2dbd1 George Cherian 2016-11-18  563   *	engine, with the bit set to indicate this engine can service this group.
fcb2dbd1 George Cherian 2016-11-18  564   *	Bits corresponding to unimplemented engines read as zero, i.e. only bit
fcb2dbd1 George Cherian 2016-11-18  565   *	numbers	less than CPT()_PF_CONSTANTS[AE] + CPT()_PF_CONSTANTS[SE] are
fcb2dbd1 George Cherian 2016-11-18  566   *	writable. AE engine bits follow SE engine bits.
fcb2dbd1 George Cherian 2016-11-18  567   *	E.g. if CPT()_PF_CONSTANTS[AE] = 0x1, and CPT()_PF_CONSTANTS[SE] = 0x2,
fcb2dbd1 George Cherian 2016-11-18  568   *	then bits <2:0> are read/writable with bit <2> corresponding to AE<0>,
fcb2dbd1 George Cherian 2016-11-18  569   *	and bit <1> to SE<1>, and bit<0> to SE<0>. Before disabling an engine,
fcb2dbd1 George Cherian 2016-11-18  570   *	the corresponding bit in each group must be cleared. CPT()_PF_EXEC_BUSY
fcb2dbd1 George Cherian 2016-11-18  571   *	can then be polled to determing when the engine becomes	idle.
fcb2dbd1 George Cherian 2016-11-18  572   *	At the point, the engine can be disabled.
fcb2dbd1 George Cherian 2016-11-18  573   */
fcb2dbd1 George Cherian 2016-11-18  574  union cptx_pf_gx_en {
fcb2dbd1 George Cherian 2016-11-18  575  	uint64_t u;
fcb2dbd1 George Cherian 2016-11-18  576  	struct cptx_pf_gx_en_s {
fcb2dbd1 George Cherian 2016-11-18  577  #if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
fcb2dbd1 George Cherian 2016-11-18  578  		uint64_t en:64;
fcb2dbd1 George Cherian 2016-11-18  579  #else /* Word 0 - Little Endian */
fcb2dbd1 George Cherian 2016-11-18  580  		uint64_t en:64;
fcb2dbd1 George Cherian 2016-11-18  581  #endif /* Word 0 - End */
fcb2dbd1 George Cherian 2016-11-18  582  	} s;
fcb2dbd1 George Cherian 2016-11-18  583  };
fcb2dbd1 George Cherian 2016-11-18  584  
fcb2dbd1 George Cherian 2016-11-18  585  /**
fcb2dbd1 George Cherian 2016-11-18  586   * Register (NCB) cpt#_vq#_saddr
fcb2dbd1 George Cherian 2016-11-18  587   *
fcb2dbd1 George Cherian 2016-11-18  588   * CPT Queue Starting Buffer Address Registers
fcb2dbd1 George Cherian 2016-11-18  589   * These registers set the instruction buffer starting address.
fcb2dbd1 George Cherian 2016-11-18  590   * cptx_vqx_saddr_s
fcb2dbd1 George Cherian 2016-11-18  591   * Word0
fcb2dbd1 George Cherian 2016-11-18  592   *  reserved_49_63:15 [63:49] Reserved.
fcb2dbd1 George Cherian 2016-11-18  593   *  ptr:43 [48:6](R/W/H) Instruction buffer IOVA <48:6> (64-byte aligned).
fcb2dbd1 George Cherian 2016-11-18  594   *	When written, it is the initial buffer starting address; when read,
fcb2dbd1 George Cherian 2016-11-18  595   *	it is the next read pointer to be requested from L2C. The PTR field
fcb2dbd1 George Cherian 2016-11-18  596   *	is overwritten with the next pointer each time that the command buffer
fcb2dbd1 George Cherian 2016-11-18  597   *	segment is exhausted. New commands will then be read from the newly
fcb2dbd1 George Cherian 2016-11-18  598   *	specified command buffer pointer.
fcb2dbd1 George Cherian 2016-11-18  599   *  reserved_0_5:6 [5:0] Reserved.
fcb2dbd1 George Cherian 2016-11-18  600   *
fcb2dbd1 George Cherian 2016-11-18  601   */
fcb2dbd1 George Cherian 2016-11-18  602  union cptx_vqx_saddr {
fcb2dbd1 George Cherian 2016-11-18  603  	uint64_t u;
fcb2dbd1 George Cherian 2016-11-18  604  	struct cptx_vqx_saddr_s {
fcb2dbd1 George Cherian 2016-11-18  605  #if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
fcb2dbd1 George Cherian 2016-11-18  606  		uint64_t reserved_49_63:15;
fcb2dbd1 George Cherian 2016-11-18  607  		uint64_t ptr:43
fcb2dbd1 George Cherian 2016-11-18 @608  		uint64_t reserved_0_5:6;
fcb2dbd1 George Cherian 2016-11-18  609  #else /* Word 0 - Little Endian */
fcb2dbd1 George Cherian 2016-11-18  610  		uint64_t reserved_0_5:6;
fcb2dbd1 George Cherian 2016-11-18  611  		uint64_t ptr:43;

:::::: The code at line 608 was first introduced by commit
:::::: fcb2dbd14b3247c53056bc2b78e907c569da1d44 drivers: crypto: Add Support for Octeon-tx CPT Engine

:::::: TO: George Cherian <george.cherian@cavium.com>
:::::: CC: 0day robot <fengguang.wu@intel.com>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 52455 bytes --]

^ permalink raw reply

* Crypto Fixes for 4.9
From: Herbert Xu @ 2016-11-19 10:27 UTC (permalink / raw)
  To: Linus Torvalds, David S. Miller, Linux Kernel Mailing List,
	Linux Crypto Mailing List

Hi Linus:

This push fixes the following issues:

- Compiler warning in caam driver that was the last one remaining.
- Do not register aes-xts in caam drivers on unsupported platforms.
- Regression in algif_hash interface that may lead to an oops.


Please pull from

git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6.git linus


Arnd Bergmann (1):
      crypto: caam - fix type mismatch warning

Herbert Xu (1):
      crypto: algif_hash - Fix NULL hash crash with shash

Sven Ebenfeld (1):
      crypto: caam - do not register AES-XTS mode on LP units

 crypto/algif_hash.c           |   17 ++++++++++-------
 drivers/crypto/caam/caamalg.c |   11 ++++++++++-
 2 files changed, 20 insertions(+), 8 deletions(-)

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH 2/3] crypto: AF_ALG - disregard AAD buffer space for output
From: Stephan Mueller @ 2016-11-19 21:08 UTC (permalink / raw)
  To: Herbert Xu; +Cc: mathew.j.martineau, linux-crypto
In-Reply-To: <20161116090446.GE29644@gondor.apana.org.au>

Am Mittwoch, 16. November 2016, 17:04:46 CET schrieb Herbert Xu:

Hi Herbert,

> On Wed, Nov 16, 2016 at 10:02:59AM +0100, Stephan Mueller wrote:
> > One thing occurred to me: The copying of the AD would only be done of src
> > != dst. For the AF_ALG interface, I thing we always have src != dst due
> > to the user space/kernel space translation. That means the kernel copies
> > the AD around even in user space src == dst. Isn't that a waste? I.e.
> > shouldn't we handle the AD copying rather in user space than in kernel
> > space?
> 
> No that's not the case.  You can do zero-copy, in which case src
> would be identical to dst.

The way to go on this topic would be to use the same logic as the authenc 
implementation by using a null cipher for the copy operation. Though, finding 
out whether the src and dst buffers are the same is an interesting 
proposition, because we need to traverse the src and dst SGLs to see whether 
the same pages and same offsets are used. A simple check for src SGL == dst 
SGL will not work for the AF_ALG implementation, because the src SGL will 
always be different from the dst SGL because they are constructed in different 
ways (tsgl will always be different from rsgl). What may be the same are the 
pages and offsets that are pointed to by the SGL in case of zerocopy.

Keeping that in mind, I am wondering whether the authenc() implementation 
should be changed to simply remove the copy operation in there. As there seem 
to be no other AEAD cipher implements that copy operation (at least the major 
CCM and GCM implementations applicable to X86 do not do that), it seems that 
it is not necessary at all for in-kernel users. The authenc implementation 
performs the copy operation of the src SGL if it is different from the dst 
SGL. See the following code used by authenc:

        if (req->src != req->dst) {
                err = crypto_authenc_copy_assoc(req);
                if (err)
                        return err;

                dst = scatterwalk_ffwd(areq_ctx->dst, req->dst, req-
>assoclen);
        }

Thus, the authenc implementation will always copy the AAD over in case of 
AF_ALG even though zerocopy with the same buffers are used.

When the in-kernel users of AEAD seemingly do not care about the copying of 
the AAD, and considering that authenc would not do it right for AF_ALG, I am 
wondering whether we should:

1. remove the AAD copy in authenc to make it en-par with the other AEAD 
implementations

2. re-consider the discussed patch

3. tell users to copy the AAD over if they need it in the dst buffers.

Ciao
Stephan

^ permalink raw reply

* Re: [PATCH] crypto: add virtio-crypto driver
From: gong lei @ 2016-11-20  7:11 UTC (permalink / raw)
  To: Benedetto, Salvatore, Gonglei, qemu-devel@nongnu.org,
	virtio-dev@lists.oasis-open.org,
	virtualization@lists.linux-foundation.org,
	linux-crypto@vger.kernel.org
  Cc: pasic@linux.vnet.ibm.com, weidong.huang@huawei.com,
	claudio.fontana@huawei.com, mst@redhat.com, luonengjun@huawei.com,
	hanweidong@huawei.com, Zeng, Xin, peter.huangpeng@huawei.com,
	xuquan8@huawei.com, stefanha@redhat.com, jianjay.zhou@huawei.com,
	cornelia.huck@de.ibm.com, davem@davemloft.net,
	wu.wubin@huawei.com, herbert@gondor.apana.org.au
In-Reply-To: <309B30E91F5E2846B79BD9AA9711D031A12767@IRSMSX102.ger.corp.intel.com>

on 2016/11/17 23:55, Benedetto, Salvatore wrote:

> Hi Gonglei,
>
> ...
>> +
>> +static int virtio_crypto_alg_ablkcipher_init_session(
>> +		struct virtio_crypto_ablkcipher_ctx *ctx,
>> +		int alg, const uint8_t *key,
>> +		unsigned int keylen,
>> +		int encrypt)
>> +{
>> +	struct scatterlist outhdr, key_sg, inhdr, *sgs[3];
>> +	unsigned int tmp;
>> +	struct virtio_crypto_session_input input;
>> +	struct virtio_crypto_op_ctrl_req ctrl;
>> +	struct virtio_crypto *vcrypto = ctx->vcrypto;
>> +	int op = encrypt ? VIRTIO_CRYPTO_OP_ENCRYPT :
>> VIRTIO_CRYPTO_OP_DECRYPT;
>> +	int err;
>> +	unsigned int num_out = 0, num_in = 0;
>> +
>> +	memset(&ctrl, 0, sizeof(ctrl));
>> +	memset(&input, 0, sizeof(input));
>> +	/* Pad ctrl header */
>> +	ctrl.header.opcode =
>> cpu_to_le32(VIRTIO_CRYPTO_CIPHER_CREATE_SESSION);
>> +	ctrl.header.algo = cpu_to_le32((uint32_t)alg);
>> +	/* Set the default dataqueue id to 0 */
>> +	ctrl.header.queue_id = 0;
>> +
>> +	input.status = cpu_to_le32(VIRTIO_CRYPTO_ERR);
>> +	/* Pad cipher's parameters */
>> +	ctrl.u.sym_create_session.op_type =
>> +		cpu_to_le32(VIRTIO_CRYPTO_SYM_OP_CIPHER);
>> +	ctrl.u.sym_create_session.u.cipher.para.algo = ctrl.header.algo;
>> +	ctrl.u.sym_create_session.u.cipher.para.keylen =
>> cpu_to_le32(keylen);
>> +	ctrl.u.sym_create_session.u.cipher.para.op = cpu_to_le32(op);
>> +
>> +	sg_init_one(&outhdr, &ctrl, sizeof(ctrl));
> I believe this won't work when the new virtually-mapped kernel stack (VMAP_STACK)
> is enabled.
I see, will fix it in the next version. Thanks for your comments :)
>
> Regards,
> Salvatore

-- 
Regards,
-Gonglei

^ permalink raw reply

* Re: [PATCH v4] crypto: arm64/sha2: integrate OpenSSL implementations of SHA256/SHA512
From: Ard Biesheuvel @ 2016-11-20 11:43 UTC (permalink / raw)
  To: linux-crypto@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, Herbert Xu, Will Deacon
  Cc: Andy Polyakov, Ard Biesheuvel
In-Reply-To: <1479642121-17912-1-git-send-email-ard.biesheuvel@linaro.org>

On 20 November 2016 at 11:42, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> This integrates both the accelerated scalar and the NEON implementations
> of SHA-224/256 as well as SHA-384/512 from the OpenSSL project.
>
> Relative performance compared to the respective generic C versions:
>
>                  |  SHA256-scalar  | SHA256-NEON* |  SHA512  |
>      ------------+-----------------+--------------+----------+
>      Cortex-A53  |      1.63x      |     1.63x    |   2.34x  |
>      Cortex-A57  |      1.43x      |     1.59x    |   1.95x  |
>      Cortex-A73  |      1.26x      |     1.56x    |     ?    |
>
> The core crypto code was authored by Andy Polyakov of the OpenSSL
> project, in collaboration with whom the upstream code was adapted so
> that this module can be built from the same version of sha512-armv8.pl.
>
> The version in this patch was taken from OpenSSL commit 32bbb62ea634
> ("sha/asm/sha512-armv8.pl: fix big-endian support in __KERNEL__ case.")
>
> * The core SHA algorithm is fundamentally sequential, but there is a
>   secondary transformation involved, called the schedule update, which
>   can be performed independently. The NEON version of SHA-224/SHA-256
>   only implements this part of the algorithm using NEON instructions,
>   the sequential part is always done using scalar instructions.
>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---

Missing changelog:

v4: fixed the big-endian build; this required an upstream change (even
    though upstream was not actually broken, since it explicitly defines
    __ARMEB__ on AArch64 big-endian builds), so this patch is now based
    on a more recent upstream OpenSSL commit (the __ILP32__ #ifdefs are
    still present but never active)

v3: at Will's request, the generated assembly files are now included
    as .S_shipped files, for which generic build rules are defined
    already.

Note that sizeable patches like this one have caused issues in the past with
patchwork, so for Herbert's convenience, the patch can be pulled from
http://git.kernel.org/cgit/linux/kernel/git/ardb/linux.git, branch
arm64-sha256 (based on today's cryptodev)

^ permalink raw reply

* Re: vmalloced stacks and scatterwalk_map_and_copy()
From: Andy Lutomirski @ 2016-11-21  2:19 UTC (permalink / raw)
  To: Eric Biggers, regressions
  Cc: linux-crypto, Herbert Xu, linux-kernel@vger.kernel.org,
	Andrew Lutomirski
In-Reply-To: <CALCETrUPuunBT1Zo25wyOwqaWJ=rm9R-WMZGN-7u4-dsdokAnQ@mail.gmail.com>

[Adding Thorsten to help keep this from getting lost]

On Thu, Nov 3, 2016 at 1:30 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Thu, Nov 3, 2016 at 11:16 AM, Eric Biggers <ebiggers@google.com> wrote:
>> Hello,
>>
>> I hit the BUG_ON() in arch/x86/mm/physaddr.c:26 while testing some crypto code
>> in an x86_64 kernel with CONFIG_DEBUG_VIRTUAL=y and CONFIG_VMAP_STACK=y:
>>
>>         /* carry flag will be set if starting x was >= PAGE_OFFSET */
>>         VIRTUAL_BUG_ON((x > y) || !phys_addr_valid(x));
>>
>> The problem is the following code in scatterwalk_map_and_copy() in
>> crypto/scatterwalk.c, which tries to determine if the buffer passed in aliases
>> the physical memory of the first segment of the scatterlist:
>>
>>         if (sg_page(sg) == virt_to_page(buf) &&
>>             sg->offset == offset_in_page(buf))
>>                 return;
>
> ...
>
>>
>> Currently I think the best solution would be to require that callers to
>> scatterwalk_map_and_copy() do not alias their source and destination.  Then the
>> alias check could be removed.  This check has only been there since v4.2 (commit
>> 74412fd5d71b6), so I'd hope not many callers rely on the behavior.  I'm not sure
>> exactly which ones do, though.
>>
>> Thoughts on this?
>
> The relevant commit is:
>
> commit 74412fd5d71b6eda0beb302aa467da000f0d530c
> Author: Herbert Xu <herbert@gondor.apana.org.au>
> Date:   Thu May 21 15:11:12 2015 +0800
>
>     crypto: scatterwalk - Check for same address in map_and_copy
>
>     This patch adds a check for in scatterwalk_map_and_copy to avoid
>     copying from the same address to the same address.  This is going
>     to be used for IV copying in AEAD IV generators.
>
>     There is no provision for partial overlaps.
>
>     This patch also uses the new scatterwalk_ffwd instead of doing
>     it by hand in scatterwalk_map_and_copy.
>
>     Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
>
> Herbert, can you clarify this?  The check seems rather bizarre --
> you're doing an incomplete check for aliasing and skipping the whole
> copy if the beginning aliases.  In any event the stack *can't*
> reasonably alias the scatterlist because a scatterlist can't safely
> point to the stack.  Is there any code that actually relies on the
> aliasing-detecting behavior?
>
> Also, Herbert, it seems like the considerable majority of the crypto
> code is acting on kernel virtual memory addresses and does software
> processing.  Would it perhaps make sense to add a kvec-based or
> iov_iter-based interface to the crypto code?  I bet it would be quite
> a bit faster and it would make crypto on stack buffers work directly.


Ping, everyone!

It's getting quite close to 4.9 release time.  Is there an actual bug
here?  Because, if so, we need to fix it.  My preference is to just
delete the weird aliasing check, but it would be really nice to know
if that check is needed for some reason.

--Andy

-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply

* PROBLEM: unable to decrypt LUKS partition since v4.9-rc6 (bisected)
From: Patrick Steinhardt @ 2016-11-21  6:56 UTC (permalink / raw)
  To: Herbert Xu, David S. Miller; +Cc: linux-crypto

[-- Attachment #1: Type: text/plain, Size: 530 bytes --]

Hi,

I'm using cryptsetup 1.7.2 via the kernel's crypto API. Since
version v4.9-rc6, I'm unable to decrypt my LUKS partitions
(aes-xts-plain64, sha512). cryptsetup simply aborts with the
message "No such passphrase available" after inputting the
passphrase.

After bisecting the issue, this points to commit a8348bc (crypto:
algif_hash - Fix NULL hash crash with shash, 2016-11-17). After
reverting this particular commit, everything works correctly
again.

Please let me know if you need additional information.

Regards
Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply

* Re: PROBLEM: unable to decrypt LUKS partition since v4.9-rc6 (bisected)
From: Herbert Xu @ 2016-11-21  7:34 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: David S. Miller, linux-crypto
In-Reply-To: <20161121065638.GA540@pks-pc>

On Mon, Nov 21, 2016 at 07:56:38AM +0100, Patrick Steinhardt wrote:
> 
> I'm using cryptsetup 1.7.2 via the kernel's crypto API. Since
> version v4.9-rc6, I'm unable to decrypt my LUKS partitions
> (aes-xts-plain64, sha512). cryptsetup simply aborts with the
> message "No such passphrase available" after inputting the
> passphrase.
> 
> After bisecting the issue, this points to commit a8348bc (crypto:
> algif_hash - Fix NULL hash crash with shash, 2016-11-17). After
> reverting this particular commit, everything works correctly
> again.

Sorry, I screwed up that patch.  Please try the following fix.
Thanks!

---8<---
crypto: algif_hash - Fix result clobbering in recvmsg

Recently an init call was added to hash_recvmsg so as to reset
the hash state in case a sendmsg call was never made.

Unfortunately this ended up clobbering the result if the previous
sendmsg was done with a MSG_MORE flag.  This patch fixes it by
excluding that case when we make the init call.

Fixes: a8348bca2944 ("algif_hash - Fix NULL hash crash with shash")
Reported-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c
index 05e21b4..d19b09c 100644
--- a/crypto/algif_hash.c
+++ b/crypto/algif_hash.c
@@ -214,7 +214,7 @@ static int hash_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
 
 	ahash_request_set_crypt(&ctx->req, NULL, ctx->result, 0);
 
-	if (!result) {
+	if (!result && !ctx->more) {
 		err = af_alg_wait_for_completion(
 				crypto_ahash_init(&ctx->req),
 				&ctx->completion);
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply related

* Re: PROBLEM: unable to decrypt LUKS partition since v4.9-rc6 (bisected)
From: Patrick Steinhardt @ 2016-11-21  7:43 UTC (permalink / raw)
  To: Herbert Xu; +Cc: David S. Miller, linux-crypto
In-Reply-To: <20161121073400.GA6357@gondor.apana.org.au>

[-- Attachment #1: Type: text/plain, Size: 812 bytes --]

On Mon, Nov 21, 2016 at 03:34:00PM +0800, Herbert Xu wrote:
> On Mon, Nov 21, 2016 at 07:56:38AM +0100, Patrick Steinhardt wrote:
> > 
> > I'm using cryptsetup 1.7.2 via the kernel's crypto API. Since
> > version v4.9-rc6, I'm unable to decrypt my LUKS partitions
> > (aes-xts-plain64, sha512). cryptsetup simply aborts with the
> > message "No such passphrase available" after inputting the
> > passphrase.
> > 
> > After bisecting the issue, this points to commit a8348bc (crypto:
> > algif_hash - Fix NULL hash crash with shash, 2016-11-17). After
> > reverting this particular commit, everything works correctly
> > again.
> 
> Sorry, I screwed up that patch.  Please try the following fix.
> Thanks!
[snip]

Thanks for the fast response. Your patch fixes the problem.

Regards
Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply

* Re: vmalloced stacks and scatterwalk_map_and_copy()
From: Herbert Xu @ 2016-11-21  8:26 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Eric Biggers, regressions, linux-crypto,
	linux-kernel@vger.kernel.org, Andrew Lutomirski
In-Reply-To: <CALCETrXg5ytkLHBpKMJG7fBea+xy2wz6kSm2XK5c8K7_Hr9UuA@mail.gmail.com>

On Sun, Nov 20, 2016 at 06:19:48PM -0800, Andy Lutomirski wrote:
>
> > Herbert, can you clarify this?  The check seems rather bizarre --
> > you're doing an incomplete check for aliasing and skipping the whole
> > copy if the beginning aliases.  In any event the stack *can't*
> > reasonably alias the scatterlist because a scatterlist can't safely
> > point to the stack.  Is there any code that actually relies on the
> > aliasing-detecting behavior?

Well at the time the IPsec stack would pass an IV that pointed
into the actual request, which is what prompted that patch.  The
IPsec code has since been changed to provide a separate IV so this
check is no longer necessary.

I will remove it with this patch.

---8<---
crypto: scatterwalk - Remove unnecessary aliasing check in map_and_copy

The aliasing check in map_and_copy is no longer necessary because
the IPsec ESP code no longer provides an IV that points into the
actual request data.  As this check is now triggering BUG checks
due to the vmalloced stack code, I'm removing it.

Reported-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

diff --git a/crypto/scatterwalk.c b/crypto/scatterwalk.c
index 52ce17a..c16c94f8 100644
--- a/crypto/scatterwalk.c
+++ b/crypto/scatterwalk.c
@@ -68,10 +68,6 @@ void scatterwalk_map_and_copy(void *buf, struct scatterlist *sg,
 
 	sg = scatterwalk_ffwd(tmp, sg, start);
 
-	if (sg_page(sg) == virt_to_page(buf) &&
-	    sg->offset == offset_in_page(buf))
-		return;
-
 	scatterwalk_start(&walk, sg);
 	scatterwalk_copychunks(buf, &walk, nbytes, out);
 	scatterwalk_done(&walk, out, 0);
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply related

* [RFC PATCH] IV Generation algorithms for dm-crypt
From: Binoy Jayan @ 2016-11-21 10:10 UTC (permalink / raw)
  To: Oded, Ofir
  Cc: Herbert Xu, David S. Miller, linux-crypto, Mark Brown,
	Arnd Bergmann, linux-kernel, Alasdair Kergon, Mike Snitzer,
	dm-devel, Shaohua Li, linux-raid, Binoy Jayan


===============================================================================
GENIV Template cipher
===============================================================================

Currently, the iv generation algorithms are implemented in dm-crypt.c. The goal
is to move these algorithms from the dm layer to the kernel crypto layer by
implementing them as template ciphers so they can be used in relation with
algorithms like aes, and with multiple modes like cbc, ecb etc. As part of this
patchset, the iv-generation code is moved from the dm layer to the crypto layer.
The dm-layer can later be optimized to encrypt larger block sizes in a single
call to the crypto engine.

One challenge in doing so is with the 'essiv' which creates the IV by hashing
the 512-byte sector number. This infact limits the block sizes to 512 bytes.
A way to get around this problem has to be explored. Another thing to note is
that the algorithms shares its context data structures (cipher context and
request context) with the callee, i.e. dm-crypt here. Not sure if this coupling
is accepted. If not, this has to be decoupled. A new crypto api
'crypto_skcipher_set_ctx' defined in 'include/crypto/skcipher.h' which was
initially written for addressing this is not used now. But even if it is used,
the data structure definition would still be shared.

The following ASCII art decomposes the kernel crypto API layers when using the
skcipher with the automated IV generation. The shown example is used by the
DM layer. For other use cases of cbc(aes), the ASCII art applies as well, but
the caller may not use the same with a separate IV generator. In this case, the
caller must generate the IV. The depicted example decomposes <ivgen>(cbc(aes))
based on the generic C implementations (geniv.c, cbc.c and aes-generic.c).
The generic implementation depicts the dependency between the templates ciphers
used in implementing geniv using the kernel crypto API.
Here, <geniv> indicates one of the following algorithms:

1. plain
2. plain64
3. essiv
4. benbi
5. null
6. lmk
7. tcw

It is possible that some streamlined cipher implementations (like AES-NI)
provide implementations merging aspects which in the view of the kernel crypto
API cannot be decomposed into layers any more. Each block in the following
ASCII art is an independent cipher instance obtained from the kernel crypto
API. Each block is accessed by the caller or by other blocks using the API
functions defined by the kernel crypto API for the cipher implementation type.
The blocks below indicate the cipher type as well as the specific logic
implemented in the cipher.

The ASCII art picture also indicates the call structure, i.e. who calls which
component. The arrows point to the invoked block where the caller uses the API
applicable to the cipher type specified for the block. For the purpose of
illustration, here we take the example of the aes mode 'cbc'. However, the IV
generation algorithm could be used with other aes modes like ecb as well.

-------------------------------------------------------------------------------
Geniv implementation
-------------------------------------------------------------------------------

NB: The ASCII art below is best viewed in a fixed-width font.

                             crypt_convert_block()              (DM Layer)
                                      |
                                      | (1)
                                      |
                                      v
+------------+   +-----------+   +-----------+       +-----------+
|            |   |           |   |           |  (2)  |           |
|  skcipher  |   | skcipher  |   | skcipher  |----+  | skcipher  |   Blocks for
| (plain/64) |   | (benbi)   |   | (essiv)   |    |  |  (null)   |   lmk, tcw
+------------+   +-----------+   +-----------+    |  +-----------+ 
     |                |               |           v         |
     | (3)            | (3)      (3)  |    +-----------+    |
     |                |               |    |           |    |
     |                |               |    |   ahash   |    | (3)
     |                |               |    |           |    |
     |                |               |    +-----------+    |
     |                |               v                     |     (Crypto API
     |                |         +-----------+               |        Layer)
     |                v         |           |               |
     +------------------------> |  skcipher | <-------------+
                                |   (cbc)   |
                                +-----------+  (AES Mode Template cipher)
                                      | (4)
                                      v
                               +-----------+
                               |           |
                               |   cipher  |   (Base generic-AES cipher)
                               |   (aes)   |
                               +-----------+

     
The following call sequence is applicable when the DM layer triggers an
encryption operation with the crypt_convert_block() function. During
configuration, the administrator sets up the use of <geniv>(cbc(aes)) as the
template cipher. 'geniv' can be one among plain, plain64, essiv, benbi, null,
lmk, or tcw which are all implemented as seperate templates. 
The following are the template ciphers implemented as part of 'geniv.c'

1. plain(cbc(aes))
2. plain64(cbc(aes))
3. essiv(cbc(aes))
4. benbi(cbc(aes))
5. null(cbc(aes))
6. lmk(cbc(aes))
7. tcw(cbc(aes))

The following call sequence is now depicted in the ASCII art above:

1. crypt_convert_block invokes crypto_skcipher_encrypt() to trigger encryption
   operation of a single block (i.e. sector) with the IV same as the sector no.
   For example, with essiv, the IV generation implementation is registered with
   a call to 'crypto_register_template(&crypto_essiv_tmpl)'

2. During instantiation of the 'geniv' handle, the IV generation algorithm is
   instantiated. For the purpose of illustration, we take the example of essiv.
   In this case, the ahash cipher is instantiated to calculate the hash of the
   sector to generate the IV.

3. Now, geniv uses the skcipher api calls to invoke the associated cipher. In
   our case, during the instantiation of geniv, the cipher handle for cbc is
   provided to geniv. The geniv skcipher type implementation now invokes the
   skcipher api with the instantiated cbc(aes) cipher handle. During the
   instantiation of the cbc(aes) cipher, the cipher type generic-aes is also
   instantiated. That means that the SKCIPHER implementation of cbc(aes) only
   implements the Cipher-block chaining mode. After performing block chaining
   operation, the cipher implementation of aes is invoked. The skcipher of
   cbc(aes) now invokes the cipher api with the aes cipher handle to encrypt
   one block.

-------------------------------------------------------------------------------
Clarifications
-------------------------------------------------------------------------------

1. Changes to testmgr.c
2. How to encrypt blocks bigger than 512 bytes while using essiv?
   As sectors are tied to IV in case of 'essiv'.
   Will changing block size make it backward incompatible
   and with other platforms (like windows) which support LUKS volumes.
3. Did not move the key management code from dm-crypt to cryto layer
   when keycount > 1 as multiple ciphers are instantiated from dm layer
   with each cipher instance is allotted a part of the key provided.

-------------------------------------------------------------------------------
Test procedure
-------------------------------------------------------------------------------

The algorithms are tested using 'cryptsetup' utility to create LUKS
compatible volumes on Qemu.

NB: '/dev/sdb' is a second disk volume (configured in qemu)

# One time setup - Format the device compatible with LUKS.
# Choose one of the following IV generation alorithms at a time
cryptsetup -y -c aes-cbc-plain -s 256 --hash sha256 luksFormat /dev/sdb
cryptsetup -y -c aes-cbc-plain64 -s 256 --hash sha256 luksFormat /dev/sdb
cryptsetup -y -c aes-cbc-essiv:sha256 -s 256 --hash sha256 luksFormat /dev/sdb
cryptsetup -y -c aes-cbc-benbi -s 256 --hash sha256 luksFormat /dev/sdb
cryptsetup -y -c aes-cbc-null -s 256 --hash sha256 luksFormat /dev/sdb
cryptsetup -y -c aes-cbc-lmk -s 256 --hash sha256 luksFormat /dev/sdb
cryptsetup -y -c aes-cbc-tcw -s 256 --hash sha256 luksFormat /dev/sdb

# With a keycount
cryptsetup -y -c aes:2-cbc-plain -s 256 --hash sha256 luksFormat /dev/sdb
cryptsetup -y -c aes:2-cbc-plain64 -s 256 --hash sha256 luksFormat /dev/sdb
cryptsetup -y -c aes:2-cbc-essiv:sha256 -s 256 --hash sha256 luksFormat /dev/sdb
cryptsetup -y -c aes:2-cbc-null -s 256 --hash sha256 luksFormat /dev/sdb
cryptsetup -y -c aes:2-cbc-lmk -s 256 --hash sha256 luksFormat /dev/sdb

# Add additional key - optional
cryptsetup luksAddKey /dev/sdb

# The above lists only a limited number of tests with the aes cipher.
# The IV generation algorithms may also be tested with other ciphers as well.

cryptsetup luksDump --dump-master-key /dev/sdb

# create a luks volume and open the device
cryptsetup luksOpen /dev/sdb crypt_fun
dmsetup table --showkeys

# Write some data to the device
cat data.txt > /dev/mapper/crypt_fun

# Read 100 bytes back
dd if=/dev/mapper/crypt_fun of=out.txt bs=100 count=1
cat out.txt

mkfs.ext4 -j /dev/mapper/crypt_fun

# Mount if fs creation succeeds
mount -t ext4 /dev/mapper/crypt_fun /mnt

<-- Use the encrypted file system -->

umount /mnt
cryptsetup luksClose crypt_fun
cryptsetup luksRemoveKey /dev/sdb

This seems to work well. The file system mounts successfully and the files
written to in the file system remain persistent across reboots.

Binoy Jayan (1):
  crypto: Add IV generation algorithms

 crypto/Kconfig            |    8 +
 crypto/Makefile           |    1 +
 crypto/geniv.c            | 1113 +++++++++++++++++++++++++++++++++++++++++++++
 drivers/md/dm-crypt.c     |  725 +++--------------------------
 include/crypto/geniv.h    |  109 +++++
 include/crypto/skcipher.h |   17 +
 6 files changed, 1309 insertions(+), 664 deletions(-)
 create mode 100644 crypto/geniv.c
 create mode 100644 include/crypto/geniv.h

-- 
Binoy Jayan


^ permalink raw reply

* [RFC PATCH] crypto: Add IV generation algorithms
From: Binoy Jayan @ 2016-11-21 10:10 UTC (permalink / raw)
  To: Oded, Ofir
  Cc: Herbert Xu, David S. Miller, linux-crypto, Mark Brown,
	Arnd Bergmann, linux-kernel, Alasdair Kergon, Mike Snitzer,
	dm-devel, Shaohua Li, linux-raid, Binoy Jayan
In-Reply-To: <1479723009-11113-1-git-send-email-binoy.jayan@linaro.org>

Currently, the iv generation algorithms are implemented in dm-crypt.c.
The goal is to move these algorithms from the dm layer to the kernel
crypto layer by implementing them as template ciphers so they can be used
in relation with algorithms like aes, and with multiple modes like cbc,
ecb etc. As part of this patchset, the iv-generation code is moved from the
dm layer to the crypto layer. The dm-layer can later be optimized to
encrypt larger block sizes in a single call to the crypto engine. The iv
generation algorithms implemented in geniv.c includes plain, plain64,
essiv, benbi, null, lmk and tcw. These templates are to be configured
and has to be invoked as:

crypto_alloc_skcipher("plain(cbc(aes))", 0, 0);
crypto_alloc_skcipher("essiv(cbc(aes))", 0, 0);
...

from the dm layer.

Signed-off-by: Binoy Jayan <binoy.jayan@linaro.org>
---
 crypto/Kconfig            |    8 +
 crypto/Makefile           |    1 +
 crypto/geniv.c            | 1113 +++++++++++++++++++++++++++++++++++++++++++++
 drivers/md/dm-crypt.c     |  725 +++--------------------------
 include/crypto/geniv.h    |  109 +++++
 include/crypto/skcipher.h |   17 +
 6 files changed, 1309 insertions(+), 664 deletions(-)
 create mode 100644 crypto/geniv.c
 create mode 100644 include/crypto/geniv.h

diff --git a/crypto/Kconfig b/crypto/Kconfig
index 84d7148..7125bc2 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -326,6 +326,14 @@ config CRYPTO_CTS
 	  This mode is required for Kerberos gss mechanism support
 	  for AES encryption.
 
+config CRYPTO_GENIV
+	tristate "IV Generation for dm-crypt"
+	select CRYPTO_BLKCIPHER
+	help
+	  GENIV: IV Generation for dm-crypt
+	  Algorithms to generate Initialization Vector for ciphers
+	  used by dm-crypt.
+
 config CRYPTO_ECB
 	tristate "ECB support"
 	select CRYPTO_BLKCIPHER
diff --git a/crypto/Makefile b/crypto/Makefile
index 99cc64a..fc81a82 100644
--- a/crypto/Makefile
+++ b/crypto/Makefile
@@ -74,6 +74,7 @@ obj-$(CONFIG_CRYPTO_TGR192) += tgr192.o
 obj-$(CONFIG_CRYPTO_GF128MUL) += gf128mul.o
 obj-$(CONFIG_CRYPTO_ECB) += ecb.o
 obj-$(CONFIG_CRYPTO_CBC) += cbc.o
+obj-$(CONFIG_CRYPTO_GENIV) += geniv.o
 obj-$(CONFIG_CRYPTO_PCBC) += pcbc.o
 obj-$(CONFIG_CRYPTO_CTS) += cts.o
 obj-$(CONFIG_CRYPTO_LRW) += lrw.o
diff --git a/crypto/geniv.c b/crypto/geniv.c
new file mode 100644
index 0000000..46988d5
--- /dev/null
+++ b/crypto/geniv.c
@@ -0,0 +1,1113 @@
+/*
+ * geniv: IV generation algorithms
+ *
+ * Copyright (c) 2016, Linaro Ltd.
+ * Copyright (C) 2006-2015 Red Hat, Inc. All rights reserved.
+ * Copyright (C) 2013 Milan Broz <gmazyland@gmail.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the Free
+ * Software Foundation; either version 2 of the License, or (at your option)
+ * any later version.
+ *
+ */
+
+#include <crypto/algapi.h>
+#include <crypto/internal/skcipher.h>
+#include <linux/err.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/log2.h>
+#include <linux/module.h>
+#include <linux/scatterlist.h>
+#include <linux/slab.h>
+#include <linux/completion.h>
+#include <linux/crypto.h>
+#include <linux/workqueue.h>
+#include <linux/backing-dev.h>
+#include <linux/atomic.h>
+#include <linux/rbtree.h>
+#include <crypto/hash.h>
+#include <crypto/md5.h>
+#include <crypto/algapi.h>
+#include <crypto/skcipher.h>
+#include <asm/unaligned.h>
+#include <crypto/geniv.h>
+
+struct crypto_geniv_req_ctx {
+	struct skcipher_request subreq CRYPTO_MINALIGN_ATTR;
+};
+
+static struct crypto_skcipher *any_tfm(struct geniv_ctx_data *cd)
+{
+	return cd->tfm;
+}
+
+static int crypt_iv_plain_gen(struct geniv_ctx_data *cd, u8 *iv,
+			      struct dm_crypt_request *dmreq)
+{
+	memset(iv, 0, cd->iv_size);
+	*(__le32 *)iv = cpu_to_le32(dmreq->iv_sector & 0xffffffff);
+
+	return 0;
+}
+
+static int crypt_iv_plain64_gen(struct geniv_ctx_data *cd, u8 *iv,
+				struct dm_crypt_request *dmreq)
+{
+	memset(iv, 0, cd->iv_size);
+	*(__le64 *)iv = cpu_to_le64(dmreq->iv_sector);
+
+	return 0;
+}
+
+/* Initialise ESSIV - compute salt but no local memory allocations */
+static int crypt_iv_essiv_init(struct geniv_ctx_data *cd)
+{
+	struct geniv_essiv_private *essiv = &cd->iv_gen_private.essiv;
+	struct scatterlist sg;
+	struct crypto_cipher *essiv_tfm;
+	int err;
+	AHASH_REQUEST_ON_STACK(req, essiv->hash_tfm);
+
+	sg_init_one(&sg, cd->key, cd->key_size);
+	ahash_request_set_tfm(req, essiv->hash_tfm);
+	ahash_request_set_callback(req, CRYPTO_TFM_REQ_MAY_SLEEP, NULL, NULL);
+	ahash_request_set_crypt(req, &sg, essiv->salt, cd->key_size);
+
+	err = crypto_ahash_digest(req);
+	ahash_request_zero(req);
+	if (err)
+		return err;
+
+	essiv_tfm = cd->iv_private;
+
+	err = crypto_cipher_setkey(essiv_tfm, essiv->salt,
+			    crypto_ahash_digestsize(essiv->hash_tfm));
+	if (err)
+		return err;
+
+	return 0;
+}
+
+/* Wipe salt and reset key derived from volume key */
+static int crypt_iv_essiv_wipe(struct geniv_ctx_data *cd)
+{
+	struct geniv_essiv_private *essiv = &cd->iv_gen_private.essiv;
+	unsigned int salt_size = crypto_ahash_digestsize(essiv->hash_tfm);
+	struct crypto_cipher *essiv_tfm;
+	int r, err = 0;
+
+	memset(essiv->salt, 0, salt_size);
+
+	essiv_tfm = cd->iv_private;
+	r = crypto_cipher_setkey(essiv_tfm, essiv->salt, salt_size);
+	if (r)
+		err = r;
+
+	return err;
+}
+
+/* Set up per cpu cipher state */
+static struct crypto_cipher *setup_essiv_cpu(struct geniv_ctx_data *cd,
+					     u8 *salt, unsigned int saltsize)
+{
+	struct crypto_cipher *essiv_tfm;
+	int err;
+
+	/* Setup the essiv_tfm with the given salt */
+	essiv_tfm = crypto_alloc_cipher(cd->cipher, 0, CRYPTO_ALG_ASYNC);
+
+	if (IS_ERR(essiv_tfm)) {
+		pr_err("Error allocating crypto tfm for ESSIV\n");
+		return essiv_tfm;
+	}
+
+	if (crypto_cipher_blocksize(essiv_tfm) !=
+	    crypto_skcipher_ivsize(any_tfm(cd))) {
+		pr_err("Block size of ESSIV cipher does not match IV size of block cipher\n");
+		crypto_free_cipher(essiv_tfm);
+		return ERR_PTR(-EINVAL);
+	}
+
+	err = crypto_cipher_setkey(essiv_tfm, salt, saltsize);
+	if (err) {
+		pr_err("Failed to set key for ESSIV cipher\n");
+		crypto_free_cipher(essiv_tfm);
+		return ERR_PTR(err);
+	}
+	return essiv_tfm;
+}
+
+static void crypt_iv_essiv_dtr(struct geniv_ctx_data *cd)
+{
+	struct crypto_cipher *essiv_tfm;
+	struct geniv_essiv_private *essiv = &cd->iv_gen_private.essiv;
+
+	crypto_free_ahash(essiv->hash_tfm);
+	essiv->hash_tfm = NULL;
+
+	kzfree(essiv->salt);
+	essiv->salt = NULL;
+
+	essiv_tfm = cd->iv_private;
+
+	if (essiv_tfm)
+		crypto_free_cipher(essiv_tfm);
+
+	cd->iv_private = NULL;
+}
+
+static int crypt_iv_essiv_ctr(struct geniv_ctx_data *cd)
+{
+	struct crypto_cipher *essiv_tfm = NULL;
+	struct crypto_ahash *hash_tfm = NULL;
+	u8 *salt = NULL;
+	int err;
+
+	if (!cd->ivopts) {
+		pr_err("Digest algorithm missing for ESSIV mode\n");
+		return -EINVAL;
+	}
+
+	/* Allocate hash algorithm */
+	hash_tfm = crypto_alloc_ahash(cd->ivopts, 0, CRYPTO_ALG_ASYNC);
+	if (IS_ERR(hash_tfm)) {
+		err = PTR_ERR(hash_tfm);
+		pr_err("Error initializing ESSIV hash. err=%d\n", err);
+		goto bad;
+	}
+
+	salt = kzalloc(crypto_ahash_digestsize(hash_tfm), GFP_KERNEL);
+	if (!salt) {
+		err = -ENOMEM;
+		goto bad;
+	}
+
+	cd->iv_gen_private.essiv.salt = salt;
+	cd->iv_gen_private.essiv.hash_tfm = hash_tfm;
+
+	essiv_tfm = setup_essiv_cpu(cd, salt,
+				crypto_ahash_digestsize(hash_tfm));
+	if (IS_ERR(essiv_tfm)) {
+		crypt_iv_essiv_dtr(cd);
+		return PTR_ERR(essiv_tfm);
+	}
+	cd->iv_private = essiv_tfm;
+
+	return 0;
+
+bad:
+	if (hash_tfm && !IS_ERR(hash_tfm))
+		crypto_free_ahash(hash_tfm);
+	kfree(salt);
+	return err;
+}
+
+static int crypt_iv_essiv_gen(struct geniv_ctx_data *cd, u8 *iv,
+			      struct dm_crypt_request *dmreq)
+{
+	struct crypto_cipher *essiv_tfm = cd->iv_private;
+
+	memset(iv, 0, cd->iv_size);
+	*(__le64 *)iv = cpu_to_le64(dmreq->iv_sector);
+	crypto_cipher_encrypt_one(essiv_tfm, iv, iv);
+
+	return 0;
+}
+
+static int crypt_iv_benbi_ctr(struct geniv_ctx_data *cd)
+{
+	unsigned int bs = crypto_skcipher_blocksize(any_tfm(cd));
+	int log = ilog2(bs);
+
+	/* we need to calculate how far we must shift the sector count
+	 * to get the cipher block count, we use this shift in _gen
+	 */
+
+	if (1 << log != bs) {
+		pr_err("cypher blocksize is not a power of 2\n");
+		return -EINVAL;
+	}
+
+	if (log > 9) {
+		pr_err("cypher blocksize is > 512\n");
+		return -EINVAL;
+	}
+
+	cd->iv_gen_private.benbi.shift = 9 - log;
+
+	return 0;
+}
+
+static int crypt_iv_benbi_gen(struct geniv_ctx_data *cd, u8 *iv,
+			      struct dm_crypt_request *dmreq)
+{
+	__be64 val;
+
+	memset(iv, 0, cd->iv_size - sizeof(u64)); /* rest is cleared below */
+
+	val = cpu_to_be64(((u64) dmreq->iv_sector <<
+			  cd->iv_gen_private.benbi.shift) + 1);
+	put_unaligned(val, (__be64 *)(iv + cd->iv_size - sizeof(u64)));
+
+	return 0;
+}
+
+static int crypt_iv_null_gen(struct geniv_ctx_data *cd, u8 *iv,
+			     struct dm_crypt_request *dmreq)
+{
+	memset(iv, 0, cd->iv_size);
+
+	return 0;
+}
+
+static void crypt_iv_lmk_dtr(struct geniv_ctx_data *cd)
+{
+	struct geniv_lmk_private *lmk = &cd->iv_gen_private.lmk;
+
+	if (lmk->hash_tfm && !IS_ERR(lmk->hash_tfm))
+		crypto_free_shash(lmk->hash_tfm);
+	lmk->hash_tfm = NULL;
+
+	kzfree(lmk->seed);
+	lmk->seed = NULL;
+}
+
+static int crypt_iv_lmk_ctr(struct geniv_ctx_data *cd)
+{
+	struct geniv_lmk_private *lmk = &cd->iv_gen_private.lmk;
+
+	lmk->hash_tfm = crypto_alloc_shash("md5", 0, 0);
+	if (IS_ERR(lmk->hash_tfm)) {
+		pr_err("Error initializing LMK hash; err=%ld\n",
+				PTR_ERR(lmk->hash_tfm));
+		return PTR_ERR(lmk->hash_tfm);
+	}
+
+	/* No seed in LMK version 2 */
+	if (cd->key_parts == cd->tfms_count) {
+		lmk->seed = NULL;
+		return 0;
+	}
+
+	lmk->seed = kzalloc(LMK_SEED_SIZE, GFP_KERNEL);
+	if (!lmk->seed) {
+		crypt_iv_lmk_dtr(cd);
+		pr_err("Error kmallocing seed storage in LMK\n");
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+static int crypt_iv_lmk_init(struct geniv_ctx_data *cd)
+{
+	struct geniv_lmk_private *lmk = &cd->iv_gen_private.lmk;
+	int subkey_size = cd->key_size / cd->key_parts;
+
+	/* LMK seed is on the position of LMK_KEYS + 1 key */
+	if (lmk->seed)
+		memcpy(lmk->seed, cd->key + (cd->tfms_count * subkey_size),
+		       crypto_shash_digestsize(lmk->hash_tfm));
+
+	return 0;
+}
+
+static int crypt_iv_lmk_wipe(struct geniv_ctx_data *cd)
+{
+	struct geniv_lmk_private *lmk = &cd->iv_gen_private.lmk;
+
+	if (lmk->seed)
+		memset(lmk->seed, 0, LMK_SEED_SIZE);
+
+	return 0;
+}
+
+static int crypt_iv_lmk_one(struct geniv_ctx_data *cd, u8 *iv,
+			    struct dm_crypt_request *dmreq, u8 *data)
+{
+	struct geniv_lmk_private *lmk = &cd->iv_gen_private.lmk;
+	struct md5_state md5state;
+	__le32 buf[4];
+	int i, r;
+	SHASH_DESC_ON_STACK(desc, lmk->hash_tfm);
+
+	desc->tfm = lmk->hash_tfm;
+	desc->flags = CRYPTO_TFM_REQ_MAY_SLEEP;
+
+	r = crypto_shash_init(desc);
+	if (r)
+		return r;
+
+	if (lmk->seed) {
+		r = crypto_shash_update(desc, lmk->seed, LMK_SEED_SIZE);
+		if (r)
+			return r;
+	}
+
+	/* Sector is always 512B, block size 16, add data of blocks 1-31 */
+	r = crypto_shash_update(desc, data + 16, 16 * 31);
+	if (r)
+		return r;
+
+	/* Sector is cropped to 56 bits here */
+	buf[0] = cpu_to_le32(dmreq->iv_sector & 0xFFFFFFFF);
+	buf[1] = cpu_to_le32((((u64)dmreq->iv_sector >> 32) & 0x00FFFFFF)
+			     | 0x80000000);
+	buf[2] = cpu_to_le32(4024);
+	buf[3] = 0;
+	r = crypto_shash_update(desc, (u8 *)buf, sizeof(buf));
+	if (r)
+		return r;
+
+	/* No MD5 padding here */
+	r = crypto_shash_export(desc, &md5state);
+	if (r)
+		return r;
+
+	for (i = 0; i < MD5_HASH_WORDS; i++)
+		__cpu_to_le32s(&md5state.hash[i]);
+	memcpy(iv, &md5state.hash, cd->iv_size);
+
+	return 0;
+}
+
+static int crypt_iv_lmk_gen(struct geniv_ctx_data *cd, u8 *iv,
+			      struct dm_crypt_request *dmreq)
+{
+	u8 *src;
+	int r = 0;
+
+	if (bio_data_dir(dmreq->ctx->bio_in) == WRITE) {
+		src = kmap_atomic(sg_page(&dmreq->sg_in));
+		r = crypt_iv_lmk_one(cd, iv, dmreq, src + dmreq->sg_in.offset);
+		kunmap_atomic(src);
+	} else
+		memset(iv, 0, cd->iv_size);
+
+	return r;
+}
+
+static int crypt_iv_lmk_post(struct geniv_ctx_data *cd, u8 *iv,
+			     struct dm_crypt_request *dmreq)
+{
+	u8 *dst;
+	int r;
+
+	if (bio_data_dir(dmreq->ctx->bio_in) == WRITE)
+		return 0;
+
+	dst = kmap_atomic(sg_page(&dmreq->sg_out));
+	r = crypt_iv_lmk_one(cd, iv, dmreq, dst + dmreq->sg_out.offset);
+
+	/* Tweak the first block of plaintext sector */
+	if (!r)
+		crypto_xor(dst + dmreq->sg_out.offset, iv, cd->iv_size);
+
+	kunmap_atomic(dst);
+	return r;
+}
+
+static void crypt_iv_tcw_dtr(struct geniv_ctx_data *cd)
+{
+	struct geniv_tcw_private *tcw = &cd->iv_gen_private.tcw;
+
+	kzfree(tcw->iv_seed);
+	tcw->iv_seed = NULL;
+	kzfree(tcw->whitening);
+	tcw->whitening = NULL;
+
+	if (tcw->crc32_tfm && !IS_ERR(tcw->crc32_tfm))
+		crypto_free_shash(tcw->crc32_tfm);
+	tcw->crc32_tfm = NULL;
+}
+
+static int crypt_iv_tcw_ctr(struct geniv_ctx_data *cd)
+{
+	struct geniv_tcw_private *tcw = &cd->iv_gen_private.tcw;
+
+	if (cd->key_size <= (cd->iv_size + TCW_WHITENING_SIZE)) {
+		pr_err("Wrong key size (%d) for TCW. Choose a value > %d bytes\n",
+			cd->key_size,
+			cd->iv_size + TCW_WHITENING_SIZE);
+		return -EINVAL;
+	}
+
+	tcw->crc32_tfm = crypto_alloc_shash("crc32", 0, 0);
+	if (IS_ERR(tcw->crc32_tfm)) {
+		pr_err("Error initializing CRC32 in TCW; err=%ld\n",
+			PTR_ERR(tcw->crc32_tfm));
+		return PTR_ERR(tcw->crc32_tfm);
+	}
+
+	tcw->iv_seed = kzalloc(cd->iv_size, GFP_KERNEL);
+	tcw->whitening = kzalloc(TCW_WHITENING_SIZE, GFP_KERNEL);
+	if (!tcw->iv_seed || !tcw->whitening) {
+		crypt_iv_tcw_dtr(cd);
+		pr_err("Error allocating seed storage in TCW\n");
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+static int crypt_iv_tcw_init(struct geniv_ctx_data *cd)
+{
+	struct geniv_tcw_private *tcw = &cd->iv_gen_private.tcw;
+	int key_offset = cd->key_size - cd->iv_size - TCW_WHITENING_SIZE;
+
+	memcpy(tcw->iv_seed, &cd->key[key_offset], cd->iv_size);
+	memcpy(tcw->whitening, &cd->key[key_offset + cd->iv_size],
+	       TCW_WHITENING_SIZE);
+
+	return 0;
+}
+
+static int crypt_iv_tcw_wipe(struct geniv_ctx_data *cd)
+{
+	struct geniv_tcw_private *tcw = &cd->iv_gen_private.tcw;
+
+	memset(tcw->iv_seed, 0, cd->iv_size);
+	memset(tcw->whitening, 0, TCW_WHITENING_SIZE);
+
+	return 0;
+}
+
+static int crypt_iv_tcw_whitening(struct geniv_ctx_data *cd,
+				  struct dm_crypt_request *dmreq, u8 *data)
+{
+	struct geniv_tcw_private *tcw = &cd->iv_gen_private.tcw;
+	__le64 sector = cpu_to_le64(dmreq->iv_sector);
+	u8 buf[TCW_WHITENING_SIZE];
+	int i, r;
+	SHASH_DESC_ON_STACK(desc, tcw->crc32_tfm);
+
+	/* xor whitening with sector number */
+	memcpy(buf, tcw->whitening, TCW_WHITENING_SIZE);
+	crypto_xor(buf, (u8 *)&sector, 8);
+	crypto_xor(&buf[8], (u8 *)&sector, 8);
+
+	/* calculate crc32 for every 32bit part and xor it */
+	desc->tfm = tcw->crc32_tfm;
+	desc->flags = CRYPTO_TFM_REQ_MAY_SLEEP;
+	for (i = 0; i < 4; i++) {
+		r = crypto_shash_init(desc);
+		if (r)
+			goto out;
+		r = crypto_shash_update(desc, &buf[i * 4], 4);
+		if (r)
+			goto out;
+		r = crypto_shash_final(desc, &buf[i * 4]);
+		if (r)
+			goto out;
+	}
+	crypto_xor(&buf[0], &buf[12], 4);
+	crypto_xor(&buf[4], &buf[8], 4);
+
+	/* apply whitening (8 bytes) to whole sector */
+	for (i = 0; i < ((1 << SECTOR_SHIFT) / 8); i++)
+		crypto_xor(data + i * 8, buf, 8);
+out:
+	memzero_explicit(buf, sizeof(buf));
+	return r;
+}
+
+static int crypt_iv_tcw_gen(struct geniv_ctx_data *cd, u8 *iv,
+			      struct dm_crypt_request *dmreq)
+{
+	struct geniv_tcw_private *tcw = &cd->iv_gen_private.tcw;
+	__le64 sector = cpu_to_le64(dmreq->iv_sector);
+	u8 *src;
+	int r = 0;
+
+	/* Remove whitening from ciphertext */
+	if (bio_data_dir(dmreq->ctx->bio_in) != WRITE) {
+		src = kmap_atomic(sg_page(&dmreq->sg_in));
+		r = crypt_iv_tcw_whitening(cd, dmreq,
+					   src + dmreq->sg_in.offset);
+		kunmap_atomic(src);
+	}
+
+	/* Calculate IV */
+	memcpy(iv, tcw->iv_seed, cd->iv_size);
+	crypto_xor(iv, (u8 *)&sector, 8);
+	if (cd->iv_size > 8)
+		crypto_xor(&iv[8], (u8 *)&sector, cd->iv_size - 8);
+
+	return r;
+}
+
+static int crypt_iv_tcw_post(struct geniv_ctx_data *cd, u8 *iv,
+			     struct dm_crypt_request *dmreq)
+{
+	u8 *dst;
+	int r;
+
+	if (bio_data_dir(dmreq->ctx->bio_in) != WRITE)
+		return 0;
+
+	/* Apply whitening on ciphertext */
+	dst = kmap_atomic(sg_page(&dmreq->sg_out));
+	r = crypt_iv_tcw_whitening(cd, dmreq, dst + dmreq->sg_out.offset);
+	kunmap_atomic(dst);
+
+	return r;
+}
+
+static struct geniv_operations crypt_iv_plain_ops = {
+	.generator = crypt_iv_plain_gen
+};
+
+static struct geniv_operations crypt_iv_plain64_ops = {
+	.generator = crypt_iv_plain64_gen
+};
+
+static struct geniv_operations crypt_iv_essiv_ops = {
+	.ctr       = crypt_iv_essiv_ctr,
+	.dtr       = crypt_iv_essiv_dtr,
+	.init      = crypt_iv_essiv_init,
+	.wipe      = crypt_iv_essiv_wipe,
+	.generator = crypt_iv_essiv_gen
+};
+
+static struct geniv_operations crypt_iv_benbi_ops = {
+	.ctr	   = crypt_iv_benbi_ctr,
+	.generator = crypt_iv_benbi_gen
+};
+
+static struct geniv_operations crypt_iv_null_ops = {
+	.generator = crypt_iv_null_gen
+};
+
+static struct geniv_operations crypt_iv_lmk_ops = {
+	.ctr	   = crypt_iv_lmk_ctr,
+	.dtr	   = crypt_iv_lmk_dtr,
+	.init	   = crypt_iv_lmk_init,
+	.wipe	   = crypt_iv_lmk_wipe,
+	.generator = crypt_iv_lmk_gen,
+	.post	   = crypt_iv_lmk_post
+};
+
+static struct geniv_operations crypt_iv_tcw_ops = {
+	.ctr	   = crypt_iv_tcw_ctr,
+	.dtr	   = crypt_iv_tcw_dtr,
+	.init	   = crypt_iv_tcw_init,
+	.wipe	   = crypt_iv_tcw_wipe,
+	.generator = crypt_iv_tcw_gen,
+	.post	   = crypt_iv_tcw_post
+};
+
+static int geniv_setkey_set(struct geniv_ctx_data *cd)
+{
+	int ret = 0;
+
+	if (cd->iv_gen_ops && cd->iv_gen_ops->init)
+		ret = cd->iv_gen_ops->init(cd);
+	return ret;
+}
+
+static int geniv_setkey_wipe(struct geniv_ctx_data *cd)
+{
+	int ret = 0;
+
+	if (cd->iv_gen_ops && cd->iv_gen_ops->wipe) {
+		ret = cd->iv_gen_ops->wipe(cd);
+		if (ret)
+			return ret;
+	}
+	return ret;
+}
+
+static int geniv_setkey_init_ctx(struct geniv_ctx_data *cd)
+{
+	int ret = -EINVAL;
+
+	pr_debug("IV Generation algorithm : %s\n", cd->ivmode);
+
+	if (cd->ivmode == NULL)
+		cd->iv_gen_ops = NULL;
+	else if (strcmp(cd->ivmode, "plain") == 0)
+		cd->iv_gen_ops = &crypt_iv_plain_ops;
+	else if (strcmp(cd->ivmode, "plain64") == 0)
+		cd->iv_gen_ops = &crypt_iv_plain64_ops;
+	else if (strcmp(cd->ivmode, "essiv") == 0)
+		cd->iv_gen_ops = &crypt_iv_essiv_ops;
+	else if (strcmp(cd->ivmode, "benbi") == 0)
+		cd->iv_gen_ops = &crypt_iv_benbi_ops;
+	else if (strcmp(cd->ivmode, "null") == 0)
+		cd->iv_gen_ops = &crypt_iv_null_ops;
+	else if (strcmp(cd->ivmode, "lmk") == 0)
+		cd->iv_gen_ops = &crypt_iv_lmk_ops;
+	else if (strcmp(cd->ivmode, "tcw") == 0) {
+		cd->iv_gen_ops = &crypt_iv_tcw_ops;
+		cd->key_parts += 2; /* IV + whitening */
+		cd->key_extra_size = cd->iv_size + TCW_WHITENING_SIZE;
+	} else {
+		ret = -EINVAL;
+		pr_err("Invalid IV mode %s\n", cd->ivmode);
+		goto end;
+	}
+
+	/* Allocate IV */
+	if (cd->iv_gen_ops && cd->iv_gen_ops->ctr) {
+		ret = cd->iv_gen_ops->ctr(cd);
+		if (ret < 0) {
+			pr_err("Error creating IV for %s\n", cd->ivmode);
+			goto end;
+		}
+	}
+
+	/* Initialize IV (set keys for ESSIV etc) */
+	if (cd->iv_gen_ops && cd->iv_gen_ops->init) {
+		ret = cd->iv_gen_ops->init(cd);
+		if (ret < 0)
+			pr_err("Error creating IV for %s\n", cd->ivmode);
+	}
+	ret = 0;
+end:
+	return ret;
+}
+
+static int crypto_geniv_set_ctx(struct crypto_skcipher *cipher,
+				void *newctx, unsigned int len)
+{
+	struct geniv_ctx *ctx = crypto_skcipher_ctx(cipher);
+	/*
+	 * TODO:
+	 * Do we really need this API or can we append the context
+	 * 'struct geniv_ctx' to the cipher from dm-crypt and use
+	 * the same here.
+	 */
+	memcpy(ctx, (char *) newctx, len);
+	return geniv_setkey_init_ctx(&ctx->data);
+}
+
+static int crypto_geniv_setkey(struct crypto_skcipher *parent,
+				const u8 *key, unsigned int keylen)
+{
+	struct geniv_ctx *ctx = crypto_skcipher_ctx(parent);
+	struct crypto_skcipher *child = ctx->child;
+	int err;
+
+	pr_debug("SETKEY Operation : %d\n", ctx->data.keyop);
+
+	switch (ctx->data.keyop) {
+	case SETKEY_OP_INIT:
+		err = geniv_setkey_init_ctx(&ctx->data);
+		break;
+	case SETKEY_OP_SET:
+		err = geniv_setkey_set(&ctx->data);
+		break;
+	case SETKEY_OP_WIPE:
+		err = geniv_setkey_wipe(&ctx->data);
+		break;
+	}
+
+	crypto_skcipher_clear_flags(child, CRYPTO_TFM_REQ_MASK);
+	crypto_skcipher_set_flags(child, crypto_skcipher_get_flags(parent) &
+					 CRYPTO_TFM_REQ_MASK);
+	err = crypto_skcipher_setkey(child, key, keylen);
+	crypto_skcipher_set_flags(parent, crypto_skcipher_get_flags(child) &
+					  CRYPTO_TFM_RES_MASK);
+	return err;
+}
+
+static struct dm_crypt_request *dmreq_of_req(struct crypto_skcipher *tfm,
+					     struct skcipher_request *req)
+{
+	struct geniv_ctx *ctx;
+
+	ctx = crypto_skcipher_ctx(tfm);
+	return (struct dm_crypt_request *) ((char *) req + ctx->data.dmoffset);
+}
+
+
+static void geniv_async_done(struct crypto_async_request *async_req, int error)
+{
+	struct skcipher_request *req = async_req->data;
+	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+	struct geniv_ctx *ctx = crypto_skcipher_ctx(tfm);
+	struct geniv_ctx_data *cd = &ctx->data;
+	struct dm_crypt_request *dmreq = dmreq_of_req(tfm, req);
+	struct convert_context *cctx = dmreq->ctx;
+	unsigned long align = crypto_skcipher_alignmask(tfm);
+	struct crypto_geniv_req_ctx *rctx =
+		(void *) PTR_ALIGN((u8 *)skcipher_request_ctx(req), align + 1);
+	struct skcipher_request *subreq = &rctx->subreq;
+
+	/*
+	 * A request from crypto driver backlog is going to be processed now,
+	 * finish the completion and continue in crypt_convert().
+	 * (Callback will be called for the second time for this request.)
+	 */
+	if (error == -EINPROGRESS) {
+		complete(&cctx->restart);
+		return;
+	}
+
+	if (!error && cd->iv_gen_ops && cd->iv_gen_ops->post)
+		error = cd->iv_gen_ops->post(cd, req->iv, dmreq);
+
+	skcipher_request_set_callback(subreq, req->base.flags,
+				      req->base.complete, req->base.data);
+	skcipher_request_complete(req, error);
+}
+
+static inline int crypto_geniv_crypt(struct skcipher_request *req, bool encrypt)
+{
+	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+	struct geniv_ctx *ctx = crypto_skcipher_ctx(tfm);
+	struct geniv_ctx_data *cd = &ctx->data;
+	struct crypto_skcipher *child = ctx->child;
+	struct dm_crypt_request *dmreq;
+	unsigned long align = crypto_skcipher_alignmask(tfm);
+	struct crypto_geniv_req_ctx *rctx =
+		(void *) PTR_ALIGN((u8 *)skcipher_request_ctx(req), align + 1);
+	struct skcipher_request *subreq = &rctx->subreq;
+	int ret = 0;
+	u8 *iv = req->iv;
+
+	dmreq = dmreq_of_req(tfm, req);
+
+	if (cd->iv_gen_ops)
+		ret = cd->iv_gen_ops->generator(cd, iv, dmreq);
+
+	if (ret < 0) {
+		pr_err("Error in generating IV ret: %d\n", ret);
+		goto end;
+	}
+
+	skcipher_request_set_tfm(subreq, child);
+	skcipher_request_set_callback(subreq, req->base.flags,
+				      geniv_async_done, req);
+	skcipher_request_set_crypt(subreq, req->src, req->dst,
+				   req->cryptlen, iv);
+
+	if (encrypt)
+		ret = crypto_skcipher_encrypt(subreq);
+	else
+		ret = crypto_skcipher_decrypt(subreq);
+
+	if (!ret && cd->iv_gen_ops && cd->iv_gen_ops->post)
+		ret = cd->iv_gen_ops->post(cd, iv, dmreq);
+
+end:
+	return ret;
+}
+
+static int crypto_geniv_encrypt(struct skcipher_request *req)
+{
+	return crypto_geniv_crypt(req, true);
+}
+
+static int crypto_geniv_decrypt(struct skcipher_request *req)
+{
+	return crypto_geniv_crypt(req, false);
+}
+
+static int crypto_geniv_init_tfm(struct crypto_skcipher *tfm)
+{
+	struct skcipher_instance *inst = skcipher_alg_instance(tfm);
+	struct crypto_skcipher_spawn *spawn = skcipher_instance_ctx(inst);
+	struct geniv_ctx *ctx = crypto_skcipher_ctx(tfm);
+	struct geniv_ctx_data *cd;
+	struct crypto_skcipher *cipher;
+	unsigned long align;
+	unsigned int reqsize, extrasize;
+
+	cipher = crypto_spawn_skcipher2(spawn);
+	if (IS_ERR(cipher))
+		return PTR_ERR(cipher);
+
+	ctx->child = cipher;
+
+	/* Setup the current cipher's request structure */
+	align = crypto_skcipher_alignmask(tfm);
+	align &= ~(crypto_tfm_ctx_alignment() - 1);
+	reqsize = align + sizeof(struct crypto_geniv_req_ctx) +
+		  crypto_skcipher_reqsize(cipher);
+	crypto_skcipher_set_reqsize(tfm, reqsize);
+
+	/* Set the current cipher's extra context parameters
+	 * Format of req structure, the context and the extra context
+	 * This is set by the caller of the cipher
+	 *   struct skcipher_request   --+
+	 *      context                  |   Request context
+	 *      padding                --+
+	 *   struct dm_crypt_request   --+
+	 *      padding                  |   Extra context
+	 *   IV                        --+
+	 */
+	cd = &ctx->data;
+	cd->dmoffset  = sizeof(struct skcipher_request);
+	cd->dmoffset += crypto_skcipher_reqsize(tfm);
+	cd->dmoffset  = ALIGN(cd->dmoffset,
+			__alignof__(struct dm_crypt_request));
+	extrasize = cd->dmoffset + sizeof(struct dm_crypt_request);
+
+	return 0;
+}
+
+static void crypto_geniv_exit_tfm(struct crypto_skcipher *tfm)
+{
+	struct geniv_ctx *ctx = crypto_skcipher_ctx(tfm);
+	struct geniv_ctx_data *cd = &ctx->data;
+
+	if (cd->iv_gen_ops && cd->iv_gen_ops->dtr)
+		cd->iv_gen_ops->dtr(cd);
+
+	crypto_free_skcipher(ctx->child);
+}
+
+static void crypto_geniv_free(struct skcipher_instance *inst)
+{
+	struct crypto_skcipher_spawn *spawn = skcipher_instance_ctx(inst);
+
+	crypto_drop_skcipher(spawn);
+	kfree(inst);
+}
+
+static int crypto_geniv_create(struct crypto_template *tmpl,
+				 struct rtattr **tb, char *algname)
+{
+	struct crypto_attr_type *algt;
+	struct skcipher_instance *inst;
+	struct skcipher_alg *alg;
+	struct crypto_skcipher_spawn *spawn;
+	const char *cipher_name;
+	int err;
+
+	algt = crypto_get_attr_type(tb);
+
+	if (IS_ERR(algt))
+		return PTR_ERR(algt);
+
+	if ((algt->type ^ CRYPTO_ALG_TYPE_SKCIPHER) & algt->mask)
+		return -EINVAL;
+
+	cipher_name = crypto_attr_alg_name(tb[1]);
+
+	if (IS_ERR(cipher_name))
+		return PTR_ERR(cipher_name);
+
+	inst = kzalloc(sizeof(*inst) + sizeof(*spawn), GFP_KERNEL);
+	if (!inst)
+		return -ENOMEM;
+
+	spawn = skcipher_instance_ctx(inst);
+
+	crypto_set_skcipher_spawn(spawn, skcipher_crypto_instance(inst));
+	err = crypto_grab_skcipher2(spawn, cipher_name, 0,
+				    crypto_requires_sync(algt->type,
+							 algt->mask));
+
+	if (err)
+		goto err_free_inst;
+
+	alg = crypto_spawn_skcipher_alg(spawn);
+
+	/* We only support 16-byte blocks. */
+	err = -EINVAL;
+	/*
+	 * if (crypto_skcipher_alg_ivsize(alg) != 16)
+	 *	goto err_drop_spawn;
+	 */
+
+	if (!is_power_of_2(alg->base.cra_blocksize))
+		goto err_drop_spawn;
+
+	err = -ENAMETOOLONG;
+	if (snprintf(inst->alg.base.cra_name, CRYPTO_MAX_ALG_NAME, "%s(%s)",
+		     algname, alg->base.cra_name) >= CRYPTO_MAX_ALG_NAME)
+		goto err_drop_spawn;
+	if (snprintf(inst->alg.base.cra_driver_name, CRYPTO_MAX_ALG_NAME,
+		     "%s(%s)", algname, alg->base.cra_driver_name) >=
+	    CRYPTO_MAX_ALG_NAME)
+		goto err_drop_spawn;
+
+	inst->alg.base.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER;
+	inst->alg.base.cra_priority = alg->base.cra_priority;
+	inst->alg.base.cra_blocksize = alg->base.cra_blocksize;
+	inst->alg.base.cra_alignmask = alg->base.cra_alignmask;
+	inst->alg.base.cra_flags = alg->base.cra_flags & CRYPTO_ALG_ASYNC;
+	inst->alg.ivsize = alg->base.cra_blocksize;
+	inst->alg.chunksize = crypto_skcipher_alg_chunksize(alg);
+	inst->alg.min_keysize = crypto_skcipher_alg_min_keysize(alg);
+	inst->alg.max_keysize = crypto_skcipher_alg_max_keysize(alg);
+
+	inst->alg.setkey = crypto_geniv_setkey;
+	inst->alg.set_ctx = crypto_geniv_set_ctx;
+	inst->alg.encrypt = crypto_geniv_encrypt;
+	inst->alg.decrypt = crypto_geniv_decrypt;
+
+	inst->alg.base.cra_ctxsize = sizeof(struct geniv_ctx);
+
+	inst->alg.init = crypto_geniv_init_tfm;
+	inst->alg.exit = crypto_geniv_exit_tfm;
+
+	inst->free = crypto_geniv_free;
+
+	err = skcipher_register_instance(tmpl, inst);
+	if (err)
+		goto err_drop_spawn;
+
+out:
+	return err;
+
+err_drop_spawn:
+	crypto_drop_skcipher(spawn);
+err_free_inst:
+	kfree(inst);
+	goto out;
+}
+
+static int crypto_plain_create(struct crypto_template *tmpl,
+				struct rtattr **tb)
+{
+	return crypto_geniv_create(tmpl, tb, "plain");
+}
+
+static int crypto_plain64_create(struct crypto_template *tmpl,
+				struct rtattr **tb)
+{
+	return crypto_geniv_create(tmpl, tb, "plain64");
+}
+
+static int crypto_essiv_create(struct crypto_template *tmpl,
+				struct rtattr **tb)
+{
+	return crypto_geniv_create(tmpl, tb, "essiv");
+}
+
+static int crypto_benbi_create(struct crypto_template *tmpl,
+				struct rtattr **tb)
+{
+	return crypto_geniv_create(tmpl, tb, "benbi");
+}
+
+static int crypto_null_create(struct crypto_template *tmpl,
+				struct rtattr **tb)
+{
+	return crypto_geniv_create(tmpl, tb, "null");
+}
+
+static int crypto_lmk_create(struct crypto_template *tmpl,
+				struct rtattr **tb)
+{
+	return crypto_geniv_create(tmpl, tb, "lmk");
+}
+
+static int crypto_tcw_create(struct crypto_template *tmpl,
+				struct rtattr **tb)
+{
+	return crypto_geniv_create(tmpl, tb, "tcw");
+}
+
+static struct crypto_template crypto_plain_tmpl = {
+	.name   = "plain",
+	.create = crypto_plain_create,
+	.module = THIS_MODULE,
+};
+
+static struct crypto_template crypto_plain64_tmpl = {
+	.name   = "plain64",
+	.create = crypto_plain64_create,
+	.module = THIS_MODULE,
+};
+
+static struct crypto_template crypto_essiv_tmpl = {
+	.name   = "essiv",
+	.create = crypto_essiv_create,
+	.module = THIS_MODULE,
+};
+
+static struct crypto_template crypto_benbi_tmpl = {
+	.name   = "benbi",
+	.create = crypto_benbi_create,
+	.module = THIS_MODULE,
+};
+
+static struct crypto_template crypto_null_tmpl = {
+	.name   = "null",
+	.create = crypto_null_create,
+	.module = THIS_MODULE,
+};
+
+static struct crypto_template crypto_lmk_tmpl = {
+	.name   = "lmk",
+	.create = crypto_lmk_create,
+	.module = THIS_MODULE,
+};
+
+static struct crypto_template crypto_tcw_tmpl = {
+	.name   = "tcw",
+	.create = crypto_tcw_create,
+	.module = THIS_MODULE,
+};
+
+static int __init crypto_geniv_module_init(void)
+{
+	int err;
+
+	err = crypto_register_template(&crypto_plain_tmpl);
+	if (err)
+		goto out;
+
+	err = crypto_register_template(&crypto_plain64_tmpl);
+	if (err)
+		goto out_undo_plain;
+
+	err = crypto_register_template(&crypto_essiv_tmpl);
+	if (err)
+		goto out_undo_plain64;
+
+	err = crypto_register_template(&crypto_benbi_tmpl);
+	if (err)
+		goto out_undo_essiv;
+
+	err = crypto_register_template(&crypto_null_tmpl);
+	if (err)
+		goto out_undo_benbi;
+
+	err = crypto_register_template(&crypto_lmk_tmpl);
+	if (err)
+		goto out_undo_null;
+
+	err = crypto_register_template(&crypto_tcw_tmpl);
+	if (!err)
+		goto out;
+
+	crypto_unregister_template(&crypto_lmk_tmpl);
+out_undo_null:
+	crypto_unregister_template(&crypto_null_tmpl);
+out_undo_benbi:
+	crypto_unregister_template(&crypto_benbi_tmpl);
+out_undo_essiv:
+	crypto_unregister_template(&crypto_essiv_tmpl);
+out_undo_plain64:
+	crypto_unregister_template(&crypto_plain64_tmpl);
+out_undo_plain:
+	crypto_unregister_template(&crypto_plain_tmpl);
+out:
+	return err;
+}
+
+static void __exit crypto_geniv_module_exit(void)
+{
+	crypto_unregister_template(&crypto_plain_tmpl);
+	crypto_unregister_template(&crypto_plain64_tmpl);
+	crypto_unregister_template(&crypto_essiv_tmpl);
+	crypto_unregister_template(&crypto_benbi_tmpl);
+	crypto_unregister_template(&crypto_null_tmpl);
+	crypto_unregister_template(&crypto_lmk_tmpl);
+	crypto_unregister_template(&crypto_tcw_tmpl);
+}
+
+module_init(crypto_geniv_module_init);
+module_exit(crypto_geniv_module_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("IV generation algorithms");
+MODULE_ALIAS_CRYPTO("geniv");
+
diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index a276883..05c2677 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -29,26 +29,13 @@
 #include <crypto/md5.h>
 #include <crypto/algapi.h>
 #include <crypto/skcipher.h>
+#include <crypto/geniv.h>
 
 #include <linux/device-mapper.h>
 
 #define DM_MSG_PREFIX "crypt"
 
 /*
- * context holding the current state of a multi-part conversion
- */
-struct convert_context {
-	struct completion restart;
-	struct bio *bio_in;
-	struct bio *bio_out;
-	struct bvec_iter iter_in;
-	struct bvec_iter iter_out;
-	sector_t cc_sector;
-	atomic_t cc_pending;
-	struct skcipher_request *req;
-};
-
-/*
  * per bio private data
  */
 struct dm_crypt_io {
@@ -65,13 +52,6 @@ struct dm_crypt_io {
 	struct rb_node rb_node;
 } CRYPTO_MINALIGN_ATTR;
 
-struct dm_crypt_request {
-	struct convert_context *ctx;
-	struct scatterlist sg_in;
-	struct scatterlist sg_out;
-	sector_t iv_sector;
-};
-
 struct crypt_config;
 
 struct crypt_iv_operations {
@@ -141,7 +121,6 @@ struct crypt_config {
 	char *cipher;
 	char *cipher_string;
 
-	struct crypt_iv_operations *iv_gen_ops;
 	union {
 		struct iv_essiv_private essiv;
 		struct iv_benbi_private benbi;
@@ -241,567 +220,6 @@ static struct crypto_skcipher *any_tfm(struct crypt_config *cc)
  * http://article.gmane.org/gmane.linux.kernel.device-mapper.dm-crypt/454
  */
 
-static int crypt_iv_plain_gen(struct crypt_config *cc, u8 *iv,
-			      struct dm_crypt_request *dmreq)
-{
-	memset(iv, 0, cc->iv_size);
-	*(__le32 *)iv = cpu_to_le32(dmreq->iv_sector & 0xffffffff);
-
-	return 0;
-}
-
-static int crypt_iv_plain64_gen(struct crypt_config *cc, u8 *iv,
-				struct dm_crypt_request *dmreq)
-{
-	memset(iv, 0, cc->iv_size);
-	*(__le64 *)iv = cpu_to_le64(dmreq->iv_sector);
-
-	return 0;
-}
-
-/* Initialise ESSIV - compute salt but no local memory allocations */
-static int crypt_iv_essiv_init(struct crypt_config *cc)
-{
-	struct iv_essiv_private *essiv = &cc->iv_gen_private.essiv;
-	AHASH_REQUEST_ON_STACK(req, essiv->hash_tfm);
-	struct scatterlist sg;
-	struct crypto_cipher *essiv_tfm;
-	int err;
-
-	sg_init_one(&sg, cc->key, cc->key_size);
-	ahash_request_set_tfm(req, essiv->hash_tfm);
-	ahash_request_set_callback(req, CRYPTO_TFM_REQ_MAY_SLEEP, NULL, NULL);
-	ahash_request_set_crypt(req, &sg, essiv->salt, cc->key_size);
-
-	err = crypto_ahash_digest(req);
-	ahash_request_zero(req);
-	if (err)
-		return err;
-
-	essiv_tfm = cc->iv_private;
-
-	err = crypto_cipher_setkey(essiv_tfm, essiv->salt,
-			    crypto_ahash_digestsize(essiv->hash_tfm));
-	if (err)
-		return err;
-
-	return 0;
-}
-
-/* Wipe salt and reset key derived from volume key */
-static int crypt_iv_essiv_wipe(struct crypt_config *cc)
-{
-	struct iv_essiv_private *essiv = &cc->iv_gen_private.essiv;
-	unsigned salt_size = crypto_ahash_digestsize(essiv->hash_tfm);
-	struct crypto_cipher *essiv_tfm;
-	int r, err = 0;
-
-	memset(essiv->salt, 0, salt_size);
-
-	essiv_tfm = cc->iv_private;
-	r = crypto_cipher_setkey(essiv_tfm, essiv->salt, salt_size);
-	if (r)
-		err = r;
-
-	return err;
-}
-
-/* Set up per cpu cipher state */
-static struct crypto_cipher *setup_essiv_cpu(struct crypt_config *cc,
-					     struct dm_target *ti,
-					     u8 *salt, unsigned saltsize)
-{
-	struct crypto_cipher *essiv_tfm;
-	int err;
-
-	/* Setup the essiv_tfm with the given salt */
-	essiv_tfm = crypto_alloc_cipher(cc->cipher, 0, CRYPTO_ALG_ASYNC);
-	if (IS_ERR(essiv_tfm)) {
-		ti->error = "Error allocating crypto tfm for ESSIV";
-		return essiv_tfm;
-	}
-
-	if (crypto_cipher_blocksize(essiv_tfm) !=
-	    crypto_skcipher_ivsize(any_tfm(cc))) {
-		ti->error = "Block size of ESSIV cipher does "
-			    "not match IV size of block cipher";
-		crypto_free_cipher(essiv_tfm);
-		return ERR_PTR(-EINVAL);
-	}
-
-	err = crypto_cipher_setkey(essiv_tfm, salt, saltsize);
-	if (err) {
-		ti->error = "Failed to set key for ESSIV cipher";
-		crypto_free_cipher(essiv_tfm);
-		return ERR_PTR(err);
-	}
-
-	return essiv_tfm;
-}
-
-static void crypt_iv_essiv_dtr(struct crypt_config *cc)
-{
-	struct crypto_cipher *essiv_tfm;
-	struct iv_essiv_private *essiv = &cc->iv_gen_private.essiv;
-
-	crypto_free_ahash(essiv->hash_tfm);
-	essiv->hash_tfm = NULL;
-
-	kzfree(essiv->salt);
-	essiv->salt = NULL;
-
-	essiv_tfm = cc->iv_private;
-
-	if (essiv_tfm)
-		crypto_free_cipher(essiv_tfm);
-
-	cc->iv_private = NULL;
-}
-
-static int crypt_iv_essiv_ctr(struct crypt_config *cc, struct dm_target *ti,
-			      const char *opts)
-{
-	struct crypto_cipher *essiv_tfm = NULL;
-	struct crypto_ahash *hash_tfm = NULL;
-	u8 *salt = NULL;
-	int err;
-
-	if (!opts) {
-		ti->error = "Digest algorithm missing for ESSIV mode";
-		return -EINVAL;
-	}
-
-	/* Allocate hash algorithm */
-	hash_tfm = crypto_alloc_ahash(opts, 0, CRYPTO_ALG_ASYNC);
-	if (IS_ERR(hash_tfm)) {
-		ti->error = "Error initializing ESSIV hash";
-		err = PTR_ERR(hash_tfm);
-		goto bad;
-	}
-
-	salt = kzalloc(crypto_ahash_digestsize(hash_tfm), GFP_KERNEL);
-	if (!salt) {
-		ti->error = "Error kmallocing salt storage in ESSIV";
-		err = -ENOMEM;
-		goto bad;
-	}
-
-	cc->iv_gen_private.essiv.salt = salt;
-	cc->iv_gen_private.essiv.hash_tfm = hash_tfm;
-
-	essiv_tfm = setup_essiv_cpu(cc, ti, salt,
-				crypto_ahash_digestsize(hash_tfm));
-	if (IS_ERR(essiv_tfm)) {
-		crypt_iv_essiv_dtr(cc);
-		return PTR_ERR(essiv_tfm);
-	}
-	cc->iv_private = essiv_tfm;
-
-	return 0;
-
-bad:
-	if (hash_tfm && !IS_ERR(hash_tfm))
-		crypto_free_ahash(hash_tfm);
-	kfree(salt);
-	return err;
-}
-
-static int crypt_iv_essiv_gen(struct crypt_config *cc, u8 *iv,
-			      struct dm_crypt_request *dmreq)
-{
-	struct crypto_cipher *essiv_tfm = cc->iv_private;
-
-	memset(iv, 0, cc->iv_size);
-	*(__le64 *)iv = cpu_to_le64(dmreq->iv_sector);
-	crypto_cipher_encrypt_one(essiv_tfm, iv, iv);
-
-	return 0;
-}
-
-static int crypt_iv_benbi_ctr(struct crypt_config *cc, struct dm_target *ti,
-			      const char *opts)
-{
-	unsigned bs = crypto_skcipher_blocksize(any_tfm(cc));
-	int log = ilog2(bs);
-
-	/* we need to calculate how far we must shift the sector count
-	 * to get the cipher block count, we use this shift in _gen */
-
-	if (1 << log != bs) {
-		ti->error = "cypher blocksize is not a power of 2";
-		return -EINVAL;
-	}
-
-	if (log > 9) {
-		ti->error = "cypher blocksize is > 512";
-		return -EINVAL;
-	}
-
-	cc->iv_gen_private.benbi.shift = 9 - log;
-
-	return 0;
-}
-
-static void crypt_iv_benbi_dtr(struct crypt_config *cc)
-{
-}
-
-static int crypt_iv_benbi_gen(struct crypt_config *cc, u8 *iv,
-			      struct dm_crypt_request *dmreq)
-{
-	__be64 val;
-
-	memset(iv, 0, cc->iv_size - sizeof(u64)); /* rest is cleared below */
-
-	val = cpu_to_be64(((u64)dmreq->iv_sector << cc->iv_gen_private.benbi.shift) + 1);
-	put_unaligned(val, (__be64 *)(iv + cc->iv_size - sizeof(u64)));
-
-	return 0;
-}
-
-static int crypt_iv_null_gen(struct crypt_config *cc, u8 *iv,
-			     struct dm_crypt_request *dmreq)
-{
-	memset(iv, 0, cc->iv_size);
-
-	return 0;
-}
-
-static void crypt_iv_lmk_dtr(struct crypt_config *cc)
-{
-	struct iv_lmk_private *lmk = &cc->iv_gen_private.lmk;
-
-	if (lmk->hash_tfm && !IS_ERR(lmk->hash_tfm))
-		crypto_free_shash(lmk->hash_tfm);
-	lmk->hash_tfm = NULL;
-
-	kzfree(lmk->seed);
-	lmk->seed = NULL;
-}
-
-static int crypt_iv_lmk_ctr(struct crypt_config *cc, struct dm_target *ti,
-			    const char *opts)
-{
-	struct iv_lmk_private *lmk = &cc->iv_gen_private.lmk;
-
-	lmk->hash_tfm = crypto_alloc_shash("md5", 0, 0);
-	if (IS_ERR(lmk->hash_tfm)) {
-		ti->error = "Error initializing LMK hash";
-		return PTR_ERR(lmk->hash_tfm);
-	}
-
-	/* No seed in LMK version 2 */
-	if (cc->key_parts == cc->tfms_count) {
-		lmk->seed = NULL;
-		return 0;
-	}
-
-	lmk->seed = kzalloc(LMK_SEED_SIZE, GFP_KERNEL);
-	if (!lmk->seed) {
-		crypt_iv_lmk_dtr(cc);
-		ti->error = "Error kmallocing seed storage in LMK";
-		return -ENOMEM;
-	}
-
-	return 0;
-}
-
-static int crypt_iv_lmk_init(struct crypt_config *cc)
-{
-	struct iv_lmk_private *lmk = &cc->iv_gen_private.lmk;
-	int subkey_size = cc->key_size / cc->key_parts;
-
-	/* LMK seed is on the position of LMK_KEYS + 1 key */
-	if (lmk->seed)
-		memcpy(lmk->seed, cc->key + (cc->tfms_count * subkey_size),
-		       crypto_shash_digestsize(lmk->hash_tfm));
-
-	return 0;
-}
-
-static int crypt_iv_lmk_wipe(struct crypt_config *cc)
-{
-	struct iv_lmk_private *lmk = &cc->iv_gen_private.lmk;
-
-	if (lmk->seed)
-		memset(lmk->seed, 0, LMK_SEED_SIZE);
-
-	return 0;
-}
-
-static int crypt_iv_lmk_one(struct crypt_config *cc, u8 *iv,
-			    struct dm_crypt_request *dmreq,
-			    u8 *data)
-{
-	struct iv_lmk_private *lmk = &cc->iv_gen_private.lmk;
-	SHASH_DESC_ON_STACK(desc, lmk->hash_tfm);
-	struct md5_state md5state;
-	__le32 buf[4];
-	int i, r;
-
-	desc->tfm = lmk->hash_tfm;
-	desc->flags = CRYPTO_TFM_REQ_MAY_SLEEP;
-
-	r = crypto_shash_init(desc);
-	if (r)
-		return r;
-
-	if (lmk->seed) {
-		r = crypto_shash_update(desc, lmk->seed, LMK_SEED_SIZE);
-		if (r)
-			return r;
-	}
-
-	/* Sector is always 512B, block size 16, add data of blocks 1-31 */
-	r = crypto_shash_update(desc, data + 16, 16 * 31);
-	if (r)
-		return r;
-
-	/* Sector is cropped to 56 bits here */
-	buf[0] = cpu_to_le32(dmreq->iv_sector & 0xFFFFFFFF);
-	buf[1] = cpu_to_le32((((u64)dmreq->iv_sector >> 32) & 0x00FFFFFF) | 0x80000000);
-	buf[2] = cpu_to_le32(4024);
-	buf[3] = 0;
-	r = crypto_shash_update(desc, (u8 *)buf, sizeof(buf));
-	if (r)
-		return r;
-
-	/* No MD5 padding here */
-	r = crypto_shash_export(desc, &md5state);
-	if (r)
-		return r;
-
-	for (i = 0; i < MD5_HASH_WORDS; i++)
-		__cpu_to_le32s(&md5state.hash[i]);
-	memcpy(iv, &md5state.hash, cc->iv_size);
-
-	return 0;
-}
-
-static int crypt_iv_lmk_gen(struct crypt_config *cc, u8 *iv,
-			    struct dm_crypt_request *dmreq)
-{
-	u8 *src;
-	int r = 0;
-
-	if (bio_data_dir(dmreq->ctx->bio_in) == WRITE) {
-		src = kmap_atomic(sg_page(&dmreq->sg_in));
-		r = crypt_iv_lmk_one(cc, iv, dmreq, src + dmreq->sg_in.offset);
-		kunmap_atomic(src);
-	} else
-		memset(iv, 0, cc->iv_size);
-
-	return r;
-}
-
-static int crypt_iv_lmk_post(struct crypt_config *cc, u8 *iv,
-			     struct dm_crypt_request *dmreq)
-{
-	u8 *dst;
-	int r;
-
-	if (bio_data_dir(dmreq->ctx->bio_in) == WRITE)
-		return 0;
-
-	dst = kmap_atomic(sg_page(&dmreq->sg_out));
-	r = crypt_iv_lmk_one(cc, iv, dmreq, dst + dmreq->sg_out.offset);
-
-	/* Tweak the first block of plaintext sector */
-	if (!r)
-		crypto_xor(dst + dmreq->sg_out.offset, iv, cc->iv_size);
-
-	kunmap_atomic(dst);
-	return r;
-}
-
-static void crypt_iv_tcw_dtr(struct crypt_config *cc)
-{
-	struct iv_tcw_private *tcw = &cc->iv_gen_private.tcw;
-
-	kzfree(tcw->iv_seed);
-	tcw->iv_seed = NULL;
-	kzfree(tcw->whitening);
-	tcw->whitening = NULL;
-
-	if (tcw->crc32_tfm && !IS_ERR(tcw->crc32_tfm))
-		crypto_free_shash(tcw->crc32_tfm);
-	tcw->crc32_tfm = NULL;
-}
-
-static int crypt_iv_tcw_ctr(struct crypt_config *cc, struct dm_target *ti,
-			    const char *opts)
-{
-	struct iv_tcw_private *tcw = &cc->iv_gen_private.tcw;
-
-	if (cc->key_size <= (cc->iv_size + TCW_WHITENING_SIZE)) {
-		ti->error = "Wrong key size for TCW";
-		return -EINVAL;
-	}
-
-	tcw->crc32_tfm = crypto_alloc_shash("crc32", 0, 0);
-	if (IS_ERR(tcw->crc32_tfm)) {
-		ti->error = "Error initializing CRC32 in TCW";
-		return PTR_ERR(tcw->crc32_tfm);
-	}
-
-	tcw->iv_seed = kzalloc(cc->iv_size, GFP_KERNEL);
-	tcw->whitening = kzalloc(TCW_WHITENING_SIZE, GFP_KERNEL);
-	if (!tcw->iv_seed || !tcw->whitening) {
-		crypt_iv_tcw_dtr(cc);
-		ti->error = "Error allocating seed storage in TCW";
-		return -ENOMEM;
-	}
-
-	return 0;
-}
-
-static int crypt_iv_tcw_init(struct crypt_config *cc)
-{
-	struct iv_tcw_private *tcw = &cc->iv_gen_private.tcw;
-	int key_offset = cc->key_size - cc->iv_size - TCW_WHITENING_SIZE;
-
-	memcpy(tcw->iv_seed, &cc->key[key_offset], cc->iv_size);
-	memcpy(tcw->whitening, &cc->key[key_offset + cc->iv_size],
-	       TCW_WHITENING_SIZE);
-
-	return 0;
-}
-
-static int crypt_iv_tcw_wipe(struct crypt_config *cc)
-{
-	struct iv_tcw_private *tcw = &cc->iv_gen_private.tcw;
-
-	memset(tcw->iv_seed, 0, cc->iv_size);
-	memset(tcw->whitening, 0, TCW_WHITENING_SIZE);
-
-	return 0;
-}
-
-static int crypt_iv_tcw_whitening(struct crypt_config *cc,
-				  struct dm_crypt_request *dmreq,
-				  u8 *data)
-{
-	struct iv_tcw_private *tcw = &cc->iv_gen_private.tcw;
-	__le64 sector = cpu_to_le64(dmreq->iv_sector);
-	u8 buf[TCW_WHITENING_SIZE];
-	SHASH_DESC_ON_STACK(desc, tcw->crc32_tfm);
-	int i, r;
-
-	/* xor whitening with sector number */
-	memcpy(buf, tcw->whitening, TCW_WHITENING_SIZE);
-	crypto_xor(buf, (u8 *)&sector, 8);
-	crypto_xor(&buf[8], (u8 *)&sector, 8);
-
-	/* calculate crc32 for every 32bit part and xor it */
-	desc->tfm = tcw->crc32_tfm;
-	desc->flags = CRYPTO_TFM_REQ_MAY_SLEEP;
-	for (i = 0; i < 4; i++) {
-		r = crypto_shash_init(desc);
-		if (r)
-			goto out;
-		r = crypto_shash_update(desc, &buf[i * 4], 4);
-		if (r)
-			goto out;
-		r = crypto_shash_final(desc, &buf[i * 4]);
-		if (r)
-			goto out;
-	}
-	crypto_xor(&buf[0], &buf[12], 4);
-	crypto_xor(&buf[4], &buf[8], 4);
-
-	/* apply whitening (8 bytes) to whole sector */
-	for (i = 0; i < ((1 << SECTOR_SHIFT) / 8); i++)
-		crypto_xor(data + i * 8, buf, 8);
-out:
-	memzero_explicit(buf, sizeof(buf));
-	return r;
-}
-
-static int crypt_iv_tcw_gen(struct crypt_config *cc, u8 *iv,
-			    struct dm_crypt_request *dmreq)
-{
-	struct iv_tcw_private *tcw = &cc->iv_gen_private.tcw;
-	__le64 sector = cpu_to_le64(dmreq->iv_sector);
-	u8 *src;
-	int r = 0;
-
-	/* Remove whitening from ciphertext */
-	if (bio_data_dir(dmreq->ctx->bio_in) != WRITE) {
-		src = kmap_atomic(sg_page(&dmreq->sg_in));
-		r = crypt_iv_tcw_whitening(cc, dmreq, src + dmreq->sg_in.offset);
-		kunmap_atomic(src);
-	}
-
-	/* Calculate IV */
-	memcpy(iv, tcw->iv_seed, cc->iv_size);
-	crypto_xor(iv, (u8 *)&sector, 8);
-	if (cc->iv_size > 8)
-		crypto_xor(&iv[8], (u8 *)&sector, cc->iv_size - 8);
-
-	return r;
-}
-
-static int crypt_iv_tcw_post(struct crypt_config *cc, u8 *iv,
-			     struct dm_crypt_request *dmreq)
-{
-	u8 *dst;
-	int r;
-
-	if (bio_data_dir(dmreq->ctx->bio_in) != WRITE)
-		return 0;
-
-	/* Apply whitening on ciphertext */
-	dst = kmap_atomic(sg_page(&dmreq->sg_out));
-	r = crypt_iv_tcw_whitening(cc, dmreq, dst + dmreq->sg_out.offset);
-	kunmap_atomic(dst);
-
-	return r;
-}
-
-static struct crypt_iv_operations crypt_iv_plain_ops = {
-	.generator = crypt_iv_plain_gen
-};
-
-static struct crypt_iv_operations crypt_iv_plain64_ops = {
-	.generator = crypt_iv_plain64_gen
-};
-
-static struct crypt_iv_operations crypt_iv_essiv_ops = {
-	.ctr       = crypt_iv_essiv_ctr,
-	.dtr       = crypt_iv_essiv_dtr,
-	.init      = crypt_iv_essiv_init,
-	.wipe      = crypt_iv_essiv_wipe,
-	.generator = crypt_iv_essiv_gen
-};
-
-static struct crypt_iv_operations crypt_iv_benbi_ops = {
-	.ctr	   = crypt_iv_benbi_ctr,
-	.dtr	   = crypt_iv_benbi_dtr,
-	.generator = crypt_iv_benbi_gen
-};
-
-static struct crypt_iv_operations crypt_iv_null_ops = {
-	.generator = crypt_iv_null_gen
-};
-
-static struct crypt_iv_operations crypt_iv_lmk_ops = {
-	.ctr	   = crypt_iv_lmk_ctr,
-	.dtr	   = crypt_iv_lmk_dtr,
-	.init	   = crypt_iv_lmk_init,
-	.wipe	   = crypt_iv_lmk_wipe,
-	.generator = crypt_iv_lmk_gen,
-	.post	   = crypt_iv_lmk_post
-};
-
-static struct crypt_iv_operations crypt_iv_tcw_ops = {
-	.ctr	   = crypt_iv_tcw_ctr,
-	.dtr	   = crypt_iv_tcw_dtr,
-	.init	   = crypt_iv_tcw_init,
-	.wipe	   = crypt_iv_tcw_wipe,
-	.generator = crypt_iv_tcw_gen,
-	.post	   = crypt_iv_tcw_post
-};
-
 static void crypt_convert_init(struct crypt_config *cc,
 			       struct convert_context *ctx,
 			       struct bio *bio_out, struct bio *bio_in,
@@ -862,12 +280,6 @@ static int crypt_convert_block(struct crypt_config *cc,
 	bio_advance_iter(ctx->bio_in, &ctx->iter_in, 1 << SECTOR_SHIFT);
 	bio_advance_iter(ctx->bio_out, &ctx->iter_out, 1 << SECTOR_SHIFT);
 
-	if (cc->iv_gen_ops) {
-		r = cc->iv_gen_ops->generator(cc, iv, dmreq);
-		if (r < 0)
-			return r;
-	}
-
 	skcipher_request_set_crypt(req, &dmreq->sg_in, &dmreq->sg_out,
 				   1 << SECTOR_SHIFT, iv);
 
@@ -876,9 +288,6 @@ static int crypt_convert_block(struct crypt_config *cc,
 	else
 		r = crypto_skcipher_decrypt(req);
 
-	if (!r && cc->iv_gen_ops && cc->iv_gen_ops->post)
-		r = cc->iv_gen_ops->post(cc, iv, dmreq);
-
 	return r;
 }
 
@@ -1363,19 +772,6 @@ static void kcryptd_async_done(struct crypto_async_request *async_req,
 	struct dm_crypt_io *io = container_of(ctx, struct dm_crypt_io, ctx);
 	struct crypt_config *cc = io->cc;
 
-	/*
-	 * A request from crypto driver backlog is going to be processed now,
-	 * finish the completion and continue in crypt_convert().
-	 * (Callback will be called for the second time for this request.)
-	 */
-	if (error == -EINPROGRESS) {
-		complete(&ctx->restart);
-		return;
-	}
-
-	if (!error && cc->iv_gen_ops && cc->iv_gen_ops->post)
-		error = cc->iv_gen_ops->post(cc, iv_of_dmreq(cc, dmreq), dmreq);
-
 	if (error < 0)
 		io->error = -EIO;
 
@@ -1517,6 +913,39 @@ static int crypt_set_key(struct crypt_config *cc, char *key)
 	return r;
 }
 
+static void crypt_init_context(struct dm_target *ti, char *key,
+			      struct crypto_skcipher *tfm,
+			      char *ivmode, char *ivopts)
+{
+	struct crypt_config *cc = ti->private;
+	struct geniv_ctx *ctx = (struct geniv_ctx *) (tfm + 1);
+
+	ctx->data.iv_size = crypto_skcipher_ivsize(tfm);
+	ctx->data.cipher = cc->cipher;
+	ctx->data.ivmode = ivmode;
+	ctx->data.tfms_count = cc->tfms_count;
+	ctx->data.tfm = tfm;
+	ctx->data.ivopts = ivopts;
+	ctx->data.key_size = cc->key_size;
+	ctx->data.key_parts = cc->key_parts;
+	ctx->data.key = cc->key;
+}
+
+static int crypt_init_all_cpus(struct dm_target *ti, char *key,
+			       char *ivmode, char *ivopts)
+{
+	struct crypt_config *cc = ti->private;
+	int ret, i;
+
+	for (i = 0; i < cc->tfms_count; i++)
+		crypt_init_context(ti, key, cc->tfms[i], ivmode, ivopts);
+
+	ret = crypt_set_key(cc, key);
+	if (ret < 0)
+		ti->error = "Error decoding and setting key";
+	return ret;
+}
+
 static int crypt_wipe_key(struct crypt_config *cc)
 {
 	clear_bit(DM_CRYPT_KEY_VALID, &cc->flags);
@@ -1550,9 +979,6 @@ static void crypt_dtr(struct dm_target *ti)
 	mempool_destroy(cc->page_pool);
 	mempool_destroy(cc->req_pool);
 
-	if (cc->iv_gen_ops && cc->iv_gen_ops->dtr)
-		cc->iv_gen_ops->dtr(cc);
-
 	if (cc->dev)
 		dm_put_device(ti, cc->dev);
 
@@ -1629,8 +1055,14 @@ static int crypt_ctr_cipher(struct dm_target *ti,
 	if (!cipher_api)
 		goto bad_mem;
 
-	ret = snprintf(cipher_api, CRYPTO_MAX_ALG_NAME,
-		       "%s(%s)", chainmode, cipher);
+create_cipher:
+	/* Call underlying cipher directly if it does not support iv */
+	if (ivmode)
+		ret = snprintf(cipher_api, CRYPTO_MAX_ALG_NAME, "%s(%s(%s))",
+				ivmode, chainmode, cipher);
+	else
+		ret = snprintf(cipher_api, CRYPTO_MAX_ALG_NAME, "%s(%s)",
+				chainmode, cipher);
 	if (ret < 0) {
 		kfree(cipher_api);
 		goto bad_mem;
@@ -1652,23 +1084,10 @@ static int crypt_ctr_cipher(struct dm_target *ti,
 	else if (ivmode) {
 		DMWARN("Selected cipher does not support IVs");
 		ivmode = NULL;
+		goto create_cipher;
 	}
 
-	/* Choose ivmode, see comments at iv code. */
-	if (ivmode == NULL)
-		cc->iv_gen_ops = NULL;
-	else if (strcmp(ivmode, "plain") == 0)
-		cc->iv_gen_ops = &crypt_iv_plain_ops;
-	else if (strcmp(ivmode, "plain64") == 0)
-		cc->iv_gen_ops = &crypt_iv_plain64_ops;
-	else if (strcmp(ivmode, "essiv") == 0)
-		cc->iv_gen_ops = &crypt_iv_essiv_ops;
-	else if (strcmp(ivmode, "benbi") == 0)
-		cc->iv_gen_ops = &crypt_iv_benbi_ops;
-	else if (strcmp(ivmode, "null") == 0)
-		cc->iv_gen_ops = &crypt_iv_null_ops;
-	else if (strcmp(ivmode, "lmk") == 0) {
-		cc->iv_gen_ops = &crypt_iv_lmk_ops;
+	if (strcmp(ivmode, "lmk") == 0) {
 		/*
 		 * Version 2 and 3 is recognised according
 		 * to length of provided multi-key string.
@@ -1680,39 +1099,14 @@ static int crypt_ctr_cipher(struct dm_target *ti,
 			cc->key_extra_size = cc->key_size / cc->key_parts;
 		}
 	} else if (strcmp(ivmode, "tcw") == 0) {
-		cc->iv_gen_ops = &crypt_iv_tcw_ops;
 		cc->key_parts += 2; /* IV + whitening */
 		cc->key_extra_size = cc->iv_size + TCW_WHITENING_SIZE;
-	} else {
-		ret = -EINVAL;
-		ti->error = "Invalid IV mode";
-		goto bad;
 	}
 
 	/* Initialize and set key */
-	ret = crypt_set_key(cc, key);
-	if (ret < 0) {
-		ti->error = "Error decoding and setting key";
+	ret = crypt_init_all_cpus(ti, key, ivmode, ivopts);
+	if (ret < 0)
 		goto bad;
-	}
-
-	/* Allocate IV */
-	if (cc->iv_gen_ops && cc->iv_gen_ops->ctr) {
-		ret = cc->iv_gen_ops->ctr(cc, ti, ivopts);
-		if (ret < 0) {
-			ti->error = "Error creating IV";
-			goto bad;
-		}
-	}
-
-	/* Initialize IV (set keys for ESSIV etc) */
-	if (cc->iv_gen_ops && cc->iv_gen_ops->init) {
-		ret = cc->iv_gen_ops->init(cc);
-		if (ret < 0) {
-			ti->error = "Error initialising IV";
-			goto bad;
-		}
-	}
 
 	ret = 0;
 bad:
@@ -2007,6 +1401,18 @@ static void crypt_resume(struct dm_target *ti)
 	clear_bit(DM_CRYPT_SUSPENDED, &cc->flags);
 }
 
+static void crypt_setkey_op_allcpus(struct crypt_config *cc,
+				    enum setkey_op keyop)
+{
+	int i;
+	struct geniv_ctx *ctx;
+
+	for (i = 0; i < cc->tfms_count; i++) {
+		ctx = (struct geniv_ctx *) (cc->tfms[i] + 1);
+		ctx->data.keyop = keyop;
+	}
+}
+
 /* Message interface
  *	key set <key>
  *	key wipe
@@ -2014,7 +1420,6 @@ static void crypt_resume(struct dm_target *ti)
 static int crypt_message(struct dm_target *ti, unsigned argc, char **argv)
 {
 	struct crypt_config *cc = ti->private;
-	int ret = -EINVAL;
 
 	if (argc < 2)
 		goto error;
@@ -2025,19 +1430,11 @@ static int crypt_message(struct dm_target *ti, unsigned argc, char **argv)
 			return -EINVAL;
 		}
 		if (argc == 3 && !strcasecmp(argv[1], "set")) {
-			ret = crypt_set_key(cc, argv[2]);
-			if (ret)
-				return ret;
-			if (cc->iv_gen_ops && cc->iv_gen_ops->init)
-				ret = cc->iv_gen_ops->init(cc);
-			return ret;
+			crypt_setkey_op_allcpus(cc, SETKEY_OP_SET);
+			return crypt_set_key(cc, argv[2]);
 		}
 		if (argc == 2 && !strcasecmp(argv[1], "wipe")) {
-			if (cc->iv_gen_ops && cc->iv_gen_ops->wipe) {
-				ret = cc->iv_gen_ops->wipe(cc);
-				if (ret)
-					return ret;
-			}
+			crypt_setkey_op_allcpus(cc, SETKEY_OP_WIPE);
 			return crypt_wipe_key(cc);
 		}
 	}
diff --git a/include/crypto/geniv.h b/include/crypto/geniv.h
new file mode 100644
index 0000000..1325843
--- /dev/null
+++ b/include/crypto/geniv.h
@@ -0,0 +1,109 @@
+/*
+ * geniv: common data structures for IV generation algorithms
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the Free
+ * Software Foundation; either version 2 of the License, or (at your option)
+ * any later version.
+ *
+ */
+#ifndef _CRYPTO_GENIV_
+#define _CRYPTO_GENIV_
+
+#define SECTOR_SHIFT            9
+
+struct geniv_essiv_private {
+	struct crypto_ahash *hash_tfm;
+	u8 *salt;
+};
+
+struct geniv_benbi_private {
+	int shift;
+};
+
+#define LMK_SEED_SIZE 64 /* hash + 0 */
+struct geniv_lmk_private {
+	struct crypto_shash *hash_tfm;
+	u8 *seed;
+};
+
+#define TCW_WHITENING_SIZE 16
+struct geniv_tcw_private {
+	struct crypto_shash *crc32_tfm;
+	u8 *iv_seed;
+	u8 *whitening;
+};
+
+enum setkey_op {
+	SETKEY_OP_INIT,
+	SETKEY_OP_SET,
+	SETKEY_OP_WIPE,
+};
+
+/*
+ * context holding the current state of a multi-part conversion
+ */
+struct convert_context {
+	struct completion restart;
+	struct bio *bio_in;
+	struct bio *bio_out;
+	struct bvec_iter iter_in;
+	struct bvec_iter iter_out;
+	sector_t cc_sector;
+	atomic_t cc_pending;
+	struct skcipher_request *req;
+};
+
+struct dm_crypt_request {
+	struct convert_context *ctx;
+	struct scatterlist sg_in;
+	struct scatterlist sg_out;
+	sector_t iv_sector;
+};
+
+
+struct geniv_ctx_data;
+
+struct geniv_operations {
+	int (*ctr)(struct geniv_ctx_data *cd);
+	void (*dtr)(struct geniv_ctx_data *cd);
+	int (*init)(struct geniv_ctx_data *cd);
+	int (*wipe)(struct geniv_ctx_data *cd);
+	int (*generator)(struct geniv_ctx_data *req, u8 *iv,
+			 struct dm_crypt_request *dmreq);
+	int (*post)(struct geniv_ctx_data *cd, u8 *iv,
+			 struct dm_crypt_request *dmreq);
+};
+
+struct geniv_ctx_data {
+	unsigned int tfms_count;
+	char *ivmode;
+	unsigned int iv_size;
+	char *ivopts;
+	unsigned int dmoffset;
+
+	char *cipher;
+	struct geniv_operations *iv_gen_ops;
+	union {
+		struct geniv_essiv_private essiv;
+		struct geniv_benbi_private benbi;
+		struct geniv_lmk_private lmk;
+		struct geniv_tcw_private tcw;
+	} iv_gen_private;
+	void *iv_private;
+	struct crypto_skcipher *tfm;
+	unsigned int key_size;
+	unsigned int key_extra_size;
+	unsigned int key_parts;      /* independent parts in key buffer */
+	enum setkey_op keyop;
+	char *msg;
+	u8 *key;
+};
+
+struct geniv_ctx {
+	struct crypto_skcipher *child;
+	struct geniv_ctx_data data;
+};
+
+#endif
+
diff --git a/include/crypto/skcipher.h b/include/crypto/skcipher.h
index cc4d98a..290c848 100644
--- a/include/crypto/skcipher.h
+++ b/include/crypto/skcipher.h
@@ -122,6 +122,8 @@ struct crypto_skcipher {
 struct skcipher_alg {
 	int (*setkey)(struct crypto_skcipher *tfm, const u8 *key,
 	              unsigned int keylen);
+	int (*set_ctx)(struct crypto_skcipher *tfm, void *ctx,
+		       unsigned int len);
 	int (*encrypt)(struct skcipher_request *req);
 	int (*decrypt)(struct skcipher_request *req);
 	int (*init)(struct crypto_skcipher *tfm);
@@ -366,6 +368,21 @@ static inline int crypto_skcipher_setkey(struct crypto_skcipher *tfm,
 {
 	return tfm->setkey(tfm, key, keylen);
 }
+/**
+ * crypto_skcipher_set_ctx() - set initial context for cipher
+ * @tfm: cipher handle
+ * @ctx: buffer holding the context data
+ * @len: length of the context data structure
+ *
+ */
+static inline void crypto_skcipher_set_ctx(struct crypto_skcipher *tfm,
+					 void *ctx, unsigned int len)
+{
+	struct skcipher_alg *alg = crypto_skcipher_alg(tfm);
+
+	alg->set_ctx(tfm, ctx, len);
+}
+
 
 static inline bool crypto_skcipher_has_setkey(struct crypto_skcipher *tfm)
 {
-- 
Binoy Jayan

^ permalink raw reply related

* Re: [PATCH] crypto: CTR DRBG - advance output buffer pointer
From: Herbert Xu @ 2016-11-21 14:55 UTC (permalink / raw)
  To: Stephan Mueller; +Cc: linux-crypto
In-Reply-To: <18729386.1pHbKfYFYP@positron.chronox.de>

On Fri, Nov 18, 2016 at 12:27:56PM +0100, Stephan Mueller wrote:
> The CTR DRBG segments the number of random bytes to be generated into
> 128 byte blocks. The current code misses the advancement of the output
> buffer pointer when the requestor asks for more than 128 bytes of data.
> In this case, the next 128 byte block of random numbers is copied to
> the beginning of the output buffer again. This implies that only the
> first 128 bytes of the output buffer would ever be filled.
> 
> The patch adds the advancement of the buffer pointer to fill the entire
> buffer.
> 
> Signed-off-by: Stephan Mueller <smueller@chronox.de>

Patch applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH] hw_random: Make explicit that max >= 32 always
From: Herbert Xu @ 2016-11-21 14:55 UTC (permalink / raw)
  To: PrasannaKumar Muralidharan; +Cc: mpm, daniel.thompson, linux-crypto
In-Reply-To: <20161118173010.5448-1-prasannatsmkumar@gmail.com>

On Fri, Nov 18, 2016 at 11:00:10PM +0530, PrasannaKumar Muralidharan wrote:
> As hw_random core calls ->read with max > 32 or more, make it explicit.
> Also remove checks involving 'max' being less than 8.
> 
> Signed-off-by: PrasannaKumar Muralidharan <prasannatsmkumar@gmail.com>

Patch applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox