Linux cryptographic layer development

Linux cryptographic layer development
 help / color / mirror / Atom feed

* Re: [PATCH v2 2/2] crypto: mediatek - add DT bindings documentation
From: Rob Herring @ 2016-12-13 20:06 UTC (permalink / raw)
  To: Ryder Lee
  Cc: Herbert Xu, David S. Miller, Matthias Brugger, devicetree,
	linux-mediatek, linux-kernel, linux-crypto, linux-arm-kernel,
	Sean Wang, Roy Luo
In-Reply-To: <1481592676-2248-3-git-send-email-ryder.lee@mediatek.com>

On Tue, Dec 13, 2016 at 09:31:16AM +0800, Ryder Lee wrote:
> Add DT bindings documentation for the crypto driver
> 
> Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
> ---
>  .../devicetree/bindings/crypto/mediatek-crypto.txt | 32 ++++++++++++++++++++++
>  1 file changed, 32 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/crypto/mediatek-crypto.txt
> 
> diff --git a/Documentation/devicetree/bindings/crypto/mediatek-crypto.txt b/Documentation/devicetree/bindings/crypto/mediatek-crypto.txt
> new file mode 100644
> index 0000000..47a786e
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/crypto/mediatek-crypto.txt
> @@ -0,0 +1,32 @@
> +MediaTek cryptographic accelerators
> +
> +Required properties:
> +- compatible: Should be "mediatek,eip97-crypto"
> +- reg: Address and length of the register set for the device
> +- interrupts: Should contain the five crypto engines interrupts in numeric
> +	order. These are global system and four descriptor rings.
> +- clocks: the clock used by the core
> +- clock-names: the names of the clock listed in the clocks property. These are
> +	"ethif", "cryp"
> +- power-domains: Must contain a reference to the PM domain.
> +
> +
> +Optional properties:
> +- interrupt-parent: Should be the phandle for the interrupt controller
> +  that services interrupts for this device

This is not optional. It's perhaps inherited from the parent. You can 
drop it as it's implied by interrupts property.

Rob

^ permalink raw reply

* Re: [PATCH] keys/encrypted: Fix two crypto-on-the-stack bugs
From: David Howells @ 2016-12-13 20:02 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: dhowells, David Laight, Joerg Roedel, David Woodhouse,
	Linus Torvalds, Ingo Molnar, Andy Lutomirski,
	linux-kernel@vger.kernel.org, linux-usb@vger.kernel.org,
	keyrings@vger.kernel.org, Eric Biggers,
	linux-crypto@vger.kernel.org, Herbert Xu, Stephan Mueller
In-Reply-To: <CALCETrVbEhxRWFmgrePeLriQU5J6ZZaN15tTAYja1hJwYFrRpg@mail.gmail.com>

Andy Lutomirski <luto@amacapital.net> wrote:

> I don't know whether you're right, but that sounds a bit silly to me.
> This is a *tiny* amount of memory.

Assuming a 1MiB kernel image in 4K pages, that gets you back a couple of pages
I think - useful if you've only got a few MiB of RAM.

David

^ permalink raw reply

* Re: [PATCH v3] siphash: add cryptographically secure hashtable function
From: Linus Torvalds @ 2016-12-13 19:26 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Jason A. Donenfeld, kernel-hardening@lists.openwall.com, LKML,
	Linux Crypto Mailing List, George Spelvin, Scott Bauer,
	Andi Kleen, Andy Lutomirski, Greg KH, Jean-Philippe Aumasson,
	Daniel J . Bernstein
In-Reply-To: <20161213083948.GA8994@zzz>

On Tue, Dec 13, 2016 at 12:39 AM, Eric Biggers <ebiggers3@gmail.com> wrote:
>
> Hmm, I don't think you can really do load_unaligned_zeropad() without first
> checking for 'left != 0'.

Right you are. If the allocation is at the end of a page, the 0-size
case would be entirely outside the page and there's no fixup.

Of course, that never happens in normal code, but DEBUG_PAGE_ALLOC can
trigger it.

              Linus

^ permalink raw reply

* Re: [PATCH v3] siphash: add cryptographically secure hashtable function
From: Linus Torvalds @ 2016-12-13 19:25 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Andi Kleen, kernel-hardening@lists.openwall.com, LKML,
	Linux Crypto Mailing List, George Spelvin, Scott Bauer,
	Andy Lutomirski, Greg KH, Eric Biggers, Jean-Philippe Aumasson,
	Daniel J . Bernstein
In-Reply-To: <CAHmME9qk+Z8CdhTFQnTkwT8S-n56tzi77-uAV3dNigcGMQx7uQ@mail.gmail.com>

On Mon, Dec 12, 2016 at 3:04 PM, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> Indeed this would be a great first candidate. There are lots of places
> where MD5 (!!) is pulled in for this sort of thing, when SipHash could
> be a faster and leaner replacement (and arguably more secure than
> rusty MD5).

Yeah,. the TCP sequence number md5_transform() cases are likely the
best example of something where siphash might be good. That tends to
be really just a couple words of data (the address and port info) plus
the net_secret[] hash. I think they currently simply just fill in the
fixed-sized 64-byte md5-round area.

I wonder it's worth it to have a special spihash version that does
that same "fixed 64-byte area" thing.

But please talk to the netwotrking people. Maybe that's the proper way
to get this merged?

            Linus

^ permalink raw reply

* Re: [PATCH] crypto: aesni-intel - RFC4106 can zero copy when !PageHighMem
From: Dave Watson @ 2016-12-13 19:07 UTC (permalink / raw)
  To: Ilya Lesokhin; +Cc: linux-crypto, tadeusz.struk, herbert, tls-fpga-sw-dev
In-Reply-To: <1481639526-71743-1-git-send-email-ilyal@mellanox.com>

On 12/13/16 04:32 PM, Ilya Lesokhin wrote:
> --- a/arch/x86/crypto/aesni-intel_glue.c
> +++ b/arch/x86/crypto/aesni-intel_glue.c
> @@ -903,9 +903,11 @@ static int helper_rfc4106_encrypt(struct aead_request *req)
>  	*((__be32 *)(iv+12)) = counter;
>  
>  	if (sg_is_last(req->src) &&
> -	    req->src->offset + req->src->length <= PAGE_SIZE &&
> +	    (!PageHighMem(sg_page(req->src)) ||
> +	    req->src->offset + req->src->length <= PAGE_SIZE) &&
>  	    sg_is_last(req->dst) &&
> -	    req->dst->offset + req->dst->length <= PAGE_SIZE) {
> +	    (!PageHighMem(sg_page(req->dst)) ||
> +	    req->dst->offset + req->dst->length <= PAGE_SIZE)) {
>  		one_entry_in_sg = 1;
>  		scatterwalk_start(&src_sg_walk, req->src);
>  		assoc = scatterwalk_map(&src_sg_walk);

I was also experimenting with a similar patch that loosened up the
restrictions here, checking for highmem.  Note that you can go even
further and check the AAD, data, and TAG all separately, the current
aesni crypto routines take them as separate buffers.  (This might fix
the RFC5288 patch AAD size issue?)

Long term it would be nice to improve the asm routines instead to
support scatter / gather IO and any AAD len, as the newer intel
routines do:

https://github.com/01org/isa-l_crypto/tree/master/aes

^ permalink raw reply

* Re: Remaining crypto API regressions with CONFIG_VMAP_STACK
From: Andy Lutomirski @ 2016-12-13 17:06 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Eric Biggers, linux-crypto, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, kernel-hardening@lists.openwall.com,
	Andrew Lutomirski, Stephan Mueller
In-Reply-To: <20161213033928.GB5601@gondor.apana.org.au>

On Mon, Dec 12, 2016 at 7:39 PM, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> On Mon, Dec 12, 2016 at 10:34:10AM -0800, Andy Lutomirski wrote:
>>
>> Here's my status.
>>
>> >         drivers/crypto/bfin_crc.c:351
>> >         drivers/crypto/qce/sha.c:299
>> >         drivers/crypto/sahara.c:973,988
>> >         drivers/crypto/talitos.c:1910
>> >         drivers/crypto/qce/sha.c:325
>>
>> I have a patch to make these depend on !VMAP_STACK.
>
> Why? They're all marked as ASYNC AFAIK.
>
>> I have a patch to convert this to, drumroll please:
>>
>>     priv->tx_tfm_mic = crypto_alloc_shash("michael_mic", 0,
>>                           CRYPTO_ALG_ASYNC);
>>
>> Herbert, I'm at a loss as what a "shash" that's "ASYNC" even means.
>
> Having 0 as type and CRYPTO_ALG_ASYNC as mask in general means
> that we're requesting a sync algorithm (i.e., ASYNC bit off).
>
> However, it is completely unnecessary for shash as they can never
> be async.  So this could be changed to just ("michael_mic", 0, 0).

I'm confused by a bunch of this.

1. Is it really the case that crypto_alloc_xyz(..., CRYPTO_ALG_ASYNC)
means to allocate a *synchronous* transform?  That's not what I
expected.

2. What guarantees that an async request is never allocated on the
stack?  If it's just convention, could an assertion be added
somewhere?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [PATCH] orinoco: Use shash instead of ahash for MIC calculations
From: Kalle Valo @ 2016-12-13 17:03 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Andy Lutomirski, linux-kernel@vger.kernel.org, USB list,
	Linux Wireless List, Eric Biggers,
	linux-crypto-u79uwXL29TY76Z2rM5mHXA, Herbert Xu, Stephan Mueller
In-Reply-To: <CALCETrXxQ9FxuqV5A1rkj2SpeFfd89njDP9h5VBuNx387ieKdQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> writes:

> On Tue, Dec 13, 2016 at 3:35 AM, Kalle Valo <kvalo-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> wrote:
>> Andy Lutomirski <luto-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> writes:
>>
>>> Eric Biggers pointed out that the orinoco driver pointed scatterlists
>>> at the stack.
>>>
>>> Fix it by switching from ahash to shash.  The result should be
>>> simpler, faster, and more correct.
>>>
>>> Cc: stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org # 4.9 only
>>> Reported-by: Eric Biggers <ebiggers3-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>>> Signed-off-by: Andy Lutomirski <luto-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
>>
>> "more correct"? Does this fix a real user visible bug or what? And why
>> just stable 4.9, does this maybe have something to do with
>> CONFIG_VMAP_STACK?
>
> Whoops, I had that text in some other patches but forgot to put it in
> this one.  It'll blow up with CONFIG_VMAP_STACK=y if a debug option
> like CONFIG_DEBUG_VIRTUAL=y is set.  It may work by accident if
> debugging is off.

Makes sense now, thanks. I'll add that to the commit log and queue this
to 4.10.

-- 
Kalle Valo

^ permalink raw reply

* Re: [PATCH] keys/encrypted: Fix two crypto-on-the-stack bugs
From: Andy Lutomirski @ 2016-12-13 17:02 UTC (permalink / raw)
  To: David Howells
  Cc: David Laight, Joerg Roedel, David Woodhouse, Linus Torvalds,
	Ingo Molnar, Andy Lutomirski, linux-kernel@vger.kernel.org,
	linux-usb@vger.kernel.org, keyrings@vger.kernel.org, Eric Biggers,
	linux-crypto@vger.kernel.org, Herbert Xu, Stephan Mueller
In-Reply-To: <2661.1481647538@warthog.procyon.org.uk>

On Tue, Dec 13, 2016 at 8:45 AM, David Howells <dhowells@redhat.com> wrote:
> Andy Lutomirski <luto@amacapital.net> wrote:
>
>> After all, rodata is ordinary memory, is backed by struct page, etc.
>
> Is that actually true?  I thought some arches excluded the kernel image from
> the page struct array to make the array consume less memory.

I don't know whether you're right, but that sounds a bit silly to me.
This is a *tiny* amount of memory.

But there's yet another snag.  Alpha doesn't have empty_zero_page --
it only has ZERO_PAGE.  I could do page_address(ZERO_PAGE(0))...

--Andy

^ permalink raw reply

* Re: [PATCH] keys/encrypted: Fix two crypto-on-the-stack bugs
From: David Howells @ 2016-12-13 16:45 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: dhowells, David Laight, Joerg Roedel, David Woodhouse,
	Linus Torvalds, Ingo Molnar, Andy Lutomirski,
	linux-kernel@vger.kernel.org, linux-usb@vger.kernel.org,
	keyrings@vger.kernel.org, Eric Biggers,
	linux-crypto@vger.kernel.org, Herbert Xu, Stephan Mueller
In-Reply-To: <CALCETrWsTKq0NOpwiJtB50OU7w99-m82NhPG_Uxs2Fqbpz0LLA@mail.gmail.com>

Andy Lutomirski <luto@amacapital.net> wrote:

> After all, rodata is ordinary memory, is backed by struct page, etc.

Is that actually true?  I thought some arches excluded the kernel image from
the page struct array to make the array consume less memory.

David

^ permalink raw reply

* Re: [PATCH] orinoco: Use shash instead of ahash for MIC calculations
From: Andy Lutomirski @ 2016-12-13 16:41 UTC (permalink / raw)
  To: Kalle Valo
  Cc: Andy Lutomirski, linux-kernel@vger.kernel.org, USB list,
	Linux Wireless List, Eric Biggers, linux-crypto, Herbert Xu,
	Stephan Mueller
In-Reply-To: <87mvg0kqno.fsf@purkki.adurom.net>

On Tue, Dec 13, 2016 at 3:35 AM, Kalle Valo <kvalo@codeaurora.org> wrote:
> Andy Lutomirski <luto@kernel.org> writes:
>
>> Eric Biggers pointed out that the orinoco driver pointed scatterlists
>> at the stack.
>>
>> Fix it by switching from ahash to shash.  The result should be
>> simpler, faster, and more correct.
>>
>> Cc: stable@vger.kernel.org # 4.9 only
>> Reported-by: Eric Biggers <ebiggers3@gmail.com>
>> Signed-off-by: Andy Lutomirski <luto@kernel.org>
>
> "more correct"? Does this fix a real user visible bug or what? And why
> just stable 4.9, does this maybe have something to do with
> CONFIG_VMAP_STACK?

Whoops, I had that text in some other patches but forgot to put it in
this one.  It'll blow up with CONFIG_VMAP_STACK=y if a debug option
like CONFIG_DEBUG_VIRTUAL=y is set.  It may work by accident if
debugging is off.

--Andy

^ permalink raw reply

* Re: [PATCH] keys/encrypted: Fix two crypto-on-the-stack bugs
From: Andy Lutomirski @ 2016-12-13 16:40 UTC (permalink / raw)
  To: David Laight, Joerg Roedel, David Woodhouse, Linus Torvalds,
	Ingo Molnar
  Cc: Andy Lutomirski, linux-kernel@vger.kernel.org,
	linux-usb@vger.kernel.org, dhowells@redhat.com,
	keyrings@vger.kernel.org, Eric Biggers,
	linux-crypto@vger.kernel.org, Herbert Xu, Stephan Mueller
In-Reply-To: <063D6719AE5E284EB5DD2968C1650D6DB023CA99@AcuExch.aculab.com>

[add some people who might know]

On Tue, Dec 13, 2016 at 4:20 AM, David Laight <David.Laight@aculab.com> wrote:
> From: Andy Lutomirski
>> Sent: 12 December 2016 20:53
>> The driver put a constant buffer of all zeros on the stack and
>> pointed a scatterlist entry at it in two places.  This doesn't work
>> with virtual stacks.  Use a static 16-byte buffer of zeros instead.
> ...
>
> I didn't think you could dma from static data either.

According to lib/dma-debug.c, you can't dma to or from kernel text or
rodata, but you can dma to or from kernel bss or data.  So
empty_zero_page should be okay, because it's not rodata right now.

But I think this is rather silly.  Joerg, Linus, etc: would it be okay
to change lib/dma-debug.c to allow DMA *from* rodata?  After all,
rodata is ordinary memory, is backed by struct page, etc.  And DMA
from the zero page had better be okay because I think it happens if
you mmap some zeros, don't write to them, and then direct I/O them to
a device.  Then I could also move empty_zero_page to rodata.

--Andy

^ permalink raw reply

* Re: [PATCH v6 2/2] crypto: add virtio-crypto driver
From: Halil Pasic @ 2016-12-13 16:11 UTC (permalink / raw)
  To: Gonglei (Arei)
  Cc: virtio-dev@lists.oasis-open.org, Xuquan (Quan Xu),
	Huangweidong (C), Herbert Xu, Michael S. Tsirkin, Claudio Fontana,
	Hanweidong (Randy), Luonengjun, qemu-devel@nongnu.org,
	Wanzongshun (Vincent), linux-kernel@vger.kernel.org,
	linux-crypto@vger.kernel.org, stefanha@redhat.com,
	Zhoujian (jay, Euler), longpeng,
	virtualization@lists.linux-foundation.org, davem@davemloft.net,
	"Wubin \(H\)" <
In-Reply-To: <20161212234941-mutt-send-email-mst@kernel.org>



On 12/12/2016 11:05 PM, Michael S. Tsirkin wrote:
> On Mon, Dec 12, 2016 at 06:54:07PM +0800, Herbert Xu wrote:
>> On Mon, Dec 12, 2016 at 06:25:12AM +0000, Gonglei (Arei) wrote:
>>> Hi, Michael & Herbert
>>>
>>> Because the virtio-crypto device emulation had been in QEMU 2.8,
>>> would you please merge the virtio-crypto driver for 4.10 if no other
>>> comments? If so, Miachel pls ack and/or review the patch, then
>>> Herbert will take it (I asked him last week). Thank you!
>>>
>>> Ps: Note on 4.10 merge window timing from Linus
>>>  https://lkml.org/lkml/2016/12/7/506
>>>
>>> Dec 23rd is the deadline for 4.10 merge window.
>>
>> Sorry but it's too late for 4.10.  It needed to have been in my
>> tree before the merge window opened to make it for this cycle.
>>
>> Cheers,
> 
> 
> Objections to me merging this? I'm preparing my tree right now.

Got this when testing the most recent version on s390x 

[   20.391074] test 0 (128 bit key, 16 byte blocks): [   20.391078] BUG: using smp_processor_id() in preemptible [00000000] code: insmod/97
[   20.391082] caller is virtio_crypto_ablkcipher_setkey+0x44/0x198
[   20.391085] CPU: 0 PID: 97 Comm: insmod Not tainted 4.9.0-02683-gb62a1ab #46
[   20.391088] Hardware name: IBM              2964 NC9              704              (KVM)
[   20.391405] Stack:
[   20.391407]        000000000c0eb6d0 000000000c0eb760 0000000000000003 0000000000000000
[   20.391414]        000000000c0eb800 000000000c0eb778 000000000c0eb778 0000000000000020
[   20.391420]        0000000000000000 000000000000000a 0000000000000020 000003ff0000000a
[   20.391426]        000003ff0000000c 000000000c0eb7c8 0000000000000000 0000000000000000
[   20.391432]        0700000000c173c8 00000000001126ba 000000000c0eb760 000000000c0eb7b8
[   20.391439] Call Trace:
[   20.391442] ([<000000000011259e>] show_trace+0x8e/0xe0)
[   20.391446]  [<0000000000112670>] show_stack+0x80/0xd8 
[   20.391449]  [<0000000000753ab6>] dump_stack+0x96/0xd8 
[   20.391453]  [<00000000007872e6>] check_preemption_disabled+0xfe/0x128 
[   20.391456]  [<0000000000839cc4>] virtio_crypto_ablkcipher_setkey+0x44/0x198 
[   20.391459]  [<0000000000705a40>] skcipher_setkey_ablkcipher+0x50/0x70 
[   20.391476]  [<000003ff80002a48>] test_skcipher_speed+0x328/0xb98 [tcrypt] 
[   20.391492]  [<000003ff800063dc>] do_test+0x1c24/0x28e0 [tcrypt] 
[   20.391509]  [<000003ff8001006a>] tcrypt_mod_init+0x6a/0x1000 [tcrypt] 
[   20.391512]  [<00000000001002cc>] do_one_initcall+0xb4/0x148 
[   20.391515]  [<0000000000298632>] do_init_module+0x7a/0x228 
[   20.391519]  [<00000000001fd380>] load_module+0x2428/0x2de0 
[   20.391522]  [<00000000001fde8a>] SyS_init_module+0x152/0x160 
[   20.391526]  [<00000000009f1306>] system_call+0xd6/0x270 
[   20.391528] no locks held by insmod/97.

Gonglei, any idea? Did not look into it myself yet.

Halil

> 
>> -- 
>> Email: Herbert Xu <herbert@gondor.apana.org.au>
>> Home Page: http://gondor.apana.org.au/~herbert/
>> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
> 

^ permalink raw reply

* [PATCH v2 1/3] drivers: crypto: Add Support for Octeon-tx CPT Engine
From: George Cherian @ 2016-12-13 14:03 UTC (permalink / raw)
  To: herbert, davem; +Cc: linux-kernel, linux-crypto, George Cherian
In-Reply-To: <1481637801-1076-1-git-send-email-george.cherian@cavium.com>

Enable the Physical Function diver for the Cavium Crypto Engine (CPT)
found in Octeon-tx series of SoC's. CPT is the Cryptographic Acceleration
Unit. CPT includes microcoded GigaCypher symmetric engines (SEs) and
asymmetric engines (AEs).

Signed-off-by: George Cherian <george.cherian@cavium.com>
---
 drivers/crypto/cavium/cpt/Kconfig        |  16 +
 drivers/crypto/cavium/cpt/Makefile       |   2 +
 drivers/crypto/cavium/cpt/cpt_common.h   | 166 +++++++
 drivers/crypto/cavium/cpt/cpt_hw_types.h | 736 +++++++++++++++++++++++++++++++
 drivers/crypto/cavium/cpt/cptpf.h        |  69 +++
 drivers/crypto/cavium/cpt/cptpf_main.c   | 733 ++++++++++++++++++++++++++++++
 drivers/crypto/cavium/cpt/cptpf_mbox.c   | 163 +++++++
 7 files changed, 1885 insertions(+)
 create mode 100644 drivers/crypto/cavium/cpt/Kconfig
 create mode 100644 drivers/crypto/cavium/cpt/Makefile
 create mode 100644 drivers/crypto/cavium/cpt/cpt_common.h
 create mode 100644 drivers/crypto/cavium/cpt/cpt_hw_types.h
 create mode 100644 drivers/crypto/cavium/cpt/cptpf.h
 create mode 100644 drivers/crypto/cavium/cpt/cptpf_main.c
 create mode 100644 drivers/crypto/cavium/cpt/cptpf_mbox.c

diff --git a/drivers/crypto/cavium/cpt/Kconfig b/drivers/crypto/cavium/cpt/Kconfig
new file mode 100644
index 0000000..247f1cb
--- /dev/null
+++ b/drivers/crypto/cavium/cpt/Kconfig
@@ -0,0 +1,16 @@
+#
+# Cavium crypto device configuration
+#
+
+config CRYPTO_DEV_CPT
+	tristate
+
+config CAVIUM_CPT
+	tristate "Cavium Cryptographic Accelerator driver"
+	depends on ARCH_THUNDER
+	select CRYPTO_DEV_CPT
+	help
+	  Support for Cavium CPT block found in octeon-tx series of
+	  processors.
+
+	  To compile this as a module, choose M here.
diff --git a/drivers/crypto/cavium/cpt/Makefile b/drivers/crypto/cavium/cpt/Makefile
new file mode 100644
index 0000000..fe3d454
--- /dev/null
+++ b/drivers/crypto/cavium/cpt/Makefile
@@ -0,0 +1,2 @@
+obj-$(CONFIG_CAVIUM_CPT) += cptpf.o
+cptpf-objs := cptpf_main.o cptpf_mbox.o
diff --git a/drivers/crypto/cavium/cpt/cpt_common.h b/drivers/crypto/cavium/cpt/cpt_common.h
new file mode 100644
index 0000000..ae542f4
--- /dev/null
+++ b/drivers/crypto/cavium/cpt/cpt_common.h
@@ -0,0 +1,166 @@
+/*
+ * Copyright (C) 2016 Cavium, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ */
+
+#ifndef __CPT_COMMON_H
+#define __CPT_COMMON_H
+
+#include <asm/byteorder.h>
+#include <linux/delay.h>
+#include <linux/pci.h>
+
+#include "cpt_hw_types.h"
+
+/* Device ID */
+#define CPT_81XX_PCI_PF_DEVICE_ID 0xa040
+#define CPT_81XX_PCI_VF_DEVICE_ID 0xa041
+
+/**< flags to indicate the features supported */
+#define CPT_FLAG_MSIX_ENABLED BIT(0)
+#define CPT_FLAG_SRIOV_ENABLED BIT(1)
+#define CPT_FLAG_VF_DRIVER BIT(2)
+#define CPT_FLAG_DEVICE_READY BIT(3)
+
+#define cpt_msix_enabled(cpt) ((cpt)->flags & CPT_FLAG_MSIX_ENABLED)
+#define cpt_sriov_enabled(cpt) ((cpt)->flags & CPT_FLAG_SRIOV_ENABLED)
+#define cpt_vf_driver(cpt) ((cpt)->flags & CPT_FLAG_VF_DRIVER)
+#define cpt_device_ready(cpt) ((cpt)->flags & CPT_FLAG_DEVICE_READY)
+
+#define CPT_MBOX_MSG_TYPE_ACK 1
+#define CPT_MBOX_MSG_TYPE_NACK 2
+#define CPT_MBOX_MSG_TIMEOUT 2000
+#define VF_STATE_DOWN 0
+#define VF_STATE_UP 1
+
+/*
+ * CPT Registers map for 81xx
+ */
+
+/* PF registers */
+#define CPTX_PF_CONSTANTS(a) (0x0ll + ((u64)(a) << 36))
+#define CPTX_PF_RESET(a) (0x100ll + ((u64)(a) << 36))
+#define CPTX_PF_DIAG(a) (0x120ll + ((u64)(a) << 36))
+#define CPTX_PF_BIST_STATUS(a) (0x160ll + ((u64)(a) << 36))
+#define CPTX_PF_ECC0_CTL(a) (0x200ll + ((u64)(a) << 36))
+#define CPTX_PF_ECC0_FLIP(a) (0x210ll + ((u64)(a) << 36))
+#define CPTX_PF_ECC0_INT(a) (0x220ll + ((u64)(a) << 36))
+#define CPTX_PF_ECC0_INT_W1S(a) (0x230ll + ((u64)(a) << 36))
+#define CPTX_PF_ECC0_ENA_W1S(a)	(0x240ll + ((u64)(a) << 36))
+#define CPTX_PF_ECC0_ENA_W1C(a)	(0x250ll + ((u64)(a) << 36))
+#define CPTX_PF_MBOX_INTX(a, b)	\
+	(0x400ll + ((u64)(a) << 36) + ((b) << 3))
+#define CPTX_PF_MBOX_INT_W1SX(a, b) \
+	(0x420ll + ((u64)(a) << 36) + ((b) << 3))
+#define CPTX_PF_MBOX_ENA_W1CX(a, b) \
+	(0x440ll + ((u64)(a) << 36) + ((b) << 3))
+#define CPTX_PF_MBOX_ENA_W1SX(a, b) \
+	(0x460ll + ((u64)(a) << 36) + ((b) << 3))
+#define CPTX_PF_EXEC_INT(a) (0x500ll + 0x1000000000ll * ((a) & 0x1))
+#define CPTX_PF_EXEC_INT_W1S(a)	(0x520ll + ((u64)(a) << 36))
+#define CPTX_PF_EXEC_ENA_W1C(a)	(0x540ll + ((u64)(a) << 36))
+#define CPTX_PF_EXEC_ENA_W1S(a)	(0x560ll + ((u64)(a) << 36))
+#define CPTX_PF_GX_EN(a, b) \
+	(0x600ll + ((u64)(a) << 36) + ((b) << 3))
+#define CPTX_PF_EXEC_INFO(a) (0x700ll + ((u64)(a) << 36))
+#define CPTX_PF_EXEC_BUSY(a) (0x800ll + ((u64)(a) << 36))
+#define CPTX_PF_EXEC_INFO0(a) (0x900ll + ((u64)(a) << 36))
+#define CPTX_PF_EXEC_INFO1(a) (0x910ll + ((u64)(a) << 36))
+#define CPTX_PF_INST_REQ_PC(a) (0x10000ll + ((u64)(a) << 36))
+#define CPTX_PF_INST_LATENCY_PC(a) \
+	(0x10020ll + ((u64)(a) << 36))
+#define CPTX_PF_RD_REQ_PC(a) (0x10040ll + ((u64)(a) << 36))
+#define CPTX_PF_RD_LATENCY_PC(a) (0x10060ll + ((u64)(a) << 36))
+#define CPTX_PF_RD_UC_PC(a) (0x10080ll + ((u64)(a) << 36))
+#define CPTX_PF_ACTIVE_CYCLES_PC(a) (0x10100ll + ((u64)(a) << 36))
+#define CPTX_PF_EXE_CTL(a) (0x4000000ll + ((u64)(a) << 36))
+#define CPTX_PF_EXE_STATUS(a) (0x4000008ll + ((u64)(a) << 36))
+#define CPTX_PF_EXE_CLK(a) (0x4000010ll + ((u64)(a) << 36))
+#define CPTX_PF_EXE_DBG_CTL(a) (0x4000018ll + ((u64)(a) << 36))
+#define CPTX_PF_EXE_DBG_DATA(a)	(0x4000020ll + ((u64)(a) << 36))
+#define CPTX_PF_EXE_BIST_STATUS(a) (0x4000028ll + ((u64)(a) << 36))
+#define CPTX_PF_EXE_REQ_TIMER(a) (0x4000030ll + ((u64)(a) << 36))
+#define CPTX_PF_EXE_MEM_CTL(a) (0x4000038ll + ((u64)(a) << 36))
+#define CPTX_PF_EXE_PERF_CTL(a)	(0x4001000ll + ((u64)(a) << 36))
+#define CPTX_PF_EXE_DBG_CNTX(a, b) \
+	(0x4001100ll + ((u64)(a) << 36) + ((b) << 3))
+#define CPTX_PF_EXE_PERF_EVENT_CNT(a) (0x4001180ll + ((u64)(a) << 36))
+#define CPTX_PF_EXE_EPCI_INBX_CNT(a, b) \
+	(0x4001200ll + ((u64)(a) << 36) + ((b) << 3))
+#define CPTX_PF_EXE_EPCI_OUTBX_CNT(a, b) \
+	(0x4001240ll + ((u64)(a) << 36) + ((b) << 3))
+#define CPTX_PF_ENGX_UCODE_BASE(a, b) \
+	(0x4002000ll + ((u64)(a) << 36) + ((b) << 3))
+#define CPTX_PF_QX_CTL(a, b) \
+	(0x8000000ll + ((u64)(a) << 36) + ((b) << 20))
+#define CPTX_PF_QX_GMCTL(a, b) \
+	(0x8000020ll + ((u64)(a) << 36) + ((b) << 20))
+#define CPTX_PF_QX_CTL2(a, b) \
+	(0x8000100ll + ((u64)(a) << 36) + ((b) << 20))
+#define CPTX_PF_VFX_MBOXX(a, b, c) \
+	(0x8001000ll + ((u64)(a) << 36) + ((b) << 20) + ((c) << 8))
+
+/* VF registers */
+#define CPTX_VQX_CTL(a, b) (0x100ll + ((u64)(a) << 36) + ((b) << 20))
+#define CPTX_VQX_SADDR(a, b) (0x200ll + ((u64)(a) << 36) + ((b) << 20))
+#define CPTX_VQX_DONE_WAIT(a, b) (0x400ll + ((u64)(a) << 36) + ((b) << 20))
+#define CPTX_VQX_INPROG(a, b) (0x410ll + ((u64)(a) << 36) + ((b) << 20))
+#define CPTX_VQX_DONE(a, b) (0x420ll + ((u64)(a) << 36) + ((b) << 20))
+#define CPTX_VQX_DONE_ACK(a, b) (0x440ll + ((u64)(a) << 36) + ((b) << 20))
+#define CPTX_VQX_DONE_INT_W1S(a, b) (0x460ll + ((u64)(a) << 36) + ((b) << 20))
+#define CPTX_VQX_DONE_INT_W1C(a, b) (0x468ll + ((u64)(a) << 36) + ((b) << 20))
+#define CPTX_VQX_DONE_ENA_W1S(a, b) (0x470ll + ((u64)(a) << 36) + ((b) << 20))
+#define CPTX_VQX_DONE_ENA_W1C(a, b) (0x478ll + ((u64)(a) << 36) + ((b) << 20))
+#define CPTX_VQX_MISC_INT(a, b)	(0x500ll + ((u64)(a) << 36) + ((b) << 20))
+#define CPTX_VQX_MISC_INT_W1S(a, b) (0x508ll + ((u64)(a) << 36) + ((b) << 20))
+#define CPTX_VQX_MISC_ENA_W1S(a, b) (0x510ll + ((u64)(a) << 36) + ((b) << 20))
+#define CPTX_VQX_MISC_ENA_W1C(a, b) (0x518ll + ((u64)(a) << 36) + ((b) << 20))
+#define CPTX_VQX_DOORBELL(a, b) (0x600ll + ((u64)(a) << 36) + ((b) << 20))
+#define CPTX_VFX_PF_MBOXX(a, b, c) \
+	(0x1000ll + ((u64)(a) << 36) + ((b) << 20) + ((c) << 3))
+
+enum vftype {
+	AE_TYPES = 1,
+	SE_TYPES = 2,
+	BAD_CPT_TYPES,
+};
+
+/* Max CPT devices supported */
+enum cpt_mbox_opcode {
+	CPT_MSG_VF_UP = 1,
+	CPT_MSG_VF_DOWN,
+	CPT_MSG_READY,
+	CPT_MSG_QLEN,
+	CPT_MSG_QBIND_GRP,
+	CPT_MSG_VQ_PRIORITY,
+};
+
+/* CPT mailbox structure */
+struct cpt_mbox {
+	u64 msg; /* Message type MBOX[0] */
+	u64 data;/* Data         MBOX[1] */
+};
+
+/* The Cryptographic Acceleration Unit can *only* be found in SoCs
+ * containing the ThunderX ARM64 CPU implementation.  All accesses to the device
+ * registers on this platform are implicitly strongly ordered with respect
+ * to memory accesses. So writeq_relaxed() and readq_relaxed() are safe to use
+ * with no memory barriers in this driver.  The readq()/writeq() functions add
+ * explicit ordering operation which in this case are redundant, and only
+ * add overhead.
+ */
+/* Register read/write APIs */
+static inline void cpt_write_csr64(u8 __iomem *hw_addr, u64 offset,
+				   u64 val)
+{
+	writeq_relaxed(val, hw_addr + offset);
+}
+
+static inline u64 cpt_read_csr64(u8 __iomem *hw_addr, u64 offset)
+{
+	return readq_relaxed(hw_addr + offset);
+}
+#endif /* __CPT_COMMON_H */
diff --git a/drivers/crypto/cavium/cpt/cpt_hw_types.h b/drivers/crypto/cavium/cpt/cpt_hw_types.h
new file mode 100644
index 0000000..3798803
--- /dev/null
+++ b/drivers/crypto/cavium/cpt/cpt_hw_types.h
@@ -0,0 +1,736 @@
+/*
+ * Copyright (C) 2016 Cavium, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ */
+
+#ifndef __CPT_HW_TYPES_H
+#define __CPT_HW_TYPES_H
+
+#include "cpt_common.h"
+
+/**
+ * Enumeration cpt_comp_e
+ *
+ * CPT Completion Enumeration
+ * Enumerates the values of CPT_RES_S[COMPCODE].
+ */
+enum cpt_comp_e {
+	CPT_COMP_E_NOTDONE = 0x00,
+	CPT_COMP_E_GOOD = 0x01,
+	CPT_COMP_E_FAULT = 0x02,
+	CPT_COMP_E_SWERR = 0x03,
+	CPT_COMP_E_LAST_ENTRY = 0xFF
+};
+
+/**
+ * Structure cpt_inst_s
+ *
+ * CPT Instruction Structure
+ * This structure specifies the instruction layout. Instructions are
+ * stored in memory as little-endian unless CPT()_PF_Q()_CTL[INST_BE] is set.
+ * cpt_inst_s_s
+ * Word 0
+ * doneint:1 Done interrupt.
+ *	0 = No interrupts related to this instruction.
+ *	1 = When the instruction completes, CPT()_VQ()_DONE[DONE] will be
+ *	incremented,and based on the rules described there an interrupt may
+ *	occur.
+ * Word 1
+ * res_addr:64 [127: 64] Result IOVA.
+ *	If nonzero, specifies where to write CPT_RES_S.
+ *	If zero, no result structure will be written.
+ *	Address must be 16-byte aligned.
+ *	Bits <63:49> are ignored by hardware; software should use a
+ *	sign-extended bit <48> for forward compatibility.
+ * Word 2
+ *  grp:10 [171:162] If [WQ_PTR] is nonzero, the SSO guest-group to use when
+ *	CPT submits work SSO.
+ *	For the SSO to not discard the add-work request, FPA_PF_MAP() must map
+ *	[GRP] and CPT()_PF_Q()_GMCTL[GMID] as valid.
+ *  tt:2 [161:160] If [WQ_PTR] is nonzero, the SSO tag type to use when CPT
+ *	submits work to SSO
+ *  tag:32 [159:128] If [WQ_PTR] is nonzero, the SSO tag to use when CPT
+ *	submits work to SSO.
+ * Word 3
+ *  wq_ptr:64 [255:192] If [WQ_PTR] is nonzero, it is a pointer to a
+ *	work-queue entry that CPT submits work to SSO after all context,
+ *	output data, and result write operations are visible to other
+ *	CNXXXX units and the cores. Bits <2:0> must be zero.
+ *	Bits <63:49> are ignored by hardware; software should
+ *	use a sign-extended bit <48> for forward compatibility.
+ *	Internal:
+ *	Bits <63:49>, <2:0> are ignored by hardware, treated as always 0x0.
+ * Word 4
+ *  ei0:64; [319:256] Engine instruction word 0. Passed to the AE/SE.
+ * Word 5
+ *  ei1:64; [383:320] Engine instruction word 1. Passed to the AE/SE.
+ * Word 6
+ *  ei2:64; [447:384] Engine instruction word 1. Passed to the AE/SE.
+ * Word 7
+ *  ei3:64; [511:448] Engine instruction word 1. Passed to the AE/SE.
+ *
+ */
+union cpt_inst_s {
+	u64 u[8];
+	struct cpt_inst_s_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		u64 reserved_17_63:47;
+		u64 doneint:1;
+		u64 reserved_0_1:16;
+#else /* Word 0 - Little Endian */
+		u64 reserved_0_15:16;
+		u64 doneint:1;
+		u64 reserved_17_63:47;
+#endif /* Word 0 - End */
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 1 - Big Endian */
+		u64 res_addr:64;
+#else /* Word 1 - Little Endian */
+		u64 res_addr:64;
+#endif /* Word 1 - End */
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 2 - Big Endian */
+		u64 reserved_172_19:20;
+		u64 grp:10;
+		u64 tt:2;
+		u64 tag:32;
+#else /* Word 2 - Little Endian */
+		u64 tag:32;
+		u64 tt:2;
+		u64 grp:10;
+		u64 reserved_172_191:20;
+#endif /* Word 2 - End */
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 3 - Big Endian */
+		u64 wq_ptr:64;
+#else /* Word 3 - Little Endian */
+		u64 wq_ptr:64;
+#endif /* Word 3 - End */
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 4 - Big Endian */
+		u64 ei0:64;
+#else /* Word 4 - Little Endian */
+		u64 ei0:64;
+#endif /* Word 4 - End */
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 5 - Big Endian */
+		u64 ei1:64;
+#else /* Word 5 - Little Endian */
+		u64 ei1:64;
+#endif /* Word 5 - End */
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 6 - Big Endian */
+		u64 ei2:64;
+#else /* Word 6 - Little Endian */
+		u64 ei2:64;
+#endif /* Word 6 - End */
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 7 - Big Endian */
+		u64 ei3:64;
+#else /* Word 7 - Little Endian */
+		u64 ei3:64;
+#endif /* Word 7 - End */
+	} s;
+};
+
+/**
+ * Structure cpt_res_s
+ *
+ * CPT Result Structure
+ * The CPT coprocessor writes the result structure after it completes a
+ * CPT_INST_S instruction. The result structure is exactly 16 bytes, and
+ * each instruction completion produces exactly one result structure.
+ *
+ * This structure is stored in memory as little-endian unless
+ * CPT()_PF_Q()_CTL[INST_BE] is set.
+ * cpt_res_s_s
+ * Word 0
+ *  doneint:1 [16:16] Done interrupt. This bit is copied from the
+ *	corresponding instruction's CPT_INST_S[DONEINT].
+ *  compcode:8 [7:0] Indicates completion/error status of the CPT coprocessor
+ *	for the	associated instruction, as enumerated by CPT_COMP_E.
+ *	Core software may write the memory location containing [COMPCODE] to
+ *	0x0 before ringing the doorbell, and then poll for completion by
+ *	checking for a nonzero value.
+ *	Once the core observes a nonzero [COMPCODE] value in this case,the CPT
+ *	coprocessor will have also completed L2/DRAM write operations.
+ * Word 1
+ *  reserved
+ *
+ */
+union cpt_res_s {
+	u64 u[2];
+	struct cpt_res_s_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		u64 reserved_17_63:47;
+		u64 doneint:1;
+		u64 reserved_8_15:8;
+		u64 compcode:8;
+#else /* Word 0 - Little Endian */
+		u64 compcode:8;
+		u64 reserved_8_15:8;
+		u64 doneint:1;
+		u64 reserved_17_63:47;
+#endif /* Word 0 - End */
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 1 - Big Endian */
+		u64 reserved_64_127:64;
+#else /* Word 1 - Little Endian */
+		u64 reserved_64_127:64;
+#endif /* Word 1 - End */
+	} s;
+};
+
+/**
+ * Register (NCB) cpt#_pf_bist_status
+ *
+ * CPT PF Control Bist Status Register
+ * This register has the BIST status of memories. Each bit is the BIST result
+ * of an individual memory (per bit, 0 = pass and 1 = fail).
+ * cptx_pf_bist_status_s
+ * Word0
+ *  bstatus [29:0](RO/H) BIST status. One bit per memory, enumerated by
+ *	CPT_RAMS_E.
+ */
+union cptx_pf_bist_status {
+	u64 u;
+	struct cptx_pf_bist_status_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		u64 reserved_30_63:34;
+		u64 bstatus:30;
+#else /* Word 0 - Little Endian */
+		u64 bstatus:30;
+		u64 reserved_30_63:34;
+#endif /* Word 0 - End */
+	} s;
+};
+
+/**
+ * Register (NCB) cpt#_pf_constants
+ *
+ * CPT PF Constants Register
+ * This register contains implementation-related parameters of CPT in CNXXXX.
+ * cptx_pf_constants_s
+ * Word 0
+ *  reserved_40_63:24 [63:40] Reserved.
+ *  epcis:8 [39:32](RO) Number of EPCI busses.
+ *  grps:8 [31:24](RO) Number of engine groups implemented.
+ *  ae:8 [23:16](RO/H) Number of AEs. In CNXXXX, for CPT0 returns 0x0,
+ *	for CPT1 returns 0x18, or less if there are fuse-disables.
+ *  se:8 [15:8](RO/H) Number of SEs. In CNXXXX, for CPT0 returns 0x30,
+ *	or less if there are fuse-disables, for CPT1 returns 0x0.
+ *  vq:8 [7:0](RO) Number of VQs.
+ */
+union cptx_pf_constants {
+	u64 u;
+	struct cptx_pf_constants_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		u64 reserved_40_63:24;
+		u64 epcis:8;
+		u64 grps:8;
+		u64 ae:8;
+		u64 se:8;
+		u64 vq:8;
+#else /* Word 0 - Little Endian */
+		u64 vq:8;
+		u64 se:8;
+		u64 ae:8;
+		u64 grps:8;
+		u64 epcis:8;
+		u64 reserved_40_63:24;
+#endif /* Word 0 - End */
+	} s;
+};
+
+/**
+ * Register (NCB) cpt#_pf_exe_bist_status
+ *
+ * CPT PF Engine Bist Status Register
+ * This register has the BIST status of each engine.  Each bit is the
+ * BIST result of an individual engine (per bit, 0 = pass and 1 = fail).
+ * cptx_pf_exe_bist_status_s
+ * Word0
+ *  reserved_48_63:16 [63:48] reserved
+ *  bstatus:48 [47:0](RO/H) BIST status. One bit per engine.
+ *
+ */
+union cptx_pf_exe_bist_status {
+	u64 u;
+	struct cptx_pf_exe_bist_status_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		u64 reserved_48_63:16;
+		u64 bstatus:48;
+#else /* Word 0 - Little Endian */
+		u64 bstatus:48;
+		u64 reserved_48_63:16;
+#endif /* Word 0 - End */
+	} s;
+};
+
+/**
+ * Register (NCB) cpt#_pf_exe_ctl
+ *
+ * CPT PF Engine Control Register
+ * This register enables the engines.
+ * cptx_pf_exe_ctl_s
+ * Word0
+ *  enable:64 [63:0](R/W) Individual enables for each of the engines.
+ */
+union cptx_pf_exe_ctl {
+	u64 u;
+	struct cptx_pf_exe_ctl_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		u64 enable:64;
+#else /* Word 0 - Little Endian */
+		u64 enable:64;
+#endif /* Word 0 - End */
+	} s;
+};
+
+/**
+ * Register (NCB) cpt#_pf_q#_ctl
+ *
+ * CPT Queue Control Register
+ * This register configures queues. This register should be changed only
+ * when quiescent (see CPT()_VQ()_INPROG[INFLIGHT]).
+ * cptx_pf_qx_ctl_s
+ * Word0
+ *  reserved_60_63:4 [63:60] reserved.
+ *  aura:12; [59:48](R/W) Guest-aura for returning this queue's
+ *	instruction-chunk buffers to FPA. Only used when [INST_FREE] is set.
+ *	For the FPA to not discard the request, FPA_PF_MAP() must map
+ *	[AURA] and CPT()_PF_Q()_GMCTL[GMID] as valid.
+ *  reserved_45_47:3 [47:45] reserved.
+ *  size:13 [44:32](R/W) Command-buffer size, in number of 64-bit words per
+ *	command buffer segment. Must be 8*n + 1, where n is the number of
+ *	instructions per buffer segment.
+ *  reserved_11_31:21 [31:11] Reserved.
+ *  cont_err:1 [10:10](R/W) Continue on error.
+ *	0 = When CPT()_VQ()_MISC_INT[NWRP], CPT()_VQ()_MISC_INT[IRDE] or
+ *	CPT()_VQ()_MISC_INT[DOVF] are set by hardware or software via
+ *	CPT()_VQ()_MISC_INT_W1S, then CPT()_VQ()_CTL[ENA] is cleared.  Due to
+ *	pipelining, additional instructions may have been processed between the
+ *	instruction causing the error and the next instruction in the disabled
+ *	queue (the instruction at CPT()_VQ()_SADDR).
+ *	1 = Ignore errors and continue processing instructions.
+ *	For diagnostic use only.
+ *  inst_free:1 [9:9](R/W) Instruction FPA free. When set, when CPT reaches the
+ *	end of an instruction chunk, that chunk will be freed to the FPA.
+ *  inst_be:1 [8:8](R/W) Instruction big-endian control. When set, instructions,
+ *	instruction next chunk pointers, and result structures are stored in
+ *	big-endian format in memory.
+ *  iqb_ldwb:1 [7:7](R/W) Instruction load don't write back.
+ *	0 = The hardware issues NCB transient load (LDT) towards the cache,
+ *	which if the line hits and is is dirty will cause the line to be
+ *	written back before being replaced.
+ *	1 = The hardware issues NCB LDWB read-and-invalidate command towards
+ *	the cache when fetching the last word of instructions; as a result the
+ *	line will not be written back when replaced.  This improves
+ *	performance, but software must not read the instructions after they are
+ *	posted to the hardware.	Reads that do not consume the last word of a
+ *	cache line always use LDI.
+ *  reserved_4_6:3 [6:4] Reserved.
+ *  grp:3; [3:1](R/W) Engine group.
+ *  pri:1; [0:0](R/W) Queue priority.
+ *	1 = This queue has higher priority. Round-robin between higher
+ *	priority queues.
+ *	0 = This queue has lower priority. Round-robin between lower
+ *	priority queues.
+ */
+union cptx_pf_qx_ctl {
+	u64 u;
+	struct cptx_pf_qx_ctl_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		u64 reserved_60_63:4;
+		u64 aura:12;
+		u64 reserved_45_47:3;
+		u64 size:13;
+		u64 reserved_11_31:21;
+		u64 cont_err:1;
+		u64 inst_free:1;
+		u64 inst_be:1;
+		u64 iqb_ldwb:1;
+		u64 reserved_4_6:3;
+		u64 grp:3;
+		u64 pri:1;
+#else /* Word 0 - Little Endian */
+		u64 pri:1;
+		u64 grp:3;
+		u64 reserved_4_6:3;
+		u64 iqb_ldwb:1;
+		u64 inst_be:1;
+		u64 inst_free:1;
+		u64 cont_err:1;
+		u64 reserved_11_31:21;
+		u64 size:13;
+		u64 reserved_45_47:3;
+		u64 aura:12;
+		u64 reserved_60_63:4;
+#endif /* Word 0 - End */
+	} s;
+};
+
+/**
+ * Register (NCB) cpt#_pf_g#_en
+ *
+ * CPT PF Group Control Register
+ * This register configures engine groups.
+ * cptx_pf_gx_en_s
+ * Word0
+ *  en: 64; [63:0](R/W/H) Engine group enable. One bit corresponds to each
+ *	engine, with the bit set to indicate this engine can service this group.
+ *	Bits corresponding to unimplemented engines read as zero, i.e. only bit
+ *	numbers	less than CPT()_PF_CONSTANTS[AE] + CPT()_PF_CONSTANTS[SE] are
+ *	writable. AE engine bits follow SE engine bits.
+ *	E.g. if CPT()_PF_CONSTANTS[AE] = 0x1, and CPT()_PF_CONSTANTS[SE] = 0x2,
+ *	then bits <2:0> are read/writable with bit <2> corresponding to AE<0>,
+ *	and bit <1> to SE<1>, and bit<0> to SE<0>. Before disabling an engine,
+ *	the corresponding bit in each group must be cleared. CPT()_PF_EXEC_BUSY
+ *	can then be polled to determing when the engine becomes	idle.
+ *	At the point, the engine can be disabled.
+ */
+union cptx_pf_gx_en {
+	u64 u;
+	struct cptx_pf_gx_en_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		u64 en:64;
+#else /* Word 0 - Little Endian */
+		u64 en:64;
+#endif /* Word 0 - End */
+	} s;
+};
+
+/**
+ * Register (NCB) cpt#_vq#_saddr
+ *
+ * CPT Queue Starting Buffer Address Registers
+ * These registers set the instruction buffer starting address.
+ * cptx_vqx_saddr_s
+ * Word0
+ *  reserved_49_63:15 [63:49] Reserved.
+ *  ptr:43 [48:6](R/W/H) Instruction buffer IOVA <48:6> (64-byte aligned).
+ *	When written, it is the initial buffer starting address; when read,
+ *	it is the next read pointer to be requested from L2C. The PTR field
+ *	is overwritten with the next pointer each time that the command buffer
+ *	segment is exhausted. New commands will then be read from the newly
+ *	specified command buffer pointer.
+ *  reserved_0_5:6 [5:0] Reserved.
+ *
+ */
+union cptx_vqx_saddr {
+	u64 u;
+	struct cptx_vqx_saddr_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		u64 reserved_49_63:15;
+		u64 ptr:43;
+		u64 reserved_0_5:6;
+#else /* Word 0 - Little Endian */
+		u64 reserved_0_5:6;
+		u64 ptr:43;
+		u64 reserved_49_63:15;
+#endif /* Word 0 - End */
+	} s;
+};
+
+/**
+ * Register (NCB) cpt#_vq#_misc_ena_w1s
+ *
+ * CPT Queue Misc Interrupt Enable Set Register
+ * This register sets interrupt enable bits.
+ * cptx_vqx_misc_ena_w1s_s
+ * Word0
+ * reserved_5_63:59 [63:5] Reserved.
+ * swerr:1 [4:4](R/W1S/H) Reads or sets enable for
+ *	CPT(0..1)_VQ(0..63)_MISC_INT[SWERR].
+ * nwrp:1 [3:3](R/W1S/H) Reads or sets enable for
+ *	CPT(0..1)_VQ(0..63)_MISC_INT[NWRP].
+ * irde:1 [2:2](R/W1S/H) Reads or sets enable for
+ *	CPT(0..1)_VQ(0..63)_MISC_INT[IRDE].
+ * dovf:1 [1:1](R/W1S/H) Reads or sets enable for
+ *	CPT(0..1)_VQ(0..63)_MISC_INT[DOVF].
+ * mbox:1 [0:0](R/W1S/H) Reads or sets enable for
+ *	CPT(0..1)_VQ(0..63)_MISC_INT[MBOX].
+ *
+ */
+union cptx_vqx_misc_ena_w1s {
+	u64 u;
+	struct cptx_vqx_misc_ena_w1s_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		u64 reserved_5_63:59;
+		u64 swerr:1;
+		u64 nwrp:1;
+		u64 irde:1;
+		u64 dovf:1;
+		u64 mbox:1;
+#else /* Word 0 - Little Endian */
+		u64 mbox:1;
+		u64 dovf:1;
+		u64 irde:1;
+		u64 nwrp:1;
+		u64 swerr:1;
+		u64 reserved_5_63:59;
+#endif /* Word 0 - End */
+	} s;
+};
+
+/**
+ * Register (NCB) cpt#_vq#_doorbell
+ *
+ * CPT Queue Doorbell Registers
+ * Doorbells for the CPT instruction queues.
+ * cptx_vqx_doorbell_s
+ * Word0
+ *  reserved_20_63:44 [63:20] Reserved.
+ *  dbell_cnt:20 [19:0](R/W/H) Number of instruction queue 64-bit words to add
+ *	to the CPT instruction doorbell count. Readback value is the the
+ *	current number of pending doorbell requests. If counter overflows
+ *	CPT()_VQ()_MISC_INT[DBELL_DOVF] is set. To reset the count back to
+ *	zero, write one to clear CPT()_VQ()_MISC_INT_ENA_W1C[DBELL_DOVF],
+ *	then write a value of 2^20 minus the read [DBELL_CNT], then write one
+ *	to CPT()_VQ()_MISC_INT_W1C[DBELL_DOVF] and
+ *	CPT()_VQ()_MISC_INT_ENA_W1S[DBELL_DOVF]. Must be a multiple of 8.
+ *	All CPT instructions are 8 words and require a doorbell count of
+ *	multiple of 8.
+ */
+union cptx_vqx_doorbell {
+	u64 u;
+	struct cptx_vqx_doorbell_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		u64 reserved_20_63:44;
+		u64 dbell_cnt:20;
+#else /* Word 0 - Little Endian */
+		u64 dbell_cnt:20;
+		u64 reserved_20_63:44;
+#endif /* Word 0 - End */
+	} s;
+};
+
+/**
+ * Register (NCB) cpt#_vq#_inprog
+ *
+ * CPT Queue In Progress Count Registers
+ * These registers contain the per-queue instruction in flight registers.
+ * cptx_vqx_inprog_s
+ * Word0
+ *  reserved_8_63:56 [63:8] Reserved.
+ *  inflight:8 [7:0](RO/H) Inflight count. Counts the number of instructions
+ *	for the VF for which CPT is fetching, executing or responding to
+ *	instructions. However this does not include any interrupts that are
+ *	awaiting software handling (CPT()_VQ()_DONE[DONE] != 0x0).
+ *	A queue may not be reconfigured until:
+ *	1. CPT()_VQ()_CTL[ENA] is cleared by software.
+ *	2. [INFLIGHT] is polled until equals to zero.
+ */
+union cptx_vqx_inprog {
+	u64 u;
+	struct cptx_vqx_inprog_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		u64 reserved_8_63:56;
+		u64 inflight:8;
+#else /* Word 0 - Little Endian */
+		u64 inflight:8;
+		u64 reserved_8_63:56;
+#endif /* Word 0 - End */
+	} s;
+};
+
+/**
+ * Register (NCB) cpt#_vq#_misc_int
+ *
+ * CPT Queue Misc Interrupt Register
+ * These registers contain the per-queue miscellaneous interrupts.
+ * cptx_vqx_misc_int_s
+ * Word 0
+ *  reserved_5_63:59 [63:5] Reserved.
+ *  swerr:1 [4:4](R/W1C/H) Software error from engines.
+ *  nwrp:1  [3:3](R/W1C/H) NCB result write response error.
+ *  irde:1  [2:2](R/W1C/H) Instruction NCB read response error.
+ *  dovf:1 [1:1](R/W1C/H) Doorbell overflow.
+ *  mbox:1 [0:0](R/W1C/H) PF to VF mailbox interrupt. Set when
+ *	CPT()_VF()_PF_MBOX(0) is written.
+ *
+ */
+union cptx_vqx_misc_int {
+	u64 u;
+	struct cptx_vqx_misc_int_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		u64 reserved_5_63:59;
+		u64 swerr:1;
+		u64 nwrp:1;
+		u64 irde:1;
+		u64 dovf:1;
+		u64 mbox:1;
+#else /* Word 0 - Little Endian */
+		u64 mbox:1;
+		u64 dovf:1;
+		u64 irde:1;
+		u64 nwrp:1;
+		u64 swerr:1;
+		u64 reserved_5_63:59;
+#endif /* Word 0 - End */
+	} s;
+};
+
+/**
+ * Register (NCB) cpt#_vq#_done_ack
+ *
+ * CPT Queue Done Count Ack Registers
+ * This register is written by software to acknowledge interrupts.
+ * cptx_vqx_done_ack_s
+ * Word0
+ *  reserved_20_63:44 [63:20] Reserved.
+ *  done_ack:20 [19:0](R/W/H) Number of decrements to CPT()_VQ()_DONE[DONE].
+ *	Reads CPT()_VQ()_DONE[DONE]. Written by software to acknowledge
+ *	interrupts. If CPT()_VQ()_DONE[DONE] is still nonzero the interrupt
+ *	will be re-sent if the conditions described in CPT()_VQ()_DONE[DONE]
+ *	are satisfied.
+ *
+ */
+union cptx_vqx_done_ack {
+	u64 u;
+	struct cptx_vqx_done_ack_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		u64 reserved_20_63:44;
+		u64 done_ack:20;
+#else /* Word 0 - Little Endian */
+		u64 done_ack:20;
+		u64 reserved_20_63:44;
+#endif /* Word 0 - End */
+	} s;
+};
+
+/**
+ * Register (NCB) cpt#_vq#_done
+ *
+ * CPT Queue Done Count Registers
+ * These registers contain the per-queue instruction done count.
+ * cptx_vqx_done_s
+ * Word0
+ *  reserved_20_63:44 [63:20] Reserved.
+ *  done:20 [19:0](R/W/H) Done count. When CPT_INST_S[DONEINT] set and that
+ *	instruction completes, CPT()_VQ()_DONE[DONE] is incremented when the
+ *	instruction finishes. Write to this field are for diagnostic use only;
+ *	instead software writes CPT()_VQ()_DONE_ACK with the number of
+ *	decrements for this field.
+ *	Interrupts are sent as follows:
+ *	* When CPT()_VQ()_DONE[DONE] = 0, then no results are pending, the
+ *	interrupt coalescing timer is held to zero, and an interrupt is not
+ *	sent.
+ *	* When CPT()_VQ()_DONE[DONE] != 0, then the interrupt coalescing timer
+ *	counts. If the counter is >= CPT()_VQ()_DONE_WAIT[TIME_WAIT]*1024, or
+ *	CPT()_VQ()_DONE[DONE] >= CPT()_VQ()_DONE_WAIT[NUM_WAIT], i.e. enough
+ *	time has passed or enough results have arrived, then the interrupt is
+ *	sent.
+ *	* When CPT()_VQ()_DONE_ACK is written (or CPT()_VQ()_DONE is written
+ *	but this is not typical), the interrupt coalescing timer restarts.
+ *	Note after decrementing this interrupt equation is recomputed,
+ *	for example if CPT()_VQ()_DONE[DONE] >= CPT()_VQ()_DONE_WAIT[NUM_WAIT]
+ *	and because the timer is zero, the interrupt will be resent immediately.
+ *	(This covers the race case between software acknowledging an interrupt
+ *	and a result returning.)
+ *	* When CPT()_VQ()_DONE_ENA_W1S[DONE] = 0, interrupts are not sent,
+ *	but the counting described above still occurs.
+ *	Since CPT instructions complete out-of-order, if software is using
+ *	completion interrupts the suggested scheme is to request a DONEINT on
+ *	each request, and when an interrupt arrives perform a "greedy" scan for
+ *	completions; even if a later command is acknowledged first this will
+ *	not result in missing a completion.
+ *	Software is responsible for making sure [DONE] does not overflow;
+ *	for example by insuring there are not more than 2^20-1 instructions in
+ *	flight that may request interrupts.
+ *
+ */
+union cptx_vqx_done {
+	u64 u;
+	struct cptx_vqx_done_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		u64 reserved_20_63:44;
+		u64 done:20;
+#else /* Word 0 - Little Endian */
+		u64 done:20;
+		u64 reserved_20_63:44;
+#endif /* Word 0 - End */
+	} s;
+};
+
+/**
+ * Register (NCB) cpt#_vq#_done_wait
+ *
+ * CPT Queue Done Interrupt Coalescing Wait Registers
+ * Specifies the per queue interrupt coalescing settings.
+ * cptx_vqx_done_wait_s
+ * Word0
+ *  reserved_48_63:16 [63:48] Reserved.
+ *  time_wait:16; [47:32](R/W) Time hold-off. When CPT()_VQ()_DONE[DONE] = 0
+ *	or CPT()_VQ()_DONE_ACK is written a timer is cleared. When the timer
+ *	reaches [TIME_WAIT]*1024 then interrupt coalescing ends.
+ *	see CPT()_VQ()_DONE[DONE]. If 0x0, time coalescing is disabled.
+ *  reserved_20_31:12 [31:20] Reserved.
+ *  num_wait:20 [19:0](R/W) Number of messages hold-off.
+ *	When CPT()_VQ()_DONE[DONE] >= [NUM_WAIT] then interrupt coalescing ends
+ *	see CPT()_VQ()_DONE[DONE]. If 0x0, same behavior as 0x1.
+ *
+ */
+union cptx_vqx_done_wait {
+	u64 u;
+	struct cptx_vqx_done_wait_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		u64 reserved_48_63:16;
+		u64 time_wait:16;
+		u64 reserved_20_31:12;
+		u64 num_wait:20;
+#else /* Word 0 - Little Endian */
+		u64 num_wait:20;
+		u64 reserved_20_31:12;
+		u64 time_wait:16;
+		u64 reserved_48_63:16;
+#endif /* Word 0 - End */
+	} s;
+};
+
+/**
+ * Register (NCB) cpt#_vq#_done_ena_w1s
+ *
+ * CPT Queue Done Interrupt Enable Set Registers
+ * Write 1 to these registers will enable the DONEINT interrupt for the queue.
+ * cptx_vqx_done_ena_w1s_s
+ * Word0
+ *  reserved_1_63:63 [63:1] Reserved.
+ *  done:1 [0:0](R/W1S/H) Write 1 will enable DONEINT for this queue.
+ *	Write 0 has no effect. Read will return the enable bit.
+ */
+union cptx_vqx_done_ena_w1s {
+	u64 u;
+	struct cptx_vqx_done_ena_w1s_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		u64 reserved_1_63:63;
+		u64 done:1;
+#else /* Word 0 - Little Endian */
+		u64 done:1;
+		u64 reserved_1_63:63;
+#endif /* Word 0 - End */
+	} s;
+};
+
+/**
+ * Register (NCB) cpt#_vq#_ctl
+ *
+ * CPT VF Queue Control Registers
+ * This register configures queues. This register should be changed (other than
+ * clearing [ENA]) only when quiescent (see CPT()_VQ()_INPROG[INFLIGHT]).
+ * cptx_vqx_ctl_s
+ * Word0
+ *  reserved_1_63:63 [63:1] Reserved.
+ *  ena:1 [0:0](R/W/H) Enables the logical instruction queue.
+ *	See also CPT()_PF_Q()_CTL[CONT_ERR] and	CPT()_VQ()_INPROG[INFLIGHT].
+ *	1 = Queue is enabled.
+ *	0 = Queue is disabled.
+ */
+union cptx_vqx_ctl {
+	u64 u;
+	struct cptx_vqx_ctl_s {
+#if defined(__BIG_ENDIAN_BITFIELD) /* Word 0 - Big Endian */
+		u64 reserved_1_63:63;
+		u64 ena:1;
+#else /* Word 0 - Little Endian */
+		u64 ena:1;
+		u64 reserved_1_63:63;
+#endif /* Word 0 - End */
+	} s;
+};
+#endif /*__CPT_HW_TYPES_H*/
diff --git a/drivers/crypto/cavium/cpt/cptpf.h b/drivers/crypto/cavium/cpt/cptpf.h
new file mode 100644
index 0000000..4511a21
--- /dev/null
+++ b/drivers/crypto/cavium/cpt/cptpf.h
@@ -0,0 +1,69 @@
+/*
+ * Copyright (C) 2016 Cavium, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ */
+
+#ifndef __CPTPF_H
+#define __CPTPF_H
+
+#include "cpt_common.h"
+
+#define CSR_DELAY 30
+#define CPT_MAX_CORE_GROUPS 8
+#define CPT_MAX_SE_CORES 10
+#define CPT_MAX_AE_CORES 6
+#define CPT_MAX_TOTAL_CORES (CPT_MAX_SE_CORES + CPT_MAX_AE_CORES)
+#define CPT_MAX_VF_NUM 16
+#define	CPT_PF_MSIX_VECTORS 3
+#define CPT_PF_INT_VEC_E_MBOXX(a) (0x02 + (a))
+
+struct cpt_device;
+
+struct microcode {
+	u8 is_mc_valid;
+	u8 is_ae;
+	u8 group;
+	u8 num_cores;
+	u32 code_size;
+	u64 core_mask;
+	u8 version[32];
+	/* Base info */
+	dma_addr_t phys_base;
+	void *code;
+};
+
+struct cpt_vf_info {
+	u8 state;
+	u8 priority;
+	u8 id;
+	u32 qlen;
+};
+
+/**
+ * cpt device structure
+ */
+struct cpt_device {
+	u16 flags;	/**< Flags to hold device status bits */
+	u8 num_vf_en; /**< Number of VFs enabled (0...CPT_MAX_VF_NUM) */
+	struct cpt_vf_info vfinfo[CPT_MAX_VF_NUM]; /* Per VF info */
+
+	void __iomem *reg_base; /* Register start address */
+	/* MSI-X */
+	u8 num_vec;
+	bool msix_enabled;
+	struct msix_entry msix_entries[CPT_PF_MSIX_VECTORS];
+	bool irq_allocated[CPT_PF_MSIX_VECTORS];
+	struct pci_dev *pdev; /**< pci device handle */
+
+	struct microcode mcode[CPT_MAX_CORE_GROUPS];
+	u8 next_mc_idx; /**< next microcode index */
+	u8 next_group;
+	u8 max_se_cores;
+	u8 max_ae_cores;
+};
+
+void cpt_mbox_intr_handler(struct cpt_device *cpt, s32 mbx);
+#endif /* __CPTPF_H */
diff --git a/drivers/crypto/cavium/cpt/cptpf_main.c b/drivers/crypto/cavium/cpt/cptpf_main.c
new file mode 100644
index 0000000..ff6674b
--- /dev/null
+++ b/drivers/crypto/cavium/cpt/cptpf_main.c
@@ -0,0 +1,733 @@
+/*
+ * Copyright (C) 2016 Cavium, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ */
+
+#include <linux/version.h>
+#include <linux/module.h>
+#include <linux/moduleparam.h>
+#include <linux/printk.h>
+#include <linux/device.h>
+#include <linux/interrupt.h>
+#include <linux/firmware.h>
+#include <linux/pci.h>
+
+#include "cptpf.h"
+
+#define DRV_NAME	"thunder-cpt"
+#define DRV_VERSION	"1.0"
+
+static u32 num_vfs = 4; /* Default 4 VF enabled */
+module_param(num_vfs, uint, 0444);
+MODULE_PARM_DESC(num_vfs, "Number of VFs to enable(1-16)");
+
+static u64 get_mask_from_value(s32 value)
+{
+	u64 mask = 0ULL;
+	s32 i;
+
+	for (i = 0; i < value; i++)
+		mask |= ((u64)1 << i);
+
+	return mask;
+}
+
+/*
+ * Disable cores specified by coremask
+ */
+static void cpt_disable_cores(struct cpt_device *cpt, u64 coremask,
+			      u8 type, u8 grp)
+{
+	union cptx_pf_exe_ctl pf_exe_ctl;
+	u32 timeout = 0xFFFFFFFF;
+	u64 grpmask = 0;
+	struct device *dev = &cpt->pdev->dev;
+
+	if (type == AE_TYPES)
+		coremask = (coremask << cpt->max_se_cores);
+
+	/* Disengage the cores from groups */
+	grpmask = cpt_read_csr64(cpt->reg_base, CPTX_PF_GX_EN(0, grp));
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_GX_EN(0, grp),
+			(grpmask & ~coremask));
+	udelay(CSR_DELAY);
+	grp = cpt_read_csr64(cpt->reg_base, CPTX_PF_EXEC_BUSY(0));
+	while (grp & coremask) {
+		dev_err(dev, "Cores still busy %llx", coremask);
+		grp = cpt_read_csr64(cpt->reg_base,
+				     CPTX_PF_EXEC_BUSY(0));
+		if (timeout--)
+			break;
+	}
+
+	/* Disable the cores */
+	pf_exe_ctl.u = cpt_read_csr64(cpt->reg_base, CPTX_PF_EXE_CTL(0));
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_EXE_CTL(0),
+			(pf_exe_ctl.u & ~coremask));
+	udelay(CSR_DELAY);
+}
+
+/*
+ * Enable cores specified by coremask
+ */
+static void cpt_enable_cores(struct cpt_device *cpt, u64 coremask,
+			     u8 type)
+{
+	union cptx_pf_exe_ctl pf_exe_ctl;
+
+	if (type == AE_TYPES)
+		coremask = (coremask << cpt->max_se_cores);
+
+	pf_exe_ctl.u = cpt_read_csr64(cpt->reg_base, CPTX_PF_EXE_CTL(0));
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_EXE_CTL(0),
+			(pf_exe_ctl.u | coremask));
+	udelay(CSR_DELAY);
+}
+
+static void cpt_configure_group(struct cpt_device *cpt, u8 grp,
+				u64 coremask, u8 type)
+{
+	union cptx_pf_gx_en pf_gx_en = {0};
+
+	if (type == AE_TYPES)
+		coremask = (coremask << cpt->max_se_cores);
+
+	pf_gx_en.u = cpt_read_csr64(cpt->reg_base, CPTX_PF_GX_EN(0, grp));
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_GX_EN(0, grp),
+			(pf_gx_en.u | coremask));
+	udelay(CSR_DELAY);
+}
+
+static void cpt_disable_mbox_interrupts(struct cpt_device *cpt)
+{
+	/* Clear mbox(0) interupts for all vfs */
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_MBOX_ENA_W1CX(0, 0), ~0ull);
+}
+
+static void cpt_disable_ecc_interrupts(struct cpt_device *cpt)
+{
+	/* Clear ecc(0) interupts for all vfs */
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_ECC0_ENA_W1C(0), ~0ull);
+}
+
+static void cpt_disable_exec_interrupts(struct cpt_device *cpt)
+{
+	/* Clear exec interupts for all vfs */
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_EXEC_ENA_W1C(0), ~0ull);
+}
+
+static void cpt_disable_all_interrupts(struct cpt_device *cpt)
+{
+	cpt_disable_mbox_interrupts(cpt);
+	cpt_disable_ecc_interrupts(cpt);
+	cpt_disable_exec_interrupts(cpt);
+}
+
+static void cpt_enable_mbox_interrupts(struct cpt_device *cpt)
+{
+	/* Set mbox(0) interupts for all vfs */
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_MBOX_ENA_W1SX(0, 0), ~0ull);
+}
+
+static s32 cpt_load_microcode(struct cpt_device *cpt, struct microcode *mcode)
+{
+	s32 ret = 0, core = 0, shift = 0;
+	u32 total_cores = 0;
+	struct device *dev = &cpt->pdev->dev;
+
+	if (!mcode || !mcode->code) {
+		dev_err(dev, "Either the mcode is null or data is NULL\n");
+		return 1;
+	}
+
+	if (mcode->code_size == 0) {
+		dev_err(dev, "microcode size is 0\n");
+		return 1;
+	}
+
+	/* Assumes 0-9 are SE cores for UCODE_BASE registers and
+	 * AE core bases follow
+	 */
+	if (mcode->is_ae) {
+		core = CPT_MAX_SE_CORES; /* start couting from 10 */
+		total_cores = CPT_MAX_TOTAL_CORES; /* upto 15 */
+	} else {
+		core = 0; /* start couting from 0 */
+		total_cores = CPT_MAX_SE_CORES; /* upto 9 */
+	}
+
+	/* Point to microcode for each core of the group */
+	for (; core < total_cores ; core++, shift++) {
+		if (mcode->core_mask & (1 << shift)) {
+			cpt_write_csr64(cpt->reg_base,
+					CPTX_PF_ENGX_UCODE_BASE(0, core),
+					(u64)mcode->phys_base);
+		}
+	}
+	return ret;
+}
+
+static s32 do_cpt_init(struct cpt_device *cpt, struct microcode *mcode)
+{
+	s32 ret = 0;
+	struct device *dev = &cpt->pdev->dev;
+
+	/* Make device not ready */
+	cpt->flags &= ~CPT_FLAG_DEVICE_READY;
+	/* Disable All PF interrupts */
+	cpt_disable_all_interrupts(cpt);
+	/* Calculate mcode group and coremasks */
+	if (mcode->is_ae) {
+		if (mcode->num_cores > cpt->max_ae_cores) {
+			dev_err(dev, "Requested for more cores than available AE cores\n");
+			ret = -1;
+			goto cpt_init_fail;
+		}
+
+		if (cpt->next_group >= CPT_MAX_CORE_GROUPS) {
+			dev_err(dev, "Can't load, all eight microcode groups in use");
+			return -ENFILE;
+		}
+
+		mcode->group = cpt->next_group;
+		/* Convert requested cores to mask */
+		mcode->core_mask = get_mask_from_value(mcode->num_cores);
+		cpt_disable_cores(cpt, mcode->core_mask, AE_TYPES,
+				  mcode->group);
+		/* Load microcode for AE engines */
+		if (cpt_load_microcode(cpt, mcode)) {
+			dev_err(dev, "Microcode load Failed for %s\n",
+				mcode->version);
+			ret = -1;
+			goto cpt_init_fail;
+		}
+		cpt->next_group++;
+		/* Configure group mask for the mcode */
+		cpt_configure_group(cpt, mcode->group, mcode->core_mask,
+				    AE_TYPES);
+		/* Enable AE cores for the group mask */
+		cpt_enable_cores(cpt, mcode->core_mask, AE_TYPES);
+	} else {
+		if (mcode->num_cores > cpt->max_se_cores) {
+			dev_err(dev, "Requested for more cores than available SE cores\n");
+			ret = -1;
+			goto cpt_init_fail;
+		}
+		if (cpt->next_group >= CPT_MAX_CORE_GROUPS) {
+			dev_err(dev, "Can't load, all eight microcode groups in use");
+			return -ENFILE;
+		}
+
+		mcode->group = cpt->next_group;
+		/* Covert requested cores to mask */
+		mcode->core_mask = get_mask_from_value(mcode->num_cores);
+		cpt_disable_cores(cpt, mcode->core_mask, SE_TYPES,
+				  mcode->group);
+		/* Load microcode for SE engines */
+		if (cpt_load_microcode(cpt, mcode)) {
+			dev_err(dev, "Microcode load Failed for %s\n",
+				mcode->version);
+			ret = -1;
+			goto cpt_init_fail;
+		}
+		cpt->next_group++;
+		/* Configure group mask for the mcode */
+		cpt_configure_group(cpt, mcode->group, mcode->core_mask,
+				    SE_TYPES);
+		/* Enable SE cores for the group mask */
+		cpt_enable_cores(cpt, mcode->core_mask, SE_TYPES);
+	}
+
+	/* Enabled PF mailbox interrupts */
+	cpt_enable_mbox_interrupts(cpt);
+	cpt->flags |= CPT_FLAG_DEVICE_READY;
+
+	return ret;
+
+cpt_init_fail:
+	/* Enabled PF mailbox interrupts */
+	cpt_enable_mbox_interrupts(cpt);
+
+	return ret;
+}
+
+struct ucode_header {
+	u8 version[32];
+	u32 code_length;
+	u32 data_length;
+	u64 sram_address;
+};
+
+static s32 cpt_ucode_load_fw(struct cpt_device *cpt, const u8 *fw, bool is_ae)
+{
+	const struct firmware *fw_entry;
+	struct device *dev = &cpt->pdev->dev;
+	struct ucode_header *ucode;
+	struct microcode *mcode;
+	int j, ret = 0;
+
+	ret = request_firmware(&fw_entry, fw, dev);
+	if (ret)
+		return ret;
+
+	mcode = &cpt->mcode[cpt->next_mc_idx];
+	ucode = (struct ucode_header *)fw_entry->data;
+	memcpy(mcode->version, (u8 *)fw_entry->data, 32);
+	mcode->code_size = ntohl(ucode->code_length) * 2;
+	mcode->is_ae = is_ae;
+	mcode->core_mask = 0ULL;
+	mcode->num_cores = is_ae ? 6 : 10;
+
+	/*  Allocate DMAable space */
+	mcode->code = dma_zalloc_coherent(&cpt->pdev->dev, mcode->code_size,
+					  &mcode->phys_base, GFP_KERNEL);
+	if (!mcode->code) {
+		dev_err(dev, "Unable to allocate space for microcode");
+		return -ENOMEM;
+	}
+
+	memcpy((void *)mcode->code, (void *)(fw_entry->data + sizeof(*ucode)),
+	       mcode->code_size);
+
+	/* Byte swap 64-bit */
+	for (j = 0; j < (mcode->code_size / 8); j++)
+		((u64 *)mcode->code)[j] = cpu_to_be64(((u64 *)mcode->code)[j]);
+	/*  MC needs 16-bit swap */
+	for (j = 0; j < (mcode->code_size / 2); j++)
+		((u16 *)mcode->code)[j] = cpu_to_be16(((u16 *)mcode->code)[j]);
+
+	dev_dbg(dev, "mcode->code_size = %u\n", mcode->code_size);
+	dev_dbg(dev, "mcode->is_ae = %u\n", mcode->is_ae);
+	dev_dbg(dev, "mcode->num_cores = %u\n", mcode->num_cores);
+	dev_dbg(dev, "mcode->code = %llx\n", (u64)mcode->code);
+	dev_dbg(dev, "mcode->phys_base = %llx\n", mcode->phys_base);
+
+	ret = do_cpt_init(cpt, mcode);
+	if (ret) {
+		dev_err(dev, "do_cpt_init failed with ret: %d\n", ret);
+		return ret;
+	}
+
+	dev_info(dev, "Microcode Loaded %s\n", mcode->version);
+	mcode->is_mc_valid = 1;
+	cpt->next_mc_idx++;
+	release_firmware(fw_entry);
+
+	return ret;
+}
+
+static s32 cpt_ucode_load(struct cpt_device *cpt)
+{
+	s32 ret = 0;
+	struct device *dev = &cpt->pdev->dev;
+
+	ret = cpt_ucode_load_fw(cpt, "cpt8x-mc-ae.out", true);
+	if (ret) {
+		dev_err(dev, "ae:cpt_ucode_load failed with ret: %d\n", ret);
+		return ret;
+	}
+	ret = cpt_ucode_load_fw(cpt, "cpt8x-mc-se.out", false);
+	if (ret) {
+		dev_err(dev, "se:cpt_ucode_load failed with ret: %d\n", ret);
+		return ret;
+	}
+
+	return ret;
+}
+
+static s32 cpt_enable_msix(struct cpt_device *cpt)
+{
+	s32 i, ret;
+
+	cpt->num_vec = CPT_PF_MSIX_VECTORS;
+
+	for (i = 0; i < cpt->num_vec; i++)
+		cpt->msix_entries[i].entry = i;
+
+	ret = pci_enable_msix(cpt->pdev, cpt->msix_entries, cpt->num_vec);
+	if (ret) {
+		dev_err(&cpt->pdev->dev, "Request for #%d msix vectors failed\n",
+			cpt->num_vec);
+		return ret;
+	}
+
+	cpt->msix_enabled = 1;
+	return 0;
+}
+
+static irqreturn_t cpt_mbx0_intr_handler (s32 irq, void *cpt_irq)
+{
+	struct cpt_device *cpt = (struct cpt_device *)cpt_irq;
+
+	cpt_mbox_intr_handler(cpt, 0);
+
+	return IRQ_HANDLED;
+}
+
+static void cpt_disable_msix(struct cpt_device *cpt)
+{
+	if (cpt->msix_enabled) {
+		pci_disable_msix(cpt->pdev);
+		cpt->msix_enabled = 0;
+		cpt->num_vec = 0;
+	}
+}
+
+static void cpt_free_all_interrupts(struct cpt_device *cpt)
+{
+	s32 irq;
+
+	for (irq = 0; irq < cpt->num_vec; irq++) {
+		if (cpt->irq_allocated[irq])
+			free_irq(cpt->msix_entries[irq].vector, cpt);
+		cpt->irq_allocated[irq] = false;
+	}
+}
+
+static void cpt_reset(struct cpt_device *cpt)
+{
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_RESET(0), 1);
+}
+
+static void cpt_find_max_enabled_cores(struct cpt_device *cpt)
+{
+	union cptx_pf_constants pf_cnsts = {0};
+
+	pf_cnsts.u = cpt_read_csr64(cpt->reg_base, CPTX_PF_CONSTANTS(0));
+	cpt->max_se_cores = pf_cnsts.s.se;
+	cpt->max_ae_cores = pf_cnsts.s.ae;
+}
+
+static u32 cpt_check_bist_status(struct cpt_device *cpt)
+{
+	union cptx_pf_bist_status bist_sts = {0};
+
+	bist_sts.u = cpt_read_csr64(cpt->reg_base,
+				    CPTX_PF_BIST_STATUS(0));
+
+	return bist_sts.u;
+}
+
+static u64 cpt_check_exe_bist_status(struct cpt_device *cpt)
+{
+	union cptx_pf_exe_bist_status bist_sts = {0};
+
+	bist_sts.u = cpt_read_csr64(cpt->reg_base,
+				    CPTX_PF_EXE_BIST_STATUS(0));
+
+	return bist_sts.u;
+}
+
+static void cpt_disable_all_cores(struct cpt_device *cpt)
+{
+	u32 grp, timeout = 0xFFFFFFFF;
+	struct device *dev = &cpt->pdev->dev;
+
+	/* Disengage the cores from groups */
+	for (grp = 0; grp < CPT_MAX_CORE_GROUPS; grp++) {
+		cpt_write_csr64(cpt->reg_base, CPTX_PF_GX_EN(0, grp), 0);
+		udelay(CSR_DELAY);
+	}
+
+	grp = cpt_read_csr64(cpt->reg_base, CPTX_PF_EXEC_BUSY(0));
+	while (grp) {
+		dev_err(dev, "Cores still busy");
+		grp = cpt_read_csr64(cpt->reg_base,
+				     CPTX_PF_EXEC_BUSY(0));
+		if (timeout--)
+			break;
+	}
+	/* Disable the cores */
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_EXE_CTL(0), 0);
+}
+
+/**
+ * Ensure all cores are disenganed from all groups by
+ * calling cpt_disable_all_cores() before calling this
+ * function.
+ */
+static void cpt_unload_microcode(struct cpt_device *cpt)
+{
+	u32 grp = 0, core;
+
+	/* Free microcode bases and reset group masks */
+	for (grp = 0; grp < CPT_MAX_CORE_GROUPS; grp++) {
+		struct microcode *mcode = &cpt->mcode[grp];
+
+		if (cpt->mcode[grp].code)
+			dma_free_coherent(&cpt->pdev->dev, mcode->code_size,
+					  mcode->code, mcode->phys_base);
+		mcode->code = NULL;
+		//mcode->base = NULL;
+	}
+	/* Clear UCODE_BASE registers for all engines */
+	for (core = 0; core < CPT_MAX_TOTAL_CORES; core++)
+		cpt_write_csr64(cpt->reg_base,
+				CPTX_PF_ENGX_UCODE_BASE(0, core), 0ull);
+}
+
+static s32 cpt_device_init(struct cpt_device *cpt)
+{
+	u64 bist;
+	struct device *dev = &cpt->pdev->dev;
+
+	/* Reset the PF when probed first */
+	cpt_reset(cpt);
+	mdelay((100));
+
+	/*Check BIST status*/
+	bist = (u64)cpt_check_bist_status(cpt);
+	if (bist) {
+		dev_err(dev, "RAM BIST failed with code 0x%llx", bist);
+		return -ENODEV;
+	}
+
+	bist = cpt_check_exe_bist_status(cpt);
+	if (bist) {
+		dev_err(dev, "Engine BIST failed with code 0x%llx", bist);
+	return -ENODEV;
+	}
+
+	/*Get CLK frequency*/
+	/*Get max enabled cores */
+	cpt_find_max_enabled_cores(cpt);
+	/*Disable all cores*/
+	cpt_disable_all_cores(cpt);
+	/*Reset device parameters*/
+	cpt->next_mc_idx   = 0;
+	cpt->next_group = 0;
+	/* PF is ready */
+	cpt->flags |= CPT_FLAG_DEVICE_READY;
+
+	return 0;
+}
+
+static s32 cpt_register_interrupts(struct cpt_device *cpt)
+{
+	s32 ret;
+	struct device *dev = &cpt->pdev->dev;
+
+	/* Enable MSI-X */
+	ret = cpt_enable_msix(cpt);
+	if (ret)
+		return ret;
+
+	/* Register mailbox interrupt handlers */
+	ret = request_irq(cpt->msix_entries[CPT_PF_INT_VEC_E_MBOXX(0)].vector,
+			  cpt_mbx0_intr_handler, 0, "CPT Mbox0", cpt);
+	if (ret)
+		goto fail;
+
+	cpt->irq_allocated[CPT_PF_INT_VEC_E_MBOXX(0)] = true;
+
+	/* Enable mailbox interrupt */
+	cpt_enable_mbox_interrupts(cpt);
+	return 0;
+
+fail:
+	dev_err(dev, "Request irq failed\n");
+	cpt_free_all_interrupts(cpt);
+	return ret;
+}
+
+static void cpt_unregister_interrupts(struct cpt_device *cpt)
+{
+	cpt_free_all_interrupts(cpt);
+	cpt_disable_msix(cpt);
+}
+
+static s32 cpt_sriov_init(struct cpt_device *cpt, s32 num_vfs)
+{
+	s32 pos = 0;
+	s32 err;
+	u16 total_vf_cnt;
+	struct pci_dev *pdev = cpt->pdev;
+
+	pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_SRIOV);
+	if (!pos) {
+		dev_err(&pdev->dev, "SRIOV capability is not found in PCIe config space\n");
+		return -ENODEV;
+	}
+
+	cpt->num_vf_en = num_vfs; /* User requested VFs */
+	pci_read_config_word(pdev, (pos + PCI_SRIOV_TOTAL_VF), &total_vf_cnt);
+	if (total_vf_cnt < cpt->num_vf_en)
+		cpt->num_vf_en = total_vf_cnt;
+
+	if (!total_vf_cnt)
+		return 0;
+
+	/*Enabled the available VFs */
+	err = pci_enable_sriov(pdev, cpt->num_vf_en);
+	if (err) {
+		dev_err(&pdev->dev, "SRIOV enable failed, num VF is %d\n",
+			cpt->num_vf_en);
+		cpt->num_vf_en = 0;
+		return err;
+	}
+
+	/* TODO: Optionally enable static VQ priorities feature */
+
+	dev_info(&pdev->dev, "SRIOV enabled, number of VF available %d\n",
+		 cpt->num_vf_en);
+
+	cpt->flags |= CPT_FLAG_SRIOV_ENABLED;
+
+	return 0;
+}
+
+static s32 cpt_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
+{
+	struct device *dev = &pdev->dev;
+	struct cpt_device *cpt;
+	s32    err;
+
+	cpt = devm_kzalloc(dev, sizeof(struct cpt_device), GFP_KERNEL);
+	if (!cpt)
+		return -ENOMEM;
+
+	pci_set_drvdata(pdev, cpt);
+	cpt->pdev = pdev;
+	err = pci_enable_device(pdev);
+	if (err) {
+		dev_err(dev, "Failed to enable PCI device\n");
+		pci_set_drvdata(pdev, NULL);
+		return err;
+	}
+
+	err = pci_request_regions(pdev, DRV_NAME);
+	if (err) {
+		dev_err(dev, "PCI request regions failed 0x%x\n", err);
+		goto cpt_err_disable_device;
+	}
+
+	err = pci_set_dma_mask(pdev, DMA_BIT_MASK(48));
+	if (err) {
+		dev_err(dev, "Unable to get usable DMA configuration\n");
+		goto cpt_err_release_regions;
+	}
+
+	err = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(48));
+	if (err) {
+		dev_err(dev, "Unable to get 48-bit DMA for consistent allocations\n");
+		goto cpt_err_release_regions;
+	}
+
+	/* MAP PF's configuration registers */
+	cpt->reg_base = pcim_iomap(pdev, 0, 0);
+	if (!cpt->reg_base) {
+		dev_err(dev, "Cannot map config register space, aborting\n");
+		err = -ENOMEM;
+		goto cpt_err_release_regions;
+	}
+
+	/* CPT device HW initialization */
+	cpt_device_init(cpt);
+
+	/* Register interrupts */
+	err = cpt_register_interrupts(cpt);
+	if (err)
+		goto cpt_err_release_regions;
+
+	err = cpt_ucode_load(cpt);
+	if (err)
+		goto cpt_err_unregister_interrupts;
+
+	/* Configure SRIOV */
+	err = cpt_sriov_init(cpt, num_vfs);
+	if (err)
+		goto cpt_err_unregister_interrupts;
+
+	return 0;
+
+cpt_err_unregister_interrupts:
+	cpt_unregister_interrupts(cpt);
+cpt_err_release_regions:
+	pci_release_regions(pdev);
+cpt_err_disable_device:
+	pci_disable_device(pdev);
+	pci_set_drvdata(pdev, NULL);
+	return err;
+}
+
+static void cpt_remove(struct pci_dev *pdev)
+{
+	struct cpt_device *cpt = pci_get_drvdata(pdev);
+
+	/* Disengage SE and AE cores from all groups*/
+	cpt_disable_all_cores(cpt);
+	/* Unload microcodes */
+	cpt_unload_microcode(cpt);
+	cpt_unregister_interrupts(cpt);
+	pci_disable_sriov(pdev);
+	pci_release_regions(pdev);
+	pci_disable_device(pdev);
+	pci_set_drvdata(pdev, NULL);
+}
+
+static void cpt_shutdown(struct pci_dev *pdev)
+{
+	struct cpt_device *cpt = pci_get_drvdata(pdev);
+
+	if (!cpt)
+		return;
+
+	dev_info(&pdev->dev, "Shutdown device %x:%x.\n",
+		 (u32)pdev->vendor, (u32)pdev->device);
+
+	cpt_unregister_interrupts(cpt);
+	pci_release_regions(pdev);
+	pci_disable_device(pdev);
+	pci_set_drvdata(pdev, NULL);
+	kzfree(cpt);
+}
+
+/* Supported devices */
+static const struct pci_device_id cpt_id_table[] = {
+	{ PCI_DEVICE(PCI_VENDOR_ID_CAVIUM, CPT_81XX_PCI_PF_DEVICE_ID) },
+	{ 0, }  /* end of table */
+};
+
+static struct pci_driver cpt_pci_driver = {
+	.name = DRV_NAME,
+	.id_table = cpt_id_table,
+	.probe = cpt_probe,
+	.remove = cpt_remove,
+	.shutdown = cpt_shutdown,
+};
+
+static s32 __init cpt_init_module(void)
+{
+	s32 ret = -1;
+
+	pr_info("%s, ver %s\n", DRV_NAME, DRV_VERSION);
+
+	if (num_vfs > 16) {
+		pr_warn("Invalid vf count %d, Resetting it to 1(default)\n",
+			num_vfs);
+		num_vfs = 4;
+	}
+
+	ret = pci_register_driver(&cpt_pci_driver);
+	if (ret)
+		pr_err("pci_register_driver() failed");
+
+	return ret;
+}
+
+static void __exit cpt_cleanup_module(void)
+{
+	pci_unregister_driver(&cpt_pci_driver);
+}
+
+module_init(cpt_init_module);
+module_exit(cpt_cleanup_module);
+
+MODULE_AUTHOR("George Cherian <george.cherian@cavium.com>");
+MODULE_DESCRIPTION("Cavium Thunder CPT Physical Function Driver");
+MODULE_LICENSE("GPL v2");
+MODULE_VERSION(DRV_VERSION);
+MODULE_DEVICE_TABLE(pci, cpt_id_table);
diff --git a/drivers/crypto/cavium/cpt/cptpf_mbox.c b/drivers/crypto/cavium/cpt/cptpf_mbox.c
new file mode 100644
index 0000000..1039a5f
--- /dev/null
+++ b/drivers/crypto/cavium/cpt/cptpf_mbox.c
@@ -0,0 +1,163 @@
+/*
+ * Copyright (C) 2016 Cavium, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ */
+#include <linux/module.h>
+#include "cptpf.h"
+
+static void cpt_send_msg_to_vf(struct cpt_device *cpt, s32 vf,
+			       struct cpt_mbox *mbx)
+{
+	/* Writing mbox(0) causes interrupt */
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_VFX_MBOXX(0, vf, 1),
+			mbx->data);
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_VFX_MBOXX(0, vf, 0), mbx->msg);
+}
+
+/* ACKs VF's mailbox message
+ * @vf: VF to which ACK to be sent
+ */
+static void cpt_mbox_send_ack(struct cpt_device *cpt, s32 vf,
+			      struct cpt_mbox *mbx)
+{
+	mbx->data = 0ull;
+	mbx->msg = CPT_MBOX_MSG_TYPE_ACK;
+	cpt_send_msg_to_vf(cpt, vf, mbx);
+}
+
+static void cpt_clear_mbox_intr(struct cpt_device *cpt, u32 vf)
+{
+	/* W1C for the VF */
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_MBOX_INTX(0, 0), (1 << vf));
+}
+
+/*
+ *  Configure QLEN/Chunk sizes for VF
+ */
+static void cpt_cfg_qlen_for_vf(struct cpt_device *cpt, s32 vf, u32 size)
+{
+	union cptx_pf_qx_ctl pf_qx_ctl;
+
+	pf_qx_ctl.u = cpt_read_csr64(cpt->reg_base, CPTX_PF_QX_CTL(0, vf));
+	pf_qx_ctl.s.size = size;
+	pf_qx_ctl.s.cont_err = true;
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_QX_CTL(0, vf), pf_qx_ctl.u);
+}
+
+/*
+ * Configure VQ priority
+ */
+static void cpt_cfg_vq_priority(struct cpt_device *cpt, s32 vf, u32 pri)
+{
+	union cptx_pf_qx_ctl pf_qx_ctl;
+
+	pf_qx_ctl.u = cpt_read_csr64(cpt->reg_base, CPTX_PF_QX_CTL(0, vf));
+	pf_qx_ctl.s.pri = pri;
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_QX_CTL(0, vf), pf_qx_ctl.u);
+}
+
+static u8 cpt_bind_vq_to_grp(struct cpt_device *cpt, u8 q, u8 grp)
+{
+	struct microcode *mcode = cpt->mcode;
+	union cptx_pf_qx_ctl pf_qx_ctl;
+	struct device *dev = &cpt->pdev->dev;
+
+	if (q >= CPT_MAX_VF_NUM) {
+		dev_err(dev, "Queues are more than cores in the group");
+		return -EINVAL;
+	}
+	if (grp >= CPT_MAX_CORE_GROUPS) {
+		dev_err(dev, "Request group is more than possible groups");
+		return -EINVAL;
+	}
+	if (grp >= cpt->next_mc_idx) {
+		dev_err(dev, "Request group is higher than available functional groups");
+		return -EINVAL;
+	}
+	pf_qx_ctl.u = cpt_read_csr64(cpt->reg_base, CPTX_PF_QX_CTL(0, q));
+	pf_qx_ctl.s.grp = mcode[grp].group;
+	cpt_write_csr64(cpt->reg_base, CPTX_PF_QX_CTL(0, q), pf_qx_ctl.u);
+	dev_dbg(dev, "VF %d TYPE %s", q, (mcode[grp].is_ae ? "AE" : "SE"));
+
+	return mcode[grp].is_ae ? AE_TYPES : SE_TYPES;
+}
+
+/* Interrupt handler to handle mailbox messages from VFs */
+static void cpt_handle_mbox_intr(struct cpt_device *cpt, s32 vf)
+{
+	struct cpt_vf_info *vfx = &cpt->vfinfo[vf];
+	struct cpt_mbox mbx = {};
+	u8 vftype;
+	struct device *dev = &cpt->pdev->dev;
+	/*
+	 * MBOX[0] contains msg
+	 * MBOX[1] contains data
+	 */
+	mbx.msg  = cpt_read_csr64(cpt->reg_base, CPTX_PF_VFX_MBOXX(0, vf, 0));
+	mbx.data = cpt_read_csr64(cpt->reg_base, CPTX_PF_VFX_MBOXX(0, vf, 1));
+	dev_dbg(dev, "%s: Mailbox msg 0x%llx from VF%d", __func__, mbx.msg, vf);
+	switch (mbx.msg) {
+	case CPT_MSG_VF_UP:
+		vfx->state = VF_STATE_UP;
+		try_module_get(THIS_MODULE);
+		cpt_mbox_send_ack(cpt, vf, &mbx);
+		break;
+	case CPT_MSG_READY:
+		mbx.msg  = CPT_MSG_READY;
+		mbx.data = vf;
+		cpt_send_msg_to_vf(cpt, vf, &mbx);
+		break;
+	case CPT_MSG_VF_DOWN:
+		/* First msg in VF teardown sequence */
+		vfx->state = VF_STATE_DOWN;
+		module_put(THIS_MODULE);
+		cpt_mbox_send_ack(cpt, vf, &mbx);
+		break;
+	case CPT_MSG_QLEN:
+		vfx->qlen = mbx.data;
+		cpt_cfg_qlen_for_vf(cpt, vf, vfx->qlen);
+		cpt_mbox_send_ack(cpt, vf, &mbx);
+		break;
+	case CPT_MSG_QBIND_GRP:
+		vftype = cpt_bind_vq_to_grp(cpt, vf, (u8)mbx.data);
+		if ((vftype != AE_TYPES) && (vftype != SE_TYPES))
+			dev_err(dev, "Queue %d binding to group %llu failed",
+				vf, mbx.data);
+		else {
+			dev_dbg(dev, "Queue %d binding to group %llu successful",
+				vf, mbx.data);
+			mbx.msg = CPT_MSG_QBIND_GRP;
+			mbx.data = vftype;
+			cpt_send_msg_to_vf(cpt, vf, &mbx);
+		}
+		break;
+	case CPT_MSG_VQ_PRIORITY:
+		vfx->priority = mbx.data;
+		cpt_cfg_vq_priority(cpt, vf, vfx->priority);
+		cpt_mbox_send_ack(cpt, vf, &mbx);
+		break;
+	default:
+		dev_err(&cpt->pdev->dev, "Invalid msg from VF%d, msg 0x%llx\n",
+			vf, mbx.msg);
+		break;
+	}
+}
+
+void cpt_mbox_intr_handler (struct cpt_device *cpt, s32 mbx)
+{
+	u64 intr;
+	u8  vf;
+
+	intr = cpt_read_csr64(cpt->reg_base, CPTX_PF_MBOX_INTX(0, 0));
+	dev_dbg(&cpt->pdev->dev, "PF interrupt Mbox%d 0x%llx\n", mbx, intr);
+	for (vf = 0; vf < CPT_MAX_VF_NUM; vf++) {
+		if (intr & (1ULL << vf)) {
+			dev_dbg(&cpt->pdev->dev, "Intr from VF %d\n", vf);
+			cpt_handle_mbox_intr(cpt, vf);
+			cpt_clear_mbox_intr(cpt, vf);
+		}
+	}
+}
-- 
2.1.4

^ permalink raw reply related

* Re: [PATCH v2] crypto: sun4i-ss: support the Security System PRNG
From: Corentin Labbe @ 2016-12-13 15:33 UTC (permalink / raw)
  To: PrasannaKumar Muralidharan
  Cc: Herbert Xu, linux-kernel, Chen-Yu Tsai, linux-crypto,
	maxime.ripard, davem, linux-arm-kernel
In-Reply-To: <CANc+2y5zegViTsPnuWZwWLUwdRk+ac6upaWOjH7iRFL_zEBxGg@mail.gmail.com>

On Tue, Dec 13, 2016 at 08:53:54PM +0530, PrasannaKumar Muralidharan wrote:
> > What do you think about those two solutions ?
> 
> I prefer the second solution's idea of using two files (/dev/hwrng and
> /dev/hwprng). Upon having a quick glance it looks like (based on
> current_rng == prng check) that your current implementation allows
> only one rng device to be in use at a time. It would be better to have
> both usable at the same time. So applications that need pseudo random
> data at high speed can use /dev/prng while applications that require
> true random number can use /dev/rng. Please feel free to correct if my
> understanding of the code is incorrect. Along with this change I think
> changing the algif_rng to use this code if this solution is going to
> be used.
> 

No, there could be both device at the same time.

^ permalink raw reply

* Re: [PATCH v2] crypto: sun4i-ss: support the Security System PRNG
From: PrasannaKumar Muralidharan @ 2016-12-13 15:23 UTC (permalink / raw)
  To: Corentin Labbe
  Cc: Herbert Xu, davem, maxime.ripard, Chen-Yu Tsai, linux-kernel,
	linux-crypto, linux-arm-kernel
In-Reply-To: <20161213141059.GB10647@Red>

> What do you think about those two solutions ?

I prefer the second solution's idea of using two files (/dev/hwrng and
/dev/hwprng). Upon having a quick glance it looks like (based on
current_rng == prng check) that your current implementation allows
only one rng device to be in use at a time. It would be better to have
both usable at the same time. So applications that need pseudo random
data at high speed can use /dev/prng while applications that require
true random number can use /dev/rng. Please feel free to correct if my
understanding of the code is incorrect. Along with this change I think
changing the algif_rng to use this code if this solution is going to
be used.

Regards,
PrasannaKumar

^ permalink raw reply

* [PATCH v2 5/7] hwrng: core: Move hwrng miscdev minor number to include/linux/miscdevice.h
From: Corentin Labbe @ 2016-12-13 14:51 UTC (permalink / raw)
  To: mpm, herbert, arnd, gregkh; +Cc: linux-crypto, linux-kernel, Corentin Labbe
In-Reply-To: <20161213145115.30082-1-clabbe.montjoie@gmail.com>

This patch move the define for hwrng's miscdev minor number to
include/linux/miscdevice.h.
It's better that all minor number are in the same place.
Rename it to HWRNG_MINOR (from RNG_MISCDEV_MINOR) in he process since
no other miscdev define have MISCDEV in their name.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
---
 drivers/char/hw_random/core.c | 3 +--
 include/linux/miscdevice.h    | 1 +
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
index 7a2e496..1e1e385 100644
--- a/drivers/char/hw_random/core.c
+++ b/drivers/char/hw_random/core.c
@@ -26,7 +26,6 @@
 
 #define RNG_MODULE_NAME		"hw_random"
 #define PFX			RNG_MODULE_NAME ": "
-#define RNG_MISCDEV_MINOR	183 /* official */
 
 static struct hwrng *current_rng;
 static struct task_struct *hwrng_fill;
@@ -283,7 +282,7 @@ static const struct file_operations rng_chrdev_ops = {
 static const struct attribute_group *rng_dev_groups[];
 
 static struct miscdevice rng_miscdev = {
-	.minor		= RNG_MISCDEV_MINOR,
+	.minor		= HWRNG_MINOR,
 	.name		= RNG_MODULE_NAME,
 	.nodename	= "hwrng",
 	.fops		= &rng_chrdev_ops,
diff --git a/include/linux/miscdevice.h b/include/linux/miscdevice.h
index 722698a..659f586 100644
--- a/include/linux/miscdevice.h
+++ b/include/linux/miscdevice.h
@@ -31,6 +31,7 @@
 #define SGI_MMTIMER		153
 #define STORE_QUEUE_MINOR	155	/* unused */
 #define I2O_MINOR		166
+#define HWRNG_MINOR		183
 #define MICROCODE_MINOR		184
 #define VFIO_MINOR		196
 #define TUN_MINOR		200
-- 
2.10.2

^ permalink raw reply related

* [PATCH v2 7/7] hwrng: core: Remove linux/sched.h from includes
From: Corentin Labbe @ 2016-12-13 14:51 UTC (permalink / raw)
  To: mpm, herbert, arnd, gregkh; +Cc: linux-crypto, linux-kernel, Corentin Labbe
In-Reply-To: <20161213145115.30082-1-clabbe.montjoie@gmail.com>

linux/sched.h is useless for hw_random/core.c.
This patch remove it.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
---

Change since v1:
- linux/fs.h was needed, keep it

 drivers/char/hw_random/core.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
index 5c654b5..1c5949b 100644
--- a/drivers/char/hw_random/core.c
+++ b/drivers/char/hw_random/core.c
@@ -20,7 +20,6 @@
 #include <linux/miscdevice.h>
 #include <linux/module.h>
 #include <linux/random.h>
-#include <linux/sched.h>
 #include <linux/slab.h>
 #include <linux/uaccess.h>
 
-- 
2.10.2

^ permalink raw reply related

* [PATCH v2 6/7] hwrng: core: remove unused PFX macro
From: Corentin Labbe @ 2016-12-13 14:51 UTC (permalink / raw)
  To: mpm, herbert, arnd, gregkh; +Cc: linux-crypto, linux-kernel, Corentin Labbe
In-Reply-To: <20161213145115.30082-1-clabbe.montjoie@gmail.com>

This patch remove the unused PFX macro.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
---
 drivers/char/hw_random/core.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
index 1e1e385..5c654b5 100644
--- a/drivers/char/hw_random/core.c
+++ b/drivers/char/hw_random/core.c
@@ -25,7 +25,6 @@
 #include <linux/uaccess.h>
 
 #define RNG_MODULE_NAME		"hw_random"
-#define PFX			RNG_MODULE_NAME ": "
 
 static struct hwrng *current_rng;
 static struct task_struct *hwrng_fill;
-- 
2.10.2

^ permalink raw reply related

* [PATCH v2 4/7] hwrng: core: Replace asm/uaccess.h by linux/uaccess.h
From: Corentin Labbe @ 2016-12-13 14:51 UTC (permalink / raw)
  To: mpm, herbert, arnd, gregkh; +Cc: linux-crypto, linux-kernel, Corentin Labbe
In-Reply-To: <20161213145115.30082-1-clabbe.montjoie@gmail.com>

This patch fix the checkpatch warning about asm/uaccess.h.
In the same time, we sort the headers in alphabetical order.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
---
 drivers/char/hw_random/core.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
index a8e63ae..7a2e496 100644
--- a/drivers/char/hw_random/core.c
+++ b/drivers/char/hw_random/core.c
@@ -10,19 +10,19 @@
  * of the GNU General Public License, incorporated herein by reference.
  */
 
+#include <linux/delay.h>
 #include <linux/device.h>
+#include <linux/err.h>
+#include <linux/fs.h>
 #include <linux/hw_random.h>
-#include <linux/module.h>
 #include <linux/kernel.h>
-#include <linux/fs.h>
-#include <linux/sched.h>
-#include <linux/miscdevice.h>
 #include <linux/kthread.h>
-#include <linux/delay.h>
-#include <linux/slab.h>
+#include <linux/miscdevice.h>
+#include <linux/module.h>
 #include <linux/random.h>
-#include <linux/err.h>
-#include <asm/uaccess.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/uaccess.h>
 
 #define RNG_MODULE_NAME		"hw_random"
 #define PFX			RNG_MODULE_NAME ": "
-- 
2.10.2

^ permalink raw reply related

* [PATCH v2 3/7] hwrng: core: Rewrite the header
From: Corentin Labbe @ 2016-12-13 14:51 UTC (permalink / raw)
  To: mpm, herbert, arnd, gregkh; +Cc: linux-crypto, linux-kernel, Corentin Labbe
In-Reply-To: <20161213145115.30082-1-clabbe.montjoie@gmail.com>

checkpatch have lot of complaint about header.
Furthermore, the header have some offtopic/useless information.

This patch rewrite a proper header.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
---
 drivers/char/hw_random/core.c | 38 +++++++++-----------------------------
 1 file changed, 9 insertions(+), 29 deletions(-)

diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
index 7029246..a8e63ae 100644
--- a/drivers/char/hw_random/core.c
+++ b/drivers/char/hw_random/core.c
@@ -1,33 +1,13 @@
 /*
-        Added support for the AMD Geode LX RNG
-	(c) Copyright 2004-2005 Advanced Micro Devices, Inc.
-
-	derived from
-
- 	Hardware driver for the Intel/AMD/VIA Random Number Generators (RNG)
-	(c) Copyright 2003 Red Hat Inc <jgarzik@redhat.com>
-
- 	derived from
-
-        Hardware driver for the AMD 768 Random Number Generator (RNG)
-        (c) Copyright 2001 Red Hat Inc <alan@redhat.com>
-
- 	derived from
-
-	Hardware driver for Intel i810 Random Number Generator (RNG)
-	Copyright 2000,2001 Jeff Garzik <jgarzik@pobox.com>
-	Copyright 2000,2001 Philipp Rumpf <prumpf@mandrakesoft.com>
-
-	Added generic RNG API
-	Copyright 2006 Michael Buesch <m@bues.ch>
-	Copyright 2005 (c) MontaVista Software, Inc.
-
-	Please read Documentation/hw_random.txt for details on use.
-
-	----------------------------------------------------------
-	This software may be used and distributed according to the terms
-        of the GNU General Public License, incorporated herein by reference.
-
+ * hw_random/core.c: HWRNG core API
+ *
+ * Copyright 2006 Michael Buesch <m@bues.ch>
+ * Copyright 2005 (c) MontaVista Software, Inc.
+ *
+ * Please read Documentation/hw_random.txt for details on use.
+ *
+ * This software may be used and distributed according to the terms
+ * of the GNU General Public License, incorporated herein by reference.
  */
 
 #include <linux/device.h>
-- 
2.10.2

^ permalink raw reply related

* [PATCH v2 2/7] hwrng: core: rewrite better comparison to NULL
From: Corentin Labbe @ 2016-12-13 14:51 UTC (permalink / raw)
  To: mpm, herbert, arnd, gregkh; +Cc: linux-crypto, linux-kernel, Corentin Labbe
In-Reply-To: <20161213145115.30082-1-clabbe.montjoie@gmail.com>

This patch fix the checkpatch warning "Comparison to NULL could be written "!ptr"

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
---
 drivers/char/hw_random/core.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
index 00cbb81..7029246 100644
--- a/drivers/char/hw_random/core.c
+++ b/drivers/char/hw_random/core.c
@@ -439,8 +439,7 @@ int hwrng_register(struct hwrng *rng)
 	int err = -EINVAL;
 	struct hwrng *old_rng, *tmp;
 
-	if (rng->name == NULL ||
-	    (rng->data_read == NULL && rng->read == NULL))
+	if (!rng->name || (!rng->data_read && !rng->read))
 		goto out;
 
 	mutex_lock(&rng_mutex);
-- 
2.10.2

^ permalink raw reply related

* [PATCH v2 1/7] hwrng: core: do not use multiple blank lines
From: Corentin Labbe @ 2016-12-13 14:51 UTC (permalink / raw)
  To: mpm, herbert, arnd, gregkh; +Cc: linux-crypto, linux-kernel, Corentin Labbe

This patch fix the checkpatch warning "Please don't use multiple blank lines"

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
---
 drivers/char/hw_random/core.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
index d2d2c89..00cbb81 100644
--- a/drivers/char/hw_random/core.c
+++ b/drivers/char/hw_random/core.c
@@ -30,7 +30,6 @@
 
  */
 
-
 #include <linux/device.h>
 #include <linux/hw_random.h>
 #include <linux/module.h>
@@ -45,12 +44,10 @@
 #include <linux/err.h>
 #include <asm/uaccess.h>
 
-
 #define RNG_MODULE_NAME		"hw_random"
 #define PFX			RNG_MODULE_NAME ": "
 #define RNG_MISCDEV_MINOR	183 /* official */
 
-
 static struct hwrng *current_rng;
 static struct task_struct *hwrng_fill;
 static LIST_HEAD(rng_list);
@@ -296,7 +293,6 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf,
 	goto out;
 }
 
-
 static const struct file_operations rng_chrdev_ops = {
 	.owner		= THIS_MODULE,
 	.open		= rng_dev_open,
@@ -314,7 +310,6 @@ static struct miscdevice rng_miscdev = {
 	.groups		= rng_dev_groups,
 };
 
-
 static ssize_t hwrng_attr_current_store(struct device *dev,
 					struct device_attribute *attr,
 					const char *buf, size_t len)
-- 
2.10.2

^ permalink raw reply related

* [PATCH] crypto: aesni-intel - RFC4106 can zero copy when !PageHighMem
From: Ilya Lesokhin @ 2016-12-13 14:32 UTC (permalink / raw)
  To: linux-crypto, tadeusz.struk, herbert, davejwatson
  Cc: tls-fpga-sw-dev, Ilya Lesokhin

In the common case of !PageHighMem we can do zero copy crypto
even if sg crosses a pages boundary.

Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
---
 arch/x86/crypto/aesni-intel_glue.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c
index b1ab0cb..b494ad7 100644
--- a/arch/x86/crypto/aesni-intel_glue.c
+++ b/arch/x86/crypto/aesni-intel_glue.c
@@ -903,9 +903,11 @@ static int helper_rfc4106_encrypt(struct aead_request *req)
 	*((__be32 *)(iv+12)) = counter;
 
 	if (sg_is_last(req->src) &&
-	    req->src->offset + req->src->length <= PAGE_SIZE &&
+	    (!PageHighMem(sg_page(req->src)) ||
+	    req->src->offset + req->src->length <= PAGE_SIZE) &&
 	    sg_is_last(req->dst) &&
-	    req->dst->offset + req->dst->length <= PAGE_SIZE) {
+	    (!PageHighMem(sg_page(req->dst)) ||
+	    req->dst->offset + req->dst->length <= PAGE_SIZE)) {
 		one_entry_in_sg = 1;
 		scatterwalk_start(&src_sg_walk, req->src);
 		assoc = scatterwalk_map(&src_sg_walk);
@@ -990,9 +992,11 @@ static int helper_rfc4106_decrypt(struct aead_request *req)
 	*((__be32 *)(iv+12)) = counter;
 
 	if (sg_is_last(req->src) &&
-	    req->src->offset + req->src->length <= PAGE_SIZE &&
+	    (!PageHighMem(sg_page(req->src)) ||
+	    req->src->offset + req->src->length <= PAGE_SIZE) &&
 	    sg_is_last(req->dst) &&
-	    req->dst->offset + req->dst->length <= PAGE_SIZE) {
+	    (!PageHighMem(sg_page(req->dst)) ||
+	    req->dst->offset + req->dst->length <= PAGE_SIZE)) {
 		one_entry_in_sg = 1;
 		scatterwalk_start(&src_sg_walk, req->src);
 		assoc = scatterwalk_map(&src_sg_walk);
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH] crypto: bring back alphabetical order of Makefile
From: Corentin Labbe @ 2016-12-13 14:30 UTC (permalink / raw)
  To: herbert, davem; +Cc: linux-crypto, linux-kernel, Corentin Labbe

THe major content of drivers/crypto/Makefile is sorted, only recent
addition break this sort.

This patch bring back this alphabetical sorting.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
---
 drivers/crypto/Makefile | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
index ad7250f..a8c8a56 100644
--- a/drivers/crypto/Makefile
+++ b/drivers/crypto/Makefile
@@ -3,6 +3,7 @@ obj-$(CONFIG_CRYPTO_DEV_ATMEL_SHA) += atmel-sha.o
 obj-$(CONFIG_CRYPTO_DEV_ATMEL_TDES) += atmel-tdes.o
 obj-$(CONFIG_CRYPTO_DEV_BFIN_CRC) += bfin_crc.o
 obj-$(CONFIG_CRYPTO_DEV_CCP) += ccp/
+obj-$(CONFIG_CRYPTO_DEV_CHELSIO) += chelsio/
 obj-$(CONFIG_CRYPTO_DEV_FSL_CAAM) += caam/
 obj-$(CONFIG_CRYPTO_DEV_GEODE) += geode-aes.o
 obj-$(CONFIG_CRYPTO_DEV_HIFN_795X) += hifn_795x.o
@@ -11,6 +12,7 @@ obj-$(CONFIG_CRYPTO_DEV_IXP4XX) += ixp4xx_crypto.o
 obj-$(CONFIG_CRYPTO_DEV_MV_CESA) += mv_cesa.o
 obj-$(CONFIG_CRYPTO_DEV_MARVELL_CESA) += marvell/
 obj-$(CONFIG_CRYPTO_DEV_MXS_DCP) += mxs-dcp.o
+obj-$(CONFIG_CRYPTO_DEV_MXC_SCC) += mxc-scc.o
 obj-$(CONFIG_CRYPTO_DEV_NIAGARA2) += n2_crypto.o
 n2_crypto-y := n2_core.o n2_asm.o
 obj-$(CONFIG_CRYPTO_DEV_NX) += nx/
@@ -21,14 +23,12 @@ obj-$(CONFIG_CRYPTO_DEV_PADLOCK_AES) += padlock-aes.o
 obj-$(CONFIG_CRYPTO_DEV_PADLOCK_SHA) += padlock-sha.o
 obj-$(CONFIG_CRYPTO_DEV_PICOXCELL) += picoxcell_crypto.o
 obj-$(CONFIG_CRYPTO_DEV_PPC4XX) += amcc/
+obj-$(CONFIG_CRYPTO_DEV_QAT) += qat/
+obj-$(CONFIG_CRYPTO_DEV_QCE) += qce/
+obj-$(CONFIG_CRYPTO_DEV_ROCKCHIP) += rockchip/
 obj-$(CONFIG_CRYPTO_DEV_S5P) += s5p-sss.o
 obj-$(CONFIG_CRYPTO_DEV_SAHARA) += sahara.o
-obj-$(CONFIG_CRYPTO_DEV_MXC_SCC) += mxc-scc.o
+obj-$(CONFIG_CRYPTO_DEV_SUN4I_SS) += sunxi-ss/
 obj-$(CONFIG_CRYPTO_DEV_TALITOS) += talitos.o
 obj-$(CONFIG_CRYPTO_DEV_UX500) += ux500/
-obj-$(CONFIG_CRYPTO_DEV_QAT) += qat/
-obj-$(CONFIG_CRYPTO_DEV_QCE) += qce/
 obj-$(CONFIG_CRYPTO_DEV_VMX) += vmx/
-obj-$(CONFIG_CRYPTO_DEV_SUN4I_SS) += sunxi-ss/
-obj-$(CONFIG_CRYPTO_DEV_ROCKCHIP) += rockchip/
-obj-$(CONFIG_CRYPTO_DEV_CHELSIO) += chelsio/
-- 
2.10.2

^ permalink raw reply related

* Re: [PATCH v2] crypto: sun4i-ss: support the Security System PRNG
From: Corentin Labbe @ 2016-12-13 14:10 UTC (permalink / raw)
  To: Herbert Xu
  Cc: davem, maxime.ripard, wens, linux-kernel, linux-crypto,
	linux-arm-kernel
In-Reply-To: <20161208090618.GB22932@gondor.apana.org.au>

On Thu, Dec 08, 2016 at 05:06:18PM +0800, Herbert Xu wrote:
> On Wed, Dec 07, 2016 at 01:51:27PM +0100, Corentin Labbe wrote:
> > 
> > So I must expose it as a crypto_rng ?
> 
> If it is to be exposed at all then algif_rng would be the best
> place.
> 
> > Could you explain why PRNG must not be used as hw_random ?
> 
> The hwrng interface was always meant to be an interface for real
> hardware random number generators.  People rely on that so we
> should not provide bogus entropy sources through this interface.
> 
> Cheers,

I have found two solutions:

The simplier is to add an attribute isprng to hwrng like that:
--- a/drivers/char/hw_random/core.c
+++ b/drivers/char/hw_random/core.c
@@ -150,7 +150,8 @@ static int hwrng_init(struct hwrng *rng)
        reinit_completion(&rng->cleanup_done);
 
 skip_init:
-       add_early_randomness(rng);
+       if (!rng->isprng)
+               add_early_randomness(rng);
 
        current_quality = rng->quality ? : default_quality;
        if (current_quality > 1024)
@@ -158,7 +159,7 @@ static int hwrng_init(struct hwrng *rng)
 
        if (current_quality == 0 && hwrng_fill)
                kthread_stop(hwrng_fill);
-       if (current_quality > 0 && !hwrng_fill)
+       if (current_quality > 0 && !hwrng_fill && !rng->isprng)
                start_khwrngd();
 
        return 0;
@@ -439,7 +440,7 @@ int hwrng_register(struct hwrng *rng)
        }
        list_add_tail(&rng->list, &rng_list);
 
-       if (old_rng && !rng->init) {
+       if (old_rng && !rng->init && !rng->isprng) {
                /*
                 * Use a new device's input to add some randomness to
                 * the system.  If this rng device isn't going to be
diff --git a/include/linux/hw_random.h b/include/linux/hw_random.h
index 34a0dc1..5a5b8dc 100644
--- a/include/linux/hw_random.h
+++ b/include/linux/hw_random.h
@@ -50,6 +50,7 @@ struct hwrng {
        struct list_head list;
        struct kref ref;
        struct completion cleanup_done;
+       bool isprng;
 };


With that, we still could register prng, but they dont provide any entropy.
An optional Kconfig/"module parameter" could still be added for people still wanting this old behavour.


The other solution is to "duplicate" /dev/hwrng to /dev/hwprng like that:
--- a/drivers/char/hw_random/core.c
+++ b/drivers/char/hw_random/core.c
@@ -24,8 +24,11 @@
 #include <linux/uaccess.h>
 
 #define RNG_MODULE_NAME		"hw_random"
+#define PRNG_MODULE_NAME	"hw_prng"
+#define HWPRNG_MINOR	185 /* not official */
 
 static struct hwrng *current_rng;
+static struct hwrng *current_prng;
 static struct task_struct *hwrng_fill;
 static LIST_HEAD(rng_list);
 /* Protects rng_list and current_rng */
@@ -44,7 +47,7 @@ module_param(default_quality, ushort, 0644);
 MODULE_PARM_DESC(default_quality,
 		 "default entropy content of hwrng per mill");
 
-static void drop_current_rng(void);
+static void drop_current_rng(bool prng);
 static int hwrng_init(struct hwrng *rng);
 static void start_khwrngd(void);
 
@@ -56,6 +59,14 @@ static size_t rng_buffer_size(void)
 	return SMP_CACHE_BYTES < 32 ? 32 : SMP_CACHE_BYTES;
 }
 
+static bool is_current_xrng(struct hwrng *rng)
+{
+	if (rng->isprng)
+		return (rng == current_prng);
+	else
+		return (rng == current_rng);
+}
+
 static void add_early_randomness(struct hwrng *rng)
 {
 	int bytes_read;
@@ -88,32 +99,46 @@ static int set_current_rng(struct hwrng *rng)
 	if (err)
 		return err;
 
-	drop_current_rng();
-	current_rng = rng;
+	drop_current_rng(rng->isprng);
+	if (rng->isprng)
+		current_prng = rng;
+	else
+		current_rng = rng;
 
 	return 0;
 }
 
-static void drop_current_rng(void)
+static void drop_current_rng(bool prng)
 {
 	BUG_ON(!mutex_is_locked(&rng_mutex));
-	if (!current_rng)
-		return;
-
-	/* decrease last reference for triggering the cleanup */
-	kref_put(&current_rng->ref, cleanup_rng);
-	current_rng = NULL;
+	if (prng) {
+		if (!current_prng)
+			return;
+		/* decrease last reference for triggering the cleanup */
+		kref_put(&current_prng->ref, cleanup_rng);
+		current_prng = NULL;
+	} else {
+		if (!current_rng)
+			return;
+		/* decrease last reference for triggering the cleanup */
+		kref_put(&current_rng->ref, cleanup_rng);
+		current_rng = NULL;
+	}
 }
 
 /* Returns ERR_PTR(), NULL or refcounted hwrng */
-static struct hwrng *get_current_rng(void)
+static struct hwrng *get_current_rng(bool prng)
 {
 	struct hwrng *rng;
 
 	if (mutex_lock_interruptible(&rng_mutex))
 		return ERR_PTR(-ERESTARTSYS);
 
-	rng = current_rng;
+	if (prng)
+		rng = current_prng;
+	else
+		rng = current_rng;
+
 	if (rng)
 		kref_get(&rng->ref);
 
@@ -193,8 +218,8 @@ static inline int rng_get_data(struct hwrng *rng, u8 *buffer, size_t size,
 	return 0;
 }
 
-static ssize_t rng_dev_read(struct file *filp, char __user *buf,
-			    size_t size, loff_t *offp)
+static ssize_t genrng_dev_read(struct file *filp, char __user *buf,
+			    size_t size, loff_t *offp, bool prng)
 {
 	ssize_t ret = 0;
 	int err = 0;
@@ -202,7 +227,7 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf,
 	struct hwrng *rng;
 
 	while (size) {
-		rng = get_current_rng();
+		rng = get_current_rng(prng);
 		if (IS_ERR(rng)) {
 			err = PTR_ERR(rng);
 			goto out;
@@ -270,6 +295,18 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf,
 	goto out;
 }
 
+static ssize_t rng_dev_read(struct file *filp, char __user *buf,
+			    size_t size, loff_t *offp)
+{
+	return genrng_dev_read(filp, buf, size, offp, false);
+}
+
+static ssize_t prng_dev_read(struct file *filp, char __user *buf,
+			    size_t size, loff_t *offp)
+{
+	return genrng_dev_read(filp, buf, size, offp, true);
+}
+
 static const struct file_operations rng_chrdev_ops = {
 	.owner		= THIS_MODULE,
 	.open		= rng_dev_open,
@@ -278,6 +315,7 @@ static const struct file_operations rng_chrdev_ops = {
 };
 
 static const struct attribute_group *rng_dev_groups[];
+static const struct attribute_group *prng_dev_groups[];
 
 static struct miscdevice rng_miscdev = {
 	.minor		= HWRNG_MINOR,
@@ -287,9 +325,24 @@ static struct miscdevice rng_miscdev = {
 	.groups		= rng_dev_groups,
 };
 
-static ssize_t hwrng_attr_current_store(struct device *dev,
-					struct device_attribute *attr,
-					const char *buf, size_t len)
+static const struct file_operations prng_chrdev_ops = {
+	.owner		= THIS_MODULE,
+	.open		= rng_dev_open,
+	.read		= prng_dev_read,
+	.llseek		= noop_llseek,
+};
+
+static struct miscdevice prng_miscdev = {
+	.minor		= HWPRNG_MINOR,
+	.name		= PRNG_MODULE_NAME,
+	.nodename	= "hwprng",
+	.fops		= &prng_chrdev_ops,
+	.groups		= prng_dev_groups,
+};
+
+static ssize_t genrng_attr_current_store(struct device *dev,
+					 struct device_attribute *attr,
+					 const char *buf, size_t len, bool prng)
 {
 	int err;
 	struct hwrng *rng;
@@ -301,8 +354,13 @@ static ssize_t hwrng_attr_current_store(struct device *dev,
 	list_for_each_entry(rng, &rng_list, list) {
 		if (sysfs_streq(rng->name, buf)) {
 			err = 0;
-			if (rng != current_rng)
-				err = set_current_rng(rng);
+			if (prng) {
+				if (rng != current_prng)
+					err = set_current_rng(rng);
+			} else {
+				if (rng != current_rng)
+					err = set_current_rng(rng);
+			}
 			break;
 		}
 	}
@@ -311,14 +369,28 @@ static ssize_t hwrng_attr_current_store(struct device *dev,
 	return err ? : len;
 }
 
-static ssize_t hwrng_attr_current_show(struct device *dev,
-				       struct device_attribute *attr,
-				       char *buf)
+static ssize_t hwrng_attr_current_store(struct device *dev,
+					struct device_attribute *attr,
+					const char *buf, size_t len)
+{
+	return genrng_attr_current_store(dev, attr, buf, len, false);
+}
+
+static ssize_t hwprng_attr_current_store(struct device *dev,
+					 struct device_attribute *attr,
+					 const char *buf, size_t len)
+{
+	return genrng_attr_current_store(dev, attr, buf, len, true);
+}
+
+static ssize_t genrng_attr_current_show(struct device *dev,
+					struct device_attribute *attr,
+					char *buf, bool prng)
 {
 	ssize_t ret;
 	struct hwrng *rng;
 
-	rng = get_current_rng();
+	rng = get_current_rng(prng);
 	if (IS_ERR(rng))
 		return PTR_ERR(rng);
 
@@ -328,9 +400,24 @@ static ssize_t hwrng_attr_current_show(struct device *dev,
 	return ret;
 }
 
-static ssize_t hwrng_attr_available_show(struct device *dev,
+static ssize_t hwrng_attr_current_show(struct device *dev,
+				       struct device_attribute *attr,
+				       char *buf)
+{
+	return genrng_attr_current_show(dev, attr, buf, false);
+}
+
+static ssize_t hwprng_attr_current_show(struct device *dev,
+					struct device_attribute *attr,
+					char *buf)
+{
+	return genrng_attr_current_show(dev, attr, buf, true);
+}
+
+
+static ssize_t hwgenrng_attr_available_show(struct device *dev,
 					 struct device_attribute *attr,
-					 char *buf)
+					 char *buf, bool prng)
 {
 	int err;
 	struct hwrng *rng;
@@ -340,8 +427,10 @@ static ssize_t hwrng_attr_available_show(struct device *dev,
 		return -ERESTARTSYS;
 	buf[0] = '\0';
 	list_for_each_entry(rng, &rng_list, list) {
-		strlcat(buf, rng->name, PAGE_SIZE);
-		strlcat(buf, " ", PAGE_SIZE);
+		if (rng->isprng == prng) {
+			strlcat(buf, rng->name, PAGE_SIZE);
+			strlcat(buf, " ", PAGE_SIZE);
+		}
 	}
 	strlcat(buf, "\n", PAGE_SIZE);
 	mutex_unlock(&rng_mutex);
@@ -349,6 +438,20 @@ static ssize_t hwrng_attr_available_show(struct device *dev,
 	return strlen(buf);
 }
 
+static ssize_t hwrng_attr_available_show(struct device *dev,
+					 struct device_attribute *attr,
+					 char *buf)
+{
+	return hwgenrng_attr_available_show(dev, attr, buf, false);
+}
+
+static ssize_t hwprng_attr_available_show(struct device *dev,
+					  struct device_attribute *attr,
+					  char *buf)
+{
+	return hwgenrng_attr_available_show(dev, attr, buf, true);
+}
+
 static DEVICE_ATTR(rng_current, S_IRUGO | S_IWUSR,
 		   hwrng_attr_current_show,
 		   hwrng_attr_current_store);
@@ -356,6 +459,13 @@ static DEVICE_ATTR(rng_available, S_IRUGO,
 		   hwrng_attr_available_show,
 		   NULL);
 
+static DEVICE_ATTR(prng_current, S_IRUGO | S_IWUSR,
+		   hwprng_attr_current_show,
+		   hwprng_attr_current_store);
+static DEVICE_ATTR(prng_available, S_IRUGO,
+		   hwprng_attr_available_show,
+		   NULL);
+
 static struct attribute *rng_dev_attrs[] = {
 	&dev_attr_rng_current.attr,
 	&dev_attr_rng_available.attr,
@@ -364,14 +474,35 @@ static struct attribute *rng_dev_attrs[] = {
 
 ATTRIBUTE_GROUPS(rng_dev);
 
+static struct attribute *prng_dev_attrs[] = {
+	&dev_attr_prng_current.attr,
+	&dev_attr_prng_available.attr,
+	NULL
+};
+
+ATTRIBUTE_GROUPS(prng_dev);
+
 static void __exit unregister_miscdev(void)
 {
 	misc_deregister(&rng_miscdev);
+	misc_deregister(&prng_miscdev);
 }
 
 static int __init register_miscdev(void)
 {
-	return misc_register(&rng_miscdev);
+	int err;
+
+	err = misc_register(&rng_miscdev);
+	if (err)
+		return err;
+	err = misc_register(&prng_miscdev);
+	if (err)
+		goto reg_error;
+	else
+		return 0;
+reg_error:
+	misc_deregister(&rng_miscdev);
+	return err;
 }
 
 static int hwrng_fillfn(void *unused)
@@ -381,7 +512,7 @@ static int hwrng_fillfn(void *unused)
 	while (!kthread_should_stop()) {
 		struct hwrng *rng;
 
-		rng = get_current_rng();
+		rng = get_current_rng(false);
 		if (IS_ERR(rng) || !rng)
 			break;
 		mutex_lock(&reading_mutex);
@@ -462,8 +593,8 @@ void hwrng_unregister(struct hwrng *rng)
 	mutex_lock(&rng_mutex);
 
 	list_del(&rng->list);
-	if (current_rng == rng) {
-		drop_current_rng();
+	if (is_current_xrng(rng)) {
+		drop_current_rng(rng->isprng);
 		if (!list_empty(&rng_list)) {
 			struct hwrng *tail;
 
@@ -553,7 +684,8 @@ static int __init hwrng_modinit(void)
 static void __exit hwrng_modexit(void)
 {
 	mutex_lock(&rng_mutex);
-	BUG_ON(current_rng);
+	WARN_ON(current_rng);
+	WARN_ON(current_prng);
 	kfree(rng_buffer);
 	kfree(rng_fillbuf);
 	mutex_unlock(&rng_mutex);
diff --git a/include/linux/hw_random.h b/include/linux/hw_random.h
index 34a0dc1..5a5b8dc 100644
--- a/include/linux/hw_random.h
+++ b/include/linux/hw_random.h
@@ -50,6 +50,7 @@ struct hwrng {
 	struct list_head list;
 	struct kref ref;
 	struct completion cleanup_done;
+	bool isprng;
 };

What do you think about those two solutions ?

Regards
Corentin Labbe

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox