* Re: [PATCH v5 1/1] crypto: add virtio-crypto driver
From: Herbert Xu @ 2016-12-07 9:48 UTC (permalink / raw)
To: Gonglei (Arei)
Cc: linux-kernel@vger.kernel.org, qemu-devel@nongnu.org,
virtio-dev@lists.oasis-open.org,
virtualization@lists.linux-foundation.org,
linux-crypto@vger.kernel.org, Luonengjun, mst@redhat.com,
stefanha@redhat.com, Huangweidong (C), Wubin (H),
xin.zeng@intel.com, Claudio Fontana, pasic@linux.vnet.ibm.com,
davem@davemloft.net, Zhoujian (jay, Euler)
In-Reply-To: <33183CC9F5247A488A2544077AF19020DA1546F2@DGGEMA505-MBX.china.huawei.com>
On Tue, Dec 06, 2016 at 09:01:49AM +0000, Gonglei (Arei) wrote:
>
> Would you please review and/or ack the virtio_crypto_algs.c?
> It is the realization of specified algs based on Linux Crypto Framework.
I have no issues with it. If the virtio folks are happy with
the interface with the host then I'll take this patch.
Thanks,
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply
* Re: [PATCH 10/10] virtio: enable endian checks for sparse builds
From: Christoph Hellwig @ 2016-12-07 7:30 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: linux-kernel, Jason Wang, linux-kbuild, Michal Marek,
Arnd Bergmann, Greg Kroah-Hartman, Matt Mackall, Herbert Xu,
David Airlie, Gerd Hoffmann, Ohad Ben-Cohen,
Christian Borntraeger, Cornelia Huck, James E.J. Bottomley,
David S. Miller, Jens Axboe, Neil Armstrong, Stefan Hajnoczi,
Asias He, linux-crypto, dri-devel
In-Reply-To: <1481038106-24899-11-git-send-email-mst@redhat.com>
On Tue, Dec 06, 2016 at 05:41:05PM +0200, Michael S. Tsirkin wrote:
> __CHECK_ENDIAN__ isn't on by default presumably because
> it triggers too many sparse warnings for correct code.
> But virtio is now clean of these warnings, and
> we want to keep it this way - enable this for
> sparse builds.
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Nah. Please just enable it globally when using sparse. I actually
had a chat with Linus about that a while ago and he seemed generally
fine with it, I just didn't manage to actually do it..
^ permalink raw reply
* Re: [PATCH 10/10] virtio: enable endian checks for sparse builds
From: Johannes Berg @ 2016-12-07 6:25 UTC (permalink / raw)
To: Michael S. Tsirkin, linux-kernel
Cc: kvm, Neil Armstrong, David Airlie, linux-remoteproc, dri-devel,
virtualization, linux-s390, James E.J. Bottomley, Herbert Xu,
linux-scsi, v9fs-developer, Asias He, Arnd Bergmann, linux-kbuild,
Jens Axboe, Michal Marek, Stefan Hajnoczi, Matt Mackall,
Greg Kroah-Hartman, linux-crypto, netdev, David S. Miller
In-Reply-To: <1481038106-24899-11-git-send-email-mst@redhat.com>
On Tue, 2016-12-06 at 17:41 +0200, Michael S. Tsirkin wrote:
> It seems that there should be a better way to do it,
> but this works too.
In some cases there might be:
> --- a/drivers/s390/virtio/Makefile
> +++ b/drivers/s390/virtio/Makefile
> @@ -6,6 +6,8 @@
> # it under the terms of the GNU General Public License (version 2
> only)
> # as published by the Free Software Foundation.
>
> +CFLAGS_virtio_ccw.o += -D__CHECK_ENDIAN__
> +CFLAGS_kvm_virtio.o += -D__CHECK_ENDIAN__
> s390-virtio-objs := virtio_ccw.o
> ifdef CONFIG_S390_GUEST_OLD_TRANSPORT
> s390-virtio-objs += kvm_virtio.o
Here you could use
ccflags-y += -D__CHECK_ENDIAN__
for example, or even
subdir-ccflags-y += -D__CHECK_ENDIAN__
(in case any subdirs ever get added here)
> --- a/drivers/vhost/Makefile
> +++ b/drivers/vhost/Makefile
> @@ -1,3 +1,4 @@
> +ccflags-y := -D__CHECK_ENDIAN__
Looks like you did that here and in some other places though - so
perhaps the s390 one was intentionally different?
> --- a/net/packet/Makefile
> +++ b/net/packet/Makefile
> @@ -2,6 +2,7 @@
> # Makefile for the packet AF.
> #
>
> +ccflags-y := -D__CHECK_ENDIAN__
Technically this is slightly more than advertised, but I guess that
still makes sense if it's clean now.
johannes
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply
* Re: [PATCH 10/10] virtio: enable endian checks for sparse builds
From: Jason Wang @ 2016-12-07 5:27 UTC (permalink / raw)
To: Michael S. Tsirkin, linux-kernel
Cc: kvm, Neil Armstrong, David Airlie, linux-remoteproc, dri-devel,
virtualization, linux-s390, James E.J. Bottomley, Herbert Xu,
linux-scsi, v9fs-developer, Asias He, Arnd Bergmann, linux-kbuild,
Jens Axboe, Michal Marek, Stefan Hajnoczi, Matt Mackall,
Greg Kroah-Hartman, linux-crypto, netdev, David S. Miller
In-Reply-To: <1481038106-24899-11-git-send-email-mst@redhat.com>
On 2016年12月06日 23:41, Michael S. Tsirkin wrote:
> __CHECK_ENDIAN__ isn't on by default presumably because
> it triggers too many sparse warnings for correct code.
> But virtio is now clean of these warnings, and
> we want to keep it this way - enable this for
> sparse builds.
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>
> It seems that there should be a better way to do it,
> but this works too.
Reviewed-by: Jason Wang <jasowang@redhat.com>
>
> drivers/block/Makefile | 1 +
> drivers/char/Makefile | 1 +
> drivers/char/hw_random/Makefile | 2 ++
> drivers/gpu/drm/virtio/Makefile | 1 +
> drivers/net/Makefile | 3 +++
> drivers/net/caif/Makefile | 1 +
> drivers/rpmsg/Makefile | 1 +
> drivers/s390/virtio/Makefile | 2 ++
> drivers/scsi/Makefile | 1 +
> drivers/vhost/Makefile | 1 +
> drivers/virtio/Makefile | 3 +++
> net/9p/Makefile | 1 +
> net/packet/Makefile | 1 +
> net/vmw_vsock/Makefile | 2 ++
> 14 files changed, 21 insertions(+)
>
> diff --git a/drivers/block/Makefile b/drivers/block/Makefile
> index 1e9661e..597481c 100644
> --- a/drivers/block/Makefile
> +++ b/drivers/block/Makefile
> @@ -27,6 +27,7 @@ obj-$(CONFIG_BLK_DEV_OSD) += osdblk.o
> obj-$(CONFIG_BLK_DEV_UMEM) += umem.o
> obj-$(CONFIG_BLK_DEV_NBD) += nbd.o
> obj-$(CONFIG_BLK_DEV_CRYPTOLOOP) += cryptoloop.o
> +CFLAGS_virtio_blk.o += -D__CHECK_ENDIAN__
> obj-$(CONFIG_VIRTIO_BLK) += virtio_blk.o
>
> obj-$(CONFIG_BLK_DEV_SX8) += sx8.o
> diff --git a/drivers/char/Makefile b/drivers/char/Makefile
> index 6e6c244..a99467d 100644
> --- a/drivers/char/Makefile
> +++ b/drivers/char/Makefile
> @@ -6,6 +6,7 @@ obj-y += mem.o random.o
> obj-$(CONFIG_TTY_PRINTK) += ttyprintk.o
> obj-y += misc.o
> obj-$(CONFIG_ATARI_DSP56K) += dsp56k.o
> +CFLAGS_virtio_console.o += -D__CHECK_ENDIAN__
> obj-$(CONFIG_VIRTIO_CONSOLE) += virtio_console.o
> obj-$(CONFIG_RAW_DRIVER) += raw.o
> obj-$(CONFIG_SGI_SNSC) += snsc.o snsc_event.o
> diff --git a/drivers/char/hw_random/Makefile b/drivers/char/hw_random/Makefile
> index 5f52b1e..a2b0931 100644
> --- a/drivers/char/hw_random/Makefile
> +++ b/drivers/char/hw_random/Makefile
> @@ -17,6 +17,8 @@ obj-$(CONFIG_HW_RANDOM_IXP4XX) += ixp4xx-rng.o
> obj-$(CONFIG_HW_RANDOM_OMAP) += omap-rng.o
> obj-$(CONFIG_HW_RANDOM_OMAP3_ROM) += omap3-rom-rng.o
> obj-$(CONFIG_HW_RANDOM_PASEMI) += pasemi-rng.o
> +CFLAGS_virtio_transport.o += -D__CHECK_ENDIAN__
> +CFLAGS_virtio-rng.o += -D__CHECK_ENDIAN__
> obj-$(CONFIG_HW_RANDOM_VIRTIO) += virtio-rng.o
> obj-$(CONFIG_HW_RANDOM_TX4939) += tx4939-rng.o
> obj-$(CONFIG_HW_RANDOM_MXC_RNGA) += mxc-rnga.o
> diff --git a/drivers/gpu/drm/virtio/Makefile b/drivers/gpu/drm/virtio/Makefile
> index 3fb8eac..1162366 100644
> --- a/drivers/gpu/drm/virtio/Makefile
> +++ b/drivers/gpu/drm/virtio/Makefile
> @@ -3,6 +3,7 @@
> # Direct Rendering Infrastructure (DRI) in XFree86 4.1.0 and higher.
>
> ccflags-y := -Iinclude/drm
> +ccflags-y += -D__CHECK_ENDIAN__
>
> virtio-gpu-y := virtgpu_drv.o virtgpu_kms.o virtgpu_drm_bus.o virtgpu_gem.o \
> virtgpu_fb.o virtgpu_display.o virtgpu_vq.o virtgpu_ttm.o \
> diff --git a/drivers/net/Makefile b/drivers/net/Makefile
> index 7336cbd..3f587de 100644
> --- a/drivers/net/Makefile
> +++ b/drivers/net/Makefile
> @@ -12,6 +12,7 @@ obj-$(CONFIG_EQUALIZER) += eql.o
> obj-$(CONFIG_IFB) += ifb.o
> obj-$(CONFIG_MACSEC) += macsec.o
> obj-$(CONFIG_MACVLAN) += macvlan.o
> +CFLAGS_macvtap.o += -D__CHECK_ENDIAN__
> obj-$(CONFIG_MACVTAP) += macvtap.o
> obj-$(CONFIG_MII) += mii.o
> obj-$(CONFIG_MDIO) += mdio.o
> @@ -20,8 +21,10 @@ obj-$(CONFIG_NETCONSOLE) += netconsole.o
> obj-$(CONFIG_PHYLIB) += phy/
> obj-$(CONFIG_RIONET) += rionet.o
> obj-$(CONFIG_NET_TEAM) += team/
> +CFLAGS_tun.o += -D__CHECK_ENDIAN__
> obj-$(CONFIG_TUN) += tun.o
> obj-$(CONFIG_VETH) += veth.o
> +CFLAGS_virtio_net.o += -D__CHECK_ENDIAN__
> obj-$(CONFIG_VIRTIO_NET) += virtio_net.o
> obj-$(CONFIG_VXLAN) += vxlan.o
> obj-$(CONFIG_GENEVE) += geneve.o
> diff --git a/drivers/net/caif/Makefile b/drivers/net/caif/Makefile
> index 9bbd453..d1a922c 100644
> --- a/drivers/net/caif/Makefile
> +++ b/drivers/net/caif/Makefile
> @@ -12,3 +12,4 @@ obj-$(CONFIG_CAIF_HSI) += caif_hsi.o
>
> # Virtio interface
> obj-$(CONFIG_CAIF_VIRTIO) += caif_virtio.o
> +CFLAGS_caif_virtio.o += -D__CHECK_ENDIAN__
> diff --git a/drivers/rpmsg/Makefile b/drivers/rpmsg/Makefile
> index ae9c913..23c8b66 100644
> --- a/drivers/rpmsg/Makefile
> +++ b/drivers/rpmsg/Makefile
> @@ -1,3 +1,4 @@
> obj-$(CONFIG_RPMSG) += rpmsg_core.o
> obj-$(CONFIG_RPMSG_QCOM_SMD) += qcom_smd.o
> obj-$(CONFIG_RPMSG_VIRTIO) += virtio_rpmsg_bus.o
> +CFLAGS_virtio_rpmsg_bus.o += -D__CHECK_ENDIAN__
> diff --git a/drivers/s390/virtio/Makefile b/drivers/s390/virtio/Makefile
> index df40692..270ada5 100644
> --- a/drivers/s390/virtio/Makefile
> +++ b/drivers/s390/virtio/Makefile
> @@ -6,6 +6,8 @@
> # it under the terms of the GNU General Public License (version 2 only)
> # as published by the Free Software Foundation.
>
> +CFLAGS_virtio_ccw.o += -D__CHECK_ENDIAN__
> +CFLAGS_kvm_virtio.o += -D__CHECK_ENDIAN__
> s390-virtio-objs := virtio_ccw.o
> ifdef CONFIG_S390_GUEST_OLD_TRANSPORT
> s390-virtio-objs += kvm_virtio.o
> diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile
> index 38d938d..9f70d46 100644
> --- a/drivers/scsi/Makefile
> +++ b/drivers/scsi/Makefile
> @@ -135,6 +135,7 @@ obj-$(CONFIG_SCSI_BNX2_ISCSI) += libiscsi.o bnx2i/
> obj-$(CONFIG_BE2ISCSI) += libiscsi.o be2iscsi/
> obj-$(CONFIG_SCSI_ESAS2R) += esas2r/
> obj-$(CONFIG_SCSI_PMCRAID) += pmcraid.o
> +CFLAGS_virtio_scsi.o += -D__CHECK_ENDIAN__
> obj-$(CONFIG_SCSI_VIRTIO) += virtio_scsi.o
> obj-$(CONFIG_VMWARE_PVSCSI) += vmw_pvscsi.o
> obj-$(CONFIG_XEN_SCSI_FRONTEND) += xen-scsifront.o
> diff --git a/drivers/vhost/Makefile b/drivers/vhost/Makefile
> index 6b012b9..619e2cd 100644
> --- a/drivers/vhost/Makefile
> +++ b/drivers/vhost/Makefile
> @@ -1,3 +1,4 @@
> +ccflags-y := -D__CHECK_ENDIAN__
> obj-$(CONFIG_VHOST_NET) += vhost_net.o
> vhost_net-y := net.o
>
> diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
> index 41e30e3..d331f19 100644
> --- a/drivers/virtio/Makefile
> +++ b/drivers/virtio/Makefile
> @@ -1,3 +1,6 @@
> +#virtio must be kept clean wrt endian tags,
> +#otherwise we'll get to maintain broken host/guest ABIs
> +ccflags-y := -D__CHECK_ENDIAN__
> obj-$(CONFIG_VIRTIO) += virtio.o virtio_ring.o
> obj-$(CONFIG_VIRTIO_MMIO) += virtio_mmio.o
> obj-$(CONFIG_VIRTIO_PCI) += virtio_pci.o
> diff --git a/net/9p/Makefile b/net/9p/Makefile
> index a0874cc..acf1225 100644
> --- a/net/9p/Makefile
> +++ b/net/9p/Makefile
> @@ -11,6 +11,7 @@ obj-$(CONFIG_NET_9P_RDMA) += 9pnet_rdma.o
> trans_fd.o \
> trans_common.o \
>
> +CFLAGS_trans_virtio.o += -D__CHECK_ENDIAN__
> 9pnet_virtio-objs := \
> trans_virtio.o \
>
> diff --git a/net/packet/Makefile b/net/packet/Makefile
> index 9df6134..a13bcb3 100644
> --- a/net/packet/Makefile
> +++ b/net/packet/Makefile
> @@ -2,6 +2,7 @@
> # Makefile for the packet AF.
> #
>
> +ccflags-y := -D__CHECK_ENDIAN__
> obj-$(CONFIG_PACKET) += af_packet.o
> obj-$(CONFIG_PACKET_DIAG) += af_packet_diag.o
> af_packet_diag-y += diag.o
> diff --git a/net/vmw_vsock/Makefile b/net/vmw_vsock/Makefile
> index bc27c70..a61eccb 100644
> --- a/net/vmw_vsock/Makefile
> +++ b/net/vmw_vsock/Makefile
> @@ -8,6 +8,8 @@ vsock-y += af_vsock.o vsock_addr.o
> vmw_vsock_vmci_transport-y += vmci_transport.o vmci_transport_notify.o \
> vmci_transport_notify_qstate.o
>
> +CFLAGS_virtio_transport.o += -D__CHECK_ENDIAN__
> +CFLAGS_virtio_transport_common.o += -D__CHECK_ENDIAN__
> vmw_vsock_virtio_transport-y += virtio_transport.o
>
> vmw_vsock_virtio_transport_common-y += virtio_transport_common.o
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply
* Re: [PATCH v2] crypto/mcryptd: Check mcryptd algorithm compatibility
From: Mikulas Patocka @ 2016-12-06 21:07 UTC (permalink / raw)
To: Tim Chen
Cc: Herbert Xu, David S. Miller, megha.dey, linux-crypto, dm-devel,
Milan Broz, Eric Biggers, stable
In-Reply-To: <6825659d1f6a96d4ab2327aa0dcf357f6252e915.1480967128.git.tim.c.chen@linux.intel.com>
I think a proper fix would be to find the reason why mcryptd crashes and
fix that reason (i.e. make it fail in mcryptd_create_hash rather than
crash).
Testing flags could fix the bug for now but it could backfire later when
more algorithms are added.
Mikulas
On Mon, 5 Dec 2016, Tim Chen wrote:
> Algorithms not compatible with mcryptd could be spawned by mcryptd
> with a direct crypto_alloc_tfm invocation using a "mcryptd(alg)" name
> construct. This causes mcryptd to crash the kernel if an arbitrary
> "alg" is incompatible and not intended to be used with mcryptd. It is
> an issue if AF_ALG tries to spawn mcryptd(alg) to expose it externally.
> But such algorithms must be used internally and not be exposed.
>
> We added a check to enforce that only internal algorithms are allowed
> with mcryptd at the time mcryptd is spawning an algorithm.
>
> Link: http://marc.info/?l=linux-crypto-vger&m=148063683310477&w=2
> Cc: stable@vger.kernel.org
> Reported-by: Mikulas Patocka <mpatocka@redhat.com>
> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
> ---
> crypto/mcryptd.c | 19 ++++++++++++-------
> 1 file changed, 12 insertions(+), 7 deletions(-)
>
> diff --git a/crypto/mcryptd.c b/crypto/mcryptd.c
> index 94ee44a..c207458 100644
> --- a/crypto/mcryptd.c
> +++ b/crypto/mcryptd.c
> @@ -254,18 +254,22 @@ static void *mcryptd_alloc_instance(struct crypto_alg *alg, unsigned int head,
> goto out;
> }
>
> -static inline void mcryptd_check_internal(struct rtattr **tb, u32 *type,
> +static inline bool mcryptd_check_internal(struct rtattr **tb, u32 *type,
> u32 *mask)
> {
> struct crypto_attr_type *algt;
>
> algt = crypto_get_attr_type(tb);
> if (IS_ERR(algt))
> - return;
> - if ((algt->type & CRYPTO_ALG_INTERNAL))
> - *type |= CRYPTO_ALG_INTERNAL;
> - if ((algt->mask & CRYPTO_ALG_INTERNAL))
> - *mask |= CRYPTO_ALG_INTERNAL;
> + return false;
> +
> + *type |= algt->type & CRYPTO_ALG_INTERNAL;
> + *mask |= algt->mask & CRYPTO_ALG_INTERNAL;
> +
> + if (*type & *mask & CRYPTO_ALG_INTERNAL)
> + return true;
> + else
> + return false;
> }
>
> static int mcryptd_hash_init_tfm(struct crypto_tfm *tfm)
> @@ -492,7 +496,8 @@ static int mcryptd_create_hash(struct crypto_template *tmpl, struct rtattr **tb,
> u32 mask = 0;
> int err;
>
> - mcryptd_check_internal(tb, &type, &mask);
> + if (!mcryptd_check_internal(tb, &type, &mask))
> + return -EINVAL;
>
> halg = ahash_attr_alg(tb[1], type, mask);
> if (IS_ERR(halg))
> --
> 2.5.5
>
^ permalink raw reply
* [PATCH 10/10] virtio: enable endian checks for sparse builds
From: Michael S. Tsirkin @ 2016-12-06 15:41 UTC (permalink / raw)
To: linux-kernel
Cc: kvm, Neil Armstrong, Jason Wang, linux-remoteproc, dri-devel,
virtualization, Gerd Hoffmann, linux-s390, James E.J. Bottomley,
Herbert Xu, linux-scsi, Christian Borntraeger, v9fs-developer,
Asias He, Ohad Ben-Cohen, Arnd Bergmann, linux-kbuild, Jens Axboe,
Michal Marek, Stefan Hajnoczi, Matt Mackall, Cornelia Huck,
Greg Kroah-Hartman, linux-crypto, netdev, "David S
In-Reply-To: <1481038106-24899-1-git-send-email-mst@redhat.com>
__CHECK_ENDIAN__ isn't on by default presumably because
it triggers too many sparse warnings for correct code.
But virtio is now clean of these warnings, and
we want to keep it this way - enable this for
sparse builds.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
It seems that there should be a better way to do it,
but this works too.
drivers/block/Makefile | 1 +
drivers/char/Makefile | 1 +
drivers/char/hw_random/Makefile | 2 ++
drivers/gpu/drm/virtio/Makefile | 1 +
drivers/net/Makefile | 3 +++
drivers/net/caif/Makefile | 1 +
drivers/rpmsg/Makefile | 1 +
drivers/s390/virtio/Makefile | 2 ++
drivers/scsi/Makefile | 1 +
drivers/vhost/Makefile | 1 +
drivers/virtio/Makefile | 3 +++
net/9p/Makefile | 1 +
net/packet/Makefile | 1 +
net/vmw_vsock/Makefile | 2 ++
14 files changed, 21 insertions(+)
diff --git a/drivers/block/Makefile b/drivers/block/Makefile
index 1e9661e..597481c 100644
--- a/drivers/block/Makefile
+++ b/drivers/block/Makefile
@@ -27,6 +27,7 @@ obj-$(CONFIG_BLK_DEV_OSD) += osdblk.o
obj-$(CONFIG_BLK_DEV_UMEM) += umem.o
obj-$(CONFIG_BLK_DEV_NBD) += nbd.o
obj-$(CONFIG_BLK_DEV_CRYPTOLOOP) += cryptoloop.o
+CFLAGS_virtio_blk.o += -D__CHECK_ENDIAN__
obj-$(CONFIG_VIRTIO_BLK) += virtio_blk.o
obj-$(CONFIG_BLK_DEV_SX8) += sx8.o
diff --git a/drivers/char/Makefile b/drivers/char/Makefile
index 6e6c244..a99467d 100644
--- a/drivers/char/Makefile
+++ b/drivers/char/Makefile
@@ -6,6 +6,7 @@ obj-y += mem.o random.o
obj-$(CONFIG_TTY_PRINTK) += ttyprintk.o
obj-y += misc.o
obj-$(CONFIG_ATARI_DSP56K) += dsp56k.o
+CFLAGS_virtio_console.o += -D__CHECK_ENDIAN__
obj-$(CONFIG_VIRTIO_CONSOLE) += virtio_console.o
obj-$(CONFIG_RAW_DRIVER) += raw.o
obj-$(CONFIG_SGI_SNSC) += snsc.o snsc_event.o
diff --git a/drivers/char/hw_random/Makefile b/drivers/char/hw_random/Makefile
index 5f52b1e..a2b0931 100644
--- a/drivers/char/hw_random/Makefile
+++ b/drivers/char/hw_random/Makefile
@@ -17,6 +17,8 @@ obj-$(CONFIG_HW_RANDOM_IXP4XX) += ixp4xx-rng.o
obj-$(CONFIG_HW_RANDOM_OMAP) += omap-rng.o
obj-$(CONFIG_HW_RANDOM_OMAP3_ROM) += omap3-rom-rng.o
obj-$(CONFIG_HW_RANDOM_PASEMI) += pasemi-rng.o
+CFLAGS_virtio_transport.o += -D__CHECK_ENDIAN__
+CFLAGS_virtio-rng.o += -D__CHECK_ENDIAN__
obj-$(CONFIG_HW_RANDOM_VIRTIO) += virtio-rng.o
obj-$(CONFIG_HW_RANDOM_TX4939) += tx4939-rng.o
obj-$(CONFIG_HW_RANDOM_MXC_RNGA) += mxc-rnga.o
diff --git a/drivers/gpu/drm/virtio/Makefile b/drivers/gpu/drm/virtio/Makefile
index 3fb8eac..1162366 100644
--- a/drivers/gpu/drm/virtio/Makefile
+++ b/drivers/gpu/drm/virtio/Makefile
@@ -3,6 +3,7 @@
# Direct Rendering Infrastructure (DRI) in XFree86 4.1.0 and higher.
ccflags-y := -Iinclude/drm
+ccflags-y += -D__CHECK_ENDIAN__
virtio-gpu-y := virtgpu_drv.o virtgpu_kms.o virtgpu_drm_bus.o virtgpu_gem.o \
virtgpu_fb.o virtgpu_display.o virtgpu_vq.o virtgpu_ttm.o \
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 7336cbd..3f587de 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -12,6 +12,7 @@ obj-$(CONFIG_EQUALIZER) += eql.o
obj-$(CONFIG_IFB) += ifb.o
obj-$(CONFIG_MACSEC) += macsec.o
obj-$(CONFIG_MACVLAN) += macvlan.o
+CFLAGS_macvtap.o += -D__CHECK_ENDIAN__
obj-$(CONFIG_MACVTAP) += macvtap.o
obj-$(CONFIG_MII) += mii.o
obj-$(CONFIG_MDIO) += mdio.o
@@ -20,8 +21,10 @@ obj-$(CONFIG_NETCONSOLE) += netconsole.o
obj-$(CONFIG_PHYLIB) += phy/
obj-$(CONFIG_RIONET) += rionet.o
obj-$(CONFIG_NET_TEAM) += team/
+CFLAGS_tun.o += -D__CHECK_ENDIAN__
obj-$(CONFIG_TUN) += tun.o
obj-$(CONFIG_VETH) += veth.o
+CFLAGS_virtio_net.o += -D__CHECK_ENDIAN__
obj-$(CONFIG_VIRTIO_NET) += virtio_net.o
obj-$(CONFIG_VXLAN) += vxlan.o
obj-$(CONFIG_GENEVE) += geneve.o
diff --git a/drivers/net/caif/Makefile b/drivers/net/caif/Makefile
index 9bbd453..d1a922c 100644
--- a/drivers/net/caif/Makefile
+++ b/drivers/net/caif/Makefile
@@ -12,3 +12,4 @@ obj-$(CONFIG_CAIF_HSI) += caif_hsi.o
# Virtio interface
obj-$(CONFIG_CAIF_VIRTIO) += caif_virtio.o
+CFLAGS_caif_virtio.o += -D__CHECK_ENDIAN__
diff --git a/drivers/rpmsg/Makefile b/drivers/rpmsg/Makefile
index ae9c913..23c8b66 100644
--- a/drivers/rpmsg/Makefile
+++ b/drivers/rpmsg/Makefile
@@ -1,3 +1,4 @@
obj-$(CONFIG_RPMSG) += rpmsg_core.o
obj-$(CONFIG_RPMSG_QCOM_SMD) += qcom_smd.o
obj-$(CONFIG_RPMSG_VIRTIO) += virtio_rpmsg_bus.o
+CFLAGS_virtio_rpmsg_bus.o += -D__CHECK_ENDIAN__
diff --git a/drivers/s390/virtio/Makefile b/drivers/s390/virtio/Makefile
index df40692..270ada5 100644
--- a/drivers/s390/virtio/Makefile
+++ b/drivers/s390/virtio/Makefile
@@ -6,6 +6,8 @@
# it under the terms of the GNU General Public License (version 2 only)
# as published by the Free Software Foundation.
+CFLAGS_virtio_ccw.o += -D__CHECK_ENDIAN__
+CFLAGS_kvm_virtio.o += -D__CHECK_ENDIAN__
s390-virtio-objs := virtio_ccw.o
ifdef CONFIG_S390_GUEST_OLD_TRANSPORT
s390-virtio-objs += kvm_virtio.o
diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile
index 38d938d..9f70d46 100644
--- a/drivers/scsi/Makefile
+++ b/drivers/scsi/Makefile
@@ -135,6 +135,7 @@ obj-$(CONFIG_SCSI_BNX2_ISCSI) += libiscsi.o bnx2i/
obj-$(CONFIG_BE2ISCSI) += libiscsi.o be2iscsi/
obj-$(CONFIG_SCSI_ESAS2R) += esas2r/
obj-$(CONFIG_SCSI_PMCRAID) += pmcraid.o
+CFLAGS_virtio_scsi.o += -D__CHECK_ENDIAN__
obj-$(CONFIG_SCSI_VIRTIO) += virtio_scsi.o
obj-$(CONFIG_VMWARE_PVSCSI) += vmw_pvscsi.o
obj-$(CONFIG_XEN_SCSI_FRONTEND) += xen-scsifront.o
diff --git a/drivers/vhost/Makefile b/drivers/vhost/Makefile
index 6b012b9..619e2cd 100644
--- a/drivers/vhost/Makefile
+++ b/drivers/vhost/Makefile
@@ -1,3 +1,4 @@
+ccflags-y := -D__CHECK_ENDIAN__
obj-$(CONFIG_VHOST_NET) += vhost_net.o
vhost_net-y := net.o
diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
index 41e30e3..d331f19 100644
--- a/drivers/virtio/Makefile
+++ b/drivers/virtio/Makefile
@@ -1,3 +1,6 @@
+#virtio must be kept clean wrt endian tags,
+#otherwise we'll get to maintain broken host/guest ABIs
+ccflags-y := -D__CHECK_ENDIAN__
obj-$(CONFIG_VIRTIO) += virtio.o virtio_ring.o
obj-$(CONFIG_VIRTIO_MMIO) += virtio_mmio.o
obj-$(CONFIG_VIRTIO_PCI) += virtio_pci.o
diff --git a/net/9p/Makefile b/net/9p/Makefile
index a0874cc..acf1225 100644
--- a/net/9p/Makefile
+++ b/net/9p/Makefile
@@ -11,6 +11,7 @@ obj-$(CONFIG_NET_9P_RDMA) += 9pnet_rdma.o
trans_fd.o \
trans_common.o \
+CFLAGS_trans_virtio.o += -D__CHECK_ENDIAN__
9pnet_virtio-objs := \
trans_virtio.o \
diff --git a/net/packet/Makefile b/net/packet/Makefile
index 9df6134..a13bcb3 100644
--- a/net/packet/Makefile
+++ b/net/packet/Makefile
@@ -2,6 +2,7 @@
# Makefile for the packet AF.
#
+ccflags-y := -D__CHECK_ENDIAN__
obj-$(CONFIG_PACKET) += af_packet.o
obj-$(CONFIG_PACKET_DIAG) += af_packet_diag.o
af_packet_diag-y += diag.o
diff --git a/net/vmw_vsock/Makefile b/net/vmw_vsock/Makefile
index bc27c70..a61eccb 100644
--- a/net/vmw_vsock/Makefile
+++ b/net/vmw_vsock/Makefile
@@ -8,6 +8,8 @@ vsock-y += af_vsock.o vsock_addr.o
vmw_vsock_vmci_transport-y += vmci_transport.o vmci_transport_notify.o \
vmci_transport_notify_qstate.o
+CFLAGS_virtio_transport.o += -D__CHECK_ENDIAN__
+CFLAGS_virtio_transport_common.o += -D__CHECK_ENDIAN__
vmw_vsock_virtio_transport-y += virtio_transport.o
vmw_vsock_virtio_transport_common-y += virtio_transport_common.o
--
MST
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply related
* [PATCH 00/10] virtio: sparse fixes
From: Michael S. Tsirkin @ 2016-12-06 15:40 UTC (permalink / raw)
To: linux-kernel
Cc: linux-s390, linux-scsi, kvm, linux-kbuild, netdev,
linux-remoteproc, dri-devel, virtualization, Michal Marek,
linux-crypto, v9fs-developer
I run latest sparse from git on virtio drivers
(turns out the version I had was rather outdated).
This patchset fixes a couple of bugs this uncovered,
and adds some annotations to make it sparse-clean.
In particular, endian-ness is often tricky,
so this patchset enabled endian-ness checks for sparse
builds.
Michael S. Tsirkin (10):
virtio_console: drop unused config fields
drm/virtio: fix endianness in primary_plane_update
drm/virtio: fix lock context imbalance
drm/virtio: annotate virtio_gpu_queue_ctrl_buffer_locked
vhost: make interval tree static inline
vhost: add missing __user annotations
vsock/virtio: add a missing __le annotation
vsock/virtio: mark an internal function static
vsock/virtio: fix src/dst cid format
virtio: enable endian checks for sparse builds
drivers/char/virtio_console.c | 14 +++++++-------
drivers/gpu/drm/virtio/virtgpu_plane.c | 4 ++--
drivers/gpu/drm/virtio/virtgpu_vq.c | 6 +++++-
drivers/vhost/vhost.c | 12 ++++++------
net/vmw_vsock/virtio_transport.c | 2 +-
net/vmw_vsock/virtio_transport_common.c | 16 ++++++++--------
drivers/block/Makefile | 1 +
drivers/char/Makefile | 1 +
drivers/char/hw_random/Makefile | 2 ++
drivers/gpu/drm/virtio/Makefile | 1 +
drivers/net/Makefile | 3 +++
drivers/net/caif/Makefile | 1 +
drivers/rpmsg/Makefile | 1 +
drivers/s390/virtio/Makefile | 2 ++
drivers/scsi/Makefile | 1 +
drivers/vhost/Makefile | 1 +
drivers/virtio/Makefile | 3 +++
net/9p/Makefile | 1 +
net/packet/Makefile | 1 +
net/vmw_vsock/Makefile | 2 ++
20 files changed, 50 insertions(+), 25 deletions(-)
--
MST
^ permalink raw reply
* Re: [PATCH 2/3] crypto: brcm: Add Broadcom SPU driver
From: Mark Rutland @ 2016-12-06 14:18 UTC (permalink / raw)
To: Rob Rice
Cc: Herbert Xu, David S. Miller, Rob Herring,
linux-crypto-u79uwXL29TY76Z2rM5mHXA,
devicetree-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA, Ray Jui, Scott Branden,
Jon Mason, bcm-kernel-feedback-list-dY08KVG/lbpWk0Htik3J/w,
Catalin Marinas, Will Deacon,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Steve Lin
In-Reply-To: <1480536453-24781-3-git-send-email-rob.rice-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
On Wed, Nov 30, 2016 at 03:07:32PM -0500, Rob Rice wrote:
> +static const struct of_device_id bcm_spu_dt_ids[] = {
> + {
> + .compatible = "brcm,spum-crypto",
> + .data = &spum_ns2_types,
> + },
> + {
> + .compatible = "brcm,spum-nsp-crypto",
> + .data = &spum_nsp_types,
> + },
> + {
> + .compatible = "brcm,spu2-crypto",
> + .data = &spu2_types,
> + },
> + {
> + .compatible = "brcm,spu2-v2-crypto",
> + .data = &spu2_v2_types,
> + },
These last two weren't in the binding document.
> + { /* sentinel */ }
> +};
> +
> +MODULE_DEVICE_TABLE(of, bcm_spu_dt_ids);
> +
> +static int spu_dt_read(struct platform_device *pdev)
> +{
> + struct device *dev = &pdev->dev;
> + struct spu_hw *spu = &iproc_priv.spu;
> + struct device_node *dn = pdev->dev.of_node;
> + struct resource *spu_ctrl_regs;
> + const struct of_device_id *match;
> + struct spu_type_subtype *matched_spu_type;
> + void __iomem *spu_reg_vbase[MAX_SPUS];
> + int i;
> + int err;
> +
> + if (!of_device_is_available(dn)) {
> + dev_crit(dev, "SPU device not available");
> + return -ENODEV;
> + }
How can this happen?
> + /* Count number of mailbox channels */
> + spu->num_chan = of_count_phandle_with_args(dn, "mboxes", "#mbox-cells");
> + dev_dbg(dev, "Device has %d SPU channels", spu->num_chan);
> +
> + match = of_match_device(of_match_ptr(bcm_spu_dt_ids), dev);
> + matched_spu_type = (struct spu_type_subtype *)match->data;
This cast usn't necessary.
> + spu->spu_type = matched_spu_type->type;
> + spu->spu_subtype = matched_spu_type->subtype;
> +
> + /* Read registers and count number of SPUs */
> + i = 0;
> + while ((i < MAX_SPUS) && ((spu_ctrl_regs =
> + platform_get_resource(pdev, IORESOURCE_MEM, i)) != NULL)) {
> + dev_dbg(dev,
> + "SPU %d control register region res.start = %#x, res.end = %#x",
> + i,
> + (unsigned int)spu_ctrl_regs->start,
> + (unsigned int)spu_ctrl_regs->end);
> +
> + spu_reg_vbase[i] = devm_ioremap_resource(dev, spu_ctrl_regs);
> + if (IS_ERR(spu_reg_vbase[i])) {
> + err = PTR_ERR(spu_reg_vbase[i]);
> + dev_err(&pdev->dev, "Failed to map registers: %d\n",
> + err);
> + spu_reg_vbase[i] = NULL;
> + return err;
> + }
> + i++;
> + }
These *really* sound like independent devices. There are no shared
registers, and each has its own mbox.
Why do we group them like this?
Thanks,
Mark.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH 1/3] crypto: brcm: DT documentation for Broadcom SPU driver
From: Mark Rutland @ 2016-12-06 14:06 UTC (permalink / raw)
To: Rob Rice
Cc: Herbert Xu, David S. Miller, Rob Herring, linux-crypto,
devicetree, linux-kernel, Ray Jui, Scott Branden, Jon Mason,
bcm-kernel-feedback-list, Catalin Marinas, Will Deacon,
linux-arm-kernel, Steve Lin
In-Reply-To: <1480536453-24781-2-git-send-email-rob.rice@broadcom.com>
On Wed, Nov 30, 2016 at 03:07:31PM -0500, Rob Rice wrote:
> Device tree documentation for Broadcom Secure Processing Unit
> (SPU) crypto driver.
>
> Signed-off-by: Steve Lin <steven.lin1@broadcom.com>
> Signed-off-by: Rob Rice <rob.rice@broadcom.com>
> ---
> .../devicetree/bindings/crypto/brcm,spu-crypto.txt | 25 ++++++++++++++++++++++
> 1 file changed, 25 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/crypto/brcm,spu-crypto.txt
>
> diff --git a/Documentation/devicetree/bindings/crypto/brcm,spu-crypto.txt b/Documentation/devicetree/bindings/crypto/brcm,spu-crypto.txt
> new file mode 100644
> index 0000000..e5fe942
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/crypto/brcm,spu-crypto.txt
> @@ -0,0 +1,25 @@
> +The Broadcom Secure Processing Unit (SPU) driver supports symmetric
> +cryptographic offload for Broadcom SoCs with SPU hardware. A SoC may have
> +multiple SPU hardware blocks.
Bindings shound describe *hardware*, not *drivers*. Please drop mention
of the driver, and just decribe the hardware.
> +Required properties:
> +- compatible : Should be "brcm,spum-crypto" for devices with SPU-M hardware
> + (e.g., Northstar2) or "brcm,spum-nsp-crypto" for the Northstar Plus variant
> + of the SPU-M hardware.
> +
> +- reg: Should contain SPU registers location and length.
> +- mboxes: A list of mailbox channels to be used by the kernel driver. Mailbox
> +channels correspond to DMA rings on the device.
> +
> +Example:
> + spu-crypto@612d0000 {
> + compatible = "brcm,spum-crypto";
> + reg = <0 0x612d0000 0 0x900>, /* SPU 0 control regs */
> + <0 0x612f0000 0 0x900>, /* SPU 1 control regs */
> + <0 0x61310000 0 0x900>, /* SPU 2 control regs */
> + <0 0x61330000 0 0x900>; /* SPU 3 control regs */
The above didn't mention there were several register sets, and the
comment beside each makes them sound like they're separate SPU
instances, so I don't think it makes sense to group them as one node.
What's going on here?
> + mboxes = <&pdc0 0>,
> + <&pdc1 0>,
> + <&pdc2 0>,
> + <&pdc3 0>;
Does each mbox correspond to one of the SPUs above? Or is there a shared
pool?
Thanks,
Mark.
^ permalink raw reply
* RE: [PATCH v5 1/1] crypto: add virtio-crypto driver
From: Gonglei (Arei) @ 2016-12-06 9:01 UTC (permalink / raw)
To: Gonglei (Arei), linux-kernel@vger.kernel.org,
qemu-devel@nongnu.org, virtio-dev@lists.oasis-open.org,
virtualization@lists.linux-foundation.org,
linux-crypto@vger.kernel.org
Cc: Huangweidong (C), Claudio Fontana, mst@redhat.com, Luonengjun,
Hanweidong (Randy), Xuquan (Quan Xu), Wanzongshun (Vincent),
stefanha@redhat.com, Zhoujian (jay, Euler), longpeng,
arei.gonglei@hotmail.com, davem@davemloft.net, Wubin (H),
herbert@gondor.apana.org.au
In-Reply-To: <1480595945-63656-2-git-send-email-arei.gonglei@huawei.com>
Hi Herbert,
Would you please review and/or ack the virtio_crypto_algs.c?
It is the realization of specified algs based on Linux Crypto Framework.
Thanks!
Regards,
-Gonglei
> -----Original Message-----
> From: Gonglei (Arei)
> Sent: Thursday, December 01, 2016 8:39 PM
> To: linux-kernel@vger.kernel.org; qemu-devel@nongnu.org;
> virtio-dev@lists.oasis-open.org; virtualization@lists.linux-foundation.org;
> linux-crypto@vger.kernel.org
> Cc: Luonengjun; mst@redhat.com; stefanha@redhat.com; Huangweidong (C);
> Wubin (H); xin.zeng@intel.com; Claudio Fontana;
> herbert@gondor.apana.org.au; pasic@linux.vnet.ibm.com;
> davem@davemloft.net; Zhoujian (jay, Euler); Hanweidong (Randy);
> arei.gonglei@hotmail.com; cornelia.huck@de.ibm.com; Xuquan (Quan Xu);
> longpeng; Wanzongshun (Vincent); Gonglei (Arei)
> Subject: [PATCH v5 1/1] crypto: add virtio-crypto driver
>
> This patch introduces virtio-crypto driver for Linux Kernel.
>
> The virtio crypto device is a virtual cryptography device
> as well as a kind of virtual hardware accelerator for
> virtual machines. The encryption anddecryption requests
> are placed in the data queue and are ultimately handled by
> thebackend crypto accelerators. The second queue is the
> control queue used to create or destroy sessions for
> symmetric algorithms and will control some advanced features
> in the future. The virtio crypto device provides the following
> cryptoservices: CIPHER, MAC, HASH, and AEAD.
>
> For more information about virtio-crypto device, please see:
> http://qemu-project.org/Features/VirtioCrypto
>
> CC: Michael S. Tsirkin <mst@redhat.com>
> CC: Cornelia Huck <cornelia.huck@de.ibm.com>
> CC: Stefan Hajnoczi <stefanha@redhat.com>
> CC: Herbert Xu <herbert@gondor.apana.org.au>
> CC: Halil Pasic <pasic@linux.vnet.ibm.com>
> CC: David S. Miller <davem@davemloft.net>
> CC: Zeng Xin <xin.zeng@intel.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> ---
> MAINTAINERS | 9 +
> drivers/crypto/Kconfig | 2 +
> drivers/crypto/Makefile | 1 +
> drivers/crypto/virtio/Kconfig | 10 +
> drivers/crypto/virtio/Makefile | 5 +
> drivers/crypto/virtio/virtio_crypto_algs.c | 537
> +++++++++++++++++++++++++++
> drivers/crypto/virtio/virtio_crypto_common.h | 122 ++++++
> drivers/crypto/virtio/virtio_crypto_core.c | 464
> +++++++++++++++++++++++
> drivers/crypto/virtio/virtio_crypto_mgr.c | 264 +++++++++++++
> include/uapi/linux/Kbuild | 1 +
> include/uapi/linux/virtio_crypto.h | 450
> ++++++++++++++++++++++
> include/uapi/linux/virtio_ids.h | 1 +
> 12 files changed, 1866 insertions(+)
> create mode 100644 drivers/crypto/virtio/Kconfig
> create mode 100644 drivers/crypto/virtio/Makefile
> create mode 100644 drivers/crypto/virtio/virtio_crypto_algs.c
> create mode 100644 drivers/crypto/virtio/virtio_crypto_common.h
> create mode 100644 drivers/crypto/virtio/virtio_crypto_core.c
> create mode 100644 drivers/crypto/virtio/virtio_crypto_mgr.c
> create mode 100644 include/uapi/linux/virtio_crypto.h
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index ad9b965..cccaaf0 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -12810,6 +12810,7 @@ F: drivers/net/virtio_net.c
> F: drivers/block/virtio_blk.c
> F: include/linux/virtio_*.h
> F: include/uapi/linux/virtio_*.h
> +F: drivers/crypto/virtio/
>
> VIRTIO DRIVERS FOR S390
> M: Christian Borntraeger <borntraeger@de.ibm.com>
> @@ -12846,6 +12847,14 @@ S: Maintained
> F: drivers/virtio/virtio_input.c
> F: include/uapi/linux/virtio_input.h
>
> +VIRTIO CRYPTO DRIVER
> +M: Gonglei <arei.gonglei@huawei.com>
> +L: virtualization@lists.linux-foundation.org
> +L: linux-crypto@vger.kernel.org
> +S: Maintained
> +F: drivers/crypto/virtio/
> +F: include/uapi/linux/virtio_crypto.h
> +
> VIA RHINE NETWORK DRIVER
> S: Orphan
> F: drivers/net/ethernet/via/via-rhine.c
> diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
> index 4d2b81f..7956478 100644
> --- a/drivers/crypto/Kconfig
> +++ b/drivers/crypto/Kconfig
> @@ -555,4 +555,6 @@ config CRYPTO_DEV_ROCKCHIP
>
> source "drivers/crypto/chelsio/Kconfig"
>
> +source "drivers/crypto/virtio/Kconfig"
> +
> endif # CRYPTO_HW
> diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
> index ad7250f..bc53cb8 100644
> --- a/drivers/crypto/Makefile
> +++ b/drivers/crypto/Makefile
> @@ -32,3 +32,4 @@ obj-$(CONFIG_CRYPTO_DEV_VMX) += vmx/
> obj-$(CONFIG_CRYPTO_DEV_SUN4I_SS) += sunxi-ss/
> obj-$(CONFIG_CRYPTO_DEV_ROCKCHIP) += rockchip/
> obj-$(CONFIG_CRYPTO_DEV_CHELSIO) += chelsio/
> +obj-$(CONFIG_CRYPTO_DEV_VIRTIO) += virtio/
> diff --git a/drivers/crypto/virtio/Kconfig b/drivers/crypto/virtio/Kconfig
> new file mode 100644
> index 0000000..d80f733
> --- /dev/null
> +++ b/drivers/crypto/virtio/Kconfig
> @@ -0,0 +1,10 @@
> +config CRYPTO_DEV_VIRTIO
> + tristate "VirtIO crypto driver"
> + depends on VIRTIO
> + select CRYPTO_AEAD
> + select CRYPTO_AUTHENC
> + select CRYPTO_BLKCIPHER
> + default m
> + help
> + This driver provides support for virtio crypto device. If you
> + choose 'M' here, this module will be called virtio_crypto.
> diff --git a/drivers/crypto/virtio/Makefile b/drivers/crypto/virtio/Makefile
> new file mode 100644
> index 0000000..dd342c9
> --- /dev/null
> +++ b/drivers/crypto/virtio/Makefile
> @@ -0,0 +1,5 @@
> +obj-$(CONFIG_CRYPTO_DEV_VIRTIO) += virtio_crypto.o
> +virtio_crypto-objs := \
> + virtio_crypto_algs.o \
> + virtio_crypto_mgr.o \
> + virtio_crypto_core.o
> diff --git a/drivers/crypto/virtio/virtio_crypto_algs.c
> b/drivers/crypto/virtio/virtio_crypto_algs.c
> new file mode 100644
> index 0000000..5100056
> --- /dev/null
> +++ b/drivers/crypto/virtio/virtio_crypto_algs.c
> @@ -0,0 +1,537 @@
> + /* Algorithms supported by virtio crypto device
> + *
> + * Authors: Gonglei <arei.gonglei@huawei.com>
> + *
> + * Copyright 2016 HUAWEI TECHNOLOGIES CO., LTD.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <linux/scatterlist.h>
> +#include <crypto/algapi.h>
> +#include <linux/err.h>
> +#include <crypto/scatterwalk.h>
> +#include <linux/atomic.h>
> +
> +#include <uapi/linux/virtio_crypto.h>
> +#include "virtio_crypto_common.h"
> +
> +/*
> + * The algs_lock protects the below global virtio_crypto_active_devs
> + * and crypto algorithms registion.
> + */
> +static DEFINE_MUTEX(algs_lock);
> +static unsigned int virtio_crypto_active_devs;
> +
> +static u64 virtio_crypto_alg_sg_nents_length(struct scatterlist *sg)
> +{
> + u64 total = 0;
> +
> + for (total = 0; sg; sg = sg_next(sg))
> + total += sg->length;
> +
> + return total;
> +}
> +
> +static int
> +virtio_crypto_alg_validate_key(int key_len, uint32_t *alg)
> +{
> + switch (key_len) {
> + case AES_KEYSIZE_128:
> + case AES_KEYSIZE_192:
> + case AES_KEYSIZE_256:
> + *alg = VIRTIO_CRYPTO_CIPHER_AES_CBC;
> + break;
> + default:
> + pr_err("virtio_crypto: Unsupported key length: %d\n",
> + key_len);
> + return -EINVAL;
> + }
> + return 0;
> +}
> +
> +static int virtio_crypto_alg_ablkcipher_init_session(
> + struct virtio_crypto_ablkcipher_ctx *ctx,
> + uint32_t alg, const uint8_t *key,
> + unsigned int keylen,
> + int encrypt)
> +{
> + struct scatterlist outhdr, key_sg, inhdr, *sgs[3];
> + unsigned int tmp;
> + struct virtio_crypto *vcrypto = ctx->vcrypto;
> + int op = encrypt ? VIRTIO_CRYPTO_OP_ENCRYPT :
> VIRTIO_CRYPTO_OP_DECRYPT;
> + int err;
> + unsigned int num_out = 0, num_in = 0;
> +
> + /*
> + * Avoid to do DMA from the stack, switch to using
> + * dynamically-allocated for the key
> + */
> + uint8_t *cipher_key = kmalloc(keylen, GFP_ATOMIC);
> +
> + if (!cipher_key)
> + return -ENOMEM;
> +
> + memcpy(cipher_key, key, keylen);
> +
> + spin_lock(&vcrypto->ctrl_lock);
> + /* Pad ctrl header */
> + vcrypto->ctrl.header.opcode =
> + cpu_to_le32(VIRTIO_CRYPTO_CIPHER_CREATE_SESSION);
> + vcrypto->ctrl.header.algo = cpu_to_le32(alg);
> + /* Set the default dataqueue id to 0 */
> + vcrypto->ctrl.header.queue_id = 0;
> +
> + vcrypto->input.status = cpu_to_le32(VIRTIO_CRYPTO_ERR);
> + /* Pad cipher's parameters */
> + vcrypto->ctrl.u.sym_create_session.op_type =
> + cpu_to_le32(VIRTIO_CRYPTO_SYM_OP_CIPHER);
> + vcrypto->ctrl.u.sym_create_session.u.cipher.para.algo =
> + vcrypto->ctrl.header.algo;
> + vcrypto->ctrl.u.sym_create_session.u.cipher.para.keylen =
> + cpu_to_le32(keylen);
> + vcrypto->ctrl.u.sym_create_session.u.cipher.para.op =
> + cpu_to_le32(op);
> +
> + sg_init_one(&outhdr, &vcrypto->ctrl, sizeof(vcrypto->ctrl));
> + sgs[num_out++] = &outhdr;
> +
> + /* Set key */
> + sg_init_one(&key_sg, cipher_key, keylen);
> + sgs[num_out++] = &key_sg;
> +
> + /* Return status and session id back */
> + sg_init_one(&inhdr, &vcrypto->input, sizeof(vcrypto->input));
> + sgs[num_out + num_in++] = &inhdr;
> +
> + err = virtqueue_add_sgs(vcrypto->ctrl_vq, sgs, num_out,
> + num_in, vcrypto, GFP_ATOMIC);
> + if (err < 0) {
> + spin_unlock(&vcrypto->ctrl_lock);
> + kzfree(cipher_key);
> + return err;
> + }
> + virtqueue_kick(vcrypto->ctrl_vq);
> +
> + /*
> + * Trapping into the hypervisor, so the request should be
> + * handled immediately.
> + */
> + while (!virtqueue_get_buf(vcrypto->ctrl_vq, &tmp) &&
> + !virtqueue_is_broken(vcrypto->ctrl_vq))
> + cpu_relax();
> +
> + if (le32_to_cpu(vcrypto->input.status) != VIRTIO_CRYPTO_OK) {
> + spin_unlock(&vcrypto->ctrl_lock);
> + pr_err("virtio_crypto: Create session failed status: %u\n",
> + le32_to_cpu(vcrypto->input.status));
> + kzfree(cipher_key);
> + return -EINVAL;
> + }
> +
> + if (encrypt)
> + ctx->enc_sess_info.session_id =
> + le64_to_cpu(vcrypto->input.session_id);
> + else
> + ctx->dec_sess_info.session_id =
> + le64_to_cpu(vcrypto->input.session_id);
> +
> + spin_unlock(&vcrypto->ctrl_lock);
> +
> + kzfree(cipher_key);
> + return 0;
> +}
> +
> +static int virtio_crypto_alg_ablkcipher_close_session(
> + struct virtio_crypto_ablkcipher_ctx *ctx,
> + int encrypt)
> +{
> + struct scatterlist outhdr, status_sg, *sgs[2];
> + unsigned int tmp;
> + struct virtio_crypto_destroy_session_req *destroy_session;
> + struct virtio_crypto *vcrypto = ctx->vcrypto;
> + int err;
> + unsigned int num_out = 0, num_in = 0;
> +
> + spin_lock(&vcrypto->ctrl_lock);
> + vcrypto->ctrl_status.status = VIRTIO_CRYPTO_ERR;
> + /* Pad ctrl header */
> + vcrypto->ctrl.header.opcode =
> + cpu_to_le32(VIRTIO_CRYPTO_CIPHER_DESTROY_SESSION);
> + /* Set the default virtqueue id to 0 */
> + vcrypto->ctrl.header.queue_id = 0;
> +
> + destroy_session = &vcrypto->ctrl.u.destroy_session;
> +
> + if (encrypt)
> + destroy_session->session_id =
> + cpu_to_le64(ctx->enc_sess_info.session_id);
> + else
> + destroy_session->session_id =
> + cpu_to_le64(ctx->dec_sess_info.session_id);
> +
> + sg_init_one(&outhdr, &vcrypto->ctrl, sizeof(vcrypto->ctrl));
> + sgs[num_out++] = &outhdr;
> +
> + /* Return status and session id back */
> + sg_init_one(&status_sg, &vcrypto->ctrl_status.status,
> + sizeof(vcrypto->ctrl_status.status));
> + sgs[num_out + num_in++] = &status_sg;
> +
> + err = virtqueue_add_sgs(vcrypto->ctrl_vq, sgs, num_out,
> + num_in, vcrypto, GFP_ATOMIC);
> + if (err < 0) {
> + spin_unlock(&vcrypto->ctrl_lock);
> + return err;
> + }
> + virtqueue_kick(vcrypto->ctrl_vq);
> +
> + while (!virtqueue_get_buf(vcrypto->ctrl_vq, &tmp) &&
> + !virtqueue_is_broken(vcrypto->ctrl_vq))
> + cpu_relax();
> +
> + if (vcrypto->ctrl_status.status != VIRTIO_CRYPTO_OK) {
> + spin_unlock(&vcrypto->ctrl_lock);
> + pr_err("virtio_crypto: Close session failed status: %u, session_id:
> 0x%llx\n",
> + vcrypto->ctrl_status.status,
> + destroy_session->session_id);
> +
> + return -EINVAL;
> + }
> + spin_unlock(&vcrypto->ctrl_lock);
> +
> + return 0;
> +}
> +
> +static int virtio_crypto_alg_ablkcipher_init_sessions(
> + struct virtio_crypto_ablkcipher_ctx *ctx,
> + const uint8_t *key, unsigned int keylen)
> +{
> + uint32_t alg;
> + int ret;
> + struct virtio_crypto *vcrypto = ctx->vcrypto;
> +
> + if (keylen > vcrypto->max_cipher_key_len) {
> + pr_err("virtio_crypto: the key is too long\n");
> + goto bad_key;
> + }
> +
> + if (virtio_crypto_alg_validate_key(keylen, &alg))
> + goto bad_key;
> +
> + /* Create encryption session */
> + ret = virtio_crypto_alg_ablkcipher_init_session(ctx,
> + alg, key, keylen, 1);
> + if (ret)
> + return ret;
> + /* Create decryption session */
> + ret = virtio_crypto_alg_ablkcipher_init_session(ctx,
> + alg, key, keylen, 0);
> + if (ret) {
> + virtio_crypto_alg_ablkcipher_close_session(ctx, 1);
> + return ret;
> + }
> + return 0;
> +
> +bad_key:
> + crypto_tfm_set_flags(ctx->tfm, CRYPTO_TFM_RES_BAD_KEY_LEN);
> + return -EINVAL;
> +}
> +
> +/* Note: kernel crypto API realization */
> +static int virtio_crypto_ablkcipher_setkey(struct crypto_ablkcipher *tfm,
> + const uint8_t *key,
> + unsigned int keylen)
> +{
> + struct virtio_crypto_ablkcipher_ctx *ctx = crypto_ablkcipher_ctx(tfm);
> + int ret;
> +
> + if (!ctx->vcrypto) {
> + /* New key */
> + int node = virtio_crypto_get_current_node();
> + struct virtio_crypto *vcrypto =
> + virtcrypto_get_dev_node(node);
> + if (!vcrypto) {
> + pr_err("virtio_crypto: Could not find a virtio device in the
> system");
> + return -ENODEV;
> + }
> +
> + ctx->vcrypto = vcrypto;
> + }
> +
> + ret = virtio_crypto_alg_ablkcipher_init_sessions(ctx, key, keylen);
> + if (ret) {
> + virtcrypto_dev_put(ctx->vcrypto);
> + ctx->vcrypto = NULL;
> +
> + return ret;
> + }
> +
> + return 0;
> +}
> +
> +static int
> +__virtio_crypto_ablkcipher_do_req(struct virtio_crypto_request *vc_req,
> + struct ablkcipher_request *req,
> + struct data_queue *data_vq,
> + __u8 op)
> +{
> + struct crypto_ablkcipher *tfm = crypto_ablkcipher_reqtfm(req);
> + unsigned int ivsize = crypto_ablkcipher_ivsize(tfm);
> + struct virtio_crypto_ablkcipher_ctx *ctx = vc_req->ablkcipher_ctx;
> + struct virtio_crypto *vcrypto = ctx->vcrypto;
> + struct virtio_crypto_op_data_req *req_data;
> + int src_nents, dst_nents;
> + int err;
> + unsigned long flags;
> + struct scatterlist outhdr, iv_sg, status_sg, **sgs;
> + int i;
> + u64 dst_len;
> + unsigned int num_out = 0, num_in = 0;
> + int sg_total;
> + uint8_t *iv;
> +
> + src_nents = sg_nents_for_len(req->src, req->nbytes);
> + dst_nents = sg_nents(req->dst);
> +
> + pr_debug("virtio_crypto: Number of sgs (src_nents: %d,
> dst_nents: %d)\n",
> + src_nents, dst_nents);
> +
> + /* Why 3? outhdr + iv + inhdr */
> + sg_total = src_nents + dst_nents + 3;
> + sgs = kzalloc_node(sg_total * sizeof(*sgs), GFP_ATOMIC,
> + dev_to_node(&vcrypto->vdev->dev));
> + if (!sgs)
> + return -ENOMEM;
> +
> + req_data = kzalloc_node(sizeof(*req_data), GFP_ATOMIC,
> + dev_to_node(&vcrypto->vdev->dev));
> + if (!req_data) {
> + kfree(sgs);
> + return -ENOMEM;
> + }
> +
> + vc_req->req_data = req_data;
> + vc_req->type = VIRTIO_CRYPTO_SYM_OP_CIPHER;
> + /* Head of operation */
> + if (op) {
> + req_data->header.session_id =
> + cpu_to_le64(ctx->enc_sess_info.session_id);
> + req_data->header.opcode =
> + cpu_to_le32(VIRTIO_CRYPTO_CIPHER_ENCRYPT);
> + } else {
> + req_data->header.session_id =
> + cpu_to_le64(ctx->dec_sess_info.session_id);
> + req_data->header.opcode =
> + cpu_to_le32(VIRTIO_CRYPTO_CIPHER_DECRYPT);
> + }
> + req_data->u.sym_req.op_type =
> cpu_to_le32(VIRTIO_CRYPTO_SYM_OP_CIPHER);
> + req_data->u.sym_req.u.cipher.para.iv_len = cpu_to_le32(ivsize);
> + req_data->u.sym_req.u.cipher.para.src_data_len =
> + cpu_to_le32(req->nbytes);
> +
> + dst_len = virtio_crypto_alg_sg_nents_length(req->dst);
> + if (unlikely(dst_len > U32_MAX)) {
> + pr_err("virtio_crypto: The dst_len is beyond U32_MAX\n");
> + err = -EINVAL;
> + goto free;
> + }
> +
> + pr_debug("virtio_crypto: src_len: %u, dst_len: %llu\n",
> + req->nbytes, dst_len);
> +
> + if (unlikely(req->nbytes + dst_len + ivsize +
> + sizeof(vc_req->status) > vcrypto->max_size)) {
> + pr_err("virtio_crypto: The length is too big\n");
> + err = -EINVAL;
> + goto free;
> + }
> +
> + req_data->u.sym_req.u.cipher.para.dst_data_len =
> + cpu_to_le32((uint32_t)dst_len);
> +
> + /* Outhdr */
> + sg_init_one(&outhdr, req_data, sizeof(*req_data));
> + sgs[num_out++] = &outhdr;
> +
> + /* IV */
> +
> + /*
> + * Avoid to do DMA from the stack, switch to using
> + * dynamically-allocated for the IV
> + */
> + iv = kzalloc_node(ivsize, GFP_ATOMIC,
> + dev_to_node(&vcrypto->vdev->dev));
> + if (!iv) {
> + err = -ENOMEM;
> + goto free;
> + }
> + memcpy(iv, req->info, ivsize);
> + sg_init_one(&iv_sg, iv, ivsize);
> + sgs[num_out++] = &iv_sg;
> + vc_req->iv = iv;
> +
> + /* Source data */
> + for (i = 0; i < src_nents; i++)
> + sgs[num_out++] = &req->src[i];
> +
> + /* Destination data */
> + for (i = 0; i < dst_nents; i++)
> + sgs[num_out + num_in++] = &req->dst[i];
> +
> + /* Status */
> + sg_init_one(&status_sg, &vc_req->status, sizeof(vc_req->status));
> + sgs[num_out + num_in++] = &status_sg;
> +
> + vc_req->sgs = sgs;
> +
> + spin_lock_irqsave(&vcrypto->lock, flags);
> + err = virtqueue_add_sgs(data_vq->vq, sgs, num_out,
> + num_in, vc_req, GFP_ATOMIC);
> + spin_unlock_irqrestore(&vcrypto->lock, flags);
> + if (unlikely(err < 0))
> + goto free_iv;
> +
> + return 0;
> +
> +free_iv:
> + kzfree(iv);
> +free:
> + kzfree(req_data);
> + kfree(sgs);
> + return err;
> +}
> +
> +static int virtio_crypto_ablkcipher_encrypt(struct ablkcipher_request *req)
> +{
> + struct crypto_ablkcipher *atfm = crypto_ablkcipher_reqtfm(req);
> + struct virtio_crypto_ablkcipher_ctx *ctx = crypto_ablkcipher_ctx(atfm);
> + struct virtio_crypto_request *vc_req = ablkcipher_request_ctx(req);
> + struct virtio_crypto *vcrypto = ctx->vcrypto;
> + int ret;
> + /* Use the first data virtqueue as default */
> + struct data_queue *data_vq = &vcrypto->data_vq[0];
> +
> + vc_req->ablkcipher_ctx = ctx;
> + vc_req->ablkcipher_req = req;
> + ret = __virtio_crypto_ablkcipher_do_req(vc_req, req, data_vq, 1);
> + if (ret < 0) {
> + pr_err("virtio_crypto: Encryption failed!\n");
> + return ret;
> + }
> + virtqueue_kick(data_vq->vq);
> +
> + return -EINPROGRESS;
> +}
> +
> +static int virtio_crypto_ablkcipher_decrypt(struct ablkcipher_request *req)
> +{
> + struct crypto_ablkcipher *atfm = crypto_ablkcipher_reqtfm(req);
> + struct virtio_crypto_ablkcipher_ctx *ctx = crypto_ablkcipher_ctx(atfm);
> + struct virtio_crypto_request *vc_req = ablkcipher_request_ctx(req);
> + struct virtio_crypto *vcrypto = ctx->vcrypto;
> + int ret;
> + /* Use the first data virtqueue as default */
> + struct data_queue *data_vq = &vcrypto->data_vq[0];
> +
> + vc_req->ablkcipher_ctx = ctx;
> + vc_req->ablkcipher_req = req;
> +
> + ret = __virtio_crypto_ablkcipher_do_req(vc_req, req, data_vq, 0);
> + if (ret < 0) {
> + pr_err("virtio_crypto: Decryption failed!\n");
> + return ret;
> + }
> + virtqueue_kick(data_vq->vq);
> +
> + return -EINPROGRESS;
> +}
> +
> +static int virtio_crypto_ablkcipher_init(struct crypto_tfm *tfm)
> +{
> + struct virtio_crypto_ablkcipher_ctx *ctx = crypto_tfm_ctx(tfm);
> +
> + tfm->crt_ablkcipher.reqsize = sizeof(struct virtio_crypto_request);
> + ctx->tfm = tfm;
> +
> + return 0;
> +}
> +
> +static void virtio_crypto_ablkcipher_exit(struct crypto_tfm *tfm)
> +{
> + struct virtio_crypto_ablkcipher_ctx *ctx = crypto_tfm_ctx(tfm);
> +
> + if (!ctx->vcrypto)
> + return;
> +
> + virtio_crypto_alg_ablkcipher_close_session(ctx, 1);
> + virtio_crypto_alg_ablkcipher_close_session(ctx, 0);
> + virtcrypto_dev_put(ctx->vcrypto);
> + ctx->vcrypto = NULL;
> +}
> +
> +static struct crypto_alg virtio_crypto_algs[] = { {
> + .cra_name = "cbc(aes)",
> + .cra_driver_name = "virtio_crypto_aes_cbc",
> + .cra_priority = 4001,
> + .cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER | CRYPTO_ALG_ASYNC,
> + .cra_blocksize = AES_BLOCK_SIZE,
> + .cra_ctxsize = sizeof(struct virtio_crypto_ablkcipher_ctx),
> + .cra_alignmask = 0,
> + .cra_module = THIS_MODULE,
> + .cra_type = &crypto_ablkcipher_type,
> + .cra_init = virtio_crypto_ablkcipher_init,
> + .cra_exit = virtio_crypto_ablkcipher_exit,
> + .cra_u = {
> + .ablkcipher = {
> + .setkey = virtio_crypto_ablkcipher_setkey,
> + .decrypt = virtio_crypto_ablkcipher_decrypt,
> + .encrypt = virtio_crypto_ablkcipher_encrypt,
> + .min_keysize = AES_MIN_KEY_SIZE,
> + .max_keysize = AES_MAX_KEY_SIZE,
> + .ivsize = AES_BLOCK_SIZE,
> + },
> + },
> +} };
> +
> +int virtio_crypto_algs_register(void)
> +{
> + int ret = 0;
> +
> + mutex_lock(&algs_lock);
> + if (++virtio_crypto_active_devs != 1)
> + goto unlock;
> +
> + ret = crypto_register_algs(virtio_crypto_algs,
> + ARRAY_SIZE(virtio_crypto_algs));
> + if (ret)
> + virtio_crypto_active_devs--;
> +
> +unlock:
> + mutex_unlock(&algs_lock);
> + return ret;
> +}
> +
> +void virtio_crypto_algs_unregister(void)
> +{
> + mutex_lock(&algs_lock);
> + if (--virtio_crypto_active_devs != 0)
> + goto unlock;
> +
> + crypto_unregister_algs(virtio_crypto_algs,
> + ARRAY_SIZE(virtio_crypto_algs));
> +
> +unlock:
> + mutex_unlock(&algs_lock);
> +}
> diff --git a/drivers/crypto/virtio/virtio_crypto_common.h
> b/drivers/crypto/virtio/virtio_crypto_common.h
> new file mode 100644
> index 0000000..975404b
> --- /dev/null
> +++ b/drivers/crypto/virtio/virtio_crypto_common.h
> @@ -0,0 +1,122 @@
> +/* Common header for Virtio crypto device.
> + *
> + * Copyright 2016 HUAWEI TECHNOLOGIES CO., LTD.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef _VIRTIO_CRYPTO_COMMON_H
> +#define _VIRTIO_CRYPTO_COMMON_H
> +
> +#include <linux/virtio.h>
> +#include <linux/crypto.h>
> +#include <linux/spinlock.h>
> +#include <crypto/aead.h>
> +#include <crypto/aes.h>
> +#include <crypto/authenc.h>
> +
> +
> +/* Internal representation of a data virtqueue */
> +struct data_queue {
> + /* Virtqueue associated with this send _queue */
> + struct virtqueue *vq;
> +
> + /* Name of the tx queue: dataq.$index */
> + char name[32];
> +};
> +
> +struct virtio_crypto {
> + struct virtio_device *vdev;
> + struct virtqueue *ctrl_vq;
> + struct data_queue *data_vq;
> +
> + /* To protect the vq operations for the dataq */
> + spinlock_t lock;
> +
> + /* To protect the vq operations for the controlq */
> + spinlock_t ctrl_lock;
> +
> + /* Maximum of data queues supported by the device */
> + u32 max_data_queues;
> +
> + /* Number of queue currently used by the driver */
> + u32 curr_queue;
> +
> + /* Maximum length of cipher key */
> + u32 max_cipher_key_len;
> + /* Maximum length of authenticated key */
> + u32 max_auth_key_len;
> + /* Maximum size of per request */
> + u64 max_size;
> +
> + /* Control VQ buffers: protected by the ctrl_lock */
> + struct virtio_crypto_op_ctrl_req ctrl;
> + struct virtio_crypto_session_input input;
> + struct virtio_crypto_inhdr ctrl_status;
> +
> + unsigned long status;
> + atomic_t ref_count;
> + struct list_head list;
> + struct module *owner;
> + uint8_t dev_id;
> +
> + /* Does the affinity hint is set for virtqueues? */
> + bool affinity_hint_set;
> +};
> +
> +struct virtio_crypto_sym_session_info {
> + /* Backend session id, which come from the host side */
> + __u64 session_id;
> +};
> +
> +struct virtio_crypto_ablkcipher_ctx {
> + struct virtio_crypto *vcrypto;
> + struct crypto_tfm *tfm;
> +
> + struct virtio_crypto_sym_session_info enc_sess_info;
> + struct virtio_crypto_sym_session_info dec_sess_info;
> +};
> +
> +struct virtio_crypto_request {
> + /* Cipher or aead */
> + uint32_t type;
> + uint8_t status;
> + struct virtio_crypto_ablkcipher_ctx *ablkcipher_ctx;
> + struct ablkcipher_request *ablkcipher_req;
> + struct virtio_crypto_op_data_req *req_data;
> + struct scatterlist **sgs;
> + uint8_t *iv;
> +};
> +
> +int virtcrypto_devmgr_add_dev(struct virtio_crypto *vcrypto_dev);
> +struct list_head *virtcrypto_devmgr_get_head(void);
> +void virtcrypto_devmgr_rm_dev(struct virtio_crypto *vcrypto_dev);
> +struct virtio_crypto *virtcrypto_devmgr_get_first(void);
> +int virtcrypto_dev_in_use(struct virtio_crypto *vcrypto_dev);
> +int virtcrypto_dev_get(struct virtio_crypto *vcrypto_dev);
> +void virtcrypto_dev_put(struct virtio_crypto *vcrypto_dev);
> +int virtcrypto_dev_started(struct virtio_crypto *vcrypto_dev);
> +struct virtio_crypto *virtcrypto_get_dev_node(int node);
> +int virtcrypto_dev_start(struct virtio_crypto *vcrypto);
> +void virtcrypto_dev_stop(struct virtio_crypto *vcrypto);
> +
> +static inline int virtio_crypto_get_current_node(void)
> +{
> + return topology_physical_package_id(smp_processor_id());
> +}
> +
> +int virtio_crypto_algs_register(void);
> +void virtio_crypto_algs_unregister(void);
> +
> +#endif /* _VIRTIO_CRYPTO_COMMON_H */
> diff --git a/drivers/crypto/virtio/virtio_crypto_core.c
> b/drivers/crypto/virtio/virtio_crypto_core.c
> new file mode 100644
> index 0000000..286d829
> --- /dev/null
> +++ b/drivers/crypto/virtio/virtio_crypto_core.c
> @@ -0,0 +1,464 @@
> + /* Driver for Virtio crypto device.
> + *
> + * Copyright 2016 HUAWEI TECHNOLOGIES CO., LTD.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <linux/err.h>
> +#include <linux/module.h>
> +#include <linux/virtio_config.h>
> +#include <linux/cpu.h>
> +
> +#include <uapi/linux/virtio_crypto.h>
> +#include "virtio_crypto_common.h"
> +
> +
> +static void virtcrypto_dataq_callback(struct virtqueue *vq)
> +{
> + struct virtio_crypto *vcrypto = vq->vdev->priv;
> + struct virtio_crypto_request *vc_req;
> + unsigned long flags;
> + unsigned int len;
> + struct ablkcipher_request *ablk_req;
> + int error;
> +
> + spin_lock_irqsave(&vcrypto->lock, flags);
> + do {
> + virtqueue_disable_cb(vq);
> + while ((vc_req = virtqueue_get_buf(vq, &len)) != NULL) {
> + if (vc_req->type == VIRTIO_CRYPTO_SYM_OP_CIPHER) {
> + switch (vc_req->status) {
> + case VIRTIO_CRYPTO_OK:
> + error = 0;
> + break;
> + case VIRTIO_CRYPTO_INVSESS:
> + case VIRTIO_CRYPTO_ERR:
> + error = -EINVAL;
> + break;
> + case VIRTIO_CRYPTO_BADMSG:
> + error = -EBADMSG;
> + break;
> + default:
> + error = -EIO;
> + break;
> + }
> + ablk_req = vc_req->ablkcipher_req;
> + /* Finish the encrypt or decrypt process */
> + ablk_req->base.complete(&ablk_req->base, error);
> + }
> +
> + kzfree(vc_req->iv);
> + kzfree(vc_req->req_data);
> + kfree(vc_req->sgs);
> + }
> + } while (!virtqueue_enable_cb(vq));
> + spin_unlock_irqrestore(&vcrypto->lock, flags);
> +}
> +
> +static int virtcrypto_find_vqs(struct virtio_crypto *vi)
> +{
> + vq_callback_t **callbacks;
> + struct virtqueue **vqs;
> + int ret = -ENOMEM;
> + int i, total_vqs;
> + const char **names;
> +
> + /*
> + * We expect 1 data virtqueue, followed by
> + * possible N-1 data queues used in multiqueue mode,
> + * followed by control vq.
> + */
> + total_vqs = vi->max_data_queues + 1;
> +
> + /* Allocate space for find_vqs parameters */
> + vqs = kcalloc(total_vqs, sizeof(*vqs), GFP_KERNEL);
> + if (!vqs)
> + goto err_vq;
> + callbacks = kcalloc(total_vqs, sizeof(*callbacks), GFP_KERNEL);
> + if (!callbacks)
> + goto err_callback;
> + names = kcalloc(total_vqs, sizeof(*names), GFP_KERNEL);
> + if (!names)
> + goto err_names;
> +
> + /* Parameters for control virtqueue */
> + callbacks[total_vqs - 1] = NULL;
> + names[total_vqs - 1] = "controlq";
> +
> + /* Allocate/initialize parameters for data virtqueues */
> + for (i = 0; i < vi->max_data_queues; i++) {
> + callbacks[i] = virtcrypto_dataq_callback;
> + snprintf(vi->data_vq[i].name, sizeof(vi->data_vq[i].name),
> + "dataq.%d", i);
> + names[i] = vi->data_vq[i].name;
> + }
> +
> + ret = vi->vdev->config->find_vqs(vi->vdev, total_vqs, vqs, callbacks,
> + names);
> + if (ret)
> + goto err_find;
> +
> + vi->ctrl_vq = vqs[total_vqs - 1];
> +
> + for (i = 0; i < vi->max_data_queues; i++)
> + vi->data_vq[i].vq = vqs[i];
> +
> + kfree(names);
> + kfree(callbacks);
> + kfree(vqs);
> +
> + return 0;
> +
> +err_find:
> + kfree(names);
> +err_names:
> + kfree(callbacks);
> +err_callback:
> + kfree(vqs);
> +err_vq:
> + return ret;
> +}
> +
> +static int virtcrypto_alloc_queues(struct virtio_crypto *vi)
> +{
> + vi->data_vq = kcalloc(vi->max_data_queues, sizeof(*vi->data_vq),
> + GFP_KERNEL);
> + if (!vi->data_vq)
> + return -ENOMEM;
> +
> + return 0;
> +}
> +
> +static void virtcrypto_clean_affinity(struct virtio_crypto *vi, long hcpu)
> +{
> + int i;
> +
> + if (vi->affinity_hint_set) {
> + for (i = 0; i < vi->max_data_queues; i++)
> + virtqueue_set_affinity(vi->data_vq[i].vq, -1);
> +
> + vi->affinity_hint_set = false;
> + }
> +}
> +
> +static void virtcrypto_set_affinity(struct virtio_crypto *vcrypto)
> +{
> + int i = 0;
> + int cpu;
> +
> + /*
> + * In single queue mode, we don't set the cpu affinity.
> + */
> + if (vcrypto->curr_queue == 1 || vcrypto->max_data_queues == 1) {
> + virtcrypto_clean_affinity(vcrypto, -1);
> + return;
> + }
> +
> + /*
> + * In multiqueue mode, we let the queue to be private to one cpu
> + * by setting the affinity hint to eliminate the contention.
> + *
> + * TODO: adds cpu hotplug support by register cpu notifier.
> + *
> + */
> + for_each_online_cpu(cpu) {
> + virtqueue_set_affinity(vcrypto->data_vq[i].vq, cpu);
> + if (++i >= vcrypto->max_data_queues)
> + break;
> + }
> +
> + vcrypto->affinity_hint_set = true;
> +}
> +
> +static void virtcrypto_free_queues(struct virtio_crypto *vi)
> +{
> + kfree(vi->data_vq);
> +}
> +
> +static int virtcrypto_init_vqs(struct virtio_crypto *vi)
> +{
> + int ret;
> +
> + /* Allocate send & receive queues */
> + ret = virtcrypto_alloc_queues(vi);
> + if (ret)
> + goto err;
> +
> + ret = virtcrypto_find_vqs(vi);
> + if (ret)
> + goto err_free;
> +
> + get_online_cpus();
> + virtcrypto_set_affinity(vi);
> + put_online_cpus();
> +
> + return 0;
> +
> +err_free:
> + virtcrypto_free_queues(vi);
> +err:
> + return ret;
> +}
> +
> +static int virtcrypto_update_status(struct virtio_crypto *vcrypto)
> +{
> + u32 status;
> + int err;
> + unsigned long flags;
> +
> + virtio_cread(vcrypto->vdev,
> + struct virtio_crypto_config, status, &status);
> +
> + /*
> + * Unknown status bits would be a host error and the driver
> + * should consider the device to be broken.
> + */
> + if (status & (~VIRTIO_CRYPTO_S_HW_READY)) {
> + dev_warn(&vcrypto->vdev->dev,
> + "Unknown status bits: 0x%x\n", status);
> +
> + spin_lock_irqsave(&vcrypto->lock, flags);
> + virtio_break_device(vcrypto->vdev);
> + spin_unlock_irqrestore(&vcrypto->lock, flags);
> + return -EPERM;
> + }
> +
> + if (vcrypto->status == status)
> + return 0;
> +
> + vcrypto->status = status;
> +
> + if (vcrypto->status & VIRTIO_CRYPTO_S_HW_READY) {
> + err = virtcrypto_dev_start(vcrypto);
> + if (err) {
> + dev_err(&vcrypto->vdev->dev,
> + "Failed to start virtio crypto device.\n");
> +
> + return -EPERM;
> + }
> + dev_info(&vcrypto->vdev->dev, "Accelerator is ready\n");
> + } else {
> + virtcrypto_dev_stop(vcrypto);
> + dev_info(&vcrypto->vdev->dev, "Accelerator is not ready\n");
> + }
> +
> + return 0;
> +}
> +
> +static void virtcrypto_del_vqs(struct virtio_crypto *vcrypto)
> +{
> + struct virtio_device *vdev = vcrypto->vdev;
> +
> + virtcrypto_clean_affinity(vcrypto, -1);
> +
> + vdev->config->del_vqs(vdev);
> +
> + virtcrypto_free_queues(vcrypto);
> +}
> +
> +static int virtcrypto_probe(struct virtio_device *vdev)
> +{
> + int err = -EFAULT;
> + struct virtio_crypto *vcrypto;
> + u32 max_data_queues = 0, max_cipher_key_len = 0;
> + u32 max_auth_key_len = 0;
> + u64 max_size = 0;
> +
> + if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1))
> + return -ENODEV;
> +
> + if (!vdev->config->get) {
> + dev_err(&vdev->dev, "%s failure: config access disabled\n",
> + __func__);
> + return -EINVAL;
> + }
> +
> + if (num_possible_nodes() > 1 && dev_to_node(&vdev->dev) < 0) {
> + /*
> + * If the accelerator is connected to a node with no memory
> + * there is no point in using the accelerator since the remote
> + * memory transaction will be very slow.
> + */
> + dev_err(&vdev->dev, "Invalid NUMA configuration.\n");
> + return -EINVAL;
> + }
> +
> + vcrypto = kzalloc_node(sizeof(*vcrypto), GFP_KERNEL,
> + dev_to_node(&vdev->dev));
> + if (!vcrypto)
> + return -ENOMEM;
> +
> + virtio_cread(vdev, struct virtio_crypto_config,
> + max_dataqueues, &max_data_queues);
> + if (max_data_queues < 1)
> + max_data_queues = 1;
> +
> + virtio_cread(vdev, struct virtio_crypto_config,
> + max_cipher_key_len, &max_cipher_key_len);
> + virtio_cread(vdev, struct virtio_crypto_config,
> + max_auth_key_len, &max_auth_key_len);
> + virtio_cread(vdev, struct virtio_crypto_config,
> + max_size, &max_size);
> +
> + /* Add virtio crypto device to global table */
> + err = virtcrypto_devmgr_add_dev(vcrypto);
> + if (err) {
> + dev_err(&vdev->dev, "Failed to add new virtio crypto device.\n");
> + goto free;
> + }
> + vcrypto->owner = THIS_MODULE;
> + vcrypto = vdev->priv = vcrypto;
> + vcrypto->vdev = vdev;
> + spin_lock_init(&vcrypto->lock);
> + spin_lock_init(&vcrypto->ctrl_lock);
> +
> + /* Use single data queue as default */
> + vcrypto->curr_queue = 1;
> + vcrypto->max_data_queues = max_data_queues;
> + vcrypto->max_cipher_key_len = max_cipher_key_len;
> + vcrypto->max_auth_key_len = max_auth_key_len;
> + vcrypto->max_size = max_size;
> +
> + dev_info(&vdev->dev,
> + "max_queues: %u, max_cipher_key_len: %u, max_auth_key_len: %u,
> max_size 0x%llx\n",
> + vcrypto->max_data_queues,
> + vcrypto->max_cipher_key_len,
> + vcrypto->max_auth_key_len,
> + vcrypto->max_size);
> +
> + err = virtcrypto_init_vqs(vcrypto);
> + if (err) {
> + dev_err(&vdev->dev, "Failed to initialize vqs.\n");
> + goto free_dev;
> + }
> + virtio_device_ready(vdev);
> +
> + err = virtcrypto_update_status(vcrypto);
> + if (err)
> + goto free_vqs;
> +
> + return 0;
> +
> +free_vqs:
> + vcrypto->vdev->config->reset(vdev);
> + virtcrypto_del_vqs(vcrypto);
> +free_dev:
> + virtcrypto_devmgr_rm_dev(vcrypto);
> +free:
> + kfree(vcrypto);
> + return err;
> +}
> +
> +static void virtcrypto_free_unused_reqs(struct virtio_crypto *vcrypto)
> +{
> + struct virtio_crypto_request *vc_req;
> + int i;
> + struct virtqueue *vq;
> +
> + for (i = 0; i < vcrypto->max_data_queues; i++) {
> + vq = vcrypto->data_vq[i].vq;
> + while ((vc_req = virtqueue_detach_unused_buf(vq)) != NULL) {
> + kfree(vc_req->req_data);
> + kfree(vc_req->sgs);
> + }
> + }
> +}
> +
> +static void virtcrypto_remove(struct virtio_device *vdev)
> +{
> + struct virtio_crypto *vcrypto = vdev->priv;
> +
> + dev_info(&vdev->dev, "Start virtcrypto_remove.\n");
> +
> + if (virtcrypto_dev_started(vcrypto))
> + virtcrypto_dev_stop(vcrypto);
> + vdev->config->reset(vdev);
> + virtcrypto_free_unused_reqs(vcrypto);
> + virtcrypto_del_vqs(vcrypto);
> + virtcrypto_devmgr_rm_dev(vcrypto);
> + kfree(vcrypto);
> +}
> +
> +static void virtcrypto_config_changed(struct virtio_device *vdev)
> +{
> + struct virtio_crypto *vcrypto = vdev->priv;
> +
> + virtcrypto_update_status(vcrypto);
> +}
> +
> +#ifdef CONFIG_PM_SLEEP
> +static int virtcrypto_freeze(struct virtio_device *vdev)
> +{
> + struct virtio_crypto *vcrypto = vdev->priv;
> +
> + vdev->config->reset(vdev);
> + virtcrypto_free_unused_reqs(vcrypto);
> + if (virtcrypto_dev_started(vcrypto))
> + virtcrypto_dev_stop(vcrypto);
> +
> + virtcrypto_del_vqs(vcrypto);
> + return 0;
> +}
> +
> +static int virtcrypto_restore(struct virtio_device *vdev)
> +{
> + struct virtio_crypto *vcrypto = vdev->priv;
> + int err;
> +
> + err = virtcrypto_init_vqs(vcrypto);
> + if (err)
> + return err;
> +
> + virtio_device_ready(vdev);
> + err = virtcrypto_dev_start(vcrypto);
> + if (err) {
> + dev_err(&vdev->dev, "Failed to start virtio crypto device.\n");
> + return -EFAULT;
> + }
> +
> + return 0;
> +}
> +#endif
> +
> +static unsigned int features[] = {
> + /* none */
> +};
> +
> +static struct virtio_device_id id_table[] = {
> + { VIRTIO_ID_CRYPTO, VIRTIO_DEV_ANY_ID },
> + { 0 },
> +};
> +
> +static struct virtio_driver virtio_crypto_driver = {
> + .driver.name = KBUILD_MODNAME,
> + .driver.owner = THIS_MODULE,
> + .feature_table = features,
> + .feature_table_size = ARRAY_SIZE(features),
> + .id_table = id_table,
> + .probe = virtcrypto_probe,
> + .remove = virtcrypto_remove,
> + .config_changed = virtcrypto_config_changed,
> +#ifdef CONFIG_PM_SLEEP
> + .freeze = virtcrypto_freeze,
> + .restore = virtcrypto_restore,
> +#endif
> +};
> +
> +module_virtio_driver(virtio_crypto_driver);
> +
> +MODULE_DEVICE_TABLE(virtio, id_table);
> +MODULE_DESCRIPTION("virtio crypto device driver");
> +MODULE_LICENSE("GPL");
> +MODULE_AUTHOR("Gonglei <arei.gonglei@huawei.com>");
> diff --git a/drivers/crypto/virtio/virtio_crypto_mgr.c
> b/drivers/crypto/virtio/virtio_crypto_mgr.c
> new file mode 100644
> index 0000000..a69ff71
> --- /dev/null
> +++ b/drivers/crypto/virtio/virtio_crypto_mgr.c
> @@ -0,0 +1,264 @@
> + /* Management for virtio crypto devices (refer to adf_dev_mgr.c)
> + *
> + * Copyright 2016 HUAWEI TECHNOLOGIES CO., LTD.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <linux/mutex.h>
> +#include <linux/list.h>
> +#include <linux/module.h>
> +
> +#include <uapi/linux/virtio_crypto.h>
> +#include "virtio_crypto_common.h"
> +
> +static LIST_HEAD(virtio_crypto_table);
> +static uint32_t num_devices;
> +
> +/* The table_lock protects the above global list and num_devices */
> +static DEFINE_MUTEX(table_lock);
> +
> +#define VIRTIO_CRYPTO_MAX_DEVICES 32
> +
> +
> +/*
> + * virtcrypto_devmgr_add_dev() - Add vcrypto_dev to the acceleration
> + * framework.
> + * @vcrypto_dev: Pointer to virtio crypto device.
> + *
> + * Function adds virtio crypto device to the global list.
> + * To be used by virtio crypto device specific drivers.
> + *
> + * Return: 0 on success, error code othewise.
> + */
> +int virtcrypto_devmgr_add_dev(struct virtio_crypto *vcrypto_dev)
> +{
> + struct list_head *itr;
> +
> + mutex_lock(&table_lock);
> + if (num_devices == VIRTIO_CRYPTO_MAX_DEVICES) {
> + pr_info("virtio_crypto: only support up to %d devices\n",
> + VIRTIO_CRYPTO_MAX_DEVICES);
> + mutex_unlock(&table_lock);
> + return -EFAULT;
> + }
> +
> + list_for_each(itr, &virtio_crypto_table) {
> + struct virtio_crypto *ptr =
> + list_entry(itr, struct virtio_crypto, list);
> +
> + if (ptr == vcrypto_dev) {
> + mutex_unlock(&table_lock);
> + return -EEXIST;
> + }
> + }
> + atomic_set(&vcrypto_dev->ref_count, 0);
> + list_add_tail(&vcrypto_dev->list, &virtio_crypto_table);
> + vcrypto_dev->dev_id = num_devices++;
> + mutex_unlock(&table_lock);
> + return 0;
> +}
> +
> +struct list_head *virtcrypto_devmgr_get_head(void)
> +{
> + return &virtio_crypto_table;
> +}
> +
> +/*
> + * virtcrypto_devmgr_rm_dev() - Remove vcrypto_dev from the acceleration
> + * framework.
> + * @vcrypto_dev: Pointer to virtio crypto device.
> + *
> + * Function removes virtio crypto device from the acceleration framework.
> + * To be used by virtio crypto device specific drivers.
> + *
> + * Return: void
> + */
> +void virtcrypto_devmgr_rm_dev(struct virtio_crypto *vcrypto_dev)
> +{
> + mutex_lock(&table_lock);
> + list_del(&vcrypto_dev->list);
> + num_devices--;
> + mutex_unlock(&table_lock);
> +}
> +
> +/*
> + * virtcrypto_devmgr_get_first()
> + *
> + * Function returns the first virtio crypto device from the acceleration
> + * framework.
> + *
> + * To be used by virtio crypto device specific drivers.
> + *
> + * Return: pointer to vcrypto_dev or NULL if not found.
> + */
> +struct virtio_crypto *virtcrypto_devmgr_get_first(void)
> +{
> + struct virtio_crypto *dev = NULL;
> +
> + mutex_lock(&table_lock);
> + if (!list_empty(&virtio_crypto_table))
> + dev = list_first_entry(&virtio_crypto_table,
> + struct virtio_crypto,
> + list);
> + mutex_unlock(&table_lock);
> + return dev;
> +}
> +
> +/*
> + * virtcrypto_dev_in_use() - Check whether vcrypto_dev is currently in use
> + * @vcrypto_dev: Pointer to virtio crypto device.
> + *
> + * To be used by virtio crypto device specific drivers.
> + *
> + * Return: 1 when device is in use, 0 otherwise.
> + */
> +int virtcrypto_dev_in_use(struct virtio_crypto *vcrypto_dev)
> +{
> + return atomic_read(&vcrypto_dev->ref_count) != 0;
> +}
> +
> +/*
> + * virtcrypto_dev_get() - Increment vcrypto_dev reference count
> + * @vcrypto_dev: Pointer to virtio crypto device.
> + *
> + * Increment the vcrypto_dev refcount and if this is the first time
> + * incrementing it during this period the vcrypto_dev is in use,
> + * increment the module refcount too.
> + * To be used by virtio crypto device specific drivers.
> + *
> + * Return: 0 when successful, EFAULT when fail to bump module refcount
> + */
> +int virtcrypto_dev_get(struct virtio_crypto *vcrypto_dev)
> +{
> + if (atomic_add_return(1, &vcrypto_dev->ref_count) == 1)
> + if (!try_module_get(vcrypto_dev->owner))
> + return -EFAULT;
> + return 0;
> +}
> +
> +/*
> + * virtcrypto_dev_put() - Decrement vcrypto_dev reference count
> + * @vcrypto_dev: Pointer to virtio crypto device.
> + *
> + * Decrement the vcrypto_dev refcount and if this is the last time
> + * decrementing it during this period the vcrypto_dev is in use,
> + * decrement the module refcount too.
> + * To be used by virtio crypto device specific drivers.
> + *
> + * Return: void
> + */
> +void virtcrypto_dev_put(struct virtio_crypto *vcrypto_dev)
> +{
> + if (atomic_sub_return(1, &vcrypto_dev->ref_count) == 0)
> + module_put(vcrypto_dev->owner);
> +}
> +
> +/*
> + * virtcrypto_dev_started() - Check whether device has started
> + * @vcrypto_dev: Pointer to virtio crypto device.
> + *
> + * To be used by virtio crypto device specific drivers.
> + *
> + * Return: 1 when the device has started, 0 otherwise
> + */
> +int virtcrypto_dev_started(struct virtio_crypto *vcrypto_dev)
> +{
> + return (vcrypto_dev->status & VIRTIO_CRYPTO_S_HW_READY);
> +}
> +
> +/*
> + * virtcrypto_get_dev_node() - Get vcrypto_dev on the node.
> + * @node: Node id the driver works.
> + *
> + * Function returns the virtio crypto device used fewest on the node.
> + *
> + * To be used by virtio crypto device specific drivers.
> + *
> + * Return: pointer to vcrypto_dev or NULL if not found.
> + */
> +struct virtio_crypto *virtcrypto_get_dev_node(int node)
> +{
> + struct virtio_crypto *vcrypto_dev = NULL, *tmp_dev;
> + unsigned long best = ~0;
> + unsigned long ctr;
> +
> + mutex_lock(&table_lock);
> + list_for_each_entry(tmp_dev, virtcrypto_devmgr_get_head(), list) {
> +
> + if ((node == dev_to_node(&tmp_dev->vdev->dev) ||
> + dev_to_node(&tmp_dev->vdev->dev) < 0) &&
> + virtcrypto_dev_started(tmp_dev)) {
> + ctr = atomic_read(&tmp_dev->ref_count);
> + if (best > ctr) {
> + vcrypto_dev = tmp_dev;
> + best = ctr;
> + }
> + }
> + }
> +
> + if (!vcrypto_dev) {
> + pr_info("virtio_crypto: Could not find a device on node %d\n",
> + node);
> + /* Get any started device */
> + list_for_each_entry(tmp_dev,
> + virtcrypto_devmgr_get_head(), list) {
> + if (virtcrypto_dev_started(tmp_dev)) {
> + vcrypto_dev = tmp_dev;
> + break;
> + }
> + }
> + }
> + mutex_unlock(&table_lock);
> + if (!vcrypto_dev)
> + return NULL;
> +
> + virtcrypto_dev_get(vcrypto_dev);
> + return vcrypto_dev;
> +}
> +
> +/*
> + * virtcrypto_dev_start() - Start virtio crypto device
> + * @vcrypto: Pointer to virtio crypto device.
> + *
> + * Function notifies all the registered services that the virtio crypto device
> + * is ready to be used.
> + * To be used by virtio crypto device specific drivers.
> + *
> + * Return: 0 on success, EFAULT when fail to register algorithms
> + */
> +int virtcrypto_dev_start(struct virtio_crypto *vcrypto)
> +{
> + if (virtio_crypto_algs_register()) {
> + pr_err("virtio_crypto: Failed to register crypto algs\n");
> + return -EFAULT;
> + }
> +
> + return 0;
> +}
> +
> +/*
> + * virtcrypto_dev_stop() - Stop virtio crypto device
> + * @vcrypto: Pointer to virtio crypto device.
> + *
> + * Function notifies all the registered services that the virtio crypto device
> + * is ready to be used.
> + * To be used by virtio crypto device specific drivers.
> + *
> + * Return: void
> + */
> +void virtcrypto_dev_stop(struct virtio_crypto *vcrypto)
> +{
> + virtio_crypto_algs_unregister();
> +}
> diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
> index cd2be1c..4bdb84c 100644
> --- a/include/uapi/linux/Kbuild
> +++ b/include/uapi/linux/Kbuild
> @@ -460,6 +460,7 @@ header-y += virtio_rng.h
> header-y += virtio_scsi.h
> header-y += virtio_types.h
> header-y += virtio_vsock.h
> +header-y += virtio_crypto.h
> header-y += vm_sockets.h
> header-y += vt.h
> header-y += vtpm_proxy.h
> diff --git a/include/uapi/linux/virtio_crypto.h
> b/include/uapi/linux/virtio_crypto.h
> new file mode 100644
> index 0000000..50cdc8a
> --- /dev/null
> +++ b/include/uapi/linux/virtio_crypto.h
> @@ -0,0 +1,450 @@
> +#ifndef _VIRTIO_CRYPTO_H
> +#define _VIRTIO_CRYPTO_H
> +/* This header is BSD licensed so anyone can use the definitions to implement
> + * compatible drivers/servers.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + * 1. Redistributions of source code must retain the above copyright
> + * notice, this list of conditions and the following disclaimer.
> + * 2. Redistributions in binary form must reproduce the above copyright
> + * notice, this list of conditions and the following disclaimer in the
> + * documentation and/or other materials provided with the distribution.
> + * 3. Neither the name of IBM nor the names of its contributors
> + * may be used to endorse or promote products derived from this
> software
> + * without specific prior written permission.
> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> + * ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS
> + * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL IBM
> OR
> + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
> NOT
> + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
> OF
> + * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> AND
> + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
> + * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
> OUT
> + * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> + * SUCH DAMAGE.
> + */
> +#include <linux/types.h>
> +#include <linux/virtio_types.h>
> +#include <linux/virtio_ids.h>
> +#include <linux/virtio_config.h>
> +
> +
> +#define VIRTIO_CRYPTO_SERVICE_CIPHER 0
> +#define VIRTIO_CRYPTO_SERVICE_HASH 1
> +#define VIRTIO_CRYPTO_SERVICE_MAC 2
> +#define VIRTIO_CRYPTO_SERVICE_AEAD 3
> +
> +#define VIRTIO_CRYPTO_OPCODE(service, op) (((service) << 8) | (op))
> +
> +struct virtio_crypto_ctrl_header {
> +#define VIRTIO_CRYPTO_CIPHER_CREATE_SESSION \
> + VIRTIO_CRYPTO_OPCODE(VIRTIO_CRYPTO_SERVICE_CIPHER, 0x02)
> +#define VIRTIO_CRYPTO_CIPHER_DESTROY_SESSION \
> + VIRTIO_CRYPTO_OPCODE(VIRTIO_CRYPTO_SERVICE_CIPHER, 0x03)
> +#define VIRTIO_CRYPTO_HASH_CREATE_SESSION \
> + VIRTIO_CRYPTO_OPCODE(VIRTIO_CRYPTO_SERVICE_HASH, 0x02)
> +#define VIRTIO_CRYPTO_HASH_DESTROY_SESSION \
> + VIRTIO_CRYPTO_OPCODE(VIRTIO_CRYPTO_SERVICE_HASH, 0x03)
> +#define VIRTIO_CRYPTO_MAC_CREATE_SESSION \
> + VIRTIO_CRYPTO_OPCODE(VIRTIO_CRYPTO_SERVICE_MAC, 0x02)
> +#define VIRTIO_CRYPTO_MAC_DESTROY_SESSION \
> + VIRTIO_CRYPTO_OPCODE(VIRTIO_CRYPTO_SERVICE_MAC, 0x03)
> +#define VIRTIO_CRYPTO_AEAD_CREATE_SESSION \
> + VIRTIO_CRYPTO_OPCODE(VIRTIO_CRYPTO_SERVICE_AEAD, 0x02)
> +#define VIRTIO_CRYPTO_AEAD_DESTROY_SESSION \
> + VIRTIO_CRYPTO_OPCODE(VIRTIO_CRYPTO_SERVICE_AEAD, 0x03)
> + __le32 opcode;
> + __le32 algo;
> + __le32 flag;
> + /* data virtqueue id */
> + __le32 queue_id;
> +};
> +
> +struct virtio_crypto_cipher_session_para {
> +#define VIRTIO_CRYPTO_NO_CIPHER 0
> +#define VIRTIO_CRYPTO_CIPHER_ARC4 1
> +#define VIRTIO_CRYPTO_CIPHER_AES_ECB 2
> +#define VIRTIO_CRYPTO_CIPHER_AES_CBC 3
> +#define VIRTIO_CRYPTO_CIPHER_AES_CTR 4
> +#define VIRTIO_CRYPTO_CIPHER_DES_ECB 5
> +#define VIRTIO_CRYPTO_CIPHER_DES_CBC 6
> +#define VIRTIO_CRYPTO_CIPHER_3DES_ECB 7
> +#define VIRTIO_CRYPTO_CIPHER_3DES_CBC 8
> +#define VIRTIO_CRYPTO_CIPHER_3DES_CTR 9
> +#define VIRTIO_CRYPTO_CIPHER_KASUMI_F8 10
> +#define VIRTIO_CRYPTO_CIPHER_SNOW3G_UEA2 11
> +#define VIRTIO_CRYPTO_CIPHER_AES_F8 12
> +#define VIRTIO_CRYPTO_CIPHER_AES_XTS 13
> +#define VIRTIO_CRYPTO_CIPHER_ZUC_EEA3 14
> + __le32 algo;
> + /* length of key */
> + __le32 keylen;
> +
> +#define VIRTIO_CRYPTO_OP_ENCRYPT 1
> +#define VIRTIO_CRYPTO_OP_DECRYPT 2
> + /* encrypt or decrypt */
> + __le32 op;
> + __le32 padding;
> +};
> +
> +struct virtio_crypto_session_input {
> + /* Device-writable part */
> + __le64 session_id;
> + __le32 status;
> + __le32 padding;
> +};
> +
> +struct virtio_crypto_cipher_session_req {
> + struct virtio_crypto_cipher_session_para para;
> + __u8 padding[32];
> +};
> +
> +struct virtio_crypto_hash_session_para {
> +#define VIRTIO_CRYPTO_NO_HASH 0
> +#define VIRTIO_CRYPTO_HASH_MD5 1
> +#define VIRTIO_CRYPTO_HASH_SHA1 2
> +#define VIRTIO_CRYPTO_HASH_SHA_224 3
> +#define VIRTIO_CRYPTO_HASH_SHA_256 4
> +#define VIRTIO_CRYPTO_HASH_SHA_384 5
> +#define VIRTIO_CRYPTO_HASH_SHA_512 6
> +#define VIRTIO_CRYPTO_HASH_SHA3_224 7
> +#define VIRTIO_CRYPTO_HASH_SHA3_256 8
> +#define VIRTIO_CRYPTO_HASH_SHA3_384 9
> +#define VIRTIO_CRYPTO_HASH_SHA3_512 10
> +#define VIRTIO_CRYPTO_HASH_SHA3_SHAKE128 11
> +#define VIRTIO_CRYPTO_HASH_SHA3_SHAKE256 12
> + __le32 algo;
> + /* hash result length */
> + __le32 hash_result_len;
> + __u8 padding[8];
> +};
> +
> +struct virtio_crypto_hash_create_session_req {
> + struct virtio_crypto_hash_session_para para;
> + __u8 padding[40];
> +};
> +
> +struct virtio_crypto_mac_session_para {
> +#define VIRTIO_CRYPTO_NO_MAC 0
> +#define VIRTIO_CRYPTO_MAC_HMAC_MD5 1
> +#define VIRTIO_CRYPTO_MAC_HMAC_SHA1 2
> +#define VIRTIO_CRYPTO_MAC_HMAC_SHA_224 3
> +#define VIRTIO_CRYPTO_MAC_HMAC_SHA_256 4
> +#define VIRTIO_CRYPTO_MAC_HMAC_SHA_384 5
> +#define VIRTIO_CRYPTO_MAC_HMAC_SHA_512 6
> +#define VIRTIO_CRYPTO_MAC_CMAC_3DES 25
> +#define VIRTIO_CRYPTO_MAC_CMAC_AES 26
> +#define VIRTIO_CRYPTO_MAC_KASUMI_F9 27
> +#define VIRTIO_CRYPTO_MAC_SNOW3G_UIA2 28
> +#define VIRTIO_CRYPTO_MAC_GMAC_AES 41
> +#define VIRTIO_CRYPTO_MAC_GMAC_TWOFISH 42
> +#define VIRTIO_CRYPTO_MAC_CBCMAC_AES 49
> +#define VIRTIO_CRYPTO_MAC_CBCMAC_KASUMI_F9 50
> +#define VIRTIO_CRYPTO_MAC_XCBC_AES 53
> + __le32 algo;
> + /* hash result length */
> + __le32 hash_result_len;
> + /* length of authenticated key */
> + __le32 auth_key_len;
> + __le32 padding;
> +};
> +
> +struct virtio_crypto_mac_create_session_req {
> + struct virtio_crypto_mac_session_para para;
> + __u8 padding[40];
> +};
> +
> +struct virtio_crypto_aead_session_para {
> +#define VIRTIO_CRYPTO_NO_AEAD 0
> +#define VIRTIO_CRYPTO_AEAD_GCM 1
> +#define VIRTIO_CRYPTO_AEAD_CCM 2
> +#define VIRTIO_CRYPTO_AEAD_CHACHA20_POLY1305 3
> + __le32 algo;
> + /* length of key */
> + __le32 key_len;
> + /* hash result length */
> + __le32 hash_result_len;
> + /* length of the additional authenticated data (AAD) in bytes */
> + __le32 aad_len;
> + /* encrypt or decrypt, See above VIRTIO_CRYPTO_OP_* */
> + __le32 op;
> + __le32 padding;
> +};
> +
> +struct virtio_crypto_aead_create_session_req {
> + struct virtio_crypto_aead_session_para para;
> + __u8 padding[32];
> +};
> +
> +struct virtio_crypto_alg_chain_session_para {
> +#define VIRTIO_CRYPTO_SYM_ALG_CHAIN_ORDER_HASH_THEN_CIPHER 1
> +#define VIRTIO_CRYPTO_SYM_ALG_CHAIN_ORDER_CIPHER_THEN_HASH 2
> + __le32 alg_chain_order;
> +/* Plain hash */
> +#define VIRTIO_CRYPTO_SYM_HASH_MODE_PLAIN 1
> +/* Authenticated hash (mac) */
> +#define VIRTIO_CRYPTO_SYM_HASH_MODE_AUTH 2
> +/* Nested hash */
> +#define VIRTIO_CRYPTO_SYM_HASH_MODE_NESTED 3
> + __le32 hash_mode;
> + struct virtio_crypto_cipher_session_para cipher_param;
> + union {
> + struct virtio_crypto_hash_session_para hash_param;
> + struct virtio_crypto_mac_session_para mac_param;
> + __u8 padding[16];
> + } u;
> + /* length of the additional authenticated data (AAD) in bytes */
> + __le32 aad_len;
> + __le32 padding;
> +};
> +
> +struct virtio_crypto_alg_chain_session_req {
> + struct virtio_crypto_alg_chain_session_para para;
> +};
> +
> +struct virtio_crypto_sym_create_session_req {
> + union {
> + struct virtio_crypto_cipher_session_req cipher;
> + struct virtio_crypto_alg_chain_session_req chain;
> + __u8 padding[48];
> + } u;
> +
> + /* Device-readable part */
> +
> +/* No operation */
> +#define VIRTIO_CRYPTO_SYM_OP_NONE 0
> +/* Cipher only operation on the data */
> +#define VIRTIO_CRYPTO_SYM_OP_CIPHER 1
> +/*
> + * Chain any cipher with any hash or mac operation. The order
> + * depends on the value of alg_chain_order param
> + */
> +#define VIRTIO_CRYPTO_SYM_OP_ALGORITHM_CHAINING 2
> + __le32 op_type;
> + __le32 padding;
> +};
> +
> +struct virtio_crypto_destroy_session_req {
> + /* Device-readable part */
> + __le64 session_id;
> + __u8 padding[48];
> +};
> +
> +/* The request of the control virtqueue's packet */
> +struct virtio_crypto_op_ctrl_req {
> + struct virtio_crypto_ctrl_header header;
> +
> + union {
> + struct virtio_crypto_sym_create_session_req
> + sym_create_session;
> + struct virtio_crypto_hash_create_session_req
> + hash_create_session;
> + struct virtio_crypto_mac_create_session_req
> + mac_create_session;
> + struct virtio_crypto_aead_create_session_req
> + aead_create_session;
> + struct virtio_crypto_destroy_session_req
> + destroy_session;
> + __u8 padding[56];
> + } u;
> +};
> +
> +struct virtio_crypto_op_header {
> +#define VIRTIO_CRYPTO_CIPHER_ENCRYPT \
> + VIRTIO_CRYPTO_OPCODE(VIRTIO_CRYPTO_SERVICE_CIPHER, 0x00)
> +#define VIRTIO_CRYPTO_CIPHER_DECRYPT \
> + VIRTIO_CRYPTO_OPCODE(VIRTIO_CRYPTO_SERVICE_CIPHER, 0x01)
> +#define VIRTIO_CRYPTO_HASH \
> + VIRTIO_CRYPTO_OPCODE(VIRTIO_CRYPTO_SERVICE_HASH, 0x00)
> +#define VIRTIO_CRYPTO_MAC \
> + VIRTIO_CRYPTO_OPCODE(VIRTIO_CRYPTO_SERVICE_MAC, 0x00)
> +#define VIRTIO_CRYPTO_AEAD_ENCRYPT \
> + VIRTIO_CRYPTO_OPCODE(VIRTIO_CRYPTO_SERVICE_AEAD, 0x00)
> +#define VIRTIO_CRYPTO_AEAD_DECRYPT \
> + VIRTIO_CRYPTO_OPCODE(VIRTIO_CRYPTO_SERVICE_AEAD, 0x01)
> + __le32 opcode;
> + /* algo should be service-specific algorithms */
> + __le32 algo;
> + /* session_id should be service-specific algorithms */
> + __le64 session_id;
> + /* control flag to control the request */
> + __le32 flag;
> + __le32 padding;
> +};
> +
> +struct virtio_crypto_cipher_para {
> + /*
> + * Byte Length of valid IV/Counter
> + *
> + * For block ciphers in CBC or F8 mode, or for Kasumi in F8 mode, or for
> + * SNOW3G in UEA2 mode, this is the length of the IV (which
> + * must be the same as the block length of the cipher).
> + * For block ciphers in CTR mode, this is the length of the counter
> + * (which must be the same as the block length of the cipher).
> + * For AES-XTS, this is the 128bit tweak, i, from IEEE Std 1619-2007.
> + *
> + * The IV/Counter will be updated after every partial cryptographic
> + * operation.
> + */
> + __le32 iv_len;
> + /* length of source data */
> + __le32 src_data_len;
> + /* length of dst data */
> + __le32 dst_data_len;
> + __le32 padding;
> +};
> +
> +struct virtio_crypto_hash_para {
> + /* length of source data */
> + __le32 src_data_len;
> + /* hash result length */
> + __le32 hash_result_len;
> +};
> +
> +struct virtio_crypto_mac_para {
> + struct virtio_crypto_hash_para hash;
> +};
> +
> +struct virtio_crypto_aead_para {
> + /*
> + * Byte Length of valid IV data pointed to by the below iv_addr
> + * parameter.
> + *
> + * For GCM mode, this is either 12 (for 96-bit IVs) or 16, in which
> + * case iv_addr points to J0.
> + * For CCM mode, this is the length of the nonce, which can be in the
> + * range 7 to 13 inclusive.
> + */
> + __le32 iv_len;
> + /* length of additional auth data */
> + __le32 aad_len;
> + /* length of source data */
> + __le32 src_data_len;
> + /* length of dst data */
> + __le32 dst_data_len;
> +};
> +
> +struct virtio_crypto_cipher_data_req {
> + /* Device-readable part */
> + struct virtio_crypto_cipher_para para;
> + __u8 padding[24];
> +};
> +
> +struct virtio_crypto_hash_data_req {
> + /* Device-readable part */
> + struct virtio_crypto_hash_para para;
> + __u8 padding[40];
> +};
> +
> +struct virtio_crypto_mac_data_req {
> + /* Device-readable part */
> + struct virtio_crypto_mac_para para;
> + __u8 padding[40];
> +};
> +
> +struct virtio_crypto_alg_chain_data_para {
> + __le32 iv_len;
> + /* Length of source data */
> + __le32 src_data_len;
> + /* Length of destination data */
> + __le32 dst_data_len;
> + /* Starting point for cipher processing in source data */
> + __le32 cipher_start_src_offset;
> + /* Length of the source data that the cipher will be computed on */
> + __le32 len_to_cipher;
> + /* Starting point for hash processing in source data */
> + __le32 hash_start_src_offset;
> + /* Length of the source data that the hash will be computed on */
> + __le32 len_to_hash;
> + /* Length of the additional auth data */
> + __le32 aad_len;
> + /* Length of the hash result */
> + __le32 hash_result_len;
> + __le32 reserved;
> +};
> +
> +struct virtio_crypto_alg_chain_data_req {
> + /* Device-readable part */
> + struct virtio_crypto_alg_chain_data_para para;
> +};
> +
> +struct virtio_crypto_sym_data_req {
> + union {
> + struct virtio_crypto_cipher_data_req cipher;
> + struct virtio_crypto_alg_chain_data_req chain;
> + __u8 padding[40];
> + } u;
> +
> + /* See above VIRTIO_CRYPTO_SYM_OP_* */
> + __le32 op_type;
> + __le32 padding;
> +};
> +
> +struct virtio_crypto_aead_data_req {
> + /* Device-readable part */
> + struct virtio_crypto_aead_para para;
> + __u8 padding[32];
> +};
> +
> +/* The request of the data virtqueue's packet */
> +struct virtio_crypto_op_data_req {
> + struct virtio_crypto_op_header header;
> +
> + union {
> + struct virtio_crypto_sym_data_req sym_req;
> + struct virtio_crypto_hash_data_req hash_req;
> + struct virtio_crypto_mac_data_req mac_req;
> + struct virtio_crypto_aead_data_req aead_req;
> + __u8 padding[48];
> + } u;
> +};
> +
> +#define VIRTIO_CRYPTO_OK 0
> +#define VIRTIO_CRYPTO_ERR 1
> +#define VIRTIO_CRYPTO_BADMSG 2
> +#define VIRTIO_CRYPTO_NOTSUPP 3
> +#define VIRTIO_CRYPTO_INVSESS 4 /* Invalid session id */
> +
> +/* The accelerator hardware is ready */
> +#define VIRTIO_CRYPTO_S_HW_READY (1 << 0)
> +
> +struct virtio_crypto_config {
> + /* See VIRTIO_CRYPTO_OP_* above */
> + __u32 status;
> +
> + /*
> + * Maximum number of data queue
> + */
> + __u32 max_dataqueues;
> +
> + /*
> + * Specifies the services mask which the device support,
> + * see VIRTIO_CRYPTO_SERVICE_* above
> + */
> + __u32 crypto_services;
> +
> + /* Detailed algorithms mask */
> + __u32 cipher_algo_l;
> + __u32 cipher_algo_h;
> + __u32 hash_algo;
> + __u32 mac_algo_l;
> + __u32 mac_algo_h;
> + __u32 aead_algo;
> + /* Maximum length of cipher key */
> + __u32 max_cipher_key_len;
> + /* Maximum length of authenticated key */
> + __u32 max_auth_key_len;
> + __u32 reserve;
> + /* Maximum size of each crypto request's content */
> + __u64 max_size;
> +};
> +
> +struct virtio_crypto_inhdr {
> + /* See VIRTIO_CRYPTO_* above */
> + __u8 status;
> +};
> +#endif
> diff --git a/include/uapi/linux/virtio_ids.h b/include/uapi/linux/virtio_ids.h
> index 3228d58..6d5c3b2 100644
> --- a/include/uapi/linux/virtio_ids.h
> +++ b/include/uapi/linux/virtio_ids.h
> @@ -42,5 +42,6 @@
> #define VIRTIO_ID_GPU 16 /* virtio GPU */
> #define VIRTIO_ID_INPUT 18 /* virtio input */
> #define VIRTIO_ID_VSOCK 19 /* virtio vsock transport */
> +#define VIRTIO_ID_CRYPTO 20 /* virtio crypto */
>
> #endif /* _LINUX_VIRTIO_IDS_H */
> --
> 1.8.3.1
>
^ permalink raw reply
* RE: [PATCH] crypto: caam - fix pointer size for AArch64 boot loader, AArch32 kernel
From: Alison Wang @ 2016-12-06 3:29 UTC (permalink / raw)
To: Horia Geantă, Herbert Xu
Cc: David S. Miller, linux-crypto@vger.kernel.org, Dan Douglass
In-Reply-To: <1480928818-8166-1-git-send-email-horia.geanta@nxp.com>
> -----Original Message-----
> From: Horia Geantă [mailto:horia.geanta@nxp.com]
> Sent: Monday, December 05, 2016 5:07 PM
> To: Herbert Xu <herbert@gondor.apana.org.au>
> Cc: David S. Miller <davem@davemloft.net>; linux-crypto@vger.kernel.org;
> Dan Douglass <dan.douglass@nxp.com>; Alison Wang <alison.wang@nxp.com>
> Subject: [PATCH] crypto: caam - fix pointer size for AArch64 boot
> loader, AArch32 kernel
>
> Start with a clean slate before dealing with bit 16 (pointer size) of
> Master Configuration Register.
> This fixes the case of AArch64 boot loader + AArch32 kernel, when the
> boot loader might set MCFGR[PS] and kernel would fail to clear it.
>
> Cc: <stable@vger.kernel.org>
> Reported-by: Alison Wang <alison.wang@nxp.com>
> Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
> ---
> drivers/crypto/caam/ctrl.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/crypto/caam/ctrl.c b/drivers/crypto/caam/ctrl.c
> index be62a7f482ac..0a6ca3919270 100644
> --- a/drivers/crypto/caam/ctrl.c
> +++ b/drivers/crypto/caam/ctrl.c
> @@ -556,8 +556,9 @@ static int caam_probe(struct platform_device *pdev)
> * Enable DECO watchdogs and, if this is a PHYS_ADDR_T_64BIT
> kernel,
> * long pointers in master configuration register
> */
> - clrsetbits_32(&ctrl->mcr, MCFGR_AWCACHE_MASK, MCFGR_AWCACHE_CACH
> |
> - MCFGR_AWCACHE_BUFF | MCFGR_WDENABLE |
> MCFGR_LARGE_BURST |
> + clrsetbits_32(&ctrl->mcr, MCFGR_AWCACHE_MASK | MCFGR_LONG_PTR,
> + MCFGR_AWCACHE_CACH | MCFGR_AWCACHE_BUFF |
> + MCFGR_WDENABLE | MCFGR_LARGE_BURST |
> (sizeof(dma_addr_t) == sizeof(u64) ? MCFGR_LONG_PTR :
> 0));
>
> /*
>
Reviewed-By: Alison Wang <Alison.wang@nxp.com>
Best Regards,
Alison Wang
^ permalink raw reply
* [PATCH v2] crypto/mcryptd: Check mcryptd algorithm compatibility
From: Tim Chen @ 2016-12-05 19:46 UTC (permalink / raw)
To: Mikulas Patocka, Herbert Xu, David S. Miller
Cc: Tim Chen, megha.dey, linux-crypto, dm-devel, Milan Broz,
Eric Biggers, stable
Algorithms not compatible with mcryptd could be spawned by mcryptd
with a direct crypto_alloc_tfm invocation using a "mcryptd(alg)" name
construct. This causes mcryptd to crash the kernel if an arbitrary
"alg" is incompatible and not intended to be used with mcryptd. It is
an issue if AF_ALG tries to spawn mcryptd(alg) to expose it externally.
But such algorithms must be used internally and not be exposed.
We added a check to enforce that only internal algorithms are allowed
with mcryptd at the time mcryptd is spawning an algorithm.
Link: http://marc.info/?l=linux-crypto-vger&m=148063683310477&w=2
Cc: stable@vger.kernel.org
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
---
crypto/mcryptd.c | 19 ++++++++++++-------
1 file changed, 12 insertions(+), 7 deletions(-)
diff --git a/crypto/mcryptd.c b/crypto/mcryptd.c
index 94ee44a..c207458 100644
--- a/crypto/mcryptd.c
+++ b/crypto/mcryptd.c
@@ -254,18 +254,22 @@ static void *mcryptd_alloc_instance(struct crypto_alg *alg, unsigned int head,
goto out;
}
-static inline void mcryptd_check_internal(struct rtattr **tb, u32 *type,
+static inline bool mcryptd_check_internal(struct rtattr **tb, u32 *type,
u32 *mask)
{
struct crypto_attr_type *algt;
algt = crypto_get_attr_type(tb);
if (IS_ERR(algt))
- return;
- if ((algt->type & CRYPTO_ALG_INTERNAL))
- *type |= CRYPTO_ALG_INTERNAL;
- if ((algt->mask & CRYPTO_ALG_INTERNAL))
- *mask |= CRYPTO_ALG_INTERNAL;
+ return false;
+
+ *type |= algt->type & CRYPTO_ALG_INTERNAL;
+ *mask |= algt->mask & CRYPTO_ALG_INTERNAL;
+
+ if (*type & *mask & CRYPTO_ALG_INTERNAL)
+ return true;
+ else
+ return false;
}
static int mcryptd_hash_init_tfm(struct crypto_tfm *tfm)
@@ -492,7 +496,8 @@ static int mcryptd_create_hash(struct crypto_template *tmpl, struct rtattr **tb,
u32 mask = 0;
int err;
- mcryptd_check_internal(tb, &type, &mask);
+ if (!mcryptd_check_internal(tb, &type, &mask))
+ return -EINVAL;
halg = ahash_attr_alg(tb[1], type, mask);
if (IS_ERR(halg))
--
2.5.5
^ permalink raw reply related
* Re: [PATCH] crypto/mcryptd: Check mcryptd algorithm compatability
From: Tim Chen @ 2016-12-05 19:03 UTC (permalink / raw)
To: Herbert Xu
Cc: Mikulas Patocka, David S. Miller, megha.dey, linux-crypto,
dm-devel, Milan Broz, Eric Biggers, stable
In-Reply-To: <1480956650.3064.65.camel@linux.intel.com>
On Mon, 2016-12-05 at 08:50 -0800, Tim Chen wrote:
> On Mon, 2016-12-05 at 20:34 +0800, Herbert Xu wrote:
> >
> > On Fri, Dec 02, 2016 at 04:15:21PM -0800, Tim Chen wrote:
> > >
> > >
> > > Algorithms not compatible with mcryptd could be spawned by mcryptd
> > > with a direct crypto_alloc_tfm invocation using a "mcryptd(alg)"
> > > name construct. This causes mcryptd to crash the kernel if
> > > "alg" is incompatible and not intended to be used with mcryptd.
> > >
> > > A flag CRYPTO_ALG_MCRYPT is being added to mcryptd compatible
> > > algorithms' cra_flags. The compatability is checked when mcryptd spawn
> > > off an algorithm.
> > >
> > > Link: http://marc.info/?l=linux-crypto-vger&m=148063683310477&w=2
> > > Cc: stable@vger.kernel.org
> > > Reported-by: Mikulas Patocka <mpatocka@redhat.com>
> > > Tested-by: Megha Dey <megha.dey@linux.intel.com>
> > > Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
> > Tim, I think we should instead make mcryptd refuse to generate
> > a non-internal algorithm. This way the user would not be able
> > to access it at all since they can only request for non-internal
> > algorithms.
> >
> > Basically you want to check at the start of mcryptd_create_hash
> > that the INTERNAL bit is set on both the type and mask as returned
> > by crypto_get_attr_type.
> I have thought about that. However, not all internal algorithms
> are compatible with mcryptd. We can still have trouble if some
> random internal algorithm is paired with mcryptd.
You're right. The mcryptd(alg) should be an internal algorithm itself.
So checking on the INTERNAL flag should be sufficient.
I'll generate another patch that uses the INTERNAL flag for checking.
Tim
^ permalink raw reply
* [PATCH v3 6/6] crypto: arm/crc32 - accelerated support based on x86 SSE implementation
From: Ard Biesheuvel @ 2016-12-05 18:42 UTC (permalink / raw)
To: linux-crypto, herbert; +Cc: linux-arm-kernel, Ard Biesheuvel
In-Reply-To: <1480963348-24203-1-git-send-email-ard.biesheuvel@linaro.org>
This is a combination of the the Intel algorithm implemented using SSE
and PCLMULQDQ instructions from arch/x86/crypto/crc32-pclmul_asm.S, and
the new CRC32 extensions introduced for both 32-bit and 64-bit ARM in
version 8 of the architecture. Two versions of the above combo are
provided, one for CRC32 and one for CRC32C.
The PMULL/NEON algorithm is faster, but operates on blocks of at least
64 bytes, and on multiples of 16 bytes only. For the remaining input,
or for all input on systems that lack the PMULL 64x64->128 instructions,
the CRC32 instructions will be used.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm/crypto/Kconfig | 5 +
arch/arm/crypto/Makefile | 2 +
arch/arm/crypto/crc32-ce-core.S | 306 ++++++++++++++++++++
arch/arm/crypto/crc32-ce-glue.c | 242 ++++++++++++++++
4 files changed, 555 insertions(+)
diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig
index fce801fa52a1..de7bb20815bf 100644
--- a/arch/arm/crypto/Kconfig
+++ b/arch/arm/crypto/Kconfig
@@ -125,4 +125,9 @@ config CRYPTO_CRCT10DIF_ARM_CE
depends on KERNEL_MODE_NEON && CRC_T10DIF
select CRYPTO_HASH
+config CRYPTO_CRC32_ARM_CE
+ tristate "CRC32(C) digest algorithm using CRC and/or PMULL instructions"
+ depends on KERNEL_MODE_NEON && CRC32
+ select CRYPTO_HASH
+
endif
diff --git a/arch/arm/crypto/Makefile b/arch/arm/crypto/Makefile
index fc77265014b7..b578a1820ab1 100644
--- a/arch/arm/crypto/Makefile
+++ b/arch/arm/crypto/Makefile
@@ -14,6 +14,7 @@ ce-obj-$(CONFIG_CRYPTO_SHA1_ARM_CE) += sha1-arm-ce.o
ce-obj-$(CONFIG_CRYPTO_SHA2_ARM_CE) += sha2-arm-ce.o
ce-obj-$(CONFIG_CRYPTO_GHASH_ARM_CE) += ghash-arm-ce.o
ce-obj-$(CONFIG_CRYPTO_CRCT10DIF_ARM_CE) += crct10dif-arm-ce.o
+ce-obj-$(CONFIG_CRYPTO_CRC32_ARM_CE) += crc32-arm-ce.o
ifneq ($(ce-obj-y)$(ce-obj-m),)
ifeq ($(call as-instr,.fpu crypto-neon-fp-armv8,y,n),y)
@@ -38,6 +39,7 @@ sha2-arm-ce-y := sha2-ce-core.o sha2-ce-glue.o
aes-arm-ce-y := aes-ce-core.o aes-ce-glue.o
ghash-arm-ce-y := ghash-ce-core.o ghash-ce-glue.o
crct10dif-arm-ce-y := crct10dif-ce-core.o crct10dif-ce-glue.o
+crc32-arm-ce-y:= crc32-ce-core.o crc32-ce-glue.o
quiet_cmd_perl = PERL $@
cmd_perl = $(PERL) $(<) > $(@)
diff --git a/arch/arm/crypto/crc32-ce-core.S b/arch/arm/crypto/crc32-ce-core.S
new file mode 100644
index 000000000000..e63d400dc5c1
--- /dev/null
+++ b/arch/arm/crypto/crc32-ce-core.S
@@ -0,0 +1,306 @@
+/*
+ * Accelerated CRC32(C) using ARM CRC, NEON and Crypto Extensions instructions
+ *
+ * Copyright (C) 2016 Linaro Ltd <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+/* GPL HEADER START
+ *
+ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 only,
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License version 2 for more details (a copy is included
+ * in the LICENSE file that accompanied this code).
+ *
+ * You should have received a copy of the GNU General Public License
+ * version 2 along with this program; If not, see http://www.gnu.org/licenses
+ *
+ * Please visit http://www.xyratex.com/contact if you need additional
+ * information or have any questions.
+ *
+ * GPL HEADER END
+ */
+
+/*
+ * Copyright 2012 Xyratex Technology Limited
+ *
+ * Using hardware provided PCLMULQDQ instruction to accelerate the CRC32
+ * calculation.
+ * CRC32 polynomial:0x04c11db7(BE)/0xEDB88320(LE)
+ * PCLMULQDQ is a new instruction in Intel SSE4.2, the reference can be found
+ * at:
+ * http://www.intel.com/products/processor/manuals/
+ * Intel(R) 64 and IA-32 Architectures Software Developer's Manual
+ * Volume 2B: Instruction Set Reference, N-Z
+ *
+ * Authors: Gregory Prestas <Gregory_Prestas@us.xyratex.com>
+ * Alexander Boyko <Alexander_Boyko@xyratex.com>
+ */
+
+#include <linux/linkage.h>
+#include <asm/assembler.h>
+
+ .text
+ .align 6
+ .arch armv8-a
+ .arch_extension crc
+ .fpu crypto-neon-fp-armv8
+
+.Lcrc32_constants:
+ /*
+ * [x4*128+32 mod P(x) << 32)]' << 1 = 0x154442bd4
+ * #define CONSTANT_R1 0x154442bd4LL
+ *
+ * [(x4*128-32 mod P(x) << 32)]' << 1 = 0x1c6e41596
+ * #define CONSTANT_R2 0x1c6e41596LL
+ */
+ .quad 0x0000000154442bd4
+ .quad 0x00000001c6e41596
+
+ /*
+ * [(x128+32 mod P(x) << 32)]' << 1 = 0x1751997d0
+ * #define CONSTANT_R3 0x1751997d0LL
+ *
+ * [(x128-32 mod P(x) << 32)]' << 1 = 0x0ccaa009e
+ * #define CONSTANT_R4 0x0ccaa009eLL
+ */
+ .quad 0x00000001751997d0
+ .quad 0x00000000ccaa009e
+
+ /*
+ * [(x64 mod P(x) << 32)]' << 1 = 0x163cd6124
+ * #define CONSTANT_R5 0x163cd6124LL
+ */
+ .quad 0x0000000163cd6124
+ .quad 0x00000000FFFFFFFF
+
+ /*
+ * #define CRCPOLY_TRUE_LE_FULL 0x1DB710641LL
+ *
+ * Barrett Reduction constant (u64`) = u` = (x**64 / P(x))`
+ * = 0x1F7011641LL
+ * #define CONSTANT_RU 0x1F7011641LL
+ */
+ .quad 0x00000001DB710641
+ .quad 0x00000001F7011641
+
+.Lcrc32c_constants:
+ .quad 0x00000000740eef02
+ .quad 0x000000009e4addf8
+ .quad 0x00000000f20c0dfe
+ .quad 0x000000014cd00bd6
+ .quad 0x00000000dd45aab8
+ .quad 0x00000000FFFFFFFF
+ .quad 0x0000000105ec76f0
+ .quad 0x00000000dea713f1
+
+ dCONSTANTl .req d0
+ dCONSTANTh .req d1
+ qCONSTANT .req q0
+
+ BUF .req r0
+ LEN .req r1
+ CRC .req r2
+
+ qzr .req q9
+
+ /**
+ * Calculate crc32
+ * BUF - buffer
+ * LEN - sizeof buffer (multiple of 16 bytes), LEN should be > 63
+ * CRC - initial crc32
+ * return %eax crc32
+ * uint crc32_pmull_le(unsigned char const *buffer,
+ * size_t len, uint crc32)
+ */
+ENTRY(crc32_pmull_le)
+ adr r3, .Lcrc32_constants
+ b 0f
+
+ENTRY(crc32c_pmull_le)
+ adr r3, .Lcrc32c_constants
+
+0: bic LEN, LEN, #15
+ vld1.8 {q1-q2}, [BUF, :128]!
+ vld1.8 {q3-q4}, [BUF, :128]!
+ vmov.i8 qzr, #0
+ vmov.i8 qCONSTANT, #0
+ vmov dCONSTANTl[0], CRC
+ veor.8 d2, d2, dCONSTANTl
+ sub LEN, LEN, #0x40
+ cmp LEN, #0x40
+ blt less_64
+
+ vld1.64 {qCONSTANT}, [r3]
+
+loop_64: /* 64 bytes Full cache line folding */
+ sub LEN, LEN, #0x40
+
+ vmull.p64 q5, d3, dCONSTANTh
+ vmull.p64 q6, d5, dCONSTANTh
+ vmull.p64 q7, d7, dCONSTANTh
+ vmull.p64 q8, d9, dCONSTANTh
+
+ vmull.p64 q1, d2, dCONSTANTl
+ vmull.p64 q2, d4, dCONSTANTl
+ vmull.p64 q3, d6, dCONSTANTl
+ vmull.p64 q4, d8, dCONSTANTl
+
+ veor.8 q1, q1, q5
+ vld1.8 {q5}, [BUF, :128]!
+ veor.8 q2, q2, q6
+ vld1.8 {q6}, [BUF, :128]!
+ veor.8 q3, q3, q7
+ vld1.8 {q7}, [BUF, :128]!
+ veor.8 q4, q4, q8
+ vld1.8 {q8}, [BUF, :128]!
+
+ veor.8 q1, q1, q5
+ veor.8 q2, q2, q6
+ veor.8 q3, q3, q7
+ veor.8 q4, q4, q8
+
+ cmp LEN, #0x40
+ bge loop_64
+
+less_64: /* Folding cache line into 128bit */
+ vldr dCONSTANTl, [r3, #16]
+ vldr dCONSTANTh, [r3, #24]
+
+ vmull.p64 q5, d3, dCONSTANTh
+ vmull.p64 q1, d2, dCONSTANTl
+ veor.8 q1, q1, q5
+ veor.8 q1, q1, q2
+
+ vmull.p64 q5, d3, dCONSTANTh
+ vmull.p64 q1, d2, dCONSTANTl
+ veor.8 q1, q1, q5
+ veor.8 q1, q1, q3
+
+ vmull.p64 q5, d3, dCONSTANTh
+ vmull.p64 q1, d2, dCONSTANTl
+ veor.8 q1, q1, q5
+ veor.8 q1, q1, q4
+
+ teq LEN, #0
+ beq fold_64
+
+loop_16: /* Folding rest buffer into 128bit */
+ subs LEN, LEN, #0x10
+
+ vld1.8 {q2}, [BUF, :128]!
+ vmull.p64 q5, d3, dCONSTANTh
+ vmull.p64 q1, d2, dCONSTANTl
+ veor.8 q1, q1, q5
+ veor.8 q1, q1, q2
+
+ bne loop_16
+
+fold_64:
+ /* perform the last 64 bit fold, also adds 32 zeroes
+ * to the input stream */
+ vmull.p64 q2, d2, dCONSTANTh
+ vext.8 q1, q1, qzr, #8
+ veor.8 q1, q1, q2
+
+ /* final 32-bit fold */
+ vldr dCONSTANTl, [r3, #32]
+ vldr d6, [r3, #40]
+ vmov.i8 d7, #0
+
+ vext.8 q2, q1, qzr, #4
+ vand.8 d2, d2, d6
+ vmull.p64 q1, d2, dCONSTANTl
+ veor.8 q1, q1, q2
+
+ /* Finish up with the bit-reversed barrett reduction 64 ==> 32 bits */
+ vldr dCONSTANTl, [r3, #48]
+ vldr dCONSTANTh, [r3, #56]
+
+ vand.8 q2, q1, q3
+ vext.8 q2, qzr, q2, #8
+ vmull.p64 q2, d5, dCONSTANTh
+ vand.8 q2, q2, q3
+ vmull.p64 q2, d4, dCONSTANTl
+ veor.8 q1, q1, q2
+ vmov r0, s5
+
+ bx lr
+ENDPROC(crc32_pmull_le)
+ENDPROC(crc32c_pmull_le)
+
+ .macro __crc32, c
+ subs ip, r2, #8
+ bmi .Ltail\c
+
+ tst r1, #3
+ bne .Lunaligned\c
+
+ teq ip, #0
+.Laligned8\c:
+ ldrd r2, r3, [r1], #8
+ARM_BE8(rev r2, r2 )
+ARM_BE8(rev r3, r3 )
+ crc32\c\()w r0, r0, r2
+ crc32\c\()w r0, r0, r3
+ bxeq lr
+ subs ip, ip, #8
+ bpl .Laligned8\c
+
+.Ltail\c:
+ tst ip, #4
+ beq 2f
+ ldr r3, [r1], #4
+ARM_BE8(rev r3, r3 )
+ crc32\c\()w r0, r0, r3
+
+2: tst ip, #2
+ beq 1f
+ ldrh r3, [r1], #2
+ARM_BE8(rev16 r3, r3 )
+ crc32\c\()h r0, r0, r3
+
+1: tst ip, #1
+ bxeq lr
+ ldrb r3, [r1]
+ crc32\c\()b r0, r0, r3
+ bx lr
+
+.Lunaligned\c:
+ tst r1, #1
+ beq 2f
+ ldrb r3, [r1], #1
+ subs r2, r2, #1
+ crc32\c\()b r0, r0, r3
+
+ tst r1, #2
+ beq 0f
+2: ldrh r3, [r1], #2
+ subs r2, r2, #2
+ARM_BE8(rev16 r3, r3 )
+ crc32\c\()h r0, r0, r3
+
+0: subs ip, r2, #8
+ bpl .Laligned8\c
+ b .Ltail\c
+ .endm
+
+ .align 5
+ENTRY(crc32_armv8_le)
+ __crc32
+ENDPROC(crc32_armv8_le)
+
+ .align 5
+ENTRY(crc32c_armv8_le)
+ __crc32 c
+ENDPROC(crc32c_armv8_le)
diff --git a/arch/arm/crypto/crc32-ce-glue.c b/arch/arm/crypto/crc32-ce-glue.c
new file mode 100644
index 000000000000..e1566bec1016
--- /dev/null
+++ b/arch/arm/crypto/crc32-ce-glue.c
@@ -0,0 +1,242 @@
+/*
+ * Accelerated CRC32(C) using ARM CRC, NEON and Crypto Extensions instructions
+ *
+ * Copyright (C) 2016 Linaro Ltd <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/crc32.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/string.h>
+
+#include <crypto/internal/hash.h>
+
+#include <asm/hwcap.h>
+#include <asm/neon.h>
+#include <asm/simd.h>
+#include <asm/unaligned.h>
+
+#define PMULL_MIN_LEN 64L /* minimum size of buffer
+ * for crc32_pmull_le_16 */
+#define SCALE_F 16L /* size of NEON register */
+
+asmlinkage u32 crc32_pmull_le(const u8 buf[], u32 len, u32 init_crc);
+asmlinkage u32 crc32_armv8_le(u32 init_crc, const u8 buf[], u32 len);
+
+asmlinkage u32 crc32c_pmull_le(const u8 buf[], u32 len, u32 init_crc);
+asmlinkage u32 crc32c_armv8_le(u32 init_crc, const u8 buf[], u32 len);
+
+static u32 (*fallback_crc32)(u32 init_crc, const u8 buf[], u32 len);
+static u32 (*fallback_crc32c)(u32 init_crc, const u8 buf[], u32 len);
+
+static int crc32_cra_init(struct crypto_tfm *tfm)
+{
+ u32 *key = crypto_tfm_ctx(tfm);
+
+ *key = 0;
+ return 0;
+}
+
+static int crc32c_cra_init(struct crypto_tfm *tfm)
+{
+ u32 *key = crypto_tfm_ctx(tfm);
+
+ *key = ~0;
+ return 0;
+}
+
+static int crc32_setkey(struct crypto_shash *hash, const u8 *key,
+ unsigned int keylen)
+{
+ u32 *mctx = crypto_shash_ctx(hash);
+
+ if (keylen != sizeof(u32)) {
+ crypto_shash_set_flags(hash, CRYPTO_TFM_RES_BAD_KEY_LEN);
+ return -EINVAL;
+ }
+ *mctx = le32_to_cpup((__le32 *)key);
+ return 0;
+}
+
+static int crc32_init(struct shash_desc *desc)
+{
+ u32 *mctx = crypto_shash_ctx(desc->tfm);
+ u32 *crc = shash_desc_ctx(desc);
+
+ *crc = *mctx;
+ return 0;
+}
+
+static int crc32_update(struct shash_desc *desc, const u8 *data,
+ unsigned int length)
+{
+ u32 *crc = shash_desc_ctx(desc);
+
+ *crc = crc32_armv8_le(*crc, data, length);
+ return 0;
+}
+
+static int crc32c_update(struct shash_desc *desc, const u8 *data,
+ unsigned int length)
+{
+ u32 *crc = shash_desc_ctx(desc);
+
+ *crc = crc32c_armv8_le(*crc, data, length);
+ return 0;
+}
+
+static int crc32_final(struct shash_desc *desc, u8 *out)
+{
+ u32 *crc = shash_desc_ctx(desc);
+
+ put_unaligned_le32(*crc, out);
+ return 0;
+}
+
+static int crc32c_final(struct shash_desc *desc, u8 *out)
+{
+ u32 *crc = shash_desc_ctx(desc);
+
+ put_unaligned_le32(~*crc, out);
+ return 0;
+}
+
+static int crc32_pmull_update(struct shash_desc *desc, const u8 *data,
+ unsigned int length)
+{
+ u32 *crc = shash_desc_ctx(desc);
+ unsigned int l;
+
+ if (may_use_simd()) {
+ if ((u32)data % SCALE_F) {
+ l = min_t(u32, length, SCALE_F - ((u32)data % SCALE_F));
+
+ *crc = fallback_crc32(*crc, data, l);
+
+ data += l;
+ length -= l;
+ }
+
+ if (length >= PMULL_MIN_LEN) {
+ l = round_down(length, SCALE_F);
+
+ kernel_neon_begin();
+ *crc = crc32_pmull_le(data, l, *crc);
+ kernel_neon_end();
+
+ data += l;
+ length -= l;
+ }
+ }
+
+ if (length > 0)
+ *crc = fallback_crc32(*crc, data, length);
+
+ return 0;
+}
+
+static int crc32c_pmull_update(struct shash_desc *desc, const u8 *data,
+ unsigned int length)
+{
+ u32 *crc = shash_desc_ctx(desc);
+ unsigned int l;
+
+ if (may_use_simd()) {
+ if ((u32)data % SCALE_F) {
+ l = min_t(u32, length, SCALE_F - ((u32)data % SCALE_F));
+
+ *crc = fallback_crc32c(*crc, data, l);
+
+ data += l;
+ length -= l;
+ }
+
+ if (length >= PMULL_MIN_LEN) {
+ l = round_down(length, SCALE_F);
+
+ kernel_neon_begin();
+ *crc = crc32c_pmull_le(data, l, *crc);
+ kernel_neon_end();
+
+ data += l;
+ length -= l;
+ }
+ }
+
+ if (length > 0)
+ *crc = fallback_crc32c(*crc, data, length);
+
+ return 0;
+}
+
+static struct shash_alg crc32_pmull_algs[] = { {
+ .setkey = crc32_setkey,
+ .init = crc32_init,
+ .update = crc32_update,
+ .final = crc32_final,
+ .descsize = sizeof(u32),
+ .digestsize = sizeof(u32),
+
+ .base.cra_ctxsize = sizeof(u32),
+ .base.cra_init = crc32_cra_init,
+ .base.cra_name = "crc32",
+ .base.cra_driver_name = "crc32-arm-ce",
+ .base.cra_priority = 200,
+ .base.cra_blocksize = 1,
+ .base.cra_module = THIS_MODULE,
+}, {
+ .setkey = crc32_setkey,
+ .init = crc32_init,
+ .update = crc32c_update,
+ .final = crc32c_final,
+ .descsize = sizeof(u32),
+ .digestsize = sizeof(u32),
+
+ .base.cra_ctxsize = sizeof(u32),
+ .base.cra_init = crc32c_cra_init,
+ .base.cra_name = "crc32c",
+ .base.cra_driver_name = "crc32c-arm-ce",
+ .base.cra_priority = 200,
+ .base.cra_blocksize = 1,
+ .base.cra_module = THIS_MODULE,
+} };
+
+static int __init crc32_pmull_mod_init(void)
+{
+ if (elf_hwcap2 & HWCAP2_PMULL) {
+ crc32_pmull_algs[0].update = crc32_pmull_update;
+ crc32_pmull_algs[1].update = crc32c_pmull_update;
+
+ if (elf_hwcap2 & HWCAP2_CRC32) {
+ fallback_crc32 = crc32_armv8_le;
+ fallback_crc32c = crc32c_armv8_le;
+ } else {
+ fallback_crc32 = crc32_le;
+ fallback_crc32c = __crc32c_le;
+ }
+ } else if (!(elf_hwcap2 & HWCAP2_CRC32)) {
+ return -ENODEV;
+ }
+
+ return crypto_register_shashes(crc32_pmull_algs,
+ ARRAY_SIZE(crc32_pmull_algs));
+}
+
+static void __exit crc32_pmull_mod_exit(void)
+{
+ crypto_unregister_shashes(crc32_pmull_algs,
+ ARRAY_SIZE(crc32_pmull_algs));
+}
+
+module_init(crc32_pmull_mod_init);
+module_exit(crc32_pmull_mod_exit);
+
+MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
+MODULE_LICENSE("GPL v2");
+MODULE_ALIAS_CRYPTO("crc32");
+MODULE_ALIAS_CRYPTO("crc32c");
--
2.7.4
^ permalink raw reply related
* [PATCH v3 5/6] crypto: arm64/crc32 - accelerated support based on x86 SSE implementation
From: Ard Biesheuvel @ 2016-12-05 18:42 UTC (permalink / raw)
To: linux-crypto, herbert; +Cc: linux-arm-kernel, Ard Biesheuvel
In-Reply-To: <1480963348-24203-1-git-send-email-ard.biesheuvel@linaro.org>
This is a combination of the the Intel algorithm implemented using SSE
and PCLMULQDQ instructions from arch/x86/crypto/crc32-pclmul_asm.S, and
the new CRC32 extensions introduced for both 32-bit and 64-bit ARM in
version 8 of the architecture. Two versions of the above combo are
provided, one for CRC32 and one for CRC32C.
The PMULL/NEON algorithm is faster, but operates on blocks of at least
64 bytes, and on multiples of 16 bytes only. For the remaining input,
or for all input on systems that lack the PMULL 64x64->128 instructions,
the CRC32 instructions will be used.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/crypto/Kconfig | 6 +
arch/arm64/crypto/Makefile | 3 +
arch/arm64/crypto/crc32-ce-core.S | 266 ++++++++++++++++++++
arch/arm64/crypto/crc32-ce-glue.c | 212 ++++++++++++++++
4 files changed, 487 insertions(+)
diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig
index d773c0659202..21835deb1ab9 100644
--- a/arch/arm64/crypto/Kconfig
+++ b/arch/arm64/crypto/Kconfig
@@ -28,6 +28,11 @@ config CRYPTO_CRCT10DIF_ARM64_CE
depends on KERNEL_MODE_NEON && CRC_T10DIF
select CRYPTO_HASH
+config CRYPTO_CRC32_ARM64_CE
+ tristate "CRC32 and CRC32C digest algorithms using PMULL instructions"
+ depends on KERNEL_MODE_NEON && CRC32
+ select CRYPTO_HASH
+
config CRYPTO_AES_ARM64_CE
tristate "AES core cipher using ARMv8 Crypto Extensions"
depends on ARM64 && KERNEL_MODE_NEON
@@ -58,4 +63,5 @@ config CRYPTO_CRC32_ARM64
tristate "CRC32 and CRC32C using optional ARMv8 instructions"
depends on ARM64
select CRYPTO_HASH
+
endif
diff --git a/arch/arm64/crypto/Makefile b/arch/arm64/crypto/Makefile
index 36fd3eb4201b..144387805a46 100644
--- a/arch/arm64/crypto/Makefile
+++ b/arch/arm64/crypto/Makefile
@@ -20,6 +20,9 @@ ghash-ce-y := ghash-ce-glue.o ghash-ce-core.o
obj-$(CONFIG_CRYPTO_CRCT10DIF_ARM64_CE) += crct10dif-ce.o
crct10dif-ce-y := crct10dif-ce-core.o crct10dif-ce-glue.o
+obj-$(CONFIG_CRYPTO_CRC32_ARM64_CE) += crc32-ce.o
+crc32-ce-y:= crc32-ce-core.o crc32-ce-glue.o
+
obj-$(CONFIG_CRYPTO_AES_ARM64_CE) += aes-ce-cipher.o
CFLAGS_aes-ce-cipher.o += -march=armv8-a+crypto
diff --git a/arch/arm64/crypto/crc32-ce-core.S b/arch/arm64/crypto/crc32-ce-core.S
new file mode 100644
index 000000000000..18f5a8442276
--- /dev/null
+++ b/arch/arm64/crypto/crc32-ce-core.S
@@ -0,0 +1,266 @@
+/*
+ * Accelerated CRC32(C) using arm64 CRC, NEON and Crypto Extensions instructions
+ *
+ * Copyright (C) 2016 Linaro Ltd <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+/* GPL HEADER START
+ *
+ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 only,
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License version 2 for more details (a copy is included
+ * in the LICENSE file that accompanied this code).
+ *
+ * You should have received a copy of the GNU General Public License
+ * version 2 along with this program; If not, see http://www.gnu.org/licenses
+ *
+ * Please visit http://www.xyratex.com/contact if you need additional
+ * information or have any questions.
+ *
+ * GPL HEADER END
+ */
+
+/*
+ * Copyright 2012 Xyratex Technology Limited
+ *
+ * Using hardware provided PCLMULQDQ instruction to accelerate the CRC32
+ * calculation.
+ * CRC32 polynomial:0x04c11db7(BE)/0xEDB88320(LE)
+ * PCLMULQDQ is a new instruction in Intel SSE4.2, the reference can be found
+ * at:
+ * http://www.intel.com/products/processor/manuals/
+ * Intel(R) 64 and IA-32 Architectures Software Developer's Manual
+ * Volume 2B: Instruction Set Reference, N-Z
+ *
+ * Authors: Gregory Prestas <Gregory_Prestas@us.xyratex.com>
+ * Alexander Boyko <Alexander_Boyko@xyratex.com>
+ */
+
+#include <linux/linkage.h>
+#include <asm/assembler.h>
+
+ .text
+ .align 6
+ .cpu generic+crypto+crc
+
+.Lcrc32_constants:
+ /*
+ * [x4*128+32 mod P(x) << 32)]' << 1 = 0x154442bd4
+ * #define CONSTANT_R1 0x154442bd4LL
+ *
+ * [(x4*128-32 mod P(x) << 32)]' << 1 = 0x1c6e41596
+ * #define CONSTANT_R2 0x1c6e41596LL
+ */
+ .octa 0x00000001c6e415960000000154442bd4
+
+ /*
+ * [(x128+32 mod P(x) << 32)]' << 1 = 0x1751997d0
+ * #define CONSTANT_R3 0x1751997d0LL
+ *
+ * [(x128-32 mod P(x) << 32)]' << 1 = 0x0ccaa009e
+ * #define CONSTANT_R4 0x0ccaa009eLL
+ */
+ .octa 0x00000000ccaa009e00000001751997d0
+
+ /*
+ * [(x64 mod P(x) << 32)]' << 1 = 0x163cd6124
+ * #define CONSTANT_R5 0x163cd6124LL
+ */
+ .quad 0x0000000163cd6124
+ .quad 0x00000000FFFFFFFF
+
+ /*
+ * #define CRCPOLY_TRUE_LE_FULL 0x1DB710641LL
+ *
+ * Barrett Reduction constant (u64`) = u` = (x**64 / P(x))`
+ * = 0x1F7011641LL
+ * #define CONSTANT_RU 0x1F7011641LL
+ */
+ .octa 0x00000001F701164100000001DB710641
+
+.Lcrc32c_constants:
+ .octa 0x000000009e4addf800000000740eef02
+ .octa 0x000000014cd00bd600000000f20c0dfe
+ .quad 0x00000000dd45aab8
+ .quad 0x00000000FFFFFFFF
+ .octa 0x00000000dea713f10000000105ec76f0
+
+ vCONSTANT .req v0
+ dCONSTANT .req d0
+ qCONSTANT .req q0
+
+ BUF .req x0
+ LEN .req x1
+ CRC .req x2
+
+ vzr .req v9
+
+ /**
+ * Calculate crc32
+ * BUF - buffer
+ * LEN - sizeof buffer (multiple of 16 bytes), LEN should be > 63
+ * CRC - initial crc32
+ * return %eax crc32
+ * uint crc32_pmull_le(unsigned char const *buffer,
+ * size_t len, uint crc32)
+ */
+ENTRY(crc32_pmull_le)
+ adr x3, .Lcrc32_constants
+ b 0f
+
+ENTRY(crc32c_pmull_le)
+ adr x3, .Lcrc32c_constants
+
+0: bic LEN, LEN, #15
+ ld1 {v1.16b-v4.16b}, [BUF], #0x40
+ movi vzr.16b, #0
+ fmov dCONSTANT, CRC
+ eor v1.16b, v1.16b, vCONSTANT.16b
+ sub LEN, LEN, #0x40
+ cmp LEN, #0x40
+ b.lt less_64
+
+ ldr qCONSTANT, [x3]
+
+loop_64: /* 64 bytes Full cache line folding */
+ sub LEN, LEN, #0x40
+
+ pmull2 v5.1q, v1.2d, vCONSTANT.2d
+ pmull2 v6.1q, v2.2d, vCONSTANT.2d
+ pmull2 v7.1q, v3.2d, vCONSTANT.2d
+ pmull2 v8.1q, v4.2d, vCONSTANT.2d
+
+ pmull v1.1q, v1.1d, vCONSTANT.1d
+ pmull v2.1q, v2.1d, vCONSTANT.1d
+ pmull v3.1q, v3.1d, vCONSTANT.1d
+ pmull v4.1q, v4.1d, vCONSTANT.1d
+
+ eor v1.16b, v1.16b, v5.16b
+ ld1 {v5.16b}, [BUF], #0x10
+ eor v2.16b, v2.16b, v6.16b
+ ld1 {v6.16b}, [BUF], #0x10
+ eor v3.16b, v3.16b, v7.16b
+ ld1 {v7.16b}, [BUF], #0x10
+ eor v4.16b, v4.16b, v8.16b
+ ld1 {v8.16b}, [BUF], #0x10
+
+ eor v1.16b, v1.16b, v5.16b
+ eor v2.16b, v2.16b, v6.16b
+ eor v3.16b, v3.16b, v7.16b
+ eor v4.16b, v4.16b, v8.16b
+
+ cmp LEN, #0x40
+ b.ge loop_64
+
+less_64: /* Folding cache line into 128bit */
+ ldr qCONSTANT, [x3, #16]
+
+ pmull2 v5.1q, v1.2d, vCONSTANT.2d
+ pmull v1.1q, v1.1d, vCONSTANT.1d
+ eor v1.16b, v1.16b, v5.16b
+ eor v1.16b, v1.16b, v2.16b
+
+ pmull2 v5.1q, v1.2d, vCONSTANT.2d
+ pmull v1.1q, v1.1d, vCONSTANT.1d
+ eor v1.16b, v1.16b, v5.16b
+ eor v1.16b, v1.16b, v3.16b
+
+ pmull2 v5.1q, v1.2d, vCONSTANT.2d
+ pmull v1.1q, v1.1d, vCONSTANT.1d
+ eor v1.16b, v1.16b, v5.16b
+ eor v1.16b, v1.16b, v4.16b
+
+ cbz LEN, fold_64
+
+loop_16: /* Folding rest buffer into 128bit */
+ subs LEN, LEN, #0x10
+
+ ld1 {v2.16b}, [BUF], #0x10
+ pmull2 v5.1q, v1.2d, vCONSTANT.2d
+ pmull v1.1q, v1.1d, vCONSTANT.1d
+ eor v1.16b, v1.16b, v5.16b
+ eor v1.16b, v1.16b, v2.16b
+
+ b.ne loop_16
+
+fold_64:
+ /* perform the last 64 bit fold, also adds 32 zeroes
+ * to the input stream */
+ ext v2.16b, v1.16b, v1.16b, #8
+ pmull2 v2.1q, v2.2d, vCONSTANT.2d
+ ext v1.16b, v1.16b, vzr.16b, #8
+ eor v1.16b, v1.16b, v2.16b
+
+ /* final 32-bit fold */
+ ldr dCONSTANT, [x3, #32]
+ ldr d3, [x3, #40]
+
+ ext v2.16b, v1.16b, vzr.16b, #4
+ and v1.16b, v1.16b, v3.16b
+ pmull v1.1q, v1.1d, vCONSTANT.1d
+ eor v1.16b, v1.16b, v2.16b
+
+ /* Finish up with the bit-reversed barrett reduction 64 ==> 32 bits */
+ ldr qCONSTANT, [x3, #48]
+
+ and v2.16b, v1.16b, v3.16b
+ ext v2.16b, vzr.16b, v2.16b, #8
+ pmull2 v2.1q, v2.2d, vCONSTANT.2d
+ and v2.16b, v2.16b, v3.16b
+ pmull v2.1q, v2.1d, vCONSTANT.1d
+ eor v1.16b, v1.16b, v2.16b
+ mov w0, v1.s[1]
+
+ ret
+ENDPROC(crc32_pmull_le)
+ENDPROC(crc32c_pmull_le)
+
+ .macro __crc32, c
+0: subs x2, x2, #16
+ b.mi 8f
+ ldp x3, x4, [x1], #16
+CPU_BE( rev x3, x3 )
+CPU_BE( rev x4, x4 )
+ crc32\c\()x w0, w0, x3
+ crc32\c\()x w0, w0, x4
+ b.ne 0b
+ ret
+
+8: tbz x2, #3, 4f
+ ldr x3, [x1], #8
+CPU_BE( rev x3, x3 )
+ crc32\c\()x w0, w0, x3
+4: tbz x2, #2, 2f
+ ldr w3, [x1], #4
+CPU_BE( rev w3, w3 )
+ crc32\c\()w w0, w0, w3
+2: tbz x2, #1, 1f
+ ldrh w3, [x1], #2
+CPU_BE( rev16 w3, w3 )
+ crc32\c\()h w0, w0, w3
+1: tbz x2, #0, 0f
+ ldrb w3, [x1]
+ crc32\c\()b w0, w0, w3
+0: ret
+ .endm
+
+ .align 5
+ENTRY(crc32_armv8_le)
+ __crc32
+ENDPROC(crc32_armv8_le)
+
+ .align 5
+ENTRY(crc32c_armv8_le)
+ __crc32 c
+ENDPROC(crc32c_armv8_le)
diff --git a/arch/arm64/crypto/crc32-ce-glue.c b/arch/arm64/crypto/crc32-ce-glue.c
new file mode 100644
index 000000000000..8594127d5e01
--- /dev/null
+++ b/arch/arm64/crypto/crc32-ce-glue.c
@@ -0,0 +1,212 @@
+/*
+ * Accelerated CRC32(C) using arm64 NEON and Crypto Extensions instructions
+ *
+ * Copyright (C) 2016 Linaro Ltd <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/cpufeature.h>
+#include <linux/crc32.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/string.h>
+
+#include <crypto/internal/hash.h>
+
+#include <asm/hwcap.h>
+#include <asm/neon.h>
+#include <asm/unaligned.h>
+
+#define PMULL_MIN_LEN 64L /* minimum size of buffer
+ * for crc32_pmull_le_16 */
+#define SCALE_F 16L /* size of NEON register */
+
+asmlinkage u32 crc32_pmull_le(const u8 buf[], u64 len, u32 init_crc);
+asmlinkage u32 crc32_armv8_le(u32 init_crc, const u8 buf[], size_t len);
+
+asmlinkage u32 crc32c_pmull_le(const u8 buf[], u64 len, u32 init_crc);
+asmlinkage u32 crc32c_armv8_le(u32 init_crc, const u8 buf[], size_t len);
+
+static u32 (*fallback_crc32)(u32 init_crc, const u8 buf[], size_t len);
+static u32 (*fallback_crc32c)(u32 init_crc, const u8 buf[], size_t len);
+
+static int crc32_pmull_cra_init(struct crypto_tfm *tfm)
+{
+ u32 *key = crypto_tfm_ctx(tfm);
+
+ *key = 0;
+ return 0;
+}
+
+static int crc32c_pmull_cra_init(struct crypto_tfm *tfm)
+{
+ u32 *key = crypto_tfm_ctx(tfm);
+
+ *key = ~0;
+ return 0;
+}
+
+static int crc32_pmull_setkey(struct crypto_shash *hash, const u8 *key,
+ unsigned int keylen)
+{
+ u32 *mctx = crypto_shash_ctx(hash);
+
+ if (keylen != sizeof(u32)) {
+ crypto_shash_set_flags(hash, CRYPTO_TFM_RES_BAD_KEY_LEN);
+ return -EINVAL;
+ }
+ *mctx = le32_to_cpup((__le32 *)key);
+ return 0;
+}
+
+static int crc32_pmull_init(struct shash_desc *desc)
+{
+ u32 *mctx = crypto_shash_ctx(desc->tfm);
+ u32 *crc = shash_desc_ctx(desc);
+
+ *crc = *mctx;
+ return 0;
+}
+
+static int crc32_pmull_update(struct shash_desc *desc, const u8 *data,
+ unsigned int length)
+{
+ u32 *crc = shash_desc_ctx(desc);
+ unsigned int l;
+
+ if ((u64)data % SCALE_F) {
+ l = min_t(u32, length, SCALE_F - ((u64)data % SCALE_F));
+
+ *crc = fallback_crc32(*crc, data, l);
+
+ data += l;
+ length -= l;
+ }
+
+ if (length >= PMULL_MIN_LEN) {
+ l = round_down(length, SCALE_F);
+
+ kernel_neon_begin_partial(10);
+ *crc = crc32_pmull_le(data, l, *crc);
+ kernel_neon_end();
+
+ data += l;
+ length -= l;
+ }
+
+ if (length > 0)
+ *crc = fallback_crc32(*crc, data, length);
+
+ return 0;
+}
+
+static int crc32c_pmull_update(struct shash_desc *desc, const u8 *data,
+ unsigned int length)
+{
+ u32 *crc = shash_desc_ctx(desc);
+ unsigned int l;
+
+ if ((u64)data % SCALE_F) {
+ l = min_t(u32, length, SCALE_F - ((u64)data % SCALE_F));
+
+ *crc = fallback_crc32c(*crc, data, l);
+
+ data += l;
+ length -= l;
+ }
+
+ if (length >= PMULL_MIN_LEN) {
+ l = round_down(length, SCALE_F);
+
+ kernel_neon_begin_partial(10);
+ *crc = crc32c_pmull_le(data, l, *crc);
+ kernel_neon_end();
+
+ data += l;
+ length -= l;
+ }
+
+ if (length > 0) {
+ *crc = fallback_crc32c(*crc, data, length);
+ }
+
+ return 0;
+}
+
+static int crc32_pmull_final(struct shash_desc *desc, u8 *out)
+{
+ u32 *crc = shash_desc_ctx(desc);
+
+ put_unaligned_le32(*crc, out);
+ return 0;
+}
+
+static int crc32c_pmull_final(struct shash_desc *desc, u8 *out)
+{
+ u32 *crc = shash_desc_ctx(desc);
+
+ put_unaligned_le32(~*crc, out);
+ return 0;
+}
+
+static struct shash_alg crc32_pmull_algs[] = { {
+ .setkey = crc32_pmull_setkey,
+ .init = crc32_pmull_init,
+ .update = crc32_pmull_update,
+ .final = crc32_pmull_final,
+ .descsize = sizeof(u32),
+ .digestsize = sizeof(u32),
+
+ .base.cra_ctxsize = sizeof(u32),
+ .base.cra_init = crc32_pmull_cra_init,
+ .base.cra_name = "crc32",
+ .base.cra_driver_name = "crc32-arm64-ce",
+ .base.cra_priority = 200,
+ .base.cra_blocksize = 1,
+ .base.cra_module = THIS_MODULE,
+}, {
+ .setkey = crc32_pmull_setkey,
+ .init = crc32_pmull_init,
+ .update = crc32c_pmull_update,
+ .final = crc32c_pmull_final,
+ .descsize = sizeof(u32),
+ .digestsize = sizeof(u32),
+
+ .base.cra_ctxsize = sizeof(u32),
+ .base.cra_init = crc32c_pmull_cra_init,
+ .base.cra_name = "crc32c",
+ .base.cra_driver_name = "crc32c-arm64-ce",
+ .base.cra_priority = 200,
+ .base.cra_blocksize = 1,
+ .base.cra_module = THIS_MODULE,
+} };
+
+static int __init crc32_pmull_mod_init(void)
+{
+ if (elf_hwcap & HWCAP_CRC32) {
+ fallback_crc32 = crc32_armv8_le;
+ fallback_crc32c = crc32c_armv8_le;
+ } else {
+ fallback_crc32 = crc32_le;
+ fallback_crc32c = __crc32c_le;
+ }
+
+ return crypto_register_shashes(crc32_pmull_algs,
+ ARRAY_SIZE(crc32_pmull_algs));
+}
+
+static void __exit crc32_pmull_mod_exit(void)
+{
+ crypto_unregister_shashes(crc32_pmull_algs,
+ ARRAY_SIZE(crc32_pmull_algs));
+}
+
+module_cpu_feature_match(PMULL, crc32_pmull_mod_init);
+module_exit(crc32_pmull_mod_exit);
+
+MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
+MODULE_LICENSE("GPL v2");
--
2.7.4
^ permalink raw reply related
* [PATCH v3 4/6] crypto: arm/crct10dif - port x86 SSE implementation to ARM
From: Ard Biesheuvel @ 2016-12-05 18:42 UTC (permalink / raw)
To: linux-crypto, herbert; +Cc: linux-arm-kernel, Ard Biesheuvel
In-Reply-To: <1480963348-24203-1-git-send-email-ard.biesheuvel@linaro.org>
This is a transliteration of the Intel algorithm implemented
using SSE and PCLMULQDQ instructions that resides in the file
arch/x86/crypto/crct10dif-pcl-asm_64.S, but simplified to only
operate on buffers that are 16 byte aligned (but of any size)
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm/crypto/Kconfig | 5 +
arch/arm/crypto/Makefile | 2 +
arch/arm/crypto/crct10dif-ce-core.S | 427 ++++++++++++++++++++
arch/arm/crypto/crct10dif-ce-glue.c | 101 +++++
4 files changed, 535 insertions(+)
diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig
index 27ed1b1cd1d7..fce801fa52a1 100644
--- a/arch/arm/crypto/Kconfig
+++ b/arch/arm/crypto/Kconfig
@@ -120,4 +120,9 @@ config CRYPTO_GHASH_ARM_CE
that uses the 64x64 to 128 bit polynomial multiplication (vmull.p64)
that is part of the ARMv8 Crypto Extensions
+config CRYPTO_CRCT10DIF_ARM_CE
+ tristate "CRCT10DIF digest algorithm using PMULL instructions"
+ depends on KERNEL_MODE_NEON && CRC_T10DIF
+ select CRYPTO_HASH
+
endif
diff --git a/arch/arm/crypto/Makefile b/arch/arm/crypto/Makefile
index fc5150702b64..fc77265014b7 100644
--- a/arch/arm/crypto/Makefile
+++ b/arch/arm/crypto/Makefile
@@ -13,6 +13,7 @@ ce-obj-$(CONFIG_CRYPTO_AES_ARM_CE) += aes-arm-ce.o
ce-obj-$(CONFIG_CRYPTO_SHA1_ARM_CE) += sha1-arm-ce.o
ce-obj-$(CONFIG_CRYPTO_SHA2_ARM_CE) += sha2-arm-ce.o
ce-obj-$(CONFIG_CRYPTO_GHASH_ARM_CE) += ghash-arm-ce.o
+ce-obj-$(CONFIG_CRYPTO_CRCT10DIF_ARM_CE) += crct10dif-arm-ce.o
ifneq ($(ce-obj-y)$(ce-obj-m),)
ifeq ($(call as-instr,.fpu crypto-neon-fp-armv8,y,n),y)
@@ -36,6 +37,7 @@ sha1-arm-ce-y := sha1-ce-core.o sha1-ce-glue.o
sha2-arm-ce-y := sha2-ce-core.o sha2-ce-glue.o
aes-arm-ce-y := aes-ce-core.o aes-ce-glue.o
ghash-arm-ce-y := ghash-ce-core.o ghash-ce-glue.o
+crct10dif-arm-ce-y := crct10dif-ce-core.o crct10dif-ce-glue.o
quiet_cmd_perl = PERL $@
cmd_perl = $(PERL) $(<) > $(@)
diff --git a/arch/arm/crypto/crct10dif-ce-core.S b/arch/arm/crypto/crct10dif-ce-core.S
new file mode 100644
index 000000000000..ce45ba0c0687
--- /dev/null
+++ b/arch/arm/crypto/crct10dif-ce-core.S
@@ -0,0 +1,427 @@
+//
+// Accelerated CRC-T10DIF using ARM NEON and Crypto Extensions instructions
+//
+// Copyright (C) 2016 Linaro Ltd <ard.biesheuvel@linaro.org>
+//
+// This program is free software; you can redistribute it and/or modify
+// it under the terms of the GNU General Public License version 2 as
+// published by the Free Software Foundation.
+//
+
+//
+// Implement fast CRC-T10DIF computation with SSE and PCLMULQDQ instructions
+//
+// Copyright (c) 2013, Intel Corporation
+//
+// Authors:
+// Erdinc Ozturk <erdinc.ozturk@intel.com>
+// Vinodh Gopal <vinodh.gopal@intel.com>
+// James Guilford <james.guilford@intel.com>
+// Tim Chen <tim.c.chen@linux.intel.com>
+//
+// This software is available to you under a choice of one of two
+// licenses. You may choose to be licensed under the terms of the GNU
+// General Public License (GPL) Version 2, available from the file
+// COPYING in the main directory of this source tree, or the
+// OpenIB.org BSD license below:
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are
+// met:
+//
+// * Redistributions of source code must retain the above copyright
+// notice, this list of conditions and the following disclaimer.
+//
+// * Redistributions in binary form must reproduce the above copyright
+// notice, this list of conditions and the following disclaimer in the
+// documentation and/or other materials provided with the
+// distribution.
+//
+// * Neither the name of the Intel Corporation nor the names of its
+// contributors may be used to endorse or promote products derived from
+// this software without specific prior written permission.
+//
+//
+// THIS SOFTWARE IS PROVIDED BY INTEL CORPORATION ""AS IS"" AND ANY
+// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL INTEL CORPORATION OR
+// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+// PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+// LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+//
+// Function API:
+// UINT16 crc_t10dif_pcl(
+// UINT16 init_crc, //initial CRC value, 16 bits
+// const unsigned char *buf, //buffer pointer to calculate CRC on
+// UINT64 len //buffer length in bytes (64-bit data)
+// );
+//
+// Reference paper titled "Fast CRC Computation for Generic
+// Polynomials Using PCLMULQDQ Instruction"
+// URL: http://www.intel.com/content/dam/www/public/us/en/documents
+// /white-papers/fast-crc-computation-generic-polynomials-pclmulqdq-paper.pdf
+//
+//
+
+#include <linux/linkage.h>
+#include <asm/assembler.h>
+
+#ifdef CONFIG_CPU_ENDIAN_BE8
+#define CPU_LE(code...)
+#else
+#define CPU_LE(code...) code
+#endif
+
+ .text
+ .fpu crypto-neon-fp-armv8
+
+ arg1_low32 .req r0
+ arg2 .req r1
+ arg3 .req r2
+
+ qzr .req q13
+
+ q0l .req d0
+ q0h .req d1
+ q1l .req d2
+ q1h .req d3
+ q2l .req d4
+ q2h .req d5
+ q3l .req d6
+ q3h .req d7
+ q4l .req d8
+ q4h .req d9
+ q5l .req d10
+ q5h .req d11
+ q6l .req d12
+ q6h .req d13
+ q7l .req d14
+ q7h .req d15
+
+ENTRY(crc_t10dif_pmull)
+ vmov.i8 qzr, #0 // init zero register
+
+ // adjust the 16-bit initial_crc value, scale it to 32 bits
+ lsl arg1_low32, arg1_low32, #16
+
+ // check if smaller than 256
+ cmp arg3, #256
+
+ // for sizes less than 128, we can't fold 64B at a time...
+ blt _less_than_128
+
+ // load the initial crc value
+ // crc value does not need to be byte-reflected, but it needs
+ // to be moved to the high part of the register.
+ // because data will be byte-reflected and will align with
+ // initial crc at correct place.
+ vmov s0, arg1_low32 // initial crc
+ vext.8 q10, qzr, q0, #4
+
+ // receive the initial 64B data, xor the initial crc value
+ vld1.64 {q0-q1}, [arg2, :128]!
+ vld1.64 {q2-q3}, [arg2, :128]!
+ vld1.64 {q4-q5}, [arg2, :128]!
+ vld1.64 {q6-q7}, [arg2, :128]!
+CPU_LE( vrev64.8 q0, q0 )
+CPU_LE( vrev64.8 q1, q1 )
+CPU_LE( vrev64.8 q2, q2 )
+CPU_LE( vrev64.8 q3, q3 )
+CPU_LE( vrev64.8 q4, q4 )
+CPU_LE( vrev64.8 q5, q5 )
+CPU_LE( vrev64.8 q6, q6 )
+CPU_LE( vrev64.8 q7, q7 )
+
+ vswp d0, d1
+ vswp d2, d3
+ vswp d4, d5
+ vswp d6, d7
+ vswp d8, d9
+ vswp d10, d11
+ vswp d12, d13
+ vswp d14, d15
+
+ // XOR the initial_crc value
+ veor.8 q0, q0, q10
+
+ adr ip, rk3
+ vld1.64 {q10}, [ip, :128] // xmm10 has rk3 and rk4
+
+ //
+ // we subtract 256 instead of 128 to save one instruction from the loop
+ //
+ sub arg3, arg3, #256
+
+ // at this section of the code, there is 64*x+y (0<=y<64) bytes of
+ // buffer. The _fold_64_B_loop will fold 64B at a time
+ // until we have 64+y Bytes of buffer
+
+
+ // fold 64B at a time. This section of the code folds 4 vector
+ // registers in parallel
+_fold_64_B_loop:
+
+ .macro fold64, reg1, reg2
+ vld1.64 {q11-q12}, [arg2, :128]!
+
+ vmull.p64 q8, \reg1\()h, d21
+ vmull.p64 \reg1, \reg1\()l, d20
+ vmull.p64 q9, \reg2\()h, d21
+ vmull.p64 \reg2, \reg2\()l, d20
+
+CPU_LE( vrev64.8 q11, q11 )
+CPU_LE( vrev64.8 q12, q12 )
+ vswp d22, d23
+ vswp d24, d25
+
+ veor.8 \reg1, \reg1, q8
+ veor.8 \reg2, \reg2, q9
+ veor.8 \reg1, \reg1, q11
+ veor.8 \reg2, \reg2, q12
+ .endm
+
+ fold64 q0, q1
+ fold64 q2, q3
+ fold64 q4, q5
+ fold64 q6, q7
+
+ subs arg3, arg3, #128
+
+ // check if there is another 64B in the buffer to be able to fold
+ bge _fold_64_B_loop
+
+ // at this point, the buffer pointer is pointing at the last y Bytes
+ // of the buffer the 64B of folded data is in 4 of the vector
+ // registers: v0, v1, v2, v3
+
+ // fold the 8 vector registers to 1 vector register with different
+ // constants
+
+ adr ip, rk9
+ vld1.64 {q10}, [ip, :128]!
+
+ .macro fold16, reg, rk
+ vmull.p64 q8, \reg\()l, d20
+ vmull.p64 \reg, \reg\()h, d21
+ .ifnb \rk
+ vld1.64 {q10}, [ip, :128]!
+ .endif
+ veor.8 q7, q7, q8
+ veor.8 q7, q7, \reg
+ .endm
+
+ fold16 q0, rk11
+ fold16 q1, rk13
+ fold16 q2, rk15
+ fold16 q3, rk17
+ fold16 q4, rk19
+ fold16 q5, rk1
+ fold16 q6
+
+ // instead of 64, we add 48 to the loop counter to save 1 instruction
+ // from the loop instead of a cmp instruction, we use the negative
+ // flag with the jl instruction
+ adds arg3, arg3, #(128-16)
+ blt _final_reduction_for_128
+
+ // now we have 16+y bytes left to reduce. 16 Bytes is in register v7
+ // and the rest is in memory. We can fold 16 bytes at a time if y>=16
+ // continue folding 16B at a time
+
+_16B_reduction_loop:
+ vmull.p64 q8, d14, d20
+ vmull.p64 q7, d15, d21
+ veor.8 q7, q7, q8
+
+ vld1.64 {q0}, [arg2, :128]!
+CPU_LE( vrev64.8 q0, q0 )
+ vswp d0, d1
+ veor.8 q7, q7, q0
+ subs arg3, arg3, #16
+
+ // instead of a cmp instruction, we utilize the flags with the
+ // jge instruction equivalent of: cmp arg3, 16-16
+ // check if there is any more 16B in the buffer to be able to fold
+ bge _16B_reduction_loop
+
+ // now we have 16+z bytes left to reduce, where 0<= z < 16.
+ // first, we reduce the data in the xmm7 register
+
+_final_reduction_for_128:
+ // check if any more data to fold. If not, compute the CRC of
+ // the final 128 bits
+ adds arg3, arg3, #16
+ beq _128_done
+
+ // here we are getting data that is less than 16 bytes.
+ // since we know that there was data before the pointer, we can
+ // offset the input pointer before the actual point, to receive
+ // exactly 16 bytes. after that the registers need to be adjusted.
+_get_last_two_regs:
+ add arg2, arg2, arg3
+ sub arg2, arg2, #16
+ vld1.64 {q1}, [arg2]
+CPU_LE( vrev64.8 q1, q1 )
+ vswp d2, d3
+
+ // get rid of the extra data that was loaded before
+ // load the shift constant
+ adr ip, tbl_shf_table + 16
+ sub ip, ip, arg3
+ vld1.8 {q0}, [ip]
+
+ // shift v2 to the left by arg3 bytes
+ vtbl.8 d4, {d14-d15}, d0
+ vtbl.8 d5, {d14-d15}, d1
+
+ // shift v7 to the right by 16-arg3 bytes
+ vmov.i8 q9, #0x80
+ veor.8 q0, q0, q9
+ vtbl.8 d18, {d14-d15}, d0
+ vtbl.8 d19, {d14-d15}, d1
+
+ // blend
+ vshr.s8 q0, q0, #7 // convert to 8-bit mask
+ vbsl.8 q0, q2, q1
+
+ // fold 16 Bytes
+ vmull.p64 q8, d18, d20
+ vmull.p64 q7, d19, d21
+ veor.8 q7, q7, q8
+ veor.8 q7, q7, q0
+
+_128_done:
+ // compute crc of a 128-bit value
+ vldr d20, rk5
+ vldr d21, rk6 // rk5 and rk6 in xmm10
+
+ // 64b fold
+ vext.8 q0, qzr, q7, #8
+ vmull.p64 q7, d15, d20
+ veor.8 q7, q7, q0
+
+ // 32b fold
+ vext.8 q0, q7, qzr, #12
+ vmov s31, s3
+ vmull.p64 q0, d0, d21
+ veor.8 q7, q0, q7
+
+ // barrett reduction
+_barrett:
+ vldr d20, rk7
+ vldr d21, rk8
+
+ vmull.p64 q0, d15, d20
+ vext.8 q0, qzr, q0, #12
+ vmull.p64 q0, d1, d21
+ vext.8 q0, qzr, q0, #12
+ veor.8 q7, q7, q0
+ vmov r0, s29
+
+_cleanup:
+ // scale the result back to 16 bits
+ lsr r0, r0, #16
+ bx lr
+
+_less_than_128:
+ teq arg3, #0
+ beq _cleanup
+
+ vmov.i8 q0, #0
+ vmov s3, arg1_low32 // get the initial crc value
+
+ vld1.64 {q7}, [arg2, :128]!
+CPU_LE( vrev64.8 q7, q7 )
+ vswp d14, d15
+ veor.8 q7, q7, q0
+
+ cmp arg3, #16
+ beq _128_done // exactly 16 left
+ blt _less_than_16_left
+
+ // now if there is, load the constants
+ vldr d20, rk1
+ vldr d21, rk2 // rk1 and rk2 in xmm10
+
+ // check if there is enough buffer to be able to fold 16B at a time
+ subs arg3, arg3, #32
+ addlt arg3, arg3, #16
+ blt _get_last_two_regs
+ b _16B_reduction_loop
+
+_less_than_16_left:
+ // shl r9, 4
+ adr ip, tbl_shf_table + 16
+ sub ip, ip, arg3
+ vld1.8 {q0}, [ip]
+ vmov.i8 q9, #0x80
+ veor.8 q0, q0, q9
+ vtbl.8 d18, {d14-d15}, d0
+ vtbl.8 d15, {d14-d15}, d1
+ vmov d14, d18
+ b _128_done
+ENDPROC(crc_t10dif_pmull)
+
+// precomputed constants
+// these constants are precomputed from the poly:
+// 0x8bb70000 (0x8bb7 scaled to 32 bits)
+ .align 4
+// Q = 0x18BB70000
+// rk1 = 2^(32*3) mod Q << 32
+// rk2 = 2^(32*5) mod Q << 32
+// rk3 = 2^(32*15) mod Q << 32
+// rk4 = 2^(32*17) mod Q << 32
+// rk5 = 2^(32*3) mod Q << 32
+// rk6 = 2^(32*2) mod Q << 32
+// rk7 = floor(2^64/Q)
+// rk8 = Q
+
+rk3: .quad 0x9d9d000000000000
+rk4: .quad 0x7cf5000000000000
+rk5: .quad 0x2d56000000000000
+rk6: .quad 0x1368000000000000
+rk7: .quad 0x00000001f65a57f8
+rk8: .quad 0x000000018bb70000
+rk9: .quad 0xceae000000000000
+rk10: .quad 0xbfd6000000000000
+rk11: .quad 0x1e16000000000000
+rk12: .quad 0x713c000000000000
+rk13: .quad 0xf7f9000000000000
+rk14: .quad 0x80a6000000000000
+rk15: .quad 0x044c000000000000
+rk16: .quad 0xe658000000000000
+rk17: .quad 0xad18000000000000
+rk18: .quad 0xa497000000000000
+rk19: .quad 0x6ee3000000000000
+rk20: .quad 0xe7b5000000000000
+rk1: .quad 0x2d56000000000000
+rk2: .quad 0x06df000000000000
+
+tbl_shf_table:
+// use these values for shift constants for the tbl/tbx instruction
+// different alignments result in values as shown:
+// DDQ 0x008f8e8d8c8b8a898887868584838281 # shl 15 (16-1) / shr1
+// DDQ 0x01008f8e8d8c8b8a8988878685848382 # shl 14 (16-3) / shr2
+// DDQ 0x0201008f8e8d8c8b8a89888786858483 # shl 13 (16-4) / shr3
+// DDQ 0x030201008f8e8d8c8b8a898887868584 # shl 12 (16-4) / shr4
+// DDQ 0x04030201008f8e8d8c8b8a8988878685 # shl 11 (16-5) / shr5
+// DDQ 0x0504030201008f8e8d8c8b8a89888786 # shl 10 (16-6) / shr6
+// DDQ 0x060504030201008f8e8d8c8b8a898887 # shl 9 (16-7) / shr7
+// DDQ 0x07060504030201008f8e8d8c8b8a8988 # shl 8 (16-8) / shr8
+// DDQ 0x0807060504030201008f8e8d8c8b8a89 # shl 7 (16-9) / shr9
+// DDQ 0x090807060504030201008f8e8d8c8b8a # shl 6 (16-10) / shr10
+// DDQ 0x0a090807060504030201008f8e8d8c8b # shl 5 (16-11) / shr11
+// DDQ 0x0b0a090807060504030201008f8e8d8c # shl 4 (16-12) / shr12
+// DDQ 0x0c0b0a090807060504030201008f8e8d # shl 3 (16-13) / shr13
+// DDQ 0x0d0c0b0a090807060504030201008f8e # shl 2 (16-14) / shr14
+// DDQ 0x0e0d0c0b0a090807060504030201008f # shl 1 (16-15) / shr15
+
+ .byte 0x0, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87
+ .byte 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f
+ .byte 0x0, 0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7
+ .byte 0x8, 0x9, 0xa, 0xb, 0xc, 0xd, 0xe , 0x0
diff --git a/arch/arm/crypto/crct10dif-ce-glue.c b/arch/arm/crypto/crct10dif-ce-glue.c
new file mode 100644
index 000000000000..d428355cf38d
--- /dev/null
+++ b/arch/arm/crypto/crct10dif-ce-glue.c
@@ -0,0 +1,101 @@
+/*
+ * Accelerated CRC-T10DIF using ARM NEON and Crypto Extensions instructions
+ *
+ * Copyright (C) 2016 Linaro Ltd <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/crc-t10dif.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/string.h>
+
+#include <crypto/internal/hash.h>
+
+#include <asm/neon.h>
+#include <asm/simd.h>
+
+#define CRC_T10DIF_PMULL_CHUNK_SIZE 16U
+
+asmlinkage u16 crc_t10dif_pmull(u16 init_crc, const u8 buf[], u32 len);
+
+static int crct10dif_init(struct shash_desc *desc)
+{
+ u16 *crc = shash_desc_ctx(desc);
+
+ *crc = 0;
+ return 0;
+}
+
+static int crct10dif_update(struct shash_desc *desc, const u8 *data,
+ unsigned int length)
+{
+ u16 *crc = shash_desc_ctx(desc);
+ unsigned int l;
+
+ if (!may_use_simd()) {
+ *crc = crc_t10dif_generic(*crc, data, length);
+ } else {
+ if (unlikely((u32)data % CRC_T10DIF_PMULL_CHUNK_SIZE)) {
+ l = min_t(u32, length, CRC_T10DIF_PMULL_CHUNK_SIZE -
+ ((u32)data % CRC_T10DIF_PMULL_CHUNK_SIZE));
+
+ *crc = crc_t10dif_generic(*crc, data, l);
+
+ length -= l;
+ data += l;
+ }
+ if (length > 0) {
+ kernel_neon_begin();
+ *crc = crc_t10dif_pmull(*crc, data, length);
+ kernel_neon_end();
+ }
+ }
+ return 0;
+}
+
+static int crct10dif_final(struct shash_desc *desc, u8 *out)
+{
+ u16 *crc = shash_desc_ctx(desc);
+
+ *(u16 *)out = *crc;
+ return 0;
+}
+
+static struct shash_alg crc_t10dif_alg = {
+ .digestsize = CRC_T10DIF_DIGEST_SIZE,
+ .init = crct10dif_init,
+ .update = crct10dif_update,
+ .final = crct10dif_final,
+ .descsize = CRC_T10DIF_DIGEST_SIZE,
+
+ .base.cra_name = "crct10dif",
+ .base.cra_driver_name = "crct10dif-arm-ce",
+ .base.cra_priority = 200,
+ .base.cra_blocksize = CRC_T10DIF_BLOCK_SIZE,
+ .base.cra_module = THIS_MODULE,
+};
+
+static int __init crc_t10dif_mod_init(void)
+{
+ if (!(elf_hwcap2 & HWCAP2_PMULL))
+ return -ENODEV;
+
+ return crypto_register_shash(&crc_t10dif_alg);
+}
+
+static void __exit crc_t10dif_mod_exit(void)
+{
+ crypto_unregister_shash(&crc_t10dif_alg);
+}
+
+module_init(crc_t10dif_mod_init);
+module_exit(crc_t10dif_mod_exit);
+
+MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
+MODULE_LICENSE("GPL v2");
+MODULE_ALIAS_CRYPTO("crct10dif");
--
2.7.4
^ permalink raw reply related
* [PATCH v3 3/6] crypto: arm64/crct10dif - port x86 SSE implementation to arm64
From: Ard Biesheuvel @ 2016-12-05 18:42 UTC (permalink / raw)
To: linux-crypto, herbert; +Cc: linux-arm-kernel, Ard Biesheuvel
In-Reply-To: <1480963348-24203-1-git-send-email-ard.biesheuvel@linaro.org>
This is a transliteration of the Intel algorithm implemented
using SSE and PCLMULQDQ instructions that resides in the file
arch/x86/crypto/crct10dif-pcl-asm_64.S, but simplified to only
operate on buffers that are 16 byte aligned (but of any size)
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
arch/arm64/crypto/Kconfig | 5 +
arch/arm64/crypto/Makefile | 3 +
arch/arm64/crypto/crct10dif-ce-core.S | 392 ++++++++++++++++++++
arch/arm64/crypto/crct10dif-ce-glue.c | 95 +++++
4 files changed, 495 insertions(+)
diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig
index 2cf32e9887e1..d773c0659202 100644
--- a/arch/arm64/crypto/Kconfig
+++ b/arch/arm64/crypto/Kconfig
@@ -23,6 +23,11 @@ config CRYPTO_GHASH_ARM64_CE
depends on ARM64 && KERNEL_MODE_NEON
select CRYPTO_HASH
+config CRYPTO_CRCT10DIF_ARM64_CE
+ tristate "CRCT10DIF digest algorithm using PMULL instructions"
+ depends on KERNEL_MODE_NEON && CRC_T10DIF
+ select CRYPTO_HASH
+
config CRYPTO_AES_ARM64_CE
tristate "AES core cipher using ARMv8 Crypto Extensions"
depends on ARM64 && KERNEL_MODE_NEON
diff --git a/arch/arm64/crypto/Makefile b/arch/arm64/crypto/Makefile
index abb79b3cfcfe..36fd3eb4201b 100644
--- a/arch/arm64/crypto/Makefile
+++ b/arch/arm64/crypto/Makefile
@@ -17,6 +17,9 @@ sha2-ce-y := sha2-ce-glue.o sha2-ce-core.o
obj-$(CONFIG_CRYPTO_GHASH_ARM64_CE) += ghash-ce.o
ghash-ce-y := ghash-ce-glue.o ghash-ce-core.o
+obj-$(CONFIG_CRYPTO_CRCT10DIF_ARM64_CE) += crct10dif-ce.o
+crct10dif-ce-y := crct10dif-ce-core.o crct10dif-ce-glue.o
+
obj-$(CONFIG_CRYPTO_AES_ARM64_CE) += aes-ce-cipher.o
CFLAGS_aes-ce-cipher.o += -march=armv8-a+crypto
diff --git a/arch/arm64/crypto/crct10dif-ce-core.S b/arch/arm64/crypto/crct10dif-ce-core.S
new file mode 100644
index 000000000000..d5b5a8c038c8
--- /dev/null
+++ b/arch/arm64/crypto/crct10dif-ce-core.S
@@ -0,0 +1,392 @@
+//
+// Accelerated CRC-T10DIF using arm64 NEON and Crypto Extensions instructions
+//
+// Copyright (C) 2016 Linaro Ltd <ard.biesheuvel@linaro.org>
+//
+// This program is free software; you can redistribute it and/or modify
+// it under the terms of the GNU General Public License version 2 as
+// published by the Free Software Foundation.
+//
+
+//
+// Implement fast CRC-T10DIF computation with SSE and PCLMULQDQ instructions
+//
+// Copyright (c) 2013, Intel Corporation
+//
+// Authors:
+// Erdinc Ozturk <erdinc.ozturk@intel.com>
+// Vinodh Gopal <vinodh.gopal@intel.com>
+// James Guilford <james.guilford@intel.com>
+// Tim Chen <tim.c.chen@linux.intel.com>
+//
+// This software is available to you under a choice of one of two
+// licenses. You may choose to be licensed under the terms of the GNU
+// General Public License (GPL) Version 2, available from the file
+// COPYING in the main directory of this source tree, or the
+// OpenIB.org BSD license below:
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are
+// met:
+//
+// * Redistributions of source code must retain the above copyright
+// notice, this list of conditions and the following disclaimer.
+//
+// * Redistributions in binary form must reproduce the above copyright
+// notice, this list of conditions and the following disclaimer in the
+// documentation and/or other materials provided with the
+// distribution.
+//
+// * Neither the name of the Intel Corporation nor the names of its
+// contributors may be used to endorse or promote products derived from
+// this software without specific prior written permission.
+//
+//
+// THIS SOFTWARE IS PROVIDED BY INTEL CORPORATION ""AS IS"" AND ANY
+// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL INTEL CORPORATION OR
+// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+// PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+// LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+//
+// Function API:
+// UINT16 crc_t10dif_pcl(
+// UINT16 init_crc, //initial CRC value, 16 bits
+// const unsigned char *buf, //buffer pointer to calculate CRC on
+// UINT64 len //buffer length in bytes (64-bit data)
+// );
+//
+// Reference paper titled "Fast CRC Computation for Generic
+// Polynomials Using PCLMULQDQ Instruction"
+// URL: http://www.intel.com/content/dam/www/public/us/en/documents
+// /white-papers/fast-crc-computation-generic-polynomials-pclmulqdq-paper.pdf
+//
+//
+
+#include <linux/linkage.h>
+#include <asm/assembler.h>
+
+ .text
+ .cpu generic+crypto
+
+ arg1_low32 .req w0
+ arg2 .req x1
+ arg3 .req x2
+
+ vzr .req v13
+
+ENTRY(crc_t10dif_pmull)
+ movi vzr.16b, #0 // init zero register
+
+ // adjust the 16-bit initial_crc value, scale it to 32 bits
+ lsl arg1_low32, arg1_low32, #16
+
+ // check if smaller than 256
+ cmp arg3, #256
+
+ // for sizes less than 128, we can't fold 64B at a time...
+ b.lt _less_than_128
+
+ // load the initial crc value
+ // crc value does not need to be byte-reflected, but it needs
+ // to be moved to the high part of the register.
+ // because data will be byte-reflected and will align with
+ // initial crc at correct place.
+ movi v10.16b, #0
+ mov v10.s[3], arg1_low32 // initial crc
+
+ // receive the initial 64B data, xor the initial crc value
+ ldp q0, q1, [arg2]
+ ldp q2, q3, [arg2, #0x20]
+ ldp q4, q5, [arg2, #0x40]
+ ldp q6, q7, [arg2, #0x60]
+ add arg2, arg2, #0x80
+
+CPU_LE( rev64 v0.16b, v0.16b )
+CPU_LE( rev64 v1.16b, v1.16b )
+CPU_LE( rev64 v2.16b, v2.16b )
+CPU_LE( rev64 v3.16b, v3.16b )
+CPU_LE( rev64 v4.16b, v4.16b )
+CPU_LE( rev64 v5.16b, v5.16b )
+CPU_LE( rev64 v6.16b, v6.16b )
+CPU_LE( rev64 v7.16b, v7.16b )
+
+CPU_LE( ext v0.16b, v0.16b, v0.16b, #8 )
+CPU_LE( ext v1.16b, v1.16b, v1.16b, #8 )
+CPU_LE( ext v2.16b, v2.16b, v2.16b, #8 )
+CPU_LE( ext v3.16b, v3.16b, v3.16b, #8 )
+CPU_LE( ext v4.16b, v4.16b, v4.16b, #8 )
+CPU_LE( ext v5.16b, v5.16b, v5.16b, #8 )
+CPU_LE( ext v6.16b, v6.16b, v6.16b, #8 )
+CPU_LE( ext v7.16b, v7.16b, v7.16b, #8 )
+
+ // XOR the initial_crc value
+ eor v0.16b, v0.16b, v10.16b
+
+ ldr q10, rk3 // xmm10 has rk3 and rk4
+ // type of pmull instruction
+ // will determine which constant to use
+
+ //
+ // we subtract 256 instead of 128 to save one instruction from the loop
+ //
+ sub arg3, arg3, #256
+
+ // at this section of the code, there is 64*x+y (0<=y<64) bytes of
+ // buffer. The _fold_64_B_loop will fold 64B at a time
+ // until we have 64+y Bytes of buffer
+
+
+ // fold 64B at a time. This section of the code folds 4 vector
+ // registers in parallel
+_fold_64_B_loop:
+
+ .macro fold64, reg1, reg2
+ ldp q11, q12, [arg2], #0x20
+
+ pmull2 v8.1q, \reg1\().2d, v10.2d
+ pmull \reg1\().1q, \reg1\().1d, v10.1d
+
+CPU_LE( rev64 v11.16b, v11.16b )
+CPU_LE( rev64 v12.16b, v12.16b )
+
+ pmull2 v9.1q, \reg2\().2d, v10.2d
+ pmull \reg2\().1q, \reg2\().1d, v10.1d
+
+CPU_LE( ext v11.16b, v11.16b, v11.16b, #8 )
+CPU_LE( ext v12.16b, v12.16b, v12.16b, #8 )
+
+ eor \reg1\().16b, \reg1\().16b, v8.16b
+ eor \reg2\().16b, \reg2\().16b, v9.16b
+ eor \reg1\().16b, \reg1\().16b, v11.16b
+ eor \reg2\().16b, \reg2\().16b, v12.16b
+ .endm
+
+ fold64 v0, v1
+ fold64 v2, v3
+ fold64 v4, v5
+ fold64 v6, v7
+
+ subs arg3, arg3, #128
+
+ // check if there is another 64B in the buffer to be able to fold
+ b.ge _fold_64_B_loop
+
+ // at this point, the buffer pointer is pointing at the last y Bytes
+ // of the buffer the 64B of folded data is in 4 of the vector
+ // registers: v0, v1, v2, v3
+
+ // fold the 8 vector registers to 1 vector register with different
+ // constants
+
+ ldr q10, rk9
+
+ .macro fold16, reg, rk
+ pmull v8.1q, \reg\().1d, v10.1d
+ pmull2 \reg\().1q, \reg\().2d, v10.2d
+ .ifnb \rk
+ ldr q10, \rk
+ .endif
+ eor v7.16b, v7.16b, v8.16b
+ eor v7.16b, v7.16b, \reg\().16b
+ .endm
+
+ fold16 v0, rk11
+ fold16 v1, rk13
+ fold16 v2, rk15
+ fold16 v3, rk17
+ fold16 v4, rk19
+ fold16 v5, rk1
+ fold16 v6
+
+ // instead of 64, we add 48 to the loop counter to save 1 instruction
+ // from the loop instead of a cmp instruction, we use the negative
+ // flag with the jl instruction
+ adds arg3, arg3, #(128-16)
+ b.lt _final_reduction_for_128
+
+ // now we have 16+y bytes left to reduce. 16 Bytes is in register v7
+ // and the rest is in memory. We can fold 16 bytes at a time if y>=16
+ // continue folding 16B at a time
+
+_16B_reduction_loop:
+ pmull v8.1q, v7.1d, v10.1d
+ pmull2 v7.1q, v7.2d, v10.2d
+ eor v7.16b, v7.16b, v8.16b
+
+ ldr q0, [arg2], #16
+CPU_LE( rev64 v0.16b, v0.16b )
+CPU_LE( ext v0.16b, v0.16b, v0.16b, #8 )
+ eor v7.16b, v7.16b, v0.16b
+ subs arg3, arg3, #16
+
+ // instead of a cmp instruction, we utilize the flags with the
+ // jge instruction equivalent of: cmp arg3, 16-16
+ // check if there is any more 16B in the buffer to be able to fold
+ b.ge _16B_reduction_loop
+
+ // now we have 16+z bytes left to reduce, where 0<= z < 16.
+ // first, we reduce the data in the xmm7 register
+
+_final_reduction_for_128:
+ // check if any more data to fold. If not, compute the CRC of
+ // the final 128 bits
+ adds arg3, arg3, #16
+ b.eq _128_done
+
+ // here we are getting data that is less than 16 bytes.
+ // since we know that there was data before the pointer, we can
+ // offset the input pointer before the actual point, to receive
+ // exactly 16 bytes. after that the registers need to be adjusted.
+_get_last_two_regs:
+ add arg2, arg2, arg3
+ ldr q1, [arg2, #-16]
+CPU_LE( rev64 v1.16b, v1.16b )
+CPU_LE( ext v1.16b, v1.16b, v1.16b, #8 )
+
+ // get rid of the extra data that was loaded before
+ // load the shift constant
+ adr x4, tbl_shf_table + 16
+ sub x4, x4, arg3
+ ld1 {v0.16b}, [x4]
+
+ // shift v2 to the left by arg3 bytes
+ tbl v2.16b, {v7.16b}, v0.16b
+
+ // shift v7 to the right by 16-arg3 bytes
+ movi v9.16b, #0x80
+ eor v0.16b, v0.16b, v9.16b
+ tbl v7.16b, {v7.16b}, v0.16b
+
+ // blend
+ sshr v0.16b, v0.16b, #7 // convert to 8-bit mask
+ bsl v0.16b, v2.16b, v1.16b
+
+ // fold 16 Bytes
+ pmull v8.1q, v7.1d, v10.1d
+ pmull2 v7.1q, v7.2d, v10.2d
+ eor v7.16b, v7.16b, v8.16b
+ eor v7.16b, v7.16b, v0.16b
+
+_128_done:
+ // compute crc of a 128-bit value
+ ldr q10, rk5 // rk5 and rk6 in xmm10
+
+ // 64b fold
+ ext v0.16b, vzr.16b, v7.16b, #8
+ mov v7.d[0], v7.d[1]
+ pmull v7.1q, v7.1d, v10.1d
+ eor v7.16b, v7.16b, v0.16b
+
+ // 32b fold
+ ext v0.16b, v7.16b, vzr.16b, #4
+ mov v7.s[3], vzr.s[0]
+ pmull2 v0.1q, v0.2d, v10.2d
+ eor v7.16b, v7.16b, v0.16b
+
+ // barrett reduction
+_barrett:
+ ldr q10, rk7
+ mov v0.d[0], v7.d[1]
+
+ pmull v0.1q, v0.1d, v10.1d
+ ext v0.16b, vzr.16b, v0.16b, #12
+ pmull2 v0.1q, v0.2d, v10.2d
+ ext v0.16b, vzr.16b, v0.16b, #12
+ eor v7.16b, v7.16b, v0.16b
+ mov w0, v7.s[1]
+
+_cleanup:
+ // scale the result back to 16 bits
+ lsr x0, x0, #16
+ ret
+
+_less_than_128:
+ cbz arg3, _cleanup
+
+ movi v0.16b, #0
+ mov v0.s[3], arg1_low32 // get the initial crc value
+
+ ldr q7, [arg2], #0x10
+CPU_LE( rev64 v7.16b, v7.16b )
+CPU_LE( ext v7.16b, v7.16b, v7.16b, #8 )
+ eor v7.16b, v7.16b, v0.16b // xor the initial crc value
+
+ cmp arg3, #16
+ b.eq _128_done // exactly 16 left
+ b.lt _less_than_16_left
+
+ ldr q10, rk1 // rk1 and rk2 in xmm10
+
+ // update the counter. subtract 32 instead of 16 to save one
+ // instruction from the loop
+ subs arg3, arg3, #32
+ b.ge _16B_reduction_loop
+
+ add arg3, arg3, #16
+ b _get_last_two_regs
+
+_less_than_16_left:
+ // shl r9, 4
+ adr x0, tbl_shf_table + 16
+ sub x0, x0, arg3
+ ld1 {v0.16b}, [x0]
+ movi v9.16b, #0x80
+ eor v0.16b, v0.16b, v9.16b
+ tbl v7.16b, {v7.16b}, v0.16b
+ b _128_done
+ENDPROC(crc_t10dif_pmull)
+
+// precomputed constants
+// these constants are precomputed from the poly:
+// 0x8bb70000 (0x8bb7 scaled to 32 bits)
+ .align 4
+// Q = 0x18BB70000
+// rk1 = 2^(32*3) mod Q << 32
+// rk2 = 2^(32*5) mod Q << 32
+// rk3 = 2^(32*15) mod Q << 32
+// rk4 = 2^(32*17) mod Q << 32
+// rk5 = 2^(32*3) mod Q << 32
+// rk6 = 2^(32*2) mod Q << 32
+// rk7 = floor(2^64/Q)
+// rk8 = Q
+
+rk1: .octa 0x06df0000000000002d56000000000000
+rk3: .octa 0x7cf50000000000009d9d000000000000
+rk5: .octa 0x13680000000000002d56000000000000
+rk7: .octa 0x000000018bb7000000000001f65a57f8
+rk9: .octa 0xbfd6000000000000ceae000000000000
+rk11: .octa 0x713c0000000000001e16000000000000
+rk13: .octa 0x80a6000000000000f7f9000000000000
+rk15: .octa 0xe658000000000000044c000000000000
+rk17: .octa 0xa497000000000000ad18000000000000
+rk19: .octa 0xe7b50000000000006ee3000000000000
+
+tbl_shf_table:
+// use these values for shift constants for the tbl/tbx instruction
+// different alignments result in values as shown:
+// DDQ 0x008f8e8d8c8b8a898887868584838281 # shl 15 (16-1) / shr1
+// DDQ 0x01008f8e8d8c8b8a8988878685848382 # shl 14 (16-3) / shr2
+// DDQ 0x0201008f8e8d8c8b8a89888786858483 # shl 13 (16-4) / shr3
+// DDQ 0x030201008f8e8d8c8b8a898887868584 # shl 12 (16-4) / shr4
+// DDQ 0x04030201008f8e8d8c8b8a8988878685 # shl 11 (16-5) / shr5
+// DDQ 0x0504030201008f8e8d8c8b8a89888786 # shl 10 (16-6) / shr6
+// DDQ 0x060504030201008f8e8d8c8b8a898887 # shl 9 (16-7) / shr7
+// DDQ 0x07060504030201008f8e8d8c8b8a8988 # shl 8 (16-8) / shr8
+// DDQ 0x0807060504030201008f8e8d8c8b8a89 # shl 7 (16-9) / shr9
+// DDQ 0x090807060504030201008f8e8d8c8b8a # shl 6 (16-10) / shr10
+// DDQ 0x0a090807060504030201008f8e8d8c8b # shl 5 (16-11) / shr11
+// DDQ 0x0b0a090807060504030201008f8e8d8c # shl 4 (16-12) / shr12
+// DDQ 0x0c0b0a090807060504030201008f8e8d # shl 3 (16-13) / shr13
+// DDQ 0x0d0c0b0a090807060504030201008f8e # shl 2 (16-14) / shr14
+// DDQ 0x0e0d0c0b0a090807060504030201008f # shl 1 (16-15) / shr15
+
+ .byte 0x0, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87
+ .byte 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f
+ .byte 0x0, 0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7
+ .byte 0x8, 0x9, 0xa, 0xb, 0xc, 0xd, 0xe , 0x0
diff --git a/arch/arm64/crypto/crct10dif-ce-glue.c b/arch/arm64/crypto/crct10dif-ce-glue.c
new file mode 100644
index 000000000000..60cb590c2590
--- /dev/null
+++ b/arch/arm64/crypto/crct10dif-ce-glue.c
@@ -0,0 +1,95 @@
+/*
+ * Accelerated CRC-T10DIF using arm64 NEON and Crypto Extensions instructions
+ *
+ * Copyright (C) 2016 Linaro Ltd <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/cpufeature.h>
+#include <linux/crc-t10dif.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/string.h>
+
+#include <crypto/internal/hash.h>
+
+#include <asm/neon.h>
+
+#define CRC_T10DIF_PMULL_CHUNK_SIZE 16U
+
+asmlinkage u16 crc_t10dif_pmull(u16 init_crc, const u8 buf[], u64 len);
+
+static int crct10dif_init(struct shash_desc *desc)
+{
+ u16 *crc = shash_desc_ctx(desc);
+
+ *crc = 0;
+ return 0;
+}
+
+static int crct10dif_update(struct shash_desc *desc, const u8 *data,
+ unsigned int length)
+{
+ u16 *crc = shash_desc_ctx(desc);
+ unsigned int l;
+
+ if (unlikely((u64)data % CRC_T10DIF_PMULL_CHUNK_SIZE)) {
+ l = min_t(u32, length, CRC_T10DIF_PMULL_CHUNK_SIZE -
+ ((u64)data % CRC_T10DIF_PMULL_CHUNK_SIZE));
+
+ *crc = crc_t10dif_generic(*crc, data, l);
+
+ length -= l;
+ data += l;
+ }
+
+ if (length > 0) {
+ kernel_neon_begin_partial(14);
+ *crc = crc_t10dif_pmull(*crc, data, length);
+ kernel_neon_end();
+ }
+
+ return 0;
+}
+
+static int crct10dif_final(struct shash_desc *desc, u8 *out)
+{
+ u16 *crc = shash_desc_ctx(desc);
+
+ *(u16 *)out = *crc;
+ return 0;
+}
+
+static struct shash_alg crc_t10dif_alg = {
+ .digestsize = CRC_T10DIF_DIGEST_SIZE,
+ .init = crct10dif_init,
+ .update = crct10dif_update,
+ .final = crct10dif_final,
+ .descsize = CRC_T10DIF_DIGEST_SIZE,
+
+ .base.cra_name = "crct10dif",
+ .base.cra_driver_name = "crct10dif-arm64-ce",
+ .base.cra_priority = 200,
+ .base.cra_blocksize = CRC_T10DIF_BLOCK_SIZE,
+ .base.cra_module = THIS_MODULE,
+};
+
+static int __init crc_t10dif_mod_init(void)
+{
+ return crypto_register_shash(&crc_t10dif_alg);
+}
+
+static void __exit crc_t10dif_mod_exit(void)
+{
+ crypto_unregister_shash(&crc_t10dif_alg);
+}
+
+module_cpu_feature_match(PMULL, crc_t10dif_mod_init);
+module_exit(crc_t10dif_mod_exit);
+
+MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
+MODULE_LICENSE("GPL v2");
--
2.7.4
^ permalink raw reply related
* [PATCH v3 2/6] crypto: testmgr - add/enhance test cases for CRC-T10DIF
From: Ard Biesheuvel @ 2016-12-05 18:42 UTC (permalink / raw)
To: linux-crypto, herbert; +Cc: linux-arm-kernel, Ard Biesheuvel
In-Reply-To: <1480963348-24203-1-git-send-email-ard.biesheuvel@linaro.org>
The existing test cases only exercise a small slice of the various
possible code paths through the x86 SSE/PCLMULQDQ implementation,
and the upcoming ports of it for arm64. So add one that exceeds 256
bytes in size, and convert another to a chunked test.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
crypto/testmgr.h | 70 ++++++++++++--------
1 file changed, 42 insertions(+), 28 deletions(-)
diff --git a/crypto/testmgr.h b/crypto/testmgr.h
index e64a4ef9d8ca..9b656be7f52f 100644
--- a/crypto/testmgr.h
+++ b/crypto/testmgr.h
@@ -1334,36 +1334,50 @@ static struct hash_testvec rmd320_tv_template[] = {
}
};
-#define CRCT10DIF_TEST_VECTORS 3
+#define CRCT10DIF_TEST_VECTORS ARRAY_SIZE(crct10dif_tv_template)
static struct hash_testvec crct10dif_tv_template[] = {
{
- .plaintext = "abc",
- .psize = 3,
-#ifdef __LITTLE_ENDIAN
- .digest = "\x3b\x44",
-#else
- .digest = "\x44\x3b",
-#endif
- }, {
- .plaintext = "1234567890123456789012345678901234567890"
- "123456789012345678901234567890123456789",
- .psize = 79,
-#ifdef __LITTLE_ENDIAN
- .digest = "\x70\x4b",
-#else
- .digest = "\x4b\x70",
-#endif
- }, {
- .plaintext =
- "abcddddddddddddddddddddddddddddddddddddddddddddddddddddd",
- .psize = 56,
-#ifdef __LITTLE_ENDIAN
- .digest = "\xe3\x9c",
-#else
- .digest = "\x9c\xe3",
-#endif
- .np = 2,
- .tap = { 28, 28 }
+ .plaintext = "abc",
+ .psize = 3,
+ .digest = (u8 *)(u16 []){ 0x443b },
+ }, {
+ .plaintext = "1234567890123456789012345678901234567890"
+ "123456789012345678901234567890123456789",
+ .psize = 79,
+ .digest = (u8 *)(u16 []){ 0x4b70 },
+ .np = 2,
+ .tap = { 63, 16 },
+ }, {
+ .plaintext = "abcdddddddddddddddddddddddddddddddddddddddd"
+ "ddddddddddddd",
+ .psize = 56,
+ .digest = (u8 *)(u16 []){ 0x9ce3 },
+ .np = 8,
+ .tap = { 1, 2, 28, 7, 6, 5, 4, 3 },
+ }, {
+ .plaintext = "1234567890123456789012345678901234567890"
+ "1234567890123456789012345678901234567890"
+ "1234567890123456789012345678901234567890"
+ "1234567890123456789012345678901234567890"
+ "1234567890123456789012345678901234567890"
+ "1234567890123456789012345678901234567890"
+ "1234567890123456789012345678901234567890"
+ "123456789012345678901234567890123456789",
+ .psize = 319,
+ .digest = (u8 *)(u16 []){ 0x44c6 },
+ }, {
+ .plaintext = "1234567890123456789012345678901234567890"
+ "1234567890123456789012345678901234567890"
+ "1234567890123456789012345678901234567890"
+ "1234567890123456789012345678901234567890"
+ "1234567890123456789012345678901234567890"
+ "1234567890123456789012345678901234567890"
+ "1234567890123456789012345678901234567890"
+ "123456789012345678901234567890123456789",
+ .psize = 319,
+ .digest = (u8 *)(u16 []){ 0x44c6 },
+ .np = 4,
+ .tap = { 1, 255, 57, 6 },
}
};
--
2.7.4
^ permalink raw reply related
* [PATCH v3 1/6] crypto: testmgr - avoid overlap in chunked tests
From: Ard Biesheuvel @ 2016-12-05 18:42 UTC (permalink / raw)
To: linux-crypto, herbert; +Cc: linux-arm-kernel, Ard Biesheuvel
In-Reply-To: <1480963348-24203-1-git-send-email-ard.biesheuvel@linaro.org>
The IDXn offsets are chosen such that tap values (which may go up to
255) end up overlapping in the xbuf allocation. In particular, IDX1
and IDX3 are too close together, so update IDX3 to avoid this issue.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
crypto/testmgr.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/crypto/testmgr.c b/crypto/testmgr.c
index ded50b67c757..670893bcf361 100644
--- a/crypto/testmgr.c
+++ b/crypto/testmgr.c
@@ -63,7 +63,7 @@ int alg_test(const char *driver, const char *alg, u32 type, u32 mask)
*/
#define IDX1 32
#define IDX2 32400
-#define IDX3 1
+#define IDX3 511
#define IDX4 8193
#define IDX5 22222
#define IDX6 17101
--
2.7.4
^ permalink raw reply related
* [PATCH v3 0/6] crypto: ARM/arm64 CRC-T10DIF/CRC32/CRC32C roundup
From: Ard Biesheuvel @ 2016-12-05 18:42 UTC (permalink / raw)
To: linux-crypto, herbert; +Cc: linux-arm-kernel, Ard Biesheuvel
This v3 combines the CRC-T10DIF and CRC32 implementations for both ARM and
arm64 that I sent out a couple of weeks ago, and adds support to the latter
for CRC32C.
Changes since v2:
- fix a couple of big-endian bugs in CRC32/CRC32C
- add back handling to the CRC-T10DIF routines of buffers that are not a
multiple of 16 bytes (but they still must be 16 byte aligned)
Ard Biesheuvel (6):
crypto: testmgr - avoid overlap in chunked tests
crypto: testmgr - add/enhance test cases for CRC-T10DIF
crypto: arm64/crct10dif - port x86 SSE implementation to arm64
crypto: arm/crct10dif - port x86 SSE implementation to ARM
crypto: arm64/crc32 - accelerated support based on x86 SSE
implementation
crypto: arm/crc32 - accelerated support based on x86 SSE
implementation
arch/arm/crypto/Kconfig | 10 +
arch/arm/crypto/Makefile | 4 +
arch/arm/crypto/crc32-ce-core.S | 306 ++++++++++++++
arch/arm/crypto/crc32-ce-glue.c | 242 +++++++++++
arch/arm/crypto/crct10dif-ce-core.S | 427 ++++++++++++++++++++
arch/arm/crypto/crct10dif-ce-glue.c | 101 +++++
arch/arm64/crypto/Kconfig | 11 +
arch/arm64/crypto/Makefile | 6 +
arch/arm64/crypto/crc32-ce-core.S | 266 ++++++++++++
arch/arm64/crypto/crc32-ce-glue.c | 212 ++++++++++
arch/arm64/crypto/crct10dif-ce-core.S | 392 ++++++++++++++++++
arch/arm64/crypto/crct10dif-ce-glue.c | 95 +++++
crypto/testmgr.c | 2 +-
crypto/testmgr.h | 70 ++--
14 files changed, 2115 insertions(+), 29 deletions(-)
create mode 100644 arch/arm/crypto/crc32-ce-core.S
create mode 100644 arch/arm/crypto/crc32-ce-glue.c
create mode 100644 arch/arm/crypto/crct10dif-ce-core.S
create mode 100644 arch/arm/crypto/crct10dif-ce-glue.c
create mode 100644 arch/arm64/crypto/crc32-ce-core.S
create mode 100644 arch/arm64/crypto/crc32-ce-glue.c
create mode 100644 arch/arm64/crypto/crct10dif-ce-core.S
create mode 100644 arch/arm64/crypto/crct10dif-ce-glue.c
--
2.7.4
^ permalink raw reply
* Re: [PATCH] crypto: rsa - fix a potential race condition in build
From: Yang Shi @ 2016-12-05 17:49 UTC (permalink / raw)
To: Herbert Xu; +Cc: davem, linux-crypto, linux-kernel
In-Reply-To: <20161205064800.GA9496@gondor.apana.org.au>
On 12/4/2016 10:48 PM, Herbert Xu wrote:
> On Fri, Dec 02, 2016 at 03:41:04PM -0800, Yang Shi wrote:
>> When building kernel with RSA enabled with multithreaded, the below
>> compile failure might be caught:
>>
>> | /buildarea/kernel-source/crypto/rsa_helper.c:18:28: fatal error: rsapubkey-asn1.h: No such file or directory
>> | #include "rsapubkey-asn1.h"
>> | ^
>> | compilation terminated.
>> | CC crypto/rsa-pkcs1pad.o
>> | CC crypto/algboss.o
>> | CC crypto/testmgr.o
>> | make[3]: *** [/buildarea/kernel-source/scripts/Makefile.build:289: crypto/rsa_helper.o] Error 1
>> | make[3]: *** Waiting for unfinished jobs....
>> | make[2]: *** [/buildarea/kernel-source/Makefile:969: crypto] Error 2
>> | make[1]: *** [Makefile:150: sub-make] Error 2
>> | make: *** [Makefile:24: __sub-make] Error 2
>>
>> The header file is not generated before rsa_helper is compiled, so
>> adding dependency to avoid such issue.
>>
>> Signed-off-by: Yang Shi <yang.shi@windriver.com>
> This should already be fixed in the latest crypto tree. Could
> you please double-check?
Thanks, Herbert, I just found the commit. Please ignore my patch, sorry
for the inconvenience.
I will backport that commit to our kernel tree.
Yang
>
> Thanks,
^ permalink raw reply
* Re: [PATCH] crypto/mcryptd: Check mcryptd algorithm compatability
From: Tim Chen @ 2016-12-05 16:50 UTC (permalink / raw)
To: Herbert Xu
Cc: Mikulas Patocka, David S. Miller, megha.dey, linux-crypto,
dm-devel, Milan Broz, Eric Biggers, stable
In-Reply-To: <20161205123403.GB10635@gondor.apana.org.au>
On Mon, 2016-12-05 at 20:34 +0800, Herbert Xu wrote:
> On Fri, Dec 02, 2016 at 04:15:21PM -0800, Tim Chen wrote:
> >
> > Algorithms not compatible with mcryptd could be spawned by mcryptd
> > with a direct crypto_alloc_tfm invocation using a "mcryptd(alg)"
> > name construct. This causes mcryptd to crash the kernel if
> > "alg" is incompatible and not intended to be used with mcryptd.
> >
> > A flag CRYPTO_ALG_MCRYPT is being added to mcryptd compatible
> > algorithms' cra_flags. The compatability is checked when mcryptd spawn
> > off an algorithm.
> >
> > Link: http://marc.info/?l=linux-crypto-vger&m=148063683310477&w=2
> > Cc: stable@vger.kernel.org
> > Reported-by: Mikulas Patocka <mpatocka@redhat.com>
> > Tested-by: Megha Dey <megha.dey@linux.intel.com>
> > Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
> Tim, I think we should instead make mcryptd refuse to generate
> a non-internal algorithm. This way the user would not be able
> to access it at all since they can only request for non-internal
> algorithms.
>
> Basically you want to check at the start of mcryptd_create_hash
> that the INTERNAL bit is set on both the type and mask as returned
> by crypto_get_attr_type.
I have thought about that. However, not all internal algorithms
are compatible with mcryptd. We can still have trouble if some
random internal algorithm is paired with mcryptd.
Tim
^ permalink raw reply
* [PATCH v4] crypto: AF_ALG - fix AEAD tag memory handling
From: Stephan Mueller @ 2016-12-05 14:26 UTC (permalink / raw)
To: herbert; +Cc: linux-crypto
Hi Herbert,
Changes v4: restore the old behavior -- if the caller does not provide sufficient
output buffer size, return an error.
---8<---
For encryption, the AEAD ciphers require AAD || PT as input and generate
AAD || CT || Tag as output and vice versa for decryption. Prior to this
patch, the AF_ALG interface for AEAD ciphers requires the buffer to be
present as input for encryption. Similarly, the output buffer for
decryption required the presence of the tag buffer too. This implies
that the kernel reads / writes data buffers from/to kernel space
even though this operation is not required.
This patch changes the AF_ALG AEAD interface to be consistent with the
in-kernel AEAD cipher requirements.
Due to this handling, he changes are transparent to user space with one
exception: the return code of recv indicates the mount of output buffer.
That output buffer has a different size compared to before the patch
which implies that the return code of recv will also be different.
For example, a decryption operation uses 16 bytes AAD, 16 bytes CT and
16 bytes tag, the AF_ALG AEAD interface before showed a recv return
code of 48 (bytes) whereas after this patch, the return code is 32
since the tag is not returned any more.
Reported-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Stephan Mueller <smueller@chronox.de>
---
crypto/algif_aead.c | 57 +++++++++++++++++++++++++++++++++--------------------
1 file changed, 36 insertions(+), 21 deletions(-)
diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c
index 6e95137..4359c9b 100644
--- a/crypto/algif_aead.c
+++ b/crypto/algif_aead.c
@@ -81,7 +81,11 @@ static inline bool aead_sufficient_data(struct aead_ctx *ctx)
{
unsigned as = crypto_aead_authsize(crypto_aead_reqtfm(&ctx->aead_req));
- return ctx->used >= ctx->aead_assoclen + as;
+ /*
+ * The minimum amount of memory needed for an AEAD cipher is
+ * the AAD and in case of decryption the tag.
+ */
+ return ctx->used >= ctx->aead_assoclen + (ctx->enc ? 0 : as);
}
static void aead_reset_ctx(struct aead_ctx *ctx)
@@ -426,12 +430,15 @@ static int aead_recvmsg_async(struct socket *sock, struct msghdr *msg,
goto unlock;
}
- used = ctx->used;
- outlen = used;
-
if (!aead_sufficient_data(ctx))
goto unlock;
+ used = ctx->used;
+ if (ctx->enc)
+ outlen = used + as;
+ else
+ outlen = used - as;
+
req = sock_kmalloc(sk, reqlen, GFP_KERNEL);
if (unlikely(!req))
goto unlock;
@@ -445,7 +452,7 @@ static int aead_recvmsg_async(struct socket *sock, struct msghdr *msg,
aead_request_set_ad(req, ctx->aead_assoclen);
aead_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG,
aead_async_cb, sk);
- used -= ctx->aead_assoclen + (ctx->enc ? as : 0);
+ used -= ctx->aead_assoclen;
/* take over all tx sgls from ctx */
areq->tsgl = sock_kmalloc(sk,
@@ -462,7 +469,7 @@ static int aead_recvmsg_async(struct socket *sock, struct msghdr *msg,
areq->tsgls = sgl->cur;
/* create rx sgls */
- while (iov_iter_count(&msg->msg_iter)) {
+ while (outlen > usedpages && iov_iter_count(&msg->msg_iter)) {
size_t seglen = min_t(size_t, iov_iter_count(&msg->msg_iter),
(outlen - usedpages));
@@ -492,16 +499,14 @@ static int aead_recvmsg_async(struct socket *sock, struct msghdr *msg,
last_rsgl = rsgl;
- /* we do not need more iovecs as we have sufficient memory */
- if (outlen <= usedpages)
- break;
-
iov_iter_advance(&msg->msg_iter, err);
}
- err = -EINVAL;
+
/* ensure output buffer is sufficiently large */
- if (usedpages < outlen)
- goto free;
+ if (usedpages < outlen) {
+ err = -EINVAL;
+ goto unlock;
+ }
aead_request_set_crypt(req, areq->tsgl, areq->first_rsgl.sgl.sg, used,
areq->iv);
@@ -572,6 +577,7 @@ static int aead_recvmsg_sync(struct socket *sock, struct msghdr *msg, int flags)
goto unlock;
}
+ /* data length provided by caller via sendmsg/sendpage */
used = ctx->used;
/*
@@ -586,16 +592,27 @@ static int aead_recvmsg_sync(struct socket *sock, struct msghdr *msg, int flags)
if (!aead_sufficient_data(ctx))
goto unlock;
- outlen = used;
+ /*
+ * Calculate the minimum output buffer size holding the result of the
+ * cipher operation. When encrypting data, the receiving buffer is
+ * larger by the tag length compared to the input buffer as the
+ * encryption operation generates the tag. For decryption, the input
+ * buffer provides the tag which is consumed resulting in only the
+ * plaintext without a buffer for the tag returned to the caller.
+ */
+ if (ctx->enc)
+ outlen = used + as;
+ else
+ outlen = used - as;
/*
* The cipher operation input data is reduced by the associated data
* length as this data is processed separately later on.
*/
- used -= ctx->aead_assoclen + (ctx->enc ? as : 0);
+ used -= ctx->aead_assoclen;
/* convert iovecs of output buffers into scatterlists */
- while (iov_iter_count(&msg->msg_iter)) {
+ while (outlen > usedpages && iov_iter_count(&msg->msg_iter)) {
size_t seglen = min_t(size_t, iov_iter_count(&msg->msg_iter),
(outlen - usedpages));
@@ -622,16 +639,14 @@ static int aead_recvmsg_sync(struct socket *sock, struct msghdr *msg, int flags)
last_rsgl = rsgl;
- /* we do not need more iovecs as we have sufficient memory */
- if (outlen <= usedpages)
- break;
iov_iter_advance(&msg->msg_iter, err);
}
- err = -EINVAL;
/* ensure output buffer is sufficiently large */
- if (usedpages < outlen)
+ if (usedpages < outlen) {
+ err = -EINVAL;
goto unlock;
+ }
sg_mark_end(sgl->sg + sgl->cur - 1);
aead_request_set_crypt(&ctx->aead_req, sgl->sg, ctx->first_rsgl.sgl.sg,
--
2.9.3
^ permalink raw reply related
* Loan Offer
From: Quick Loan @ 2016-12-05 10:51 UTC (permalink / raw)
To: Recipients
We can help you with a genuine loan to meet your needs.
Do you need a personal or business loan without stress and
quick approval? Do you need an urgent loan today? No Credit Checks
* LOAN APPROVAL IN 60MINS !!
* GUARANTEED SAME DAY TRANSFER !!
* 100% APPROVAL RATE !!
* LOW INTEREST RATE !!
Contact US for more information about loan offer and we will solve your
financial problem. contact us via email: quickloanss@hotmail.com
^ permalink raw reply
* Re: [PATCH v2] crypto: sun4i-ss: support the Security System PRNG
From: Corentin Labbe @ 2016-12-05 12:57 UTC (permalink / raw)
To: Herbert Xu
Cc: davem, maxime.ripard, wens, linux-kernel, linux-crypto,
linux-arm-kernel
In-Reply-To: <20161205123705.GA10732@gondor.apana.org.au>
On Mon, Dec 05, 2016 at 08:37:05PM +0800, Herbert Xu wrote:
> On Mon, Dec 05, 2016 at 11:48:42AM +0100, Corentin Labbe wrote:
> > From: LABBE Corentin <clabbe.montjoie@gmail.com>
> >
> > The Security System have a PRNG.
> > This patch add support for it as an hwrng.
> >
> > Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
>
> Please don't add PRNGs to hwrng. If we have existing PRNGs in
> there please let me know so that we can remove them.
>
So how to expose PRNG to user space ? or more generally how to "use" a PRNG ?
I found hisi-rng, crypto4xx_ and exynos-rng which seems to be PRNG used as hwrng.
Regards
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox